当使用pytorch_transformers时,遇到报错如下 ModuleNotFoundError: No module named ‘fused_layer_norm_cuda’
解决方式
手动编译安装apex
1 2 3
git clone https://github.com/NVIDIA/apex cd apex CUDA_HOME=/usr/local/cuda-11.2 pip install -v --no-cache-dir--global-option="--cpp_ext"--global-option="--cuda_ext"./
安装时对不同的cuda版本的需求会导致apex报错
1 2 3 4 5 6 7 8 9
Traceback (most recent call last): File "<string>", line 1, in <module> File "/media/backup/john/project/apex/setup.py", line 177, in <module> check_cuda_torch_binary_vs_bare_metal(CUDA_HOME) File "/media/backup/john/project/apex/setup.py", line 34, in check_cuda_torch_binary_vs_bare_metal raise RuntimeError( RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries. Pytorch binaries were compiled with Cuda 11.3. In some cases, a minor-version mismatch will not cause later errors: https://github.com/NVIDIA/apex/pull/323#discussion_r287021798. You can try commenting out this check (at your own risk). ERROR: Command errored out with exit status 1: /home/anaconda3/envs/py38/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/media//john/project/apex/setup.py'"'"'; __file__='"'"'/media//john/project/apex/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record /tmp/pip-record-rw1s_hs_/install-record.txt --single-version-externally-managed --compile --install-headers /home//anaconda3/envs/py38/include/python3.8/apex Check the logs for full command output.