安装caffe的过程中出现了一些bug,导致得重新安装cuda和cudnn,nvidia驱动也已经安装
1. 安装cuda,cudnn,nvidia driver
1. 1 到腾的开始
因为caffe的问题导致cuda崩了,此时发现cuda也装不上去了,只能重新尝试11.0版本
1 2 3
| 官网下载cuda11.0 toolkit,适配ubuntu20.04 wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51. sudo sh cuda_11.0.3_450.51.06_linux.run .run
|
始终说出现问题,驱动删除后仍然报错,我决定不选择11.0,driver报错问题,考虑先不安装driver,安装成功
1.2. nvidia驱动推荐安装460
1.3. 安装驱动成功后:
1 2
| nvidia-smi cuda 11.2, nvidia-driver 460
|
1.4 安装后插入环境变量
1 2 3 4 5
| gedit ~/.bashrc export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 export PATH=$PATH:/usr/local/cuda/bin export CUDA_HOME=$CUDA_HOME:/usr/local/cuda source ~/.bashrc
|
1.5 测试cuda
果然又出现了bug
1 2 3
| cd /usr/local/cuda/samples/1_Utilities/deviceQuery sudo make ./deviceQuery
|
1 2 3 4 5
| /usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_80,code=compute_80 -o deviceQuery.o -c deviceQuery.cpp nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). g++: No such file or directory nvcc fatal : Failed to preprocess host compiler properties. make: *** [Makefile:309:deviceQuery.o] 错误 1
|
1.5 g++没安装成功,通过build-essential安装,依赖出现问题
1 2 3
| 下列软件包有未满足的依赖关系: libc6-dev : 依赖: libc6 (= 2.31-0ubuntu9) 但是 2.31-0ubuntu9.2 正要被安装 E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
|
1.6 通过智能安装包aptitude解决依赖问题
找到可解决方案后成功安装g++,此时运行sudo make没有报错,只有warning,但是看信息结果又出错了
1 2 3 4 5 6 7
| ./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 999 -> unknown error Result = FAIL
|
1.7 再次尝试
无奈卸载使用11.02版本尝试,又报错了
1 2 3
| /usr/local/cuda/bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o deviceQuery.o -c deviceQuery.cpp nvcc fatal : Unknown option '--threads' make: *** [Makefile:323:deviceQuery.o] 错误 1
|
但是nvcc -V显示正确,无法解决,只能进行下一步
1 2 3 4 5
| nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Thu_Jun_11_22:26:38_PDT_2020 Cuda compilation tools, release 11.0, V11.0.194 Build cuda_11.0_bu.TC445_37.28540450_0
|
1.8 先不管报错,cudnn没问题
cudnn,cuda路径没问题
1 2 3
| cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2 cat /usr/local/cuda/version.txt CUDA Version 11.0.207
|
2. 安装caffee
1 2 3 4
| sudo apt install --no-install-recommends libboost-all-dev sudo apt install cmake git unzip libgflags-dev libgoogle-glog-dev libprotobuf-dev libleveldb-dev liblmdb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler libatlas-base-dev libopenblas-dev liblapack-dev the python3-dev python3-skimage graphviz python-protobuf pip install --upgrade pip pip install numpy pydot protobuf scikit-image
|
这中间出现了依赖包关系降级问题
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| 保持 下列软件包于其当前版本: 1) libhdf5-dev [未安装的] 2) libprotobuf-dev [未安装的] 3) python3-dev [未安装的] 4) python3.8-dev [未安装的] 5) zlib1g-dev [未安装的]
保留下列未解决的依赖关系: 6) protobuf-compiler 推荐 libprotobuf-dev
是否接受该解决方案?[Y/n/q/?] n 下列动作将解决这些依赖关系:
降级 下列软件包: 1) zlib1g [1:1.2.11.dfsg-2ubuntu1.2 (now) -> 1:1.2.11.dfsg-2ubuntu1 (focal)]
|
解决完之后安装git clone caffee
1 2 3
| git clone https://github.com/BVLC/caffe.git cp Makefile.config.example Makefile.config gedit Makefile.config
|
1 2 3 4 5 6 7 8 9
| conda create -n caffe python=3.x #python和ubuntu中的python版本相同 conda activate caffe conda install -y numpy conda install -y scikit-image
sudo cp -r /usr/lib/python3/dist-packages/caffe /home/guoba/anaconda3/envs/caffe/lib/python3.8/site-packages/caffe #caffe_scr_dir 为caffe的安装路径,默认为/usr/lib/python3/dist-packages/ #anaconda_dir为anaconda安装路径,默认为~/anaconda sudo cp -r caffe_scr_dir/google anaconda_dir/envs/caffe/lib/python3.x/site-packages/google
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| cmake -D CMAKE_BUILD_TYPE=RELEASE \ -D CMAKE_INSTALL_PREFIX=/usr/local \ -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \ -D BUILD_TIFF=ON \ -D WITH_FFMPEG=ON \ -D WITH_GSTREAMER=ON \ -D WITH_TBB=ON \ -D BUILD_TBB=ON \ -D WITH_EIGEN=ON \ -D WITH_V4L=ON \ -D WITH_LIBV4L=ON \ -D WITH_VTK=OFF \ -D WITH_QT=OFF \ -D WITH_OPENGL=ON \ -D OPENCV_ENABLE_NONFREE=ON \ -D INSTALL_C_EXAMPLES=OFF \ -D INSTALL_PYTHON_EXAMPLES=OFF \ -D BUILD_NEW_PYTHON_SUPPORT=ON \ -D OPENCV_GENERATE_PKGCONFIG=ON \ -D BUILD_TESTS=OFF \ -D OPENCV_DNN_CUDA=ON \ -D ENABLE_FAST_MATH=ON \ -D CUDA_FAST_MATH=ON \ -D CUDA_ARCH_BIN=7.0 \ -D WITH_CUBLAS=ON \ -D WITH_CUDNN=ON \ -D CUDNN_LIBRARY=/usr/local/cuda/lib64/libcudnn.so.8.0.5 \ -D CUDNN_INCLUDE_DIR=/usr/local/cuda/include \ -D BUILD_EXAMPLES=OFF ..
|
详细链接整合:
https://cyfeng.science/2020/05/02/ubuntu-install-nvidia-driver-cuda-cudnn-suits/
https://www.lijingle.com/thread-36-1-1.html
https://zhuanlan.zhihu.com/p/339835760
https://www.dazhuanlan.com/2019/12/05/5de8098e42817
https://www.cnblogs.com/klchang/p/14353384.html