From Fedora Project Wiki

Revision as of 16:44, 22 August 2017 by Tc01 (talk | contribs) (→‎Install NVIDIA libraries: nvidia-smi is now in the xorg-x11-drv-nvidia-cuda package)

Note these instructions *only* apply to Fedora 16 and later releases.

This document provides instructions to install/remove CUDA 4.2 on Fedora. The final goal will be to be able to run GPU-enabled BOINC applications (in particular, GPUGRID).

Installation

Prerequisites

First, be sure your GPU is compatible with CUDA. Refer to Nvidia's list of CUDA GPUs.

Then, install required packages with DNF:

su -c 'dnf install wget make gcc-c++ freeglut-devel libXi-devel libXmu-devel mesa-libGLU-devel'

Or with YUM:

su -c 'yum install wget make gcc-c++ freeglut-devel libXi-devel libXmu-devel mesa-libGLU-devel'

Install NVIDIA libraries

If you want to run BOINC projects, you will also need the NVIDIA drivers and libraries available in the "rpmfusion" repository.

Additional confirmation needed
Here, the graphics driver is changed from the standard 'nouveau' driver to the 'nvidia' driver. Is this really necessary? One may be able to keep the 'nouveau' driver.
May activate annoying bug in Flash Player
Using the 'nvidia' driver might cause your Flash Player (if you have it) to exhibit the "blue YouTube" bug, as Flash Player will try to use hardware acceleration, but will fail in doing so.
  • See ForbiddenItems#NVIDIA for Fedora's official policy on the NVIDIA drivers (which is why the "nouveau" graphics driver instead of the "nvidia" one is used by default.)
  • See rpmfusion configuration on how to activate the "rpmfusion" repository for your installation.
  • See xorg-x11-drv-nvidia on what the RPM fusion "xorg-x11-drv-nvidia" package is about and how to install it.

The "rpmfusion" repository provides several RPM packages:

  • The "nvidia" graphics driver, which replaces the "nouveau" driver.
  • Two configuration tool packages.
  • The NVIDIA library packages. There are 32-bit and 64-bit versions. You need to install both the 32 bit and 64 bit library packages if you are on a 64 bit machine otherwise some BOINC projects won't run.
  • The NIVIDA kernel module package. This package is adapted to a specific kernel and needs to be updated whenever your system is updated to a new kernel (see below).

This illustration (made with Gliffy) depicts the packages and their interdependencies. It is sufficient to install the library packages to pull them all in, including the correct kernel module package:

RPM fusion packages and their interdependencies

On 32 bit using DNF:

su -c 'dnf install xorg-x11-drv-nvidia-libs' 

Or with YUM:

su -c 'yum install xorg-x11-drv-nvidia-libs' 

On 64 bit using DNF:

su -c 'dnf install xorg-x11-drv-nvidia-libs xorg-x11-drv-nvidia-libs.i686' 

Or with YUM:

su -c 'yum install xorg-x11-drv-nvidia-libs xorg-x11-drv-nvidia-libs.i686' 

After installation, check the list of provided libraries using "rpm --query --list xorg-x11-drv-nvidia-libs | grep -P '\.so(\.[123])?$'"

/usr/lib64/nvidia/libGL.so.1
/usr/lib64/nvidia/libOpenCL.so.1
/usr/lib64/nvidia/libXvMCNVIDIA.so.1
/usr/lib64/nvidia/libcuda.so
/usr/lib64/nvidia/libcuda.so.1
/usr/lib64/nvidia/libnvcuvid.so.1
/usr/lib64/nvidia/libnvidia-cfg.so.1
/usr/lib64/nvidia/libnvidia-compiler.so.1
/usr/lib64/nvidia/libnvidia-glcore.so
/usr/lib64/nvidia/libnvidia-glcore.so.1
/usr/lib64/nvidia/libnvidia-ml.so.1
/usr/lib64/nvidia/libnvidia-opencl.so.1
/usr/lib64/nvidia/libnvidia-tls.so.1
/usr/lib64/nvidia/tls/libnvidia-tls.so
/usr/lib64/nvidia/tls/libnvidia-tls.so.1
/usr/lib64/vdpau/libvdpau_nvidia.so
/usr/lib64/vdpau/libvdpau_nvidia.so.1
/usr/lib/nvidia/libGL.so.1
/usr/lib/nvidia/libOpenCL.so.1
/usr/lib/nvidia/libXvMCNVIDIA.so.1
/usr/lib/nvidia/libcuda.so
/usr/lib/nvidia/libcuda.so.1
/usr/lib/nvidia/libnvcuvid.so.1
/usr/lib/nvidia/libnvidia-cfg.so.1
/usr/lib/nvidia/libnvidia-compiler.so.1
/usr/lib/nvidia/libnvidia-glcore.so
/usr/lib/nvidia/libnvidia-glcore.so.1
/usr/lib/nvidia/libnvidia-ml.so.13.5.2-3.fc17.x86_64
/usr/lib/nvidia/libnvidia-opencl.so.1
/usr/lib/nvidia/libnvidia-tls.so.1
/usr/lib/nvidia/tls/libnvidia-tls.so
/usr/lib/nvidia/tls/libnvidia-tls.so.1
/usr/lib/vdpau/libvdpau_nvidia.so
/usr/lib/vdpau/libvdpau_nvidia.so.1
Playing around with rpm
You may want to play around with rpm --query PACKAGENAME --requires and rpm --query PACKAGENAME --provides to list the capabilities required and provided by the packages. The counterparts rpm --query --whatrequires CAPABILITY and rpm --query --whatprovides CAPABILITY list the packages that require or provide said capabilities. Listing the capabilities of the kernel module demands that you write the name in full, including version, release and architectue: rpm --query kmod-nvidia-3.5.4-2.fc17.x86_64-304.37-1.fc17.5.x86_64 --requires
Configuration interface 1
The rpmfusion package xorg-x11-drv-nvidia-cuda comes with the 'nvidia-smi' application, which enables you to manage the graphic hardware from the command line. From the man page: "'nvidia-smi' provides monitoring information for each of NVIDIA's Tesla devices and each of its high-end Fermi-based and Kepler-based Quadro devices. It provides very limited information for other types of NVIDIA devices."
Configuration interface 2
The rpmfusion package nvidia-settings provides a GUI tool to manage the graphic card. From the man page: "The nvidia-settings utility is a tool for configuring the NVIDIA graphics driver. It operates by communicating with the NVIDIA X driver, querying and updating state as appropriate. This communication is done via the NV-CONTROL, GLX, XVideo, and RandR X extensions."

Upgrading your kernel

After a dnf upgrade or yum update that upgraded the kernel, booting will not work because the 'nvidia' kernel module for the new kernel version is missing and the 'nouveau' kernel module has been blacklisted via file '/etc/modprobe.d/blacklist-nouveau.conf'.

To fix this, log in as root, then:

  1. Determine the version code of your next kernel (it should be the highest version code in '/boot', e.g. '3.5.4-2.fc17.x86_64').
  2. Store this version code in a shell variable for ease-of-use: NEXT=3.5.4-2.fc17.x86_64.
  3. Add the new 'kmod-nvidia' package from the rpmfusion repository to your system by running
    dnf install "kmod-nvidia-${NEXT}"
    Or with YUM:
    yum install "kmod-nvidia-${NEXT}"
    This will add the kernel module as '/usr/lib/modules/${NEXT}/extra/nvidia/nvidia.ko'.
  4. After that, you may have to recreate the initial RAM disk as well (this seems to be needed on first installation only) using
    mv /boot/initramfs-${NEXT}.img /boot/initramfs-${NEXT}.old
    and then
    dracut /boot/initramfs-${NEXT}.img ${NEXT}

On the other hand, if you use the statically compiled "akmod-nvidia", your system should work even after kernel upgrades.

Download CUDA Toolkit and GPU Computing SDK

You will have to download the installers for the "CUDA Toolkit" and for the "GPU Computing SDK". Refer to the Nvidia CUDA downloads page for the latest versions.

Let's download and save them on the Desktop.

32 bit :

cd ~/Desktop
wget http://developer.download.nvidia.com/compute/cuda/4_2/rel/toolkit/cudatoolkit_4.2.9_linux_32_fedora14.run
wget http://developer.download.nvidia.com/compute/cuda/4_2/rel/sdk/gpucomputingsdk_4.2.9_linux.run

64 bit :

cd ~/Desktop
wget http://developer.download.nvidia.com/compute/cuda/4_2/rel/toolkit/cudatoolkit_4.2.9_linux_64_fedora14.run
wget http://developer.download.nvidia.com/compute/cuda/4_2/rel/sdk/gpucomputingsdk_4.2.9_linux.run

Install CUDA Toolkit

Go to "Desktop", add execution permissions of the cudatoolkit downloaded file, and execute it with root permissions:

cd ~/Desktop
chmod +x cudatoolkit_4.2.9_linux_*
su -c './cudatoolkit_4.2.9_linux_*'

When it asks you:

Enter install path (default /usr/local/cuda, '/cuda' will be appended):

type:

/opt

Install GPU Computing SDK

As before, go to "Desktop", add execution permissions of the gpucomputingsdk downloaded file, and execute it without root permissions:

cd ~/Desktop
chmod +x gpucomputingsdk_4.2.9_linux.run
./gpucomputingsdk_4.2.9_linux.run

When it asks you:

Enter install path (default ~/NVIDIA_GPU_Computing_SDK):

press [enter] (to use default path)

When it asks you:

Enter CUDA install path (default /usr/local/cuda):

type

/opt/cuda

Preparation

Executable search path

Extend the executable search path to include CUDA executables:

export PATH=$PATH:/opt/cuda/bin

To make this permanent, modify your ~/.bashrc (modifying ~/.bash_profile will cause the path to be extended for the login shell only):

echo 'export PATH=$PATH:/opt/cuda/bin' >> ~/.bashrc

Library search path

Extend the library search path to include CUDA libraries:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/lib:/opt/cuda/lib64

To make this permanent, modify your ~/.bashrc:

echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/lib:/opt/cuda/lib64' >> ~/.bashrc

Or more correctly, add entries to /etc/ld.so.conf.d/, then run ldconfig once:

su -c 'echo "/opt/cuda/lib" > /etc/ld.so.conf.d/nvidia-cuda.conf; echo "/opt/cuda/lib64" > /etc/ld.so.conf.d/nvidia-cuda64.conf ; ldconfig'

If you later want to check whether any of your programs need something that isn't there, run this command in the relevant directory:

find . -exec file --mime '{}' ';' | grep 'application/x-executable' | cut --fields=1 --delimiter=':' | xargs ldd | grep 'not found'
Applications looking for CUDA 3
Some applications (in this case, einstein@home) may look for CUDA 3 libraries libcudart.so.3 and libcufft.so.3. In that case, run
su -c 'cd /opt/cuda/lib; ln -s libcudart.so.4 libcudart.so.3; ln -s libcufft.so.4 libcufft.so.3'
and on 64 bit machines additionally
su -c 'cd /opt/cuda/lib64; ln -s libcudart.so.4 libcudart.so.3; ln -s libcufft.so.4 libcufft.so.3'
That seems to do the trick.

Compile

Fedora 16

We finally compile:

32bits:

cd ~/NVIDIA_GPU_Computing_SDK/C
LINKFLAGS=-L/usr/lib/nvidia/ make

64bits:

cd ~/NVIDIA_GPU_Computing_SDK/C
LINKFLAGS=-L/usr/lib64/nvidia/ make

Fedora 17

Some compatibility problems appeared with gcc-4.7. You will have to install a compatibility version:

su -c 'yum install compat-gcc-34 compat-gcc-34-c++'

Create a symbolic link to make CUDA use gcc-3.4:

su -c 'ln -s /usr/bin/gcc34 /opt/cuda/bin/gcc'

Now, you can compile.

32bits:

cd ~/NVIDIA_GPU_Computing_SDK/C
LINKFLAGS=-L/usr/lib/nvidia/ make cuda-install=/opt/cuda

64bits:

cd ~/NVIDIA_GPU_Computing_SDK/C
LINKFLAGS=-L/usr/lib64/nvidia/ make cuda-install=/opt/cuda

Test

Now, let's test if CUDA is working correctly. Type:

~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/fluidsGL

You should see something like this on the command line:

[fluidsGL] starting...

[fluidsGL] - [OpenGL/CUDA simulation] starting...
   OpenGL device is Available
CUDA device [GeForce GT 610] has 1 Multi-Processors

A window with a fluid dynamics simulation should appear. Use the mouse pointer to generate some activity:

Now we can use GPUGRID applications with BOINC.

If the following error message appears instead:

[fluidsGL]
starting...                               

[fluidsGL] - [OpenGL/CUDA simulation]
starting...
   OpenGL device is NOT Available, [fluidsGL] exiting...
[fluidsGL] test results...                
WAIVED                                          

You are probably running the application as a user that does not currently have access to the display.

There are additional test programs underneath ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/. Try a few.

In particular

~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery

will tell you about the capabilities of the graphics device:

Device 0: "GeForce GT 610"
  CUDA Driver Version / Runtime Version          5.0 / 4.2
  CUDA Capability Major/Minor version number:    2.1
  Total amount of global memory:                 1024 MBytes (1073283072 bytes)
  ( 1) Multiprocessors x ( 48) CUDA Cores/MP:    48 CUDA Cores
  GPU Clock rate:                                1620 MHz (1.62 GHz)
  Memory Clock rate:                             667 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 65536 bytes
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and execution:                 Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Cleanup

Now that CUDA has been installed, the installers files are useless. You can remove them:

cd ~/Desktop
rm cudatoolkit_4.2.9_linux_*
rm gpucomputingsdk_4.2.9_linux.run

Uninstall

If you want to totally remove CUDA, juste delete the /opt/cuda and ~/NVIDIA_GPU_Computing_SDK folders:

rm -r ~/NVIDIA_GPU_Computing_SDK
su -c 'rm -r /opt/cuda'

and remove the export PATH=$PATH:/opt/cuda/bin and export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/lib:/opt/cuda/lib64 lines of the ~/.bash_profile file.

Useful links