How to install Tensorflow with NVIDIA GPU - using the GPU for computing and display. GPU in the example is GTX 1080 and Ubuntu 16(updated for Linux MInt 19). The installation of tensorflow is by Virtualenv. For pip install of Tensorflow for CPU you can check here:

Installing tensorflow on Ubuntu google cloud platform

Steps described in this article:

Initial article setup:

  • python 3.5
  • Linux Mint 18
  • CUDA 9.0
  • CUDNN 7.1

Updated version(August 2018):

  • python 3.6
  • Linux Mint 19
  • CUDA 9.2
  • CUDNN 7.2

Why to use GPU vs normal CPU tensorflow. My tests are showing that a single NVidia 1080 is 10 times faster that 24 CPUs used from Google cloud platform. Of course this measurement is pretty lame and doesn't take into account many factors. But from practical point of view - one and the same NN with the same training set takes 48 hours on 24 CPUs and 4 on a single 1080(used in dual mode - display and compute).

Prerequisite

In order to follow this article you need:

  • Ubuntu or Linux Mint
  • GPU with cuda architecture
  • installed python
  • knowledge how to use Linux terminal commands

Install Required libraries

Update: you can install Cuda also by:

sudo apt install cuda-9-0

Older version of CUDA (like 7.0 and 8.0) can be found here:
CUDA Toolkit Archive

Install Cuda Toolkit

In order to use your CUDA GPU you need to install Cuda Toolkit. The latest available is 9.1 but so far it's not compatible with tensorflow and I had to downgrade it to 9.0 in order to avoid this error:

libcublas.so.9.0: cannot open shared object file: No such file or directory

The official documentation from NVidia is here: NVIDIA CUDA Installation Guide for Linux

In short:

  1. Verify the kernel headers and gcc - more info on the link above - in my case everything was fine. No need of actions for my installation of Ubuntu 16 ( also tested on Linux Mint 18)
  2. Download version for you from - CUDA Toolkit 9.1 Download
    2.1 Select version 9.0 ( from legacy releases)
    2.2 Operating System - Linux
    2.3 Architecture - x86_64
    2.4 Distribution - Ubuntu
    2.5 Version - 16.04
    2.6 Installer Type - deb (local)
  3. Download the file
  4. Run the following scripts:
sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

for 9.1

sudo dpkg -i cuda-repo-ubuntu1604-9-1-local_9.1.85-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-1-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

Set environmental variables:

add this to your path by adding line

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}

or

export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}}

to your local variables by this command( or any other way to add a single line to file bashrc) :

sudo nano ~/.bashrc

Install CUDNN

You also need to install CUDNN from this link: cuDNN 7.1

  • Create account
  • Download cuDNN v7.1 (latest for 9.0 and 9.1)
  • Select the cuDNNlibraries for Linux: development, documentation and runtime
  • Install them by
sudo dpkg -i libcudnn7_7.1.2.21-1+cuda9.1_amd64.deb 
sudo dpkg -i libcudnn7-dev_7.1.2.21-1+cuda9.1_amd64.deb 
sudo dpkg -i libcudnn7-doc_7.1.2.21-1+cuda9.1_amd64.deb 

Install Nvidia Driver

You need to go to: NVIDIA Driver Downloads and select driver for your card. For 1080 this are my filters:

  • Product Type: GeForce
  • Product Series: GeForce 10 Series
  • Product GeForce GTX 1080
  • Operating System: Linux 64-bit

Install the driver by

sudo sh NVIDIA-Linux-x86_64-390.48.run

you may need to run it without user interface only from terminal

Additional steps

Run:

sudo apt-get install cuda-command-line-tools

Set to your path:

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64

with

sudo nano ~/.bashrc

You may need to install Java:

sudo apt-get update
sudo apt-get upgrade 
sudo apt-get install default-jre
sudo apt-get install default-jdk

Check your GPU information

You may need to check your GPU information in order to avoid error:

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'MatMul_1': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0

One solution is to set your GPU as CUDA visible device by:

CUDA_VISIBLE_DEVICES=0

You can use some of the following commands:

lspci | grep -i nvidia
nvidia-smi 
inxi -Fxz

Another error which could raised after fresh test of tensorflow with GPU support is:

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

In this case you need to check if the GPU drivers are properly installed and working by:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48                 Driver Version: 390.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:20:00.0  On |                  N/A |
|  0%   46C    P8    15W / 200W |    528MiB /  8116MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1291      G   /usr/lib/xorg/Xorg                           255MiB |
|    0      1826      G   cinnamon                                      93MiB |
|    0      2045      G   ...-token=0131A404F88EDFB69CFBE87FE27CF2B5     8MiB |
|    0      2776      G   ...-token=4213622D51C92E34E15CBCBCB028F7CA    60MiB |
|    0     25949      G   ...-token=81FD9A54DC3DB7C8FBAA2380BC4090AB    68MiB |
|    0     26937      G   ...-token=730AF28A86DCED16D77B4CD5AF0378A4    39MiB |
+-----------------------------------------------------------------------------+

If this is not the case you can reinstall the video card driver. To uninstall all graphic drivers related to nvidia do:

sudo apt-get remove --purge nvidia*

Install Tensorflow

I prefer to create virtual environment for tensorflow because:

  • you can have several different versions of tensorflow
  • if something goes wrong you can easily fix or build new environment
  • you can have less problems related to module and required libraries between different projects. For example one requires numpy 2.0 while other project requires different one.

Create virtual environment

Prior creating the environment you need to install several libraries:

sudo apt-get install -y python3-pip
sudo apt-get install build-essential libssl-dev libffi-dev python-dev
sudo apt-get install -y python3-venv

Create a folder for your evnironments. If you create a folder in your home you will be able to use the commands from the official documentation:

source ~/tensorflow/bin/activate

otherwise you need to write a simple script and run them. You can check my script at the end of the post.

Creating the environment:

virtualenv --system-site-packages -p python3 tensorflow

Check you directory in which you are going to create the environment. In the example above the new environment is named - tensorflow and it'll be create in the current folder. If you want you can choose different name like face:

virtualenv --system-site-packages -p python3 face

And activate the environment by

source ~/tensorflow/bin/activate # bash

Verify pip later than 8.1

easy_install -U pip

Install tensorflow

Installing tensorflow can be done by:

pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU

Testing Tensorflow

You can activate your environment by(deactivation is simply by command - deactivate):

source ~/tensorflow/bin/activate

and test the tensorflow by this simple code:

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

How to uninstall TensorFlow

Removing TensorFlow is simple and you only need to remove the environment that you created by:

$ rm -r myVirtEnv

Script for activating and running Tensorflow

This is a simple script activating tensorflow virtual environment from folder:

  • /home/user/Software/Tensorflow/environments/tensorflow and running a training
    in:
  • /home/user/Software/Tensorflow/myproject/faceRecognition/
cd /home/user/Software/Tensorflow/environments/tensorflow
source ./bin/activate
cd /home/user/Software/Tensorflow/myproject/faceRecognition/
python3 train.py 

Errors

Error Failed to initialize NVML: Driver/library version mismatch

If you have error:

Failed to initialize NVML: Driver/library version mismatch

Most probably you need to restart your computer in order video driver information to be updated.

E cuda_driver.cc:466] failed call to cuInit: CUDA_ERROR_NO_DEVICE

The problem most probably is related to enviroment variable CUDA_VISIBLE_DEVICES. You can check what is set for CUDA_VISIBLE_DEVICES. and if needed to set it to 0:

export CUDA_VISIBLE_DEVICES = 0

if this is not the case then check nvidia driver by:

nvidia-smi

Not loading or using latest nvdia driver

If tensorflow is not using the latest nvidia driver the training of neural nets will take much longer in order to verify that check:

nvidia-smi

and

dpkg --get-selections | grep nvidia

If you see several versions like: libnvidia-common-390 and libnvidia-common-396

its better to remove one of them. In my case I removed 396 and this solve the issue after restart.

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'MatMul_1'

You may need to check your GPU information in order to avoid error:

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'MatMul_1': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0

One solution is to set your GPU as CUDA visible device by:

CUDA_VISIBLE_DEVICES=0

Resources and additional information