modprobe：FATAL：在目錄/ lib/modules/

中找不到模塊nvidia-uvm

成功安裝並測試了使用GPU支持編譯的Tensorflow後，最近出現了問題。modprobe：FATAL：在目錄/ lib/modules/

重新啓動機器後，我得到了以下錯誤消息，當我試圖運行Tensorflow程序：

... 
('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz') 
modprobe: FATAL: Module nvidia-uvm not found in directory /lib/modules/4.4.0-34-generic 
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:140] kernel driver does not appear to be running on this host (caffe-desktop): /proc/driver/nvidia/version does not exist 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:92] No GPU devices available on machine. 
(0, 114710.45) 
(1, 95368.891) 
... 
(98, 56776.922) 
(99, 57289.672)

Screencapture of error

代碼：https://github.com/llSourcell/autoencoder_demo

問：爲什麼會重新啓動的Ubuntu 16.04機器突破Tensorflow？

來源

2016-08-18 Cantren

我真的解決了我自己的問題，並希望分享爲我工作的解決方案。

神奇的谷歌搜索是：「modprobe的：FATAL：模塊NVIDIA-UVM在目錄/ lib/modules目錄找不到/」

害得我下面的答案上askubuntu： https://askubuntu.com/a/496146

該答案的作者，Sneetsher，做了很好的解釋，所以如果鏈接不404我會從那裏開始。

崖注意

診斷：我懷疑Ubuntu的可能已經安裝了一個內核更新，當我重新啓動。

解決方案：重新安裝NVIDIA驅動程序修復了錯誤。

問題：

意譯askubuntu答案：NVIDIA驅動程序不能與X服務器上運行

兩種不同的方法解決了NVIDIA驅動程序

1）鍵盤和顯示器安裝：

1）切換到純文本控制檯（Ctrl + Alt + F1或任何到F6）。

2）建立剛剛安裝了當前的內核驅動模塊（）sudo ./<DRIVER>.run -K

信用「Sneetsher」：https://askubuntu.com/a/496146

我沒有鍵盤或顯示器連接到這臺電腦，以便這裏是我實際使用的「無頭」的方法：在SSH

2）：

按照本指南重新引導到控制檯：

http://ubuntuhandbook.org/index.php/2014/01/boot-into-text-console-ubuntu-linux-14-04/

$ sudo cp -n /etc/default/grub /etc/default/grub.orig $ sudo nano /etc/default/grub $ sudo update-grub

編輯根據上述鏈接grub的文件（3種變化）：

註釋行GRUB_CMDLINE_LINUX_DEFAULT =」安靜飛濺「，通過在開頭添加＃，這將禁用Ubuntu紫色屏幕。

變化GRUB_CMDLINE_LINUX =」」到GRUB_CMDLINE_LINUX =」文本」，這使得Ubuntu的開機直接進入文本模式。

取消註釋該行＃GRUB_TERMINAL =控制檯，通過在開始刪除＃，這使得GRUB菜單轉化爲現實黑白色&文本模式（無背景圖片）

UPDATE：（如果運行Ubuntu 16.04如果 $ sudo的systemctl設置默認multi-user.target

重啓進入控制檯

$ sudo shutdown -r now $ sudo service lightdm stop $ sudo ./<DRIVER>.run

遵循NVIDIA驅動程序安裝

$ sudo mv /etc/default/grub /etc/default/grub.textonly $ sudo mv /etc/default/grub.orig /etc/default/grub $ sudo update-grub $ sudo shutdown -r now

結果（應該是什麼樣的東西現在已成功檢測到GPU）

... ('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz') I tensorflow/core/common_runtime/gpu/gpu_init.cc:118] Found device 0 with properties: name: GeForce GTX 970 major: 5 minor: 2 memoryClockRate (GHz) 1.342 pciBusID 0000:01:00.0 Total memory: 3.94GiB Free memory: 3.88GiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:138] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:148] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:868] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0) (0, 113040.92) (1, 94895.867) ...

Screencapture of the same

來源

2016-08-18 17:50:43 Cantren

嗨，我面臨着完全相同的問題。我按照你的指示，但我沒有得到線sudo ./ 。運行，我應該運行它完全一樣或應該替換某些東西，或者我應該先下載一些驅動程序文件，然後運行....？ – m0j1

.run是從NVIDIA網站下載的NVIDIA驅動程序的文件名。在我的情況下：這個文件名是「NVIDIA-Linux-x86_64-367.27.run」，可以從這裏下載：http://www.nvidia.com/download/driverResults.aspx/104284/en-us這假定你有一個64位的x86 CPU，並且你不需要更新的驅動程序版本。如果367.27不再適用於您的目的，則可能需要替換更新的驅動程序版本。 – Cantren

一種簡單的解決方案，「問題：NVIDIA驅動程序不能安裝有X服務器運行：使用SSH

從另一臺計算機

訪問的ubuntu Ubuntu的計算機的刪除畫面（顯示裝置）
重啓電腦使用sudo reboot，然後再次訪問它

來源

2016-11-16 20:30:03 withr

modprobe：FATAL：在目錄/ lib/modules/

回答

相關問題