1

I have updated my nvidia driver to nvidia-driver-525-open using the GUI (which was marked as tested).

After a reboot I no longer get a graphical login screen, just a blinking cursor. Using Ctrl+Alt+F4 I've been able to switch to a terminal login.

Since I've had a similar issue before, where I've tried to fix it myself and ended up with the system no longer booting at all, I've decided to find out as much as I could and then asking for help.

First I looked at gdm which is AFAIK what actually shows the login screen.

systemctl status gdm3 produces the following output:

gdm.service - GNOME Display Manager
    Loaded: loaded (/lib/systemd/system/gdm.service; static)
    Active: active (running) since Sat 2023-04-08 13:44:18 CEST; 42min ago
   Process: 1086 ExecStartPre=/usr/share/gdm/generate-config (code=exit, status=0/SUCCESS)
  Main PID: 1091 (gdm3)
     Tasks: 3 (limit: 18407)
    Memory: 4.6M
       CPU: 987ms
    CGroup: /system.slice/gdm.service
            |-1091 /usr/sbin/gdm3

Apr 08 13:47:09 computername gdm-launch-environment] [2464]: pam_unix(gdm-launch-environment:session): session closed for user gdm Apr 08 13:47:09 computername gdm3[1091]: Gdm: GdmDisplay: Session never registered, failing Apr 08 13:47:09 computername gdm-launch-environment] [2464]: GLib-GObject: g_object_unref: assertion 'G_IS_OBJECT (object)' failed Apr 08 13:47:09 computername gdm3[2653]: modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.19.0-35-generic Apr 08 13:47:09 computername gdm3[1091]: Gdm: GdmLocalDisplayFactory: maximum number of X display failures reached: check X server log for errors Apr 08 13:47:09 computername gdm3[1091]: Gdm: Child process -2531 was already dead. Apr 08 13:47:09 computername gdm3[1091]: Gdm: GdmDisplay: Session never registered, failing Apr 08 13:47:09 computername gdm3[2998]: modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.19.0-35-generic Apr 08 13:47:09 computername gdm3[3028]: modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.19.0-35-generic Apr 08 13:47:09 computername gdm3[1091]: Gdm: Child process -2531 was already dead.

modprobe nvidia also outputs modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.19.0-35-generic

Towards this I found this question. A user asked for the output of dkms status (nothing for me) and uname -r (5.19.0-35-generic) there.

The first answers there suggests running the following command (with versions exchanged).

sudo apt install linux-modules-nvidia-510-5.17.0-1020-oem

However I believe that was the point last time where it all went bad, so I'm hesitant to try again.

And help and/or advice would be appreciated!

EDIT 1: I've done a dry run of the linux module install (sudo apt-get install --dry-run linux-modules-nvidia-525-5.19.0-35-generic) to try to figure out what it would actually do. It would also install linux-objects-nvidia-525-5.19.0-35-generic

Edit 2: Looking through the installed packages I noticed that they were installed for kernel version 5.19.0-38. Starting with this kernel version gets me a login screen again, but neither my Nvidia graphics card nor my processor integrated graphics card are being used. The screen is extremely low resolution and the About page in Ubuntu's settings reports Graphics as "llvmpipe (LLVM 15.0.6, 256 bits)"

HBnet
  • 91

1 Answers1

1

What the initial issue was: the kernel modules for the selected kernel version (5.19.0-35) were missing.

While scrolling through the installed apt packages I noticed that they were installed for version 5.19.0-38. Due to an issue I had half a year ago I have grub setup to boot with the last manually selected kernel version. Ubuntu seems to not have taken the actual used kernel in account and just used installed the packages for the newest one.

Solution for this initial problem: Enter grub menu and select kernel version 5.19.0-38. Result: Kernel boots, I get a graphical login screen, but my graphics card is not being used. Ubuntu is using llvm-pipe for graphics.

Based on this I can also pose a hypothesis on the problem I had half a year ago. Ubuntu might have installed a new kernel version which was automatically used, but did not install the nvidia kernel modules for said new kernel version. I don't know if that is what happened but it would make sense to me.

The llvmpipe issue: using sudo dmesg | grep nvidia-driver I found following message: Open nvidia.ko is only ready for use on Data Center GPUs

After having found this thread on the nvidia forums, I installed the non-open version of the driver instead and things seem to be back to normal.

HBnet
  • 91