Bugzilla – Bug 3316
multiple gdmgreeters prevent login
Last modified: 2008-12-09 14:36:01 UTC
You need to log in before you can comment on or make changes to this bug.
i'm currently running opensolaris 2008.11 snv_97. (although i saw this problem with previous builds as well.) my system has a nvidia graphics card with two displays connected to it. i do not run the displays in xinerama mode. whenever my system starts up, a gdm window pops up, and if i log in very quickly the second the window pops up, then everything looks ok. but if i wait just a second (or if i mistype my login or password), then all of the sudden the screen flashes blank and the login screen appears again, but all the state that associated with the previous gdm login window (ie, any username or password characters i typed in) are gone. this seems to be due to the fact that a second gdmgreeter process has been started and it's taking over the display from the first gdmgreeter process. here's the processes that i see associated with gdm: ---8<--- STATE STIME FMRI online 10:00:16 svc:/application/graphical-login/gdm:default 10:00:16 101507 gdm-binary 10:01:55 101667 gdm-binary 10:01:56 101668 Xorg 10:01:56 101692 gdm-binary 10:01:58 101706 gdmgreeter 10:02:06 101722 gdmgreeter ---8<--- if i try to log in after the second gdmgreeter process appears, then login fails. after i enter my password, the login screen hangs for a while, the screen flashes, and i'm back to the login screen again. i though that perhaps this problem was due to the fact that i have two screens and gdm-binary might be trying to start a copy of gdmgreeter for each screen. (right now, both copies only run on my left display.) so i tried modifying /etc/X11/xorg.conf and commenting out configuration related to the second display, but this didn't fix the problem.
i just upgraded to snv_98 and i'm still seeing this problem. i also got personal email from someone working for the us govt asking me about this bug since he was hitting it as well. (so it's not just me. ;)
a workaround: ---8<--- cd /usr/lib mv gdmgreeter gdmgreeter.bin root@mcescher$ cat >gdmgreeter<<EOF #!/bin/sh while :; do pgrep -x gdmgreeter.bin >/dev/null [ $? != 0 ] && break sleep 5 done exec /usr/lib/gdmgreeter.bin "$@" EOF chmod a+x gdmgreeter svcadm restart gdm ---8<---
i just image-update'd to snv_99 and i'm still seeing this issue.
This does sound strange. Could you attach your /etc/X11/gdm/custom.conf file so I can look at it. Note that GDM does not "automagically" manage multiple displays. It should only try to manage the displays setup in your GDM configuration file. However, GDM does have some code which calls XQueryExtension to see if Xinerama is on, and tries to manage via Xinerama if the extension is available and XineramaGetInfo returns multiple screens. You say you aren't using Xinerama, but this is the only place I see where GDM might automagically try to manage additional displays. It would also be good to run gdmsetup as root, then enable debug messages by checking the box on the "Security" tab. Then reboot and recreate the problem of the two greeters being displayed, and attach the gdm-related messages at the bottom of your syslog (/var/adm/messages) so I can review. This might help me understand why it is launching a second greeter.
Created an attachment (id=578) [details] /etc/X11/gdm/custom.conf file from mcescher
i attached my /etc/X11/gdm/custom.conf file, but it's all just comments and section headers. there's no actual content in any of the sections. tomorrow i can try running gdmsetup.
Created an attachment (id=589) [details] gdm related DEBUG messages from /var/adm/messages i've enabled debugging logging via gdmsetup and attached the DEBUG entries from my /var/adm/messages file.
Created an attachment (id=590) [details] gdm related DEBUG messages from /var/adm/messages re-attaching a new gdm DEBUG message log file. (the previous one was invalid. sorry.)
Created an attachment (id=591) [details] gdm related DEBUG messages from /var/adm/messages sigh. re-attaching a new gdm DEBUG message log file. (third time is the charm...)
looking at the log file i attached, i see that a second gdmgreeter is being spawned. it does exactly all the same actions as the first greeter that is spawned. unfortunately, there don't seem to be any messages (that i see) in the log file that indicate why the second greeter is spawned. hence i tried ran the following dscript and restarted the gdm login processes: ---8<--- dtrace -w \ -n 'syscall::exece:entry/copyinstr(arg0) == "/usr/lib/gdmgreeter"/ { stop(); printf("\n"); system("pgrep -xo gdm-binary | xargs ptree"); system("pstack %d", pid); system("prun %d", pid); }' ---8<--- here's the output that i got: ---8<--- CPU ID FUNCTION:NAME 1 850 exece:entry 103280 /usr/sbin/gdm-binary 103281 /usr/sbin/gdm-binary 103282 /usr/X11/bin/Xorg :0 -depth 24 -nolisten tcp -audit 0 -br -auth /var/lib/gdm/:0 103306 /usr/sbin/gdm-binary 103318 /usr/sbin/gdm-binary 103318: /usr/sbin/gdm-binary feefdab7 execve (80f2630, 80f1008) + 7 0806f1cc exec_command (80f93e8, 0) + a4 0806f88d gdm_slave_greeter (48f799eb, 80a05e8, 0, ffffffff, 80473ac, feeead38) + 6a5 0806c9b1 gdm_slave_run (80f4600) + 805 0806ba18 gdm_slave_start (80f4600, 80f4600, 80f21f8, 80f4438, 8077a83, 80f3980) + 410 08069e86 gdm_display_manage (80f4600) + 1de 0805edf8 gdm_start_first_unborn_local (0, 8047e94, 8047e24, feffb7dc, 8047e3c, fefb0680) + 74 08060e0e main (1, 8047e68, 8047e70) + 69e 0805ea0e _start (1, 8047efc, 0, 8047f11, 8047f1e, 8047f36) + 7a 0 850 exece:entry 103280 /usr/sbin/gdm-binary 103281 /usr/sbin/gdm-binary 103282 /usr/X11/bin/Xorg :0 -depth 24 -nolisten tcp -audit 0 -br -auth /var/lib/gdm/:0 103306 /usr/sbin/gdm-binary 103338 /usr/sbin/gdm-binary 103318 /usr/lib/gdmgreeter 103338: /usr/sbin/gdm-binary feefdab7 execve (80f2630, 80f1008) + 7 0806f1cc exec_command (80f93e8, 0) + a4 0806f88d gdm_slave_greeter (48f799eb, 80a05e8, 0, ffffffff, 80473ac, feeead38) + 6a5 0806c9b1 gdm_slave_run (80f4600) + 805 0806ba18 gdm_slave_start (80f4600, 80f4600, 80f21f8, 80f4438, 8077a83, 80f3980) + 410 08069e86 gdm_display_manage (80f4600) + 1de 0805edf8 gdm_start_first_unborn_local (0, 8047e94, 8047e24, feffb7dc, 8047e3c, fefb0680) + 74 08060e0e main (1, 8047e68, 8047e70) + 69e 0805ea0e _start (1, 8047efc, 0, 8047f11, 8047f1e, 8047f36) + 7a ---8<---
i think i found the problem. looking at the gdm process tree on my system i see: ---8<--- 103527 /usr/sbin/gdm-binary 103536 /usr/sbin/gdm-binary 103545 /usr/X11/bin/Xorg :0 -depth 24 -nolisten tcp -audit 0 -br -auth /var/ 103577 /usr/sbin/gdm-binary 103665 /usr/lib/gdmgreeter 103621 /usr/lib/gdmgreeter ---8<--- so we see that there are three gdm-binary processes, two of which have spawned gdmgreeter processes. so the question i then asked was why is the third gdm-binary process being started? to figure this out i used the following dscript: ---8<--- dtrace -w \ -n 'syscall::exece:entry/execname == "gdm-binary"/ { trace(pid); }' \ -n 'syscall::fork*:entry/execname == "gdm-binary"/ { stop(); printf("\n"); system("pgrep -xo gdm-binary | xargs ptree"); system("pstack %d", pid); system("prun %d", pid); }' ---8<--- and after filtering through the output i discovered the following stack: ---8<--- 1 990 forksys:entry 103527 /usr/sbin/gdm-binary 103536 /usr/sbin/gdm-binary 103545 /usr/X11/bin/Xorg :0 -depth 24 -nolisten tcp -audit 0 -br -auth 103536: /usr/sbin/gdm-binary feefe85b __forkx () + b feeeb6ea fork () + 1a 0807668b gdm_exec_fbconsole () + 4f 080774f1 gdm_server_start () + 39d 0806c25d gdm_slave_run () + b1 0806ba18 gdm_slave_start () + 410 08069e86 gdm_display_manage (80f4600) + 1de 0805edf8 gdm_start_first_unborn_local () + 74 08060e0e main () + 69e 0805ea0e _start () + 7a 1 850 exece:entry 103577 ---8<--- if you look at the source for gdm_exec_fbconsole(), you see that it blindly tries to exec /usr/openwin/bin/fbconsole, which doesn't exist on my system. i say blindly, because it never checks for failure from exec(). so when the exec fails, this gdm-binary process continues along and spawns another gdmgreeter process. oops. so the new workaround for this bug becomes: ---8<--- ln -s /bin/true /usr/openwin/bin/fbconsole svcadm restart gdm ---8<---
I have fixed this bug in upstream GDM, and also in our vermillion builds so that GDM will just avoid calling fbconsole if /usr/openwin/bin/fbconsole can't be found. When GDM gets next updated in Nevada/OpenSolaris, this will be fixed. However, this fix won't go into the upcoming OpenSolaris release because they've addressed the issue by creating a /usr/openwin/bin/fbconsole symlink, so the issue should never happen.