Serguei Kolos
2010-09-30 19:53:21 UTC
Hi
Time after time I got my CORBA server application deadlocked during the
ORB::shutdown
operation. GDB shows that the deadlock is caused by the 2 threads which
are trying to lock
2 mutexes at the same time:
1. The main thread which is executing ORB::shutdown locks the
giopServer::pd_lock
mutex ( in giopServer::stop() which then calls deactivate()) and
then hangs on waiting
for the SocketCollection::pd_collection_lock mutex in the
SocketCollection::wakeUp()
function call
2. Another thread locks the SocketCollection::pd_collection_lock mutex
in the
SocketCollection::Select function call (line 510) and then calls the
tcpEndpoint::notifyReadable
(line 556) which in turn ends up in the giopServer::notifyRzReadable
function hanging on
the attempt to lock the giopServer::pd_lock mutex in the call to
giopServer::notifyRzReadable
function (line 1090)
Complete stack traces for the threads are given at the end of the message.
I'm using omniORB 4.1.3 on SLC5 Linux (kernel 2.6.18) with gcc43.
Is that a known issue which has already been fixed in the omniORB 4.1.4?
Cheers,
Sergei
Thread 3 (Thread 12096):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x004656e9 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0x00460d9f in _L_lock_885 () from /lib/libpthread.so.0
#3 0x00460c66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4 0x0807bed1 in omni_mutex::lock (this=0x905f858)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:220
#5 0x0807beff in omni_mutex_lock::omni_mutex_lock (this=0xfff66898, m=...)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:260
#6 0xf7d9fa25 in omni::SocketCollection::wakeUp (this=0x905f850) at
../src/lib/omniORB/orbcore/SocketCollection.cc:615
#7 0xf7dd8824 in omni::tcpEndpoint::Poke (this=0x905f848) at
../src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc:671
#8 0xf7d326ef in omni::giopRendezvouser::terminate (this=0x905ff90)
at ../src/lib/omniORB/orbcore/giopRendezvouser.cc:136
#9 0xf7dd0d7a in omni::giopServer::deactivate (this=0x905f5c8) at
../src/lib/omniORB/orbcore/giopServer.cc:671
#10 0xf7dd18c2 in omni::giopServer::stop (this=0x905f5c8) at
../src/lib/omniORB/orbcore/giopServer.cc:426
#11 0xf7d84c3e in omni::omniObjAdapter::adapterInactive (this=0x905fe5c)
at ../src/lib/omniORB/orbcore/objectAdapter.cc:575
#12 0xf7db27e9 in omni::omniOrbPOA::do_destroy (this=0x905fe50,
etherealize_objects=true)
at ../src/lib/omniORB/orbcore/poa.cc:2512
#13 0xf7db2cc0 in omni::omniOrbPOA::destroy (this=0x905fe50,
etherealize_objects=true, wait_for_completion=true)
at ../src/lib/omniORB/orbcore/poa.cc:927
#14 0xf7da48a9 in omni::omniOrbPOA::shutdown () at
../src/lib/omniORB/orbcore/poa.cc:4180
#15 0xf7d6ecc5 in omniOrbORB::actual_shutdown (this=0x905f5a8) at
../src/lib/omniORB/orbcore/corbaOrb.cc:1013
#16 0xf7d6ee0d in omniOrbORB::do_shutdown (this=0x905f5a8,
wait_for_completion=true)
at ../src/lib/omniORB/orbcore/corbaOrb.cc:1075
#17 0xf7d6f619 in omniOrbORB::shutdown (this=0x905f5a8,
wait_for_completion=true)
at ../src/lib/omniORB/orbcore/corbaOrb.cc:869
#18 0xf7ed1abe in IPCCore::shutdown () at ../src/core.cc:350
#19 0xf7ed1cc1 in IPCCore::atExitFun () at ../src/core.cc:398
#20 0x002f8da9 in exit () from /lib/libc.so.6
#21 0x002e2ea4 in __libc_start_main () from /lib/libc.so.6
#22 0x08068801 in _start ()
Thread 2 (Thread 12099):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x00462ef2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#2 0xf7c6e58f in omni_condition::timedwait (this=0x905f560,
secs=1285851447, nanosecs=620655000)
at ../src/lib/omnithread/posix.cc:172
#3 0xf7d4fde5 in omni::Scavenger::execute (this=0x9060788) at
../src/lib/omniORB/orbcore/giopStrand.cc:735
#4 0xf7d31dcf in omniAsyncWorker::real_run (this=0x9060ce0) at
../src/lib/omniORB/orbcore/invoker.cc:235
#5 0xf7d31285 in omniAsyncWorkerInfo::run (this=0xf6dfe378) at
../src/lib/omniORB/orbcore/invoker.cc:283
#6 0xf7d31edd in omniAsyncWorker::run (this=0x9060ce0) at
../src/lib/omniORB/orbcore/invoker.cc:162
#7 0xf7c6e017 in omni_thread_wrapper (ptr=0x9060ce0) at
../src/lib/omnithread/posix.cc:456
#8 0x0045e832 in start_thread () from /lib/libpthread.so.0
#9 0x0039ee0e in clone () from /lib/libc.so.6
Thread 1 (Thread 12098):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x004656e9 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0x00460d9f in _L_lock_885 () from /lib/libpthread.so.0
#3 0x00460c66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4 0x0807bed1 in omni_mutex::lock (this=0x905f5e8)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:220
#5 0x0807beff in omni_mutex_lock::omni_mutex_lock (this=0xf78360e0, m=...)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:260
#6 0xf7dce7c7 in omni::giopServer::notifyRzReadable (this=0x905f5c8,
conn=0x906b968, force_create=false)
at ../src/lib/omniORB/orbcore/giopServer.cc:1090
#7 0xf7d329c3 in omni::giopRendezvouser::notifyReadable
(this_=0x905ff90, conn=0x906b968)
at ../src/lib/omniORB/orbcore/giopRendezvouser.cc:77
#8 0xf7dd84be in omni::tcpEndpoint::notifyReadable (this=0x905f848,
sh=0x906b980)
at ../src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc:761
#9 0xf7da0108 in omni::SocketCollection::Select (this=0x905f850) at
../src/lib/omniORB/orbcore/SocketCollection.cc:556
#10 0xf7dd8591 in omni::tcpEndpoint::AcceptAndMonitor (this=0x905f848,
func=0xf7d3298e <omni::giopRendezvouser::notifyReadable(void*,
omni::giopConnection*)>, cookie=0x905ff90)
at ../src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc:701
#11 0xf7d327f2 in omni::giopRendezvouser::execute (this=0x905ff90) at
../src/lib/omniORB/orbcore/giopRendezvouser.cc:95
#12 0xf7d31dcf in omniAsyncWorker::real_run (this=0x905ffc0) at
../src/lib/omniORB/orbcore/invoker.cc:235
---Type <return> to continue, or q <return> to quit---
#13 0xf7d31285 in omniAsyncWorkerInfo::run (this=0xf7836378) at
../src/lib/omniORB/orbcore/invoker.cc:283
#14 0xf7d31edd in omniAsyncWorker::run (this=0x905ffc0) at
../src/lib/omniORB/orbcore/invoker.cc:162
#15 0xf7c6e017 in omni_thread_wrapper (ptr=0x905ffc0) at
../src/lib/omnithread/posix.cc:456
#16 0x0045e832 in start_thread () from /lib/libpthread.so.0
#17 0x0039ee0e in clone () from /lib/libc.so.6
Time after time I got my CORBA server application deadlocked during the
ORB::shutdown
operation. GDB shows that the deadlock is caused by the 2 threads which
are trying to lock
2 mutexes at the same time:
1. The main thread which is executing ORB::shutdown locks the
giopServer::pd_lock
mutex ( in giopServer::stop() which then calls deactivate()) and
then hangs on waiting
for the SocketCollection::pd_collection_lock mutex in the
SocketCollection::wakeUp()
function call
2. Another thread locks the SocketCollection::pd_collection_lock mutex
in the
SocketCollection::Select function call (line 510) and then calls the
tcpEndpoint::notifyReadable
(line 556) which in turn ends up in the giopServer::notifyRzReadable
function hanging on
the attempt to lock the giopServer::pd_lock mutex in the call to
giopServer::notifyRzReadable
function (line 1090)
Complete stack traces for the threads are given at the end of the message.
I'm using omniORB 4.1.3 on SLC5 Linux (kernel 2.6.18) with gcc43.
Is that a known issue which has already been fixed in the omniORB 4.1.4?
Cheers,
Sergei
Thread 3 (Thread 12096):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x004656e9 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0x00460d9f in _L_lock_885 () from /lib/libpthread.so.0
#3 0x00460c66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4 0x0807bed1 in omni_mutex::lock (this=0x905f858)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:220
#5 0x0807beff in omni_mutex_lock::omni_mutex_lock (this=0xfff66898, m=...)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:260
#6 0xf7d9fa25 in omni::SocketCollection::wakeUp (this=0x905f850) at
../src/lib/omniORB/orbcore/SocketCollection.cc:615
#7 0xf7dd8824 in omni::tcpEndpoint::Poke (this=0x905f848) at
../src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc:671
#8 0xf7d326ef in omni::giopRendezvouser::terminate (this=0x905ff90)
at ../src/lib/omniORB/orbcore/giopRendezvouser.cc:136
#9 0xf7dd0d7a in omni::giopServer::deactivate (this=0x905f5c8) at
../src/lib/omniORB/orbcore/giopServer.cc:671
#10 0xf7dd18c2 in omni::giopServer::stop (this=0x905f5c8) at
../src/lib/omniORB/orbcore/giopServer.cc:426
#11 0xf7d84c3e in omni::omniObjAdapter::adapterInactive (this=0x905fe5c)
at ../src/lib/omniORB/orbcore/objectAdapter.cc:575
#12 0xf7db27e9 in omni::omniOrbPOA::do_destroy (this=0x905fe50,
etherealize_objects=true)
at ../src/lib/omniORB/orbcore/poa.cc:2512
#13 0xf7db2cc0 in omni::omniOrbPOA::destroy (this=0x905fe50,
etherealize_objects=true, wait_for_completion=true)
at ../src/lib/omniORB/orbcore/poa.cc:927
#14 0xf7da48a9 in omni::omniOrbPOA::shutdown () at
../src/lib/omniORB/orbcore/poa.cc:4180
#15 0xf7d6ecc5 in omniOrbORB::actual_shutdown (this=0x905f5a8) at
../src/lib/omniORB/orbcore/corbaOrb.cc:1013
#16 0xf7d6ee0d in omniOrbORB::do_shutdown (this=0x905f5a8,
wait_for_completion=true)
at ../src/lib/omniORB/orbcore/corbaOrb.cc:1075
#17 0xf7d6f619 in omniOrbORB::shutdown (this=0x905f5a8,
wait_for_completion=true)
at ../src/lib/omniORB/orbcore/corbaOrb.cc:869
#18 0xf7ed1abe in IPCCore::shutdown () at ../src/core.cc:350
#19 0xf7ed1cc1 in IPCCore::atExitFun () at ../src/core.cc:398
#20 0x002f8da9 in exit () from /lib/libc.so.6
#21 0x002e2ea4 in __libc_start_main () from /lib/libc.so.6
#22 0x08068801 in _start ()
Thread 2 (Thread 12099):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x00462ef2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#2 0xf7c6e58f in omni_condition::timedwait (this=0x905f560,
secs=1285851447, nanosecs=620655000)
at ../src/lib/omnithread/posix.cc:172
#3 0xf7d4fde5 in omni::Scavenger::execute (this=0x9060788) at
../src/lib/omniORB/orbcore/giopStrand.cc:735
#4 0xf7d31dcf in omniAsyncWorker::real_run (this=0x9060ce0) at
../src/lib/omniORB/orbcore/invoker.cc:235
#5 0xf7d31285 in omniAsyncWorkerInfo::run (this=0xf6dfe378) at
../src/lib/omniORB/orbcore/invoker.cc:283
#6 0xf7d31edd in omniAsyncWorker::run (this=0x9060ce0) at
../src/lib/omniORB/orbcore/invoker.cc:162
#7 0xf7c6e017 in omni_thread_wrapper (ptr=0x9060ce0) at
../src/lib/omnithread/posix.cc:456
#8 0x0045e832 in start_thread () from /lib/libpthread.so.0
#9 0x0039ee0e in clone () from /lib/libc.so.6
Thread 1 (Thread 12098):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x004656e9 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0x00460d9f in _L_lock_885 () from /lib/libpthread.so.0
#3 0x00460c66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4 0x0807bed1 in omni_mutex::lock (this=0x905f5e8)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:220
#5 0x0807beff in omni_mutex_lock::omni_mutex_lock (this=0xf78360e0, m=...)
at
/afs/cern.ch/atlas/project/tdaq/cmt/tdaq/nightly/installed/include/omnithread.h:260
#6 0xf7dce7c7 in omni::giopServer::notifyRzReadable (this=0x905f5c8,
conn=0x906b968, force_create=false)
at ../src/lib/omniORB/orbcore/giopServer.cc:1090
#7 0xf7d329c3 in omni::giopRendezvouser::notifyReadable
(this_=0x905ff90, conn=0x906b968)
at ../src/lib/omniORB/orbcore/giopRendezvouser.cc:77
#8 0xf7dd84be in omni::tcpEndpoint::notifyReadable (this=0x905f848,
sh=0x906b980)
at ../src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc:761
#9 0xf7da0108 in omni::SocketCollection::Select (this=0x905f850) at
../src/lib/omniORB/orbcore/SocketCollection.cc:556
#10 0xf7dd8591 in omni::tcpEndpoint::AcceptAndMonitor (this=0x905f848,
func=0xf7d3298e <omni::giopRendezvouser::notifyReadable(void*,
omni::giopConnection*)>, cookie=0x905ff90)
at ../src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc:701
#11 0xf7d327f2 in omni::giopRendezvouser::execute (this=0x905ff90) at
../src/lib/omniORB/orbcore/giopRendezvouser.cc:95
#12 0xf7d31dcf in omniAsyncWorker::real_run (this=0x905ffc0) at
../src/lib/omniORB/orbcore/invoker.cc:235
---Type <return> to continue, or q <return> to quit---
#13 0xf7d31285 in omniAsyncWorkerInfo::run (this=0xf7836378) at
../src/lib/omniORB/orbcore/invoker.cc:283
#14 0xf7d31edd in omniAsyncWorker::run (this=0x905ffc0) at
../src/lib/omniORB/orbcore/invoker.cc:162
#15 0xf7c6e017 in omni_thread_wrapper (ptr=0x905ffc0) at
../src/lib/omnithread/posix.cc:456
#16 0x0045e832 in start_thread () from /lib/libpthread.so.0
#17 0x0039ee0e in clone () from /lib/libc.so.6