Discussion:
[omniORB] Timed out waiting for rendezvousers to terminate
Mike Richmond
2007-09-21 22:18:50 UTC
Permalink
I am using omniORB 4.1.0 and am seeing a problem whereby my app takes
a long time (about 10 seconds) to quit. I've tracked this down to it
timing out when waiting for the rendezvouser to terminate
(giopServer.cc:683). I've tried extending the wait by changing the
value of timeout in gdb to 30 seconds, but the rendezvouser is still
timed out. Before and after the wait the rendezvouser thread is in
this state:

Thread 3 (process 993 thread 0x160b):
#0 0x9001a1cc in select ()
#1 0x0207fe42 in omni::do_select (maxfd=17, r=0xb0101cd4, w=0x0,
e=0x0, t=0x0) at SocketCollection.cc:1161
#2 0x02080136 in omni::SocketCollection::Select (this=0x493c258) at
SocketCollection.cc:1239
#3 0x020a2c82 in omni::tcpEndpoint::AcceptAndMonitor
(this=0x493c250, func=0x206a334
<omni::giopRendezvouser::notifyReadable(void*,
omni::giopConnection*)>, cookie=0x493c700) at ./tcp/tcpEndpoint.cc:613
#4 0x0206a425 in omni::giopRendezvouser::execute (this=0x493c700) at
giopRendezvouser.cc:97
#5 0x020bcd95 in omniAsyncWorker::real_run (this=0x493c730) at invoker.cc:234
#6 0x02023811 in omniAsyncWorkerInfo::run (this=0xb0101ef4) at invoker.cc:282
#7 0x020bd033 in omniAsyncWorker::run (this=0x493c730) at invoker.cc:161
#8 0x015d3862 in omni_thread_wrapper (ptr=0x493c730) at posix.cc:451
#9 0x90024227 in _pthread_body ()

Adding a breakpoint shows that omni::SocketCollection::Select() does
not return.

However if I step through the rendezvouser terminate() method, and in
particular through tcpAddress->Poke(), then
omni::SocketCollection::Select() does return. In tcpAddress->Poke()
::connect() gives EINPROGRESS, and CLOSESOCKET() returns 0.

My theory is that it is possible to close the socket in
tcpAddress->Poke() before it has "done enough to poke the endpoint".
In support of this theory I observe that
omni::SocketCollection::Select() returns if I sleep for a short time
before closing the socket in tcpAddress->Poke(), or if I undefine
USE_NONBLOCKING_CONNECT.

My machine is pretty quick - a 2 x 2.66 GHz Dual-Core Intel Xeon Mac
Pro running Mac OS X 10.4.10. Unfortunately I don't know enough
about sockets to know if this is a problem with tcpAddress->Poke(),
or with the socket implementation on Mac OS X. After some googling I
tried adding a loop calling getsockopt( sock, SOL_SOCKET, SO_ERROR,
&err, &len ) between ::connect() and CLOSESOCKET() but getsockopt()
returned 0, err = 0 on the first call and didn't fix my problem.

Any suggestions?

Mike Richmond
Global Graphics Software Ltd
Duncan Grisby
2007-09-30 20:42:21 UTC
Permalink
Post by Mike Richmond
I am using omniORB 4.1.0 and am seeing a problem whereby my app takes
a long time (about 10 seconds) to quit. I've tracked this down to it
timing out when waiting for the rendezvouser to terminate
(giopServer.cc:683). I've tried extending the wait by changing the
value of timeout in gdb to 30 seconds, but the rendezvouser is still
timed out. Before and after the wait the rendezvouser thread is in
Please can you try with the latest CVS snapshot? There have been quite
a few changes to the shutdown code, so it's quite possible that the
problem you're seeing has already been fixed.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Mike Richmond
2007-10-23 15:24:25 UTC
Permalink
Post by Duncan Grisby
Post by Mike Richmond
I am using omniORB 4.1.0 and am seeing a problem whereby my app takes
a long time (about 10 seconds) to quit. I've tracked this down to it
timing out when waiting for the rendezvouser to terminate
(giopServer.cc:683). I've tried extending the wait by changing the
value of timeout in gdb to 30 seconds, but the rendezvouser is still
timed out. Before and after the wait the rendezvouser thread is in
Please can you try with the latest CVS snapshot? There have been quite
a few changes to the shutdown code, so it's quite possible that the
problem you're seeing has already been fixed.
For the list, with Duncan's help I solved this by applying this
single change from 4.1.1 to our 4.1.0 source:

Wed Mar 28 17:20:39 BST 2007 dgrisby
====================================

- Always wake up SocketCollection in Poke in case connect seems to
work but does not actually wake the thread.

src/lib/omniORB/orbcore/ssl/sslEndpoint.cc
src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc

Mike Richmond
Global Graphics Software Ltd

Loading...