Discussion:
[omniORB] Client hangs while trying to narrow reference specified by corbaloc
Patrick Hartling
2006-07-26 04:21:20 UTC
Permalink
I have run into a problem when using omniINSPOA and corbaloc with
omniORB 4.0.7, and I have not been able to find any answers in the
documentation or list archives. On the server side, I provide access
to a bootstrap object through omniINSPOA, and that object returns a
reference to another object activated in a second POA (a child of
RootPOA). In case it matters, the second POA has the following
policies set: UNIQUE_ID, RETAIN, and MAIN_THREAD_MODEL, and it has
BIDIRECTIONAL_POLICY_TYPE set to BiDirPolicy::BOTH.

Everything works very well if I have the client and server running on
the same machine. If I run on two separate machines, however, the
client hangs while trying to narrow the reference to the bootstrap
object that it gets back from CORBA::ORB::string_to_object(). It
reports the following error:

omniORB: Client attempt to connect to giop:tcp:192.168.1.199:42000
omniORB: AsyncInvoker: thread id = 1 has started. Total threads = 2
omniORB: giopRendezvouser task execute for giop:tcp:192.168.1.183:37128
omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 2
omniORB: Scavenger task execute.
omniORB: Client opened connection to giop:tcp:192.168.1.199:42000
omniORB: sendChunk: to giop:tcp:192.168.1.199:42000 98 bytes
omniORB: inputMessage: from giop:tcp:192.168.1.199:42000 12 bytes
omniORB: throw giopStream::CommFailure from
giopImpl10.cc:298(1,NO,COMM_FAILURE_WaitingForReply)

The server reports a communication failure while trying to un-marshal arguments:

omniORB: Server accepted connection from giop:tcp:192.168.1.183:37129
omniORB: AsyncInvoker: thread id = 5 has started. Total threads = 3
omniORB: Scavenger task execute.
omniORB: AsyncInvoker: thread id = 6 has started. Total threads = 3
omniORB: giopWorker task execute.
omniORB: Accepted connection from giop:tcp:192.168.1.183:37129 because
of this rule: "* bidir,tcp"
omniORB: inputMessage: from giop:tcp:192.168.1.183:37129 98 bytes
omniORB: sendCloseConnection: to giop:tcp:192.168.1.183:37129 12 bytes
omniORB: throw giopStream::CommFailure from
giopStream.cc:880(0,NO,COMM_FAILURE_UnMarshalArguments)

Does anyone know what would cause the different behavior when using
separate machines? I have only started using omniINSPOA today, and my
code worked fine before when using a rudimentary protocol based on the
stringified IOR. There should not be any firewall interference in this
case. I do have the bidirectional policy in place to deal with the
case of a firewall, but no requests are currently being made to use
bidirectional GIOP. I have tested various combinations of Red Hat
Enterprise Linux 4 (x86_64), Windows XP, and Mac OS X 10.4, and the
problem is the same every time.

-Patrick
--
Patrick L. Hartling
http://www.137.org/patrick/
Duncan Grisby
2006-07-27 16:14:00 UTC
Permalink
Post by Patrick Hartling
Everything works very well if I have the client and server running on
the same machine. If I run on two separate machines, however, the
client hangs while trying to narrow the reference to the bootstrap
object that it gets back from CORBA::ORB::string_to_object(). It
Does it hang? From the trace you sent, it looks like it got an
exception.
Post by Patrick Hartling
omniORB: Client attempt to connect to giop:tcp:192.168.1.199:42000
omniORB: AsyncInvoker: thread id = 1 has started. Total threads = 2
omniORB: giopRendezvouser task execute for giop:tcp:192.168.1.183:37128
omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 2
omniORB: Scavenger task execute.
omniORB: Client opened connection to giop:tcp:192.168.1.199:42000
omniORB: sendChunk: to giop:tcp:192.168.1.199:42000 98 bytes
omniORB: inputMessage: from giop:tcp:192.168.1.199:42000 12 bytes
This message is a CloseConnection message...
Post by Patrick Hartling
omniORB: throw giopStream::CommFailure from
giopImpl10.cc:298(1,NO,COMM_FAILURE_WaitingForReply)
...so the client says communication failed.
Post by Patrick Hartling
The server reports a communication failure while trying to un-marshal
omniORB: Server accepted connection from giop:tcp:192.168.1.183:37129
omniORB: AsyncInvoker: thread id = 5 has started. Total threads = 3
omniORB: Scavenger task execute.
omniORB: AsyncInvoker: thread id = 6 has started. Total threads = 3
omniORB: giopWorker task execute.
omniORB: Accepted connection from giop:tcp:192.168.1.183:37129 because
of this rule: "* bidir,tcp"
omniORB: inputMessage: from giop:tcp:192.168.1.183:37129 98 bytes
omniORB: sendCloseConnection: to giop:tcp:192.168.1.183:37129 12 bytes
The server immediately closed the connection when it got the call. I
can't explain why that happened. Try running with traceLevel 40 to see
the GIOP messages. That might give some idea of what's going on.
Post by Patrick Hartling
omniORB: throw giopStream::CommFailure from
giopStream.cc:880(0,NO,COMM_FAILURE_UnMarshalArguments)
That's not part of the problem -- that's simply the server noticing that
the client has gone away after it was sent the CloseConnection message.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Patrick Hartling
2006-07-27 19:55:07 UTC
Permalink
Post by Duncan Grisby
Post by Patrick Hartling
Everything works very well if I have the client and server running on
the same machine. If I run on two separate machines, however, the
client hangs while trying to narrow the reference to the bootstrap
object that it gets back from CORBA::ORB::string_to_object(). It
Does it hang? From the trace you sent, it looks like it got an
exception.
Well, I say it hangs because the _narrow() call never returns. It just
keeps going through a cycle of sending a message, waiting for a while,
getting the WaitingForReply exception, and then trying again. I should
have been more clear in my message about when the client and server
pause.
Post by Duncan Grisby
Post by Patrick Hartling
omniORB: Client attempt to connect to giop:tcp:192.168.1.199:42000
omniORB: AsyncInvoker: thread id = 1 has started. Total threads = 2
omniORB: giopRendezvouser task execute for giop:tcp:192.168.1.183:37128
omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 2
omniORB: Scavenger task execute.
omniORB: Client opened connection to giop:tcp:192.168.1.199:42000
omniORB: sendChunk: to giop:tcp:192.168.1.199:42000 98 bytes
There is a long pause here, which I take to mean that the client is
waiting on the server.
Post by Duncan Grisby
Post by Patrick Hartling
omniORB: inputMessage: from giop:tcp:192.168.1.199:42000 12 bytes
This message is a CloseConnection message...
Post by Patrick Hartling
omniORB: throw giopStream::CommFailure from
giopImpl10.cc:298(1,NO,COMM_FAILURE_WaitingForReply)
...so the client says communication failed.
Post by Patrick Hartling
omniORB: Server accepted connection from giop:tcp:192.168.1.183:37129
omniORB: AsyncInvoker: thread id = 5 has started. Total threads = 3
omniORB: Scavenger task execute.
omniORB: AsyncInvoker: thread id = 6 has started. Total threads = 3
omniORB: giopWorker task execute.
omniORB: Accepted connection from giop:tcp:192.168.1.183:37129 because
of this rule: "* bidir,tcp"
omniORB: inputMessage: from giop:tcp:192.168.1.183:37129 98 bytes
There is a long pause here before the next line of output is printed.
Post by Duncan Grisby
Post by Patrick Hartling
omniORB: sendCloseConnection: to giop:tcp:192.168.1.183:37129 12 bytes
The server immediately closed the connection when it got the call. I
can't explain why that happened. Try running with traceLevel 40 to see
the GIOP messages. That might give some idea of what's going on.
Here is the client side with traceLevel set to 40:

omniORB: Distribution date: Fri Jan 13 13:47:35 GMT 2006 dgrisby
omniORB: My addresses are:
omniORB: 127.0.0.1
omniORB: 192.168.1.183
omniORB: Maximum supported GIOP version is 1.2
omniORB: Native char code sets: ISO-8859-1 UTF-8.
omniORB: Transmission char code sets: ISO-8859-1(1.2) ISO-8859-1(1.1)
ISO-8859-1(1.0) UTF-8(1.2) UTF-8(1.1).
omniORB: Native wide char code sets: UTF-16.
omniORB: Transmission wide char code sets: UTF-16(1.2).
omniORB: Initialising omniDynamic library.
omniORB: Current configuration is as follows:
omniORB: DefaultInitRef (file) =
omniORB: DefaultInitRef (args) =
omniORB: InitRef = NameService=corbaname::localhost
omniORB: abortOnInternalError = 0
omniORB: abortOnNativeException = 0
omniORB: acceptBiDirectionalGIOP = 1
omniORB: acceptMisalignedTcIndirections = 0
omniORB: bootstrapAgentHostname =
omniORB: bootstrapAgentPort = 900
omniORB: clientCallTimeOutPeriod = 0
omniORB: clientTransportRule = * unix,ssl,tcp
omniORB: configFile = ../../../release/config/omniORB4.cfg
omniORB: connectionWatchImmediate = 0
omniORB: connectionWatchPeriod = 50000
omniORB: diiThrowsSysExceptions = 0
omniORB: dumpConfiguration = 0
omniORB: endPoint = giop:tcp::
omniORB: endPointPublishAllIFs = 0
omniORB: giopMaxMsgSize = 2097152
omniORB: giopTargetAddressMode = KeyAddr
omniORB: id = omniORB4
omniORB: inConScanPeriod = 180
omniORB: lcdMode = 0
omniORB: maxGIOPConnectionPerServer = 5
omniORB: maxGIOPVersion = 1.2
omniORB: maxInterleavedCallsPerConnection = 5
omniORB: maxServerThreadPerConnection = 100
omniORB: maxServerThreadPoolSize = 100
omniORB: maxSocketRecv = 2147483647
omniORB: maxSocketSend = 2147483647
omniORB: nativeCharCodeSet = ISO-8859-1
omniORB: nativeWCharCodeSet = UTF-16
omniORB: objectTableSize = 0
omniORB: offerBiDirectionalGIOP = 0
omniORB: omniORB_27_CompatibleAnyExtraction = 0
omniORB: oneCallPerConnection = 1
omniORB: outConScanPeriod = 120
omniORB: poaHoldRequestTimeout = 0
omniORB: poaUniquePersistentSystemIds = 1
omniORB: principal = [Null]
omniORB: scanGranularity = 5
omniORB: serverCallTimeOutPeriod = 0
omniORB: serverTransportRule = * bidir,tcp
omniORB: strictIIOP = 1
omniORB: supportBootstrapAgent = 1
omniORB: supportCurrent = 1
omniORB: supportPerThreadTimeOut = 0
omniORB: tcAliasExpand = 0
omniORB: threadPerConnectionLowerLimit = 9000
omniORB: threadPerConnectionPolicy = 1
omniORB: threadPerConnectionUpperLimit = 10000
omniORB: threadPoolWatchConnection = 1
omniORB: traceExceptions = 1
omniORB: traceFile = [stderr]
omniORB: traceInvocations = 0
omniORB: traceLevel = 40
omniORB: traceThreadId = 0
omniORB: unixTransportDirectory = /tmp/omni-%u
omniORB: unixTransportPermission = 777
omniORB: useTypeCodeIndirections = 1
omniORB: verifyObjectExistsAndType = 1
omniORB: Initialising incoming endpoints.
omniORB: Bind to address 0.0.0.0.
omniORB: Starting serving incoming endpoints.
omniORB: Creating ref to remote: key<JdasImmersive>
target id : IDL:omg.org/CORBA/Object:1.0
most derived id:
omniORB: Client attempt to connect to giop:tcp:192.168.1.199:42000
omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 2
omniORB: Scavenger task execute.
omniORB: AsyncInvoker: thread id = 1 has started. Total threads = 2
omniORB: giopRendezvouser task execute for giop:tcp:192.168.1.183:37985
omniORB: Client opened connection to giop:tcp:192.168.1.199:42000
omniORB: sendChunk: to giop:tcp:192.168.1.199:42000 98 bytes
omniORB:
4749 4f50 0100 0100 5600 0000 0000 0000 GIOP....V.......
0200 0000 010a 6f6d 0d00 0000 4a64 6173 ......om....Jdas
496d 6d65 7273 6976 656e 6974 0600 0000 Immersivenit....
5f69 735f 6100 203d 0000 0000 2200 0000 _is_a. =...."...
4944 4c3a 6172 632f 496d 6d65 7273 6976 IDL:arc/Immersiv
6541 7070 496e 7465 7266 6163 653a 312e eAppInterface:1.
3000 0.

[Pause occurs here while messages about scanning for idle connections
are printed.]

omniORB: inputMessage: from giop:tcp:192.168.1.199:42000 12 bytes
omniORB:
4749 4f50 0100 0105 0000 0000 GIOP........
omniORB: throw giopStream::CommFailure from
giopImpl10.cc:298(1,NO,COMM_FAILURE_WaitingForReply)
omniORB: Client connection refcount = 0
omniORB: Client close connection to giop:tcp:192.168.1.199:42000
omniORB: Client attempt to connect to giop:tcp:192.168.1.199:42000
omniORB: Client opened connection to giop:tcp:192.168.1.199:42000
omniORB: sendChunk: to giop:tcp:192.168.1.199:42000 98 bytes
omniORB:
4749 4f50 0100 0100 5600 0000 0000 0000 GIOP....V.......
0200 0000 010a 6f6d 0d00 0000 4a64 6173 ......om....Jdas
496d 6d65 7273 6976 656e 6974 0600 0000 Immersivenit....
5f69 735f 6100 203d 0000 0000 2200 0000 _is_a. =...."...
4944 4c3a 6172 632f 496d 6d65 7273 6976 IDL:arc/Immersiv
6541 7070 496e 7465 7266 6163 653a 312e eAppInterface:1.
3000 0.

And here is the server side:

omniORB: Distribution date: Fri Jan 13 13:47:35 GMT 2006 dgrisby
omniORB: My addresses are:
omniORB: 127.0.0.1
omniORB: 192.168.1.199
omniORB: Maximum supported GIOP version is 1.2
omniORB: Native char code sets: UTF-8 ISO-8859-1.
omniORB: Transmission char code sets: UTF-8(1.2) UTF-8(1.1)
ISO-8859-1(1.2) ISO-8859-1(1.1) ISO-8859-1(1.0).
omniORB: Native wide char code sets: UTF-16.
omniORB: Transmission wide char code sets: UTF-16(1.2).
omniORB: Initialising omniDynamic library.
omniORB: Current configuration is as follows:
omniORB: DefaultInitRef (file) =
omniORB: DefaultInitRef (args) =
omniORB: abortOnInternalError = 0
omniORB: abortOnNativeException = 0
omniORB: acceptBiDirectionalGIOP = 1
omniORB: acceptMisalignedTcIndirections = 0
omniORB: bootstrapAgentHostname =
omniORB: bootstrapAgentPort = 900
omniORB: clientCallTimeOutPeriod = 0
omniORB: clientTransportRule = * unix,ssl,tcp
omniORB: configFile = ../../../release/config/omniORB4.cfg
omniORB: connectionWatchImmediate = 0
omniORB: connectionWatchPeriod = 50000
omniORB: diiThrowsSysExceptions = 0
omniORB: dumpConfiguration = 0
omniORB: endPoint = giop:tcp::42000
omniORB: endPointPublishAllIFs = 0
omniORB: giopMaxMsgSize = 2097152
omniORB: giopTargetAddressMode = KeyAddr
omniORB: id = omniORB4
omniORB: inConScanPeriod = 180
omniORB: lcdMode = 0
omniORB: maxGIOPConnectionPerServer = 5
omniORB: maxGIOPVersion = 1.2
omniORB: maxInterleavedCallsPerConnection = 5
omniORB: maxServerThreadPerConnection = 100
omniORB: maxServerThreadPoolSize = 100
omniORB: maxSocketRecv = 2147483647
omniORB: maxSocketSend = 2147483647
omniORB: nativeCharCodeSet = ISO-8859-1
omniORB: nativeWCharCodeSet = UTF-16
omniORB: objectTableSize = 0
omniORB: offerBiDirectionalGIOP = 0
omniORB: omniORB_27_CompatibleAnyExtraction = 0
omniORB: oneCallPerConnection = 1
omniORB: outConScanPeriod = 120
omniORB: poaHoldRequestTimeout = 0
omniORB: poaUniquePersistentSystemIds = 1
omniORB: principal = [Null]
omniORB: scanGranularity = 5
omniORB: serverCallTimeOutPeriod = 0
omniORB: serverTransportRule = * bidir,tcp
omniORB: strictIIOP = 1
omniORB: supportBootstrapAgent = 1
omniORB: supportCurrent = 1
omniORB: supportPerThreadTimeOut = 0
omniORB: tcAliasExpand = 0
omniORB: threadPerConnectionLowerLimit = 9000
omniORB: threadPerConnectionPolicy = 1
omniORB: threadPerConnectionUpperLimit = 10000
omniORB: threadPoolWatchConnection = 1
omniORB: traceExceptions = 1
omniORB: traceFile = [stderr]
omniORB: traceInvocations = 0
omniORB: traceLevel = 40
omniORB: traceThreadId = 0
omniORB: unixTransportDirectory = /tmp/omni-%u
omniORB: unixTransportPermission = 777
omniORB: useTypeCodeIndirections = 1
omniORB: verifyObjectExistsAndType = 1
omniORB: Initialising incoming endpoints.
omniORB: Bind to address 0.0.0.0.
omniORB: Starting serving incoming endpoints.
omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 1
omniORB: giopRendezvouser task execute for giop:tcp:192.168.1.199:42000

[other POA and object activation output omitted]

omniORB: Adding key<JdasImmersive> (activating) to object table.
omniORB: State key<JdasImmersive> (activating) -> active
omniORB: Server accepted connection from giop:tcp:192.168.1.183:37986
omniORB: AsyncInvoker: thread id = 3 has started. Total threads = 3
omniORB: Scavenger task execute.
omniORB: AsyncInvoker: thread id = 4 has started. Total threads = 3
omniORB: giopWorker task execute.
omniORB: Accepted connection from giop:tcp:192.168.1.183:37986 because
of this rule: "* bidir,tcp"
omniORB: inputMessage: from giop:tcp:192.168.1.183:37986 98 bytes
omniORB:
4749 4f50 0100 0100 5600 0000 0000 0000 GIOP....V.......
0200 0000 0101 0008 0d00 0000 4a64 6173 ............Jdas
496d 6d65 7273 6976 6509 0031 0600 0000 Immersive..1....
5f69 735f 6100 018a 0000 0000 2200 0000 _is_a......."...
4944 4c3a 6172 632f 496d 6d65 7273 6976 IDL:arc/Immersiv
6541 7070 496e 7465 7266 6163 653a 312e eAppInterface:1.
3000 0.
omniORB: Scavenger reduce idle count for strand 0xa1c4690 to 35
[...]
omniORB: Scavenger reduce idle count for strand 0xa1c4690 to 0
omniORB: Scavenger close connection from giop:tcp:192.168.1.183:37986
omniORB: sendCloseConnection: to giop:tcp:192.168.1.183:37986 12 bytes
omniORB:
4749 4f50 0100 0105 0000 0000 GIOP........
omniORB: Scan for idle connections done (1154007899,211513000).
omniORB: throw giopStream::CommFailure from
giopStream.cc:880(0,NO,COMM_FAILURE_UnMarshalArguments)
omniORB: Server accepted connection from giop:tcp:192.168.1.183:37987
omniORB: AsyncInvoker: thread id = 5 has started. Total threads = 4
omniORB: giopWorker task execute.
omniORB: Accepted connection from giop:tcp:192.168.1.183:37987 because
of this rule: "* bidir,tcp"
omniORB: inputMessage: from giop:tcp:192.168.1.183:37987 98 bytes
omniORB:
4749 4f50 0100 0100 5600 0000 0000 0000 GIOP....V.......
0200 0000 0100 0000 0d00 0000 4a64 6173 ............Jdas
496d 6d65 7273 6976 6500 0000 0600 0000 Immersive.......
5f69 735f 6100 0000 0000 0000 2200 0000 _is_a......."...
4944 4c3a 6172 632f 496d 6d65 7273 6976 IDL:arc/Immersiv
6541 7070 496e 7465 7266 6163 653a 312e eAppInterface:1.
3000 0.
omniORB: Server connection refcount = 1
omniORB: Server connection refcount = 0
omniORB: Server close connection from giop:tcp:192.168.1.183:37986

Please let me know if there is more information that I can provide.
Many thanks for your assistance.

-Patrick
--
Patrick L. Hartling
http://www.137.org/patrick/
Duncan Grisby
2006-08-08 15:56:36 UTC
Permalink
On Thursday 27 July, "Patrick Hartling" wrote:

[...]
Post by Patrick Hartling
omniORB: Accepted connection from giop:tcp:192.168.1.183:37986 because
of this rule: "* bidir,tcp"
omniORB: inputMessage: from giop:tcp:192.168.1.183:37986 98 bytes
4749 4f50 0100 0100 5600 0000 0000 0000 GIOP....V.......
0200 0000 0101 0008 0d00 0000 4a64 6173 ............Jdas
496d 6d65 7273 6976 6509 0031 0600 0000 Immersive..1....
5f69 735f 6100 018a 0000 0000 2200 0000 _is_a......."...
4944 4c3a 6172 632f 496d 6d65 7273 6976 IDL:arc/Immersiv
6541 7070 496e 7465 7266 6163 653a 312e eAppInterface:1.
3000 0.
omniORB: Scavenger reduce idle count for strand 0xa1c4690 to 35
[...]
omniORB: Scavenger reduce idle count for strand 0xa1c4690 to 0
omniORB: Scavenger close connection from giop:tcp:192.168.1.183:37986
omniORB: sendCloseConnection: to giop:tcp:192.168.1.183:37986 12 bytes
4749 4f50 0100 0105 0000 0000 GIOP........
This is very strange. It shows the server receiving an _is_a request as
the client tries to confirm the type of the object reference, but then
the server does not do anything at all to service the request. The fact
that it scavenges the connection as idle shows that it does not think
there are any requests in progress on the connection. The call has just
vanished.

Please can you run with additional tracing of -ORBtraceInvocations 1
-ORBtraceThreadId 1. That might narrow it down a bit.

Are you able to attached to the server with a debugger at the time it's
received the _is_a request and is pausing? If so, it would be helpful
to get a stack backtrace from all the threads running.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Loading...