Jeff Frontz
2013-07-03 18:53:53 UTC
We're running an app that has client/server processes co-resident on a
virtual server. It's been running fine for years.
We recently made a slight change to one of our (many) applications that has
changed the timing of when the client attempts to contact (read "narrow on
an object serviced by") the server (the narrow call was moved earlier in
the lifetime of the client, but still long after the server was activated).
Every so often, a narrow on an object will throw
a COMM_FAILURE_MarshalArguments (1096024067) exception. After reviewing
the exception trace (which I've unfortunately deleted and am trying to
reproduce), I poked through the omniORB source (4.1.2) and the initial
obvious source is a timeout -- except all of our timeouts are set to "0"
(forever). Looking further, it seems the next likely culprit is send(2)
experiencing some sort of a (transient?) error. Since these processes are
on the same machine, I can't imagine there being any sort of intramachine
congestion in the TCP stack. There doesn't seem to be any obvious
processor/resource overload (per sar) -- that other (different application)
clients simultaneously running on the same machine continue to execute
perfectly would seem to refute any obvious resource issue.
Are there other less likely sources for this exception?
Thanks,
Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130703/aa7f6d34/attachment.html>
virtual server. It's been running fine for years.
We recently made a slight change to one of our (many) applications that has
changed the timing of when the client attempts to contact (read "narrow on
an object serviced by") the server (the narrow call was moved earlier in
the lifetime of the client, but still long after the server was activated).
Every so often, a narrow on an object will throw
a COMM_FAILURE_MarshalArguments (1096024067) exception. After reviewing
the exception trace (which I've unfortunately deleted and am trying to
reproduce), I poked through the omniORB source (4.1.2) and the initial
obvious source is a timeout -- except all of our timeouts are set to "0"
(forever). Looking further, it seems the next likely culprit is send(2)
experiencing some sort of a (transient?) error. Since these processes are
on the same machine, I can't imagine there being any sort of intramachine
congestion in the TCP stack. There doesn't seem to be any obvious
processor/resource overload (per sar) -- that other (different application)
clients simultaneously running on the same machine continue to execute
perfectly would seem to refute any obvious resource issue.
Are there other less likely sources for this exception?
Thanks,
Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130703/aa7f6d34/attachment.html>