Discussion:
[omniORB] Immediate rope switch
Lazar Stricevic
2007-12-01 00:29:37 UTC
Permalink
Hi everyone,

We are using omniORB in a client-server configuration, where both server
and client have two network interfaces and are configured with two
endpoints (one for each interface). The reason for this is fault
tolerance: when connection is lost on one network connection (e.g.
because of cable was disconnected).
Upon the loss of connection, current behavior of omniORB is to try to
reestablish connection over the same "rope" i.e. network interface, to
wait if reestablishment fails (to make sure that the connection is
impossible over that interface), and only then switch the connection to
the other "rope". This makes sense in regular non-real-time usage, but
unfortunately that is not the case with our system.
We decided that the best for our system is to switch rope when it
encounters problems in communication (transient exception), even if it
is possible to continue to use current interface. The waiting proved to
be too costly for us.
Desired behavior is achieved by changing the method notifyCommFailure()
of the GIOP_C object (file
omniORB-4.1.1/src/lib/omniORB/orbcore/GIOP_C.cc). The part which decides
whether to check connection again over the same interface is commented
out. Patch file is attached.

Is there any other, regular way to do this (without changing the code of
omniORB)?

Best regards,
Lazar

-------------- next part --------------
diff -cr omniORB-4.1.1/src/lib/omniORB/orbcore/GIOP_C.cc src/lib/omniORB/orbcore/GIOP_C.cc
*** omniORB-4.1.1/src/lib/omniORB/orbcore/GIOP_C.cc Tue Jul 18 18:21:22 2006
--- src/lib/omniORB/orbcore/GIOP_C.cc Tue Jun 26 18:38:35 2007
***************
*** 289,295 ****

OMNIORB_ASSERT(pd_calldescriptor);

! if (pd_strand->first_use) {
const giopAddress* firstaddr = pd_calldescriptor->firstAddressUsed();
const giopAddress* currentaddr;

--- 289,295 ----

OMNIORB_ASSERT(pd_calldescriptor);

! // if (pd_strand->first_use) {
const giopAddress* firstaddr = pd_calldescriptor->firstAddressUsed();
const giopAddress* currentaddr;

***************
*** 303,314 ****
currentaddr = pd_calldescriptor->currentAddress();
}

! if (pd_strand->orderly_closed) {
! // Strand was closed before / during our request. Retry with the
! // same address.
! retry = 1;
! }
! else {
currentaddr = pd_rope->notifyCommFailure(currentaddr,heldlock);
pd_calldescriptor->currentAddress(currentaddr);

--- 303,314 ----
currentaddr = pd_calldescriptor->currentAddress();
}

! // if (pd_strand->orderly_closed) {
! // // Strand was closed before / during our request. Retry with the
! // // same address.
! // retry = 1;
! // }
! // else {
currentaddr = pd_rope->notifyCommFailure(currentaddr,heldlock);
pd_calldescriptor->currentAddress(currentaddr);

***************
*** 322,330 ****
// Retry will use the next address in the list.
retry = 1;
}
! }
! }
! else if (pd_strand->biDir &&
pd_strand->isClient() &&
pd_strand->biDir_has_callbacks) {

--- 322,330 ----
// Retry will use the next address in the list.
retry = 1;
}
! // }
! // }
! /* else if (pd_strand->biDir &&
pd_strand->isClient() &&
pd_strand->biDir_has_callbacks) {

***************
*** 356,362 ****
minor = TRANSIENT_ConnectionClosed;
break;
}
! }

////////////////////////////////////////////////////////////////////////
CORBA::ULong
--- 356,362 ----
minor = TRANSIENT_ConnectionClosed;
break;
}
! */}

////////////////////////////////////////////////////////////////////////
CORBA::ULong
Duncan Grisby
2007-12-21 16:19:41 UTC
Permalink
On Friday 30 November, Lazar Stricevic wrote:

[...]
Post by Lazar Stricevic
Upon the loss of connection, current behavior of omniORB is to try to
reestablish connection over the same "rope" i.e. network interface, to
wait if reestablishment fails (to make sure that the connection is
impossible over that interface), and only then switch the connection
to the other "rope". This makes sense in regular non-real-time usage,
but unfortunately that is not the case with our system.
We decided that the best for our system is to switch rope when it
encounters problems in communication (transient exception), even if it
is possible to continue to use current interface. The waiting proved
to be too costly for us.
Desired behavior is achieved by changing the method
notifyCommFailure() of the GIOP_C object (file
omniORB-4.1.1/src/lib/omniORB/orbcore/GIOP_C.cc). The part which
decides whether to check connection again over the same interface is
commented out. Patch file is attached.
Is there any other, regular way to do this (without changing the code
of omniORB)?
I think that's a reasonable way to handle your situation. There's no way
to do it without the kind of changes you've done inside omniORB. I'm not
sure it's a common enough requirement to make it worth exposing as a
configuration parameter.

Anyone else interested in this?

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Lazar Stricevic
2007-12-24 19:59:15 UTC
Permalink
Hi Duncan,

Thank you for your answer.
This small change significantly reduces reaction time to e.g.
communication line break, which is very important for our usage (we have
2 sec reaction time) and that why it is the only way for us to use
omniORB. (More details about what we are using it for are available at
http://www.dmsgroup.co.yu/)
I believe that parameter which would make this feature configurable
would make lives significantly easier for everyone who is using omniORB
in a real-time environment. I certainly know that it would make my life
easier, because then I won't have to patch every new version of omniORB
which I have been doing from version 4.0.5. :)

Cheers,
Lazar
Post by Duncan Grisby
[...]
Post by Lazar Stricevic
Upon the loss of connection, current behavior of omniORB is to try to
reestablish connection over the same "rope" i.e. network interface, to
wait if reestablishment fails (to make sure that the connection is
impossible over that interface), and only then switch the connection
to the other "rope". This makes sense in regular non-real-time usage,
but unfortunately that is not the case with our system.
We decided that the best for our system is to switch rope when it
encounters problems in communication (transient exception), even if it
is possible to continue to use current interface. The waiting proved
to be too costly for us.
Desired behavior is achieved by changing the method
notifyCommFailure() of the GIOP_C object (file
omniORB-4.1.1/src/lib/omniORB/orbcore/GIOP_C.cc). The part which
decides whether to check connection again over the same interface is
commented out. Patch file is attached.
Is there any other, regular way to do this (without changing the code
of omniORB)?
I think that's a reasonable way to handle your situation. There's no way
to do it without the kind of changes you've done inside omniORB. I'm not
sure it's a common enough requirement to make it worth exposing as a
configuration parameter.
Anyone else interested in this?
Cheers,
Duncan.
Continue reading on narkive:
Loading...