[omniORB] Question about ORB configuration

Discussion:

Thomas Zumbiehl

2008-11-27 16:13:15 UTC

Hi,

I am currently facing problems about configuring the ORB for performance issues.
First, I need to explain the context :
GUI client ---- Server A ---- Server B
CLI client ----- |

I have a number of servers B, which are only connected to server A. Server A is "only" relaying requests to differents servers B.
Server A has also a set of GUI and CLI clients that connects to it to make requests, that will be relayed to the servers B.
All connexions are defined to be bidirectionnals.

My problems occurs when I have quite a lot of requests comming at the same time (about 100). Sometimes, my server A seems to be bloqued, even at the ORB level, sometimes, it is the server B that is bloqued, sometimes, the client have a CORBA exception thrown when sending a _non_existent request to the server A, which in fact should never occur, as the server is running. The only thing is that the server might have a lot of threads ongoing...
Well, I am quite lost with the configuration issue I should set on my servers A and B to let them talk together (to do some polling), and to be also able to provide service for many requests at the same time.

Could any one provide me some information on some kind of "standard" configurations depending on the performance/availability ratio expected from the server.
As a server is also a client from another server, is there some special configuration needed ?

Thanks in advance for any information.

Thomas

Thomas Zumbiehl
Chef de Projet D?veloppement
BV Associates
http://www.bvassociates.fr <http://www.bvassociates.fr/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20081127/75743f87/attachment.htm

Duncan Grisby

2008-12-12 22:21:53 UTC

Permalink

Post by Thomas Zumbiehl
I am currently facing problems about configuring the ORB for performance
issues.
GUI client ---- Server A ---- Server B
CLI client ----- |
I have a number of servers B, which are only connected to server A. Server A
is "only" relaying requests to differents servers B.
Server A has also a set of GUI and CLI clients that connects to it to make
requests, that will be relayed to the servers B.
All connexions are defined to be bidirectionnals.

Do you mean Bi-directional GIOP? If so, that limits things somewhat...

Post by Thomas Zumbiehl
My problems occurs when I have quite a lot of requests comming at the same
time (about 100). Sometimes, my server A seems to be bloqued, even at the ORB
level, sometimes, it is the server B that is bloqued, sometimes, the client
have a CORBA exception thrown when sending a _non_existent request to the
server A, which in fact should never occur, as the server is running. The only
thing is that the server might have a lot of threads ongoing...

What thread and connection parameters are you currently using? Have you
read the manual chapter on connection and thread management?

http://omniorb.sourceforge.net/omni41/omniORB/omniORB008.html

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

Thomas Zumbiehl

2008-12-12 22:28:13 UTC

Permalink

Hi Duncan,

Yes, I ment Bi-directional GIOP.
And, yes, I did read the manual chapter on connection and thread management.

The latest configuration I got on both A and B servers is :
scanGranularity=5

clientTransportRule=* tcp,bidir
offerBiDirectionalGIOP=1
oneCallPerConnection=0
maxGIOPConnectionPerServer=200
outConScanPeriod=600

endPoint=giop:tcp::0
serverTransportRule=* tcp,bidir
acceptBiDirectionalGIOP=1
maxServerThreadPerConnection=200
maxServerThreadPoolSize=200
inConScanPeriod=600

Do you think this is correct ?

Cheers,

Thomas Zumbiehl
Chef de Projet D?veloppement
BV Associates
http://www.bvassociates.fr

-----Message d'origine-----
De : Duncan Grisby [mailto:***@grisby.org]
Envoy? : vendredi 12 d?cembre 2008 17:22
? : Thomas Zumbiehl
Cc : omniorb-***@omniorb-support.com
Objet : Re: [omniORB] Question about ORB configuration

Post by Thomas Zumbiehl
I am currently facing problems about configuring the ORB for
performance issues.
GUI client ---- Server A ---- Server B CLI client ----- |
I have a number of servers B, which are only connected to server A.
Server A is "only" relaying requests to differents servers B.
Server A has also a set of GUI and CLI clients that connects to it to
make requests, that will be relayed to the servers B.
All connexions are defined to be bidirectionnals.

Do you mean Bi-directional GIOP? If so, that limits things somewhat...

Post by Thomas Zumbiehl
My problems occurs when I have quite a lot of requests comming at the
same time (about 100). Sometimes, my server A seems to be bloqued,
even at the ORB level, sometimes, it is the server B that is bloqued,
sometimes, the client have a CORBA exception thrown when sending a
_non_existent request to the server A, which in fact should never
occur, as the server is running. The only thing is that the server might have a lot of threads ongoing...

What thread and connection parameters are you currently using? Have you read the manual chapter on connection and thread management?

http://omniorb.sourceforge.net/omni41/omniORB/omniORB008.html

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

Duncan Grisby

2008-12-12 22:59:19 UTC

Permalink

Post by Thomas Zumbiehl
scanGranularity=5
clientTransportRule=* tcp,bidir
offerBiDirectionalGIOP=1
oneCallPerConnection=0
maxGIOPConnectionPerServer=200
outConScanPeriod=600
endPoint=giop:tcp::0
serverTransportRule=* tcp,bidir
acceptBiDirectionalGIOP=1
maxServerThreadPerConnection=200
maxServerThreadPoolSize=200
inConScanPeriod=600
Do you think this is correct ?

It's generally a good idea to set outConScanPeriod to be smaller than
inConScanPeriod so that clients usually close connections. When servers
close connections, clients can end up having to retry calls due to the
connection closures.

What is the setting of threadPerConnectionPolicy ? How many concurrent
calls do you have relative to the setting of 200 for the thread pool size?

What happens when the processes become "blocked"? Do they recover after
a while, or stay blocked for ever? If they're blocked forever, try
attaching with a debugger and get a stack trace for all the threads.

If it's only temporary, you can get more information by running both
ends with traceLevel 25 traceThreadId 1 traceTime 1. That will output a
lot of debugging information that will hopefully show where the hold-up
is.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

Thomas Zumbiehl

2008-12-12 23:22:36 UTC

Permalink

The threadPerConnectionPolicy is set to default value (I'm using omniORB 4.0.7, so it is 1).
For the concurrent calls, let's say it is up to 100. I took 200 to be sure that I'm under the limit.

When the process's blocked, it's for ever.

The "process is blocked" part of the problem seems to be solved by changing the code.

Now, I'm really interrested in the best configuration to tune my servers A and B for managing concurrent requests.

Cheers,

Thomas Zumbiehl
Chef de Projet D?veloppement
BV Associates
http://www.bvassociates.fr

-----Message d'origine-----
De : Duncan Grisby [mailto:***@grisby.org]
Envoy? : vendredi 12 d?cembre 2008 17:59
? : Thomas Zumbiehl
Cc : omniorb-***@omniorb-support.com
Objet : Re: [omniORB] Question about ORB configuration

It's generally a good idea to set outConScanPeriod to be smaller than inConScanPeriod so that clients usually close connections. When servers close connections, clients can end up having to retry calls due to the connection closures.

What is the setting of threadPerConnectionPolicy ? How many concurrent
calls do you have relative to the setting of 200 for the thread pool size?

What happens when the processes become "blocked"? Do they recover after a while, or stay blocked for ever? If they're blocked forever, try attaching with a debugger and get a stack trace for all the threads.

If it's only temporary, you can get more information by running both ends with traceLevel 25 traceThreadId 1 traceTime 1. That will output a lot of debugging information that will hopefully show where the hold-up is.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

Serguei Kolos

2008-12-12 23:38:33 UTC

Permalink

Hello

I have an impression that a bug has been introduced in the omniORB 4.1.3
to the handling of the exceptions in case of using reference forwarding.
The
new omniObjRef.cxx file defines the RECOVER_FORWARD macros as:

#define RECOVER_FORWARD do {\
omni::revertToOriginalProfile(this); \
CORBA::TRANSIENT ex2(TRANSIENT_FailedOnForwarded, ex.completed()); \
if( !_omni_callTransientExceptionHandler(this, retries++, ex2) ) \
throw; \
} while(0)

The issue is that in the line 790 it is used inside the
catch(const giopStream::CommFailure& ex) block and as a consequence if a
transient exception handler returns 0 the giopStream::CommFailure
exception is
propagated to the user space which must never happen.
What is the best way of solving this issue?

Cheers,
Sergei

PS: In the omniORB 4.0.7 it was working since the RECOVER_FORWARD was
throwing
ex2 exception instead of rethrowing the original one:

#define RECOVER_FORWARD do {\
omni::revertToOriginalProfile(this); \
CORBA::TRANSIENT ex2(TRANSIENT_FailedOnForwarded, ex.completed()); \
if( !_omni_callTransientExceptionHandler(this, retries++, ex2) ) \
throw ex2; \
} while(0)

Duncan Grisby

2008-12-12 23:45:22 UTC

Permalink

Post by Serguei Kolos
I have an impression that a bug has been introduced in the omniORB 4.1.3
to the handling of the exceptions in case of using reference
forwarding.

[...]

Post by Serguei Kolos
The issue is that in the line 790 it is used inside the catch(const
giopStream::CommFailure& ex) block and as a consequence if a transient
exception handler returns 0 the giopStream::CommFailure exception is
propagated to the user space which must never happen.

Yes, it is a bug.

Post by Serguei Kolos
What is the best way of solving this issue?

Update to the latest CVS snapshot. The bug is already fixed there.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --