Discussion:
[omniORB] Big lag in method call
Sylvain Gault
2013-07-31 19:38:19 UTC
Permalink
Hi there.

I'm facing a problem right now. My application is a bit complex (it's a
MapReduce implementation), I'll try to keep only the meaningful parts.

I have several processes, each running on distinct nodes.
- One Master process.
- M Mapper processes.
- R Reducer processes.

For the example, let's say M = R = 10.

The Master controls everything.
At some point, the Master will start 10 threads to perform 10 calls
simultaneously to a method of one mapper process (named PM_1) to start
the data transfers to the 10 reducers processes (named PR_1 to PR_10).

And the transfer method on PM_1 will call a method of PR_1 to actually
ransfer its data.

The problem I have is that I apparently can't make another method call
to PM_1 while those 10 transfers are running.
When I try, I get a lag of several seconds and my method call only
succeed when some of those 10 transfers are finished.

I first thought my data transfers were too bandwidth intensive and
delayed my other short call. But given the statistics, It looks more
like a limited number of parallel method calls.


Any clue about this?


Thanks.
Regards,
Sylvain Gault
Sylvain Gault
2013-07-31 21:32:08 UTC
Permalink
Post by Sylvain Gault
Hi there.
I'm facing a problem right now. My application is a bit complex (it's a
MapReduce implementation), I'll try to keep only the meaningful parts.
I have several processes, each running on distinct nodes.
- One Master process.
- M Mapper processes.
- R Reducer processes.
For the example, let's say M = R = 10.
The Master controls everything.
At some point, the Master will start 10 threads to perform 10 calls
simultaneously to a method of one mapper process (named PM_1) to start
the data transfers to the 10 reducers processes (named PR_1 to PR_10).
And the transfer method on PM_1 will call a method of PR_1 to actually
ransfer its data.
The problem I have is that I apparently can't make another method call
to PM_1 while those 10 transfers are running.
When I try, I get a lag of several seconds and my method call only
succeed when some of those 10 transfers are finished.
I first thought my data transfers were too bandwidth intensive and
delayed my other short call. But given the statistics, It looks more
like a limited number of parallel method calls.
Any clue about this?
Thanks.
Regards,
Sylvain Gault
Replying to myself: I found the configuration option
maxGIOPConnectionPerServer whose value is 5 by default. It was the
reason blocking my calls. A value of 1000 should be better. :]


Sylvain Gault

Loading...