Sylvain Gault
2013-07-31 19:38:19 UTC
Hi there.
I'm facing a problem right now. My application is a bit complex (it's a
MapReduce implementation), I'll try to keep only the meaningful parts.
I have several processes, each running on distinct nodes.
- One Master process.
- M Mapper processes.
- R Reducer processes.
For the example, let's say M = R = 10.
The Master controls everything.
At some point, the Master will start 10 threads to perform 10 calls
simultaneously to a method of one mapper process (named PM_1) to start
the data transfers to the 10 reducers processes (named PR_1 to PR_10).
And the transfer method on PM_1 will call a method of PR_1 to actually
ransfer its data.
The problem I have is that I apparently can't make another method call
to PM_1 while those 10 transfers are running.
When I try, I get a lag of several seconds and my method call only
succeed when some of those 10 transfers are finished.
I first thought my data transfers were too bandwidth intensive and
delayed my other short call. But given the statistics, It looks more
like a limited number of parallel method calls.
Any clue about this?
Thanks.
Regards,
Sylvain Gault
I'm facing a problem right now. My application is a bit complex (it's a
MapReduce implementation), I'll try to keep only the meaningful parts.
I have several processes, each running on distinct nodes.
- One Master process.
- M Mapper processes.
- R Reducer processes.
For the example, let's say M = R = 10.
The Master controls everything.
At some point, the Master will start 10 threads to perform 10 calls
simultaneously to a method of one mapper process (named PM_1) to start
the data transfers to the 10 reducers processes (named PR_1 to PR_10).
And the transfer method on PM_1 will call a method of PR_1 to actually
ransfer its data.
The problem I have is that I apparently can't make another method call
to PM_1 while those 10 transfers are running.
When I try, I get a lag of several seconds and my method call only
succeed when some of those 10 transfers are finished.
I first thought my data transfers were too bandwidth intensive and
delayed my other short call. But given the statistics, It looks more
like a limited number of parallel method calls.
Any clue about this?
Thanks.
Regards,
Sylvain Gault