Serguei Kolos
2009-03-15 21:31:53 UTC
Hello
I'm using omniORB 4.1.3 on Linux kernel 2.6 with gcc34.
In my configuration I have a single server application serving many
concurrent multi-threading
clients which are sending requests to this server from several threads
concurrently. I set
oneCallPerConnection parameter to 0 on the clients. In this
configuration very often
I have some clients timing out while the server is not really under
heavy load. To investigate the
situation I have run the server and the clients with the traceLevel 35
and I have found a possible
indication that when a bulk of messages arrives over the same connection
to the server and the
server splits them into separate buffers then it looks like only first
message gets processed and
others get lost.
Here is how it appears in the server output:
omniORB: inputMessage: from giop:tcp:10.146.34.229:52782 2224 bytes
omniORB: 128 bytes out of 2224
4749 4f50 0100 0100 6c00 0000 0000 0000 GIOP....l.......
1200 0000 0100 3d20 1800 0000 ff69 7063 ......= .....ipc
2f70 6172 7469 7469 6f6e 0061 6c6c 5f68 /partition.all_h
6f73 7473 0800 0000 7265 736f 6c76 6500 osts....resolve.
0000 0000 0200 0000 0e00 0000 6973 2f72 ............is/r
6570 6f73 6974 6f72 7900 0000 0400 0000 epository.......
6970 6300 0800 0000 5275 6e43 7472 6c00 ipc.....RunCtrl.
0400 0000 6970 6300 4749 4f50 0100 0100 ....ipc.GIOP....
omniORB: Split input data into 19 messages
omniORB: sendChunk: to giop:tcp:10.146.34.229:52782 176 bytes
omniORB: 128 bytes out of 176
4749 4f50 0100 0101 a400 0000 0000 0000 GIOP............
1200 0000 0000 0000 1600 0000 4944 4c3a ............IDL:
6973 2f72 6570 6f73 6974 6f72 793a 312e is/repository:1.
3000 0000 0100 0000 0000 0000 7000 0000 0...........p...
0101 0200 0e00 0000 3130 2e31 3436 2e34 ........10.146.4
322e 3232 3800 cf81 1900 0000 ff69 732f 2.228........is/
7265 706c 6963 61fe 3c35 bc49 0100 2dda replica.<5.I..-.
0000 0000 0000 0000 0200 0000 0000 0000 ................
The output message above is the answer to the very first call in the
input requests sequence.
Then the next appearance of an IO on the giop:tcp:10.146.34.229:52782
endpoint
is happening much later and looks like a new request/response pattern:
inputMessage: from giop:tcp:10.146.34.229:52782 116 bytes
omniORB:
4749 4f50 0102 0100 6c00 0000 0a00 0000 GIOP....l.......
0300 0000 0000 4f54 1800 0000 ff69 7063 ......OT.....ipc
2f70 6172 7469 7469 6f6e 0061 6c6c 5f68 /partition.all_h
6f73 7473 0800 0000 7265 736f 6c76 6500 osts....resolve.
0000 0000 0100 0000 0200 0000 0e00 0000 ................
6973 2f72 6570 6f73 6974 6f72 7900 0000 is/repository...
0400 0000 6970 6300 0400 0000 504d 4700 ....ipc.....PMG.
0400 0000 6970 6300 ....ipc.
omniORB: sendChunk: to giop:tcp:10.146.34.229:52782 176 bytes
omniORB: 128 bytes out of 176
4749 4f50 0100 0101 a400 0000 0000 0000 GIOP............
1000 0000 0000 0000 1600 0000 4944 4c3a ............IDL:
6973 2f72 6570 6f73 6974 6f72 793a 312e is/repository:1.
3000 0000 0100 0000 0000 0000 7000 0000 0...........p...
0101 0200 0e00 0000 3130 2e31 3436 2e34 ........10.146.4
322e 3233 3100 908f 1900 0000 ff69 732f 2.231........is/
7265 706c 6963 61fe 6335 bc49 0100 276c replica.c5.I..'l
0000 0000 0000 0000 0200 0000 0000 0000 ................
Is it a bug somewhere in the omniORB or I'm missing something?
Cheers,
Sergei
I'm using omniORB 4.1.3 on Linux kernel 2.6 with gcc34.
In my configuration I have a single server application serving many
concurrent multi-threading
clients which are sending requests to this server from several threads
concurrently. I set
oneCallPerConnection parameter to 0 on the clients. In this
configuration very often
I have some clients timing out while the server is not really under
heavy load. To investigate the
situation I have run the server and the clients with the traceLevel 35
and I have found a possible
indication that when a bulk of messages arrives over the same connection
to the server and the
server splits them into separate buffers then it looks like only first
message gets processed and
others get lost.
Here is how it appears in the server output:
omniORB: inputMessage: from giop:tcp:10.146.34.229:52782 2224 bytes
omniORB: 128 bytes out of 2224
4749 4f50 0100 0100 6c00 0000 0000 0000 GIOP....l.......
1200 0000 0100 3d20 1800 0000 ff69 7063 ......= .....ipc
2f70 6172 7469 7469 6f6e 0061 6c6c 5f68 /partition.all_h
6f73 7473 0800 0000 7265 736f 6c76 6500 osts....resolve.
0000 0000 0200 0000 0e00 0000 6973 2f72 ............is/r
6570 6f73 6974 6f72 7900 0000 0400 0000 epository.......
6970 6300 0800 0000 5275 6e43 7472 6c00 ipc.....RunCtrl.
0400 0000 6970 6300 4749 4f50 0100 0100 ....ipc.GIOP....
omniORB: Split input data into 19 messages
omniORB: sendChunk: to giop:tcp:10.146.34.229:52782 176 bytes
omniORB: 128 bytes out of 176
4749 4f50 0100 0101 a400 0000 0000 0000 GIOP............
1200 0000 0000 0000 1600 0000 4944 4c3a ............IDL:
6973 2f72 6570 6f73 6974 6f72 793a 312e is/repository:1.
3000 0000 0100 0000 0000 0000 7000 0000 0...........p...
0101 0200 0e00 0000 3130 2e31 3436 2e34 ........10.146.4
322e 3232 3800 cf81 1900 0000 ff69 732f 2.228........is/
7265 706c 6963 61fe 3c35 bc49 0100 2dda replica.<5.I..-.
0000 0000 0000 0000 0200 0000 0000 0000 ................
The output message above is the answer to the very first call in the
input requests sequence.
Then the next appearance of an IO on the giop:tcp:10.146.34.229:52782
endpoint
is happening much later and looks like a new request/response pattern:
inputMessage: from giop:tcp:10.146.34.229:52782 116 bytes
omniORB:
4749 4f50 0102 0100 6c00 0000 0a00 0000 GIOP....l.......
0300 0000 0000 4f54 1800 0000 ff69 7063 ......OT.....ipc
2f70 6172 7469 7469 6f6e 0061 6c6c 5f68 /partition.all_h
6f73 7473 0800 0000 7265 736f 6c76 6500 osts....resolve.
0000 0000 0100 0000 0200 0000 0e00 0000 ................
6973 2f72 6570 6f73 6974 6f72 7900 0000 is/repository...
0400 0000 6970 6300 0400 0000 504d 4700 ....ipc.....PMG.
0400 0000 6970 6300 ....ipc.
omniORB: sendChunk: to giop:tcp:10.146.34.229:52782 176 bytes
omniORB: 128 bytes out of 176
4749 4f50 0100 0101 a400 0000 0000 0000 GIOP............
1000 0000 0000 0000 1600 0000 4944 4c3a ............IDL:
6973 2f72 6570 6f73 6974 6f72 793a 312e is/repository:1.
3000 0000 0100 0000 0000 0000 7000 0000 0...........p...
0101 0200 0e00 0000 3130 2e31 3436 2e34 ........10.146.4
322e 3233 3100 908f 1900 0000 ff69 732f 2.231........is/
7265 706c 6963 61fe 6335 bc49 0100 276c replica.c5.I..'l
0000 0000 0000 0000 0200 0000 0000 0000 ................
Is it a bug somewhere in the omniORB or I'm missing something?
Cheers,
Sergei