Discussion:
[omniORB] Memory corruption when using omniORB 4.1.4 with SLES 11 64-bit
Jingdong Sun
2013-03-27 18:36:06 UTC
Permalink
Hi, There,

I am using omniORB 4.1.4 with my project.
Recently, when I testing with SLES, I noticed that, the server side hit
memory corruption some time (not always).

With ORBtraceLevel set to 45, I got following trace information:
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 2048
bytes
omniORB: (7)
4749 4f50 0102 0300 6467 0000 0a00 0000 GIOP....dg......
0300 0000 0000 0000 0e00 0000 fed6 ef51 ...............Q
5100 0034 2e00 0000 0000 6f72 0800 0000 Q..4......or....
7374 6172 7450 4500 0000 0000 2234 3522 startPE....."45"
2e67 0000 3c3f 786d 6c20 7665 7273 696f .g..<?xml versio
6e3d 2231 2e30 2220 656e 636f 6469 6e67 n="1.0" encoding
3d22 5554 462d 3822 2073 7461 6e64 616c ="UTF-8" standal
6f6e 653d 226e 6f22 203f 3e0a 3c61 7567 one="no" ?>.<aug
(Jingdong: I skipped some lines here......)
2020 3c74 743a 6174 7472 206e 616d 653d <tt:attr name=
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
omniORB: (7)
2263 6861 696e 4964 2220 7479 7065 3d22 "chainId" type="
696e 7433 3222 2f3e 0a20 2020 2020 203c int32"/>. <
(Jingdong: I skipped some lines here too.....)
(Jingdong: following part is corrupted, not the contents as I expected).
3020 3820 3020 3020 3020 3020 3120 3120 0 8 0 0 0 0 1 1
3020 3020 3020 3120 3120 340a 3120 3435 0 0 0 1 1 4.1 45
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 18
bytes
omniORB: (7)
4749 4f50 0102 0107 0600 0000 0a00 0000 GIOP............
0a00

What I noticed are:
1. The memory corruption problem not happened all the time, and when
problem happened, generally the 2nd try will pass.
2. All corruptions happened to me so far were related to relative big data
(about 24K), and it happened related to "inputCopyChunk" as trace shown
above.
3. The size server side got is correct, even the content got corrupted.
(The size 24432 bytes is correct in the example I copied here)
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
4. When corruption happened, sometimes the content just got truncated,
sometimes the contents just replaced by some meaningless contents at the
end.

Please help me.
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130327/30771b77/attachment.html>
姜维
2013-03-28 01:22:25 UTC
Permalink
It's very likely to be a bug in your code.




2013/3/28 Jingdong Sun <jindong at us.ibm.com>
Post by Jingdong Sun
Hi, There,
I am using omniORB 4.1.4 with my project.
Recently, when I testing with SLES, I noticed that, the server side hit
memory corruption some time (not always).
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 2048
bytes
omniORB: (7)
4749 4f50 0102 0300 6467 0000 0a00 0000 GIOP....dg......
0300 0000 0000 0000 0e00 0000 fed6 ef51 ...............Q
5100 0034 2e00 0000 0000 6f72 0800 0000 Q..4......or....
7374 6172 7450 4500 0000 0000 2234 3522 startPE....."45"
2e67 0000 3c3f 786d 6c20 7665 7273 696f .g..<?xml versio
6e3d 2231 2e30 2220 656e 636f 6469 6e67 n="1.0" encoding
3d22 5554 462d 3822 2073 7461 6e64 616c ="UTF-8" standal
6f6e 653d 226e 6f22 203f 3e0a 3c61 7567 one="no" ?>.<aug
(Jingdong: I skipped some lines here......)
2020 3c74 743a 6174 7472 206e 616d 653d <tt:attr name=
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
omniORB: (7)
2263 6861 696e 4964 2220 7479 7065 3d22 "chainId" type="
696e 7433 3222 2f3e 0a20 2020 2020 203c int32"/>. <
(Jingdong: I skipped some lines here too.....)
(Jingdong: following part is corrupted, not the contents as I expected).
3020 3820 3020 3020 3020 3020 3120 3120 0 8 0 0 0 0 1 1
3020 3020 3020 3120 3120 340a 3120 3435 0 0 0 1 1 4.1 45
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 18 bytes
omniORB: (7)
4749 4f50 0102 0107 0600 0000 0a00 0000 GIOP............
0a00
1. The memory corruption problem not happened all the time, and when
problem happened, generally the 2nd try will pass.
2. All corruptions happened to me so far were related to relative big data
(about 24K), and it happened related to "inputCopyChunk" as trace shown
above.
3. The size server side got is correct, even the content got corrupted.
(The size 24432 bytes is correct in the example I copied here)
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
4. When corruption happened, sometimes the content just got truncated,
sometimes the contents just replaced by some meaningless contents at the
end.
Please help me.
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com
_______________________________________________
omniORB-list mailing list
omniORB-list at omniorb-support.com
http://www.omniorb-support.com/mailman/listinfo/omniorb-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130328/e1d70016/attachment.html>
Jingdong Sun
2013-03-28 04:29:28 UTC
Permalink
All trace points are from omniORB, how do you think it is a bug in my
code?
Can you point to me some examples of what possible bug in my code that can
make this kind memory corruption happen from omniORB?

Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com



From: ?? <sdjiangwei at gmail.com>
To: Jingdong Sun/Rochester/IBM at IBMUS,
Cc: omniorb-list at omniorb-support.com
Date: 03/27/2013 08:22 PM
Subject: Re: [omniORB] Memory corruption when using omniORB 4.1.4
with SLES 11 64-bit



It's very likely to be a bug in your code.




2013/3/28 Jingdong Sun <jindong at us.ibm.com>
Hi, There,

I am using omniORB 4.1.4 with my project.
Recently, when I testing with SLES, I noticed that, the server side hit
memory corruption some time (not always).

With ORBtraceLevel set to 45, I got following trace information:
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 2048
bytes
omniORB: (7)
4749 4f50 0102 0300 6467 0000 0a00 0000 GIOP....dg......
0300 0000 0000 0000 0e00 0000 fed6 ef51 ...............Q
5100 0034 2e00 0000 0000 6f72 0800 0000 Q..4......or....
7374 6172 7450 4500 0000 0000 2234 3522 startPE....."45"
2e67 0000 3c3f 786d 6c20 7665 7273 696f .g..<?xml versio
6e3d 2231 2e30 2220 656e 636f 6469 6e67 n="1.0" encoding
3d22 5554 462d 3822 2073 7461 6e64 616c ="UTF-8" standal
6f6e 653d 226e 6f22 203f 3e0a 3c61 7567 one="no" ?>.<aug
(Jingdong: I skipped some lines here......)
2020 3c74 743a 6174 7472 206e 616d 653d <tt:attr name=
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
omniORB: (7)
2263 6861 696e 4964 2220 7479 7065 3d22 "chainId" type="
696e 7433 3222 2f3e 0a20 2020 2020 203c int32"/>. <
(Jingdong: I skipped some lines here too.....)
(Jingdong: following part is corrupted, not the contents as I expected).
3020 3820 3020 3020 3020 3020 3120 3120 0 8 0 0 0 0 1 1
3020 3020 3020 3120 3120 340a 3120 3435 0 0 0 1 1 4.1 45
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 18
bytes
omniORB: (7)
4749 4f50 0102 0107 0600 0000 0a00 0000 GIOP............
0a00

What I noticed are:
1. The memory corruption problem not happened all the time, and when
problem happened, generally the 2nd try will pass.
2. All corruptions happened to me so far were related to relative big data
(about 24K), and it happened related to "inputCopyChunk" as trace shown
above.
3. The size server side got is correct, even the content got corrupted.
(The size 24432 bytes is correct in the example I copied here)
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
4. When corruption happened, sometimes the content just got truncated,
sometimes the contents just replaced by some meaningless contents at the
end.

Please help me.
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com
_______________________________________________
omniORB-list mailing list
omniORB-list at omniorb-support.com
http://www.omniorb-support.com/mailman/listinfo/omniorb-list



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130327/6291bb27/attachment-0001.html>
姜维
2013-03-28 05:32:03 UTC
Permalink
"Talk is cheap, show me your code "
? 2013-3-28 ??12:29?"Jingdong Sun" <jindong at us.ibm.com>???
Post by Jingdong Sun
All trace points are from omniORB, how do you think it is a bug in my
code?
Can you point to me some examples of what possible bug in my code that can
make this kind memory corruption happen from omniORB?
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com
From: ?? <sdjiangwei at gmail.com>
To: Jingdong Sun/Rochester/IBM at IBMUS,
Cc: omniorb-list at omniorb-support.com
Date: 03/27/2013 08:22 PM
Subject: Re: [omniORB] Memory corruption when using omniORB 4.1.4
with SLES 11 64-bit
------------------------------
It's very likely to be a bug in your code.
2013/3/28 Jingdong Sun <*jindong at us.ibm.com* <jindong at us.ibm.com>>
Hi, There,
I am using omniORB 4.1.4 with my project.
Recently, when I testing with SLES, I noticed that, the server side hit
memory corruption some time (not always).
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 2048
bytes
omniORB: (7)
4749 4f50 0102 0300 6467 0000 0a00 0000 GIOP....dg......
0300 0000 0000 0000 0e00 0000 fed6 ef51 ...............Q
5100 0034 2e00 0000 0000 6f72 0800 0000 Q..4......or....
7374 6172 7450 4500 0000 0000 2234 3522 startPE....."45"
2e67 0000 3c3f 786d 6c20 7665 7273 696f .g..<?xml versio
6e3d 2231 2e30 2220 656e 636f 6469 6e67 n="1.0" encoding
3d22 5554 462d 3822 2073 7461 6e64 616c ="UTF-8" standal
6f6e 653d 226e 6f22 203f 3e0a 3c61 7567 one="no" ?>.<aug
(Jingdong: I skipped some lines here......)
2020 3c74 743a 6174 7472 206e 616d 653d <tt:attr name=
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
omniORB: (7)
2263 6861 696e 4964 2220 7479 7065 3d22 "chainId" type="
696e 7433 3222 2f3e 0a20 2020 2020 203c int32"/>. <
(Jingdong: I skipped some lines here too.....)
(Jingdong: following part is corrupted, not the contents as I expected).
3020 3820 3020 3020 3020 3020 3120 3120 0 8 0 0 0 0 1 1
3020 3020 3020 3120 3120 340a 3120 3435 0 0 0 1 1 4.1 45
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 18 bytes
omniORB: (7)
4749 4f50 0102 0107 0600 0000 0a00 0000 GIOP............
0a00
1. The memory corruption problem not happened all the time, and when
problem happened, generally the 2nd try will pass.
2. All corruptions happened to me so far were related to relative big data
(about 24K), and it happened related to "inputCopyChunk" as trace shown
above.
3. The size server side got is correct, even the content got corrupted.
(The size 24432 bytes is correct in the example I copied here)
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
4. When corruption happened, sometimes the content just got truncated,
sometimes the contents just replaced by some meaningless contents at the
end.
Please help me.
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958) *
**jindong at us.ibm.com* <jindong at us.ibm.com>
_______________________________________________
omniORB-list mailing list*
**omniORB-list at omniorb-support.com* <omniORB-list at omniorb-support.com>*
**http://www.omniorb-support.com/mailman/listinfo/omniorb-list*<http://www.omniorb-support.com/mailman/listinfo/omniorb-list>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130328/a9d959b0/attachment.html>
Jayaraman, Thirupurasundari
2013-03-28 05:32:54 UTC
Permalink
Hi

We found Google performance tools, TCMALLOC_DEBUG to be very effective to track down memory corruption.

You need to link your code with TCMALLOC_DEBUG option. The exe will then crash everytime there is a potential memory issue. We were able to fix quite a few issues running the exe in this mode.

The exe will run slow and also use more RAM. So please make sure the use case you use to narrow down can run without consuming a lot of memory.

Check gperftools in code.google.com.


Regards
Sundari.

From: ?? [mailto:sdjiangwei at gmail.com]
Sent: 28 March 2013 06:52
To: Jingdong Sun
Cc: omniorb-list at omniorb-support.com
Subject: Re: [omniORB] Memory corruption when using omniORB 4.1.4 with SLES 11 64-bit

It's very likely to be a bug in your code.



2013/3/28 Jingdong Sun <jindong at us.ibm.com<mailto:jindong at us.ibm.com>>
Hi, There,

I am using omniORB 4.1.4 with my project.
Recently, when I testing with SLES, I noticed that, the server side hit memory corruption some time (not always).

With ORBtraceLevel set to 45, I got following trace information:
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 2048 bytes
omniORB: (7)
4749 4f50 0102 0300 6467 0000 0a00 0000 GIOP....dg......
0300 0000 0000 0000 0e00 0000 fed6 ef51 ...............Q
5100 0034 2e00 0000 0000 6f72 0800 0000 Q..4......or....
7374 6172 7450 4500 0000 0000 2234 3522 startPE....."45"
2e67 0000 3c3f 786d 6c20 7665 7273 696f .g..<?xml versio
6e3d 2231 2e30 2220 656e 636f 6469 6e67 n="1.0" encoding
3d22 5554 462d 3822 2073 7461 6e64 616c ="UTF-8" standal
6f6e 653d 226e 6f22 203f 3e0a 3c61 7567 one="no" ?>.<aug
(Jingdong: I skipped some lines here......)
2020 3c74 743a 6174 7472 206e 616d 653d <tt:attr name=
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432 bytes
omniORB: (7)
2263 6861 696e 4964 2220 7479 7065 3d22 "chainId" type="
696e 7433 3222 2f3e 0a20 2020 2020 203c int32"/>. <
(Jingdong: I skipped some lines here too.....)
(Jingdong: following part is corrupted, not the contents as I expected).
3020 3820 3020 3020 3020 3020 3120 3120 0 8 0 0 0 0 1 1
3020 3020 3020 3120 3120 340a 3120 3435 0 0 0 1 1 4.1 45
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 18 bytes
omniORB: (7)
4749 4f50 0102 0107 0600 0000 0a00 0000 GIOP............
0a00

What I noticed are:
1. The memory corruption problem not happened all the time, and when problem happened, generally the 2nd try will pass.
2. All corruptions happened to me so far were related to relative big data (about 24K), and it happened related to "inputCopyChunk" as trace shown above.
3. The size server side got is correct, even the content got corrupted. (The size 24432 bytes is correct in the example I copied here)
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432 bytes
4. When corruption happened, sometimes the content just got truncated, sometimes the contents just replaced by some meaningless contents at the end.

Please help me.
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com<mailto:jindong at us.ibm.com>
_______________________________________________
omniORB-list mailing list
omniORB-list at omniorb-support.com<mailto:omniORB-list at omniorb-support.com>
http://www.omniorb-support.com/mailman/listinfo/omniorb-list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130328/10eb1263/attachment-0001.html>
Jingdong Sun
2013-03-28 12:57:54 UTC
Permalink
Thank you both.

My original codes are with my product, so I will create test case for try
this.

Will update when I have result.

Thanks again.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com



From: "Jayaraman, Thirupurasundari"
<Thirupurasundari.Jayaraman at kla-tencor.com>
To: ?? <sdjiangwei at gmail.com>, Jingdong Sun/Rochester/IBM at IBMUS,
Cc: "omniorb-list at omniorb-support.com"
<omniorb-list at omniorb-support.com>
Date: 03/28/2013 12:34 AM
Subject: RE: [omniORB] Memory corruption when using omniORB 4.1.4
with SLES 11 64-bit



Hi

We found Google performance tools, TCMALLOC_DEBUG to be very effective to
track down memory corruption.

You need to link your code with TCMALLOC_DEBUG option. The exe will then
crash everytime there is a potential memory issue. We were able to fix
quite a few issues running the exe in this mode.

The exe will run slow and also use more RAM. So please make sure the use
case you use to narrow down can run without consuming a lot of memory.

Check gperftools in code.google.com.


Regards
Sundari.

From: ?? [mailto:sdjiangwei at gmail.com]
Sent: 28 March 2013 06:52
To: Jingdong Sun
Cc: omniorb-list at omniorb-support.com
Subject: Re: [omniORB] Memory corruption when using omniORB 4.1.4 with
SLES 11 64-bit

It's very likely to be a bug in your code.



2013/3/28 Jingdong Sun <jindong at us.ibm.com>
Hi, There,

I am using omniORB 4.1.4 with my project.
Recently, when I testing with SLES, I noticed that, the server side hit
memory corruption some time (not always).

With ORBtraceLevel set to 45, I got following trace information:
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 2048
bytes
omniORB: (7)
4749 4f50 0102 0300 6467 0000 0a00 0000 GIOP....dg......
0300 0000 0000 0000 0e00 0000 fed6 ef51 ...............Q
5100 0034 2e00 0000 0000 6f72 0800 0000 Q..4......or....
7374 6172 7450 4500 0000 0000 2234 3522 startPE....."45"
2e67 0000 3c3f 786d 6c20 7665 7273 696f .g..<?xml versio
6e3d 2231 2e30 2220 656e 636f 6469 6e67 n="1.0" encoding
3d22 5554 462d 3822 2073 7461 6e64 616c ="UTF-8" standal
6f6e 653d 226e 6f22 203f 3e0a 3c61 7567 one="no" ?>.<aug
(Jingdong: I skipped some lines here......)
2020 3c74 743a 6174 7472 206e 616d 653d <tt:attr name=
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
omniORB: (7)
2263 6861 696e 4964 2220 7479 7065 3d22 "chainId" type="
696e 7433 3222 2f3e 0a20 2020 2020 203c int32"/>. <
(Jingdong: I skipped some lines here too.....)
(Jingdong: following part is corrupted, not the contents as I expected).
3020 3820 3020 3020 3020 3020 3120 3120 0 8 0 0 0 0 1 1
3020 3020 3020 3120 3120 340a 3120 3435 0 0 0 1 1 4.1 45
omniORB: (7) inputMessage: from giop:tcp:[::ffff:10.6.25.60]:56354 18
bytes
omniORB: (7)
4749 4f50 0102 0107 0600 0000 0a00 0000 GIOP............
0a00

What I noticed are:
1. The memory corruption problem not happened all the time, and when
problem happened, generally the 2nd try will pass.
2. All corruptions happened to me so far were related to relative big data
(about 24K), and it happened related to "inputCopyChunk" as trace shown
above.
3. The size server side got is correct, even the content got corrupted.
(The size 24432 bytes is correct in the example I copied here)
omniORB: (7) inputCopyChunk: from giop:tcp:[::ffff:10.6.25.60]:56354 24432
bytes
4. When corruption happened, sometimes the content just got truncated,
sometimes the contents just replaced by some meaningless contents at the
end.

Please help me.
Thanks.
Jingdong Sun
InfoSphere Streams Development
Phone 507 253-5958 (T/L 553-5958)
jindong at us.ibm.com
_______________________________________________
omniORB-list mailing list
omniORB-list at omniorb-support.com
http://www.omniorb-support.com/mailman/listinfo/omniorb-list


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20130328/49b48b5b/attachment.html>
Loading...