Discussion:
[omniORB] Re: omni::createObjRef thread safety
Luke Deller
2010-01-28 04:09:20 UTC
Permalink
Hi,

Duncan did you have any thoughts on this issue?

Having used this patch in production for three months I can say with more confidence now that it has fixed this crash issue for us.

Regards,
Luke.

-----Original Message-----
From: Luke Deller
Sent: Tuesday, 27 October 2009 6:27 PM
To: 'omniorb-***@omniorb-support.com'
Subject: omni::createObjRef thread safety

Hi,

We've been getting an occasional but recurring crash under load in the omni::createObjRef function in omniInternal.cc. It looks to me like a thread safety issue. I have some some confidence that the error is indeed in omniORB rather than our own code because the attached patch to omniORB seems to have stopped the crash from occurring.

This is using a snapshot of omniORB (and omniORBpy) from 20080908, but I think that there have been no relevant changes since then.
From analysing the crash dump files (Windows "mini-dump" files), I can see that the crash is caused by memory corruption in the "omniIdentity" object pointed to by the "id" local variable, resulting in an access violation on line 1064 or 1069 of omniInternal.cc (varies slightly according to the nature of the corruption)
1064: omniObjRef* objref = pof->newObjRef(ior, id);
1065: if( target_intf_not_confirmed ) objref->pd_flags.type_verified = 0;
1066:
1067: {
1068: omni_optional_lock sync(*internalLock, locked, locked);
1069: id->gainRef(objref);
1070: }

The call stack is always as follows:

[1] omniORB413_vc9_rt!omni::createObjRef
[2] omniORB413_vc9_rt!omniObjRef::_unMarshal
[3] omniORB413_vc9_rt!CORBA::Object::_unmarshalObjRef
[4] omniDynamic413_vc9_rt!CORBA::Object_Member::operator<<=
[5] omniDynamic413_vc9_rt!omni::fastCopyUsingTC
[6] omniDynamic413_vc9_rt!omni::tcParser::copyStreamToStream
[7] omniDynamic413_vc9_rt!CORBA::Any::operator<<=
[8] _omniOTS!CosTransactions::PropagationContext::operator<<=
[9] _omniOTS!GetTxnContext
[10] omniORB413_vc9_rt!omni::GIOP_S::handleRequest
[11] omniORB413_vc9_rt!omni::GIOP_S::dispatcher

See that this is called from the "serverReceiveRequest" interceptor installed by omniOTS (which would be called very often from multiple threads).

It is trying to unmarshal the CosTransactions::PropagationContext struct. Here is the IDL (note that the implementation_specific_data field in CosTransactions::PropagationContext contains an omniOTS::SpecificData struct).

module CosTransactions { // from standard OMG IDL
struct PropagationContext {
unsigned long timeout;
TransIdentity current;
sequence <TransIdentity> parents;
any implementation_specific_data;
};
};
module omniOTS { // from omniOTS source code
struct SpecificData {
octet magic[7];
short version_major;
short version_minor;
CosTransactions::Control ctrl;
};
};

I can see from inspecting variables in the crash dump that it is trying to unmarshal the final CosTransactions::Control object reference. I can't tell you yet whether this object is in-process or not - either case could arise.

In the omni::createObjRef function I notice that if locked=0 then there is a gap during which omni::internalLock is NOT held, after "omniIdentity* id" has been obtained from omni::createIdentity but before it is used (at which point I am seeing that it is occasionally corrupted). My theory is that omni::createIdentity is returning a valid object but that the object is deallocated or otherwise rendered invalid by another thread during this gap. Is this possible?

See the attached patch which seems to stop the crash happening. It is not an ideal solution but I think it does support my theory of what the problem is?

Regards,
Luke.

**********************************************************************************************
Important Note
This email (including any attachments) contains information which is confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, distribute or copy this email. If you have received this email in error please notify the
sender immediately and delete this email. Any views expressed in this email are not necessarily the views of IRESS Market Technology Limited.

It is the duty of the recipient to virus scan and otherwise test the information provided before loading onto any computer system.
IRESS Market Technology Limited does not warrant that the information is free of a virus or any other defect or error.
**********************************************************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: omni-objreflocking.patch
Type: application/octet-stream
Size: 493 bytes
Desc: omni-objreflocking.patch
Url : http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20100128/2a8427f1/omni-objreflocking.obj
Duncan Grisby
2010-01-28 16:30:01 UTC
Permalink
Post by Luke Deller
Duncan did you have any thoughts on this issue?
Hi Luke,

Did you not see my reply? I sent a different patch for you to try out:

http://www.omniorb-support.com/pipermail/omniorb-list/2009-December/030458.html

Can you try that and see if it solves the problem? I hope so because
it's gone in to svn...

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Luke Deller
2010-01-28 18:50:22 UTC
Permalink
Hi Duncan,

Oh I missed that reply. Thanks for having a look! I'll test your patch.

Regards,
Luke.


----- Original Message -----
From: Duncan Grisby <***@grisby.org>
To: Luke Deller
Cc: 'omniorb-***@omniorb-support.com' <omniorb-***@omniorb-support.com>
Sent: Thu Jan 28 21:29:58 2010
Subject: Re: [omniORB] Re: omni::createObjRef thread safety
Post by Luke Deller
Duncan did you have any thoughts on this issue?
Hi Luke,

Did you not see my reply? I sent a different patch for you to try out:

http://www.omniorb-support.com/pipermail/omniorb-list/2009-December/030458.html

Can you try that and see if it solves the problem? I hope so because
it's gone in to svn...

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

**********************************************************************************************
Important Note
This email (including any attachments) contains information which is confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, distribute or copy this email. If you have received this email in error please notify the
sender immediately and delete this email. Any views expressed in this email are not necessarily the views of IRESS Market Technology Limited.

It is the duty of the recipient to virus scan and otherwise test the information provided before loading onto any computer system.
IRESS Market Technology Limited does not warrant that the information is free of a virus or any other defect or error.
**********************************************************************************************
Loading...