[omniORB] Memory leak related to Proxy Object Factories

Discussion:

Tuyen Chau

2006-12-06 06:13:33 UTC

I need some help tracking down what appears to be a memory leak deep in
the proxy object factories (POF), or how they interact with other
components in omniORB. In our application, we derive from the default
POFs to instantiate our own proxies. This all works fine. However,
when we take another step in our proxy object factories and try to cache
our proxies so that there is a unique object reference for each servant,
we ran into a memory leak of about 600-700 bytes, each time our POF is
called.

Here's how we cache the object. We use the key in the omniIdentity to
identify object. If an object has been created for that key, we
increment the reference count on the object and return it. If not, we
create a new object and cache it. Our newObjRef(() function looks
approximately like this,

OurPofClass::newObjRef(omniIOR* ior, omniIdentity* ident) {

unsigned char *key = (unsigned char *)ident->key();
int keySize = ident->keysize();
omniObjRef obj;

obj = FIND_IN_CACHE(key, keySize);

if (obj == NULL) {

obj = new OUR PROXY();

CACHE(obj, key, keySize);

} else {

ClassName::_duplicate(dynamic_cast<ClassName_ptr>(obj));
}

return obj;
}

As far we can tell, there is no leak in our cache implementation, which
really is a hash table. If we comment out just the line of code that
puts the object in the cache, the memory leak is gone. If we put the
object in the cache, the memory leak is back. It seems that omniORB
checks on the return object and do something different if the object is
not "brand new", and that something "different" is causing the memory
leak. But then we are puzzled as to how omniORB can even tell if the
return object is "brand new" vs. cached.

Does this make sense to any of you? Does anyone know how can we fix
this memory leak?

Best regards,
Tuyen

Duncan Grisby

2006-12-12 00:31:30 UTC

Permalink

Post by Tuyen Chau
I need some help tracking down what appears to be a memory leak deep
in the proxy object factories (POF), or how they interact with other
components in omniORB. In our application, we derive from the default
POFs to instantiate our own proxies. This all works fine. However,
when we take another step in our proxy object factories and try to
cache our proxies so that there is a unique object reference for each
servant, we ran into a memory leak of about 600-700 bytes, each time
our POF is called. Here's how we cache the object. We use the key in
the omniIdentity to identify object. If an object has been created
for that key, we increment the reference count on the object and
return it. If not, we create a new object and cache it. Our
newObjRef(() function looks approximately like this,
OurPofClass::newObjRef(omniIOR* ior, omniIdentity* ident) {

The things that are leaking are the ior and identity. The new object
reference takes ownership of them, so if you return an existing object
reference, they are never released. It's easy for you to release the
ior, but the identity is more tricky since it is assigned to the objref
in createObjRef() in omniInternal.cc. You'll have to tweak that to look
to see if the objref already has a different identity and release the
new one if it does.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

Tuyen Chau

2006-12-15 01:29:33 UTC

Permalink

Post by Duncan Grisby
The things that are leaking are the ior and identity. The new object
reference takes ownership of them, so if you return an existing object
reference, they are never released. It's easy for you to release the
ior, but the identity is more tricky since it is assigned to the objref
in createObjRef() in omniInternal.cc. You'll have to tweak that to look
to see if the objref already has a different identity and release the
new one if it does.
Cheers,
Duncan.

Thanks for your response, Duncan.

I've looked into the omniOrb code as you suggested and it is not as
straight forward as it sounds, so instead, I looked for work arounds.
The work-around that we came up and are happy with, is to create a dummy
objref to take ownership of the ior and identity, and set a timer for 1
second with the dummy objref passed into the timer handler as a
parameter. When the timer expires, the timer handler gets called, and
releases the dummy objref.

In this work around, we assume 1 second is long enough to execute the
omniOrb code between the newObjRef() call and before calling our code.
We also assume that returning a cache objref, instead of a new one, does
not confuse omniOrb in any other way other than the memory leak problem.

We've tested our work-around and got mixed results. In a simple test,
the leak is gone. In a much more complicated test that is supposed to
run for days, we got a C++ Runtime exception (no pure virtual function
call) that seemed to happen on a random basis. We are not sure if the
exception that we got is a different problem altogether or related to
the work-around that we took. Does this work-around look OK to you, or
are we missing the boat here?

Thanks so much in advance for any assistance that you can provide,
Tuyen

Duncan Grisby

2006-12-22 21:56:30 UTC

Permalink

Post by Tuyen Chau
I've looked into the omniOrb code as you suggested and it is not as
straight forward as it sounds, so instead, I looked for work arounds.
The work-around that we came up and are happy with, is to create a
dummy objref to take ownership of the ior and identity, and set a
timer for 1 second with the dummy objref passed into the timer handler
as a parameter. When the timer expires, the timer handler gets
called, and releases the dummy objref.
In this work around, we assume 1 second is long enough to execute the
omniOrb code between the newObjRef() call and before calling our code.
We also assume that returning a cache objref, instead of a new one,
does not confuse omniOrb in any other way other than the memory leak
problem.
We've tested our work-around and got mixed results. In a simple test,
the leak is gone. In a much more complicated test that is supposed to
run for days, we got a C++ Runtime exception (no pure virtual function
call) that seemed to happen on a random basis. We are not sure if the
exception that we got is a different problem altogether or related to
the work-around that we took. Does this work-around look OK to you,
or are we missing the boat here?

Most likely, you were unlucky and the 1 second timeout expired before
omniORB tried to call a method on the object. That would explain the
pure virtual method called error you see. I don't think your approach is
at all safe.

Why are you returning cached object references? I was assuming that it
was to avoid the overhead of constructing new ones, but if you consider
it acceptable to create a new one then immediately delete it again, that
can't be the reason. So you must be doing it to share some state between
them. Rather than trying to fight omniORB's desire to create new
objrefs, why don't you put whatever shared state you have into a
sub-object that is shared between all the objrefs that omniORB creates.
That way you can do whatever it is you need without having to subvert
what omniORB tries to do.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

Tuyen Chau

2006-12-22 23:48:11 UTC

Permalink

Post by Duncan Grisby
Most likely, you were unlucky and the 1 second timeout expired before
omniORB tried to call a method on the object. That would explain the
pure virtual method called error you see. I don't think your approach is
at all safe.
Why are you returning cached object references? I was assuming that it
was to avoid the overhead of constructing new ones, but if you consider
it acceptable to create a new one then immediately delete it again, that
can't be the reason. So you must be doing it to share some state between
them. Rather than trying to fight omniORB's desire to create new
objrefs, why don't you put whatever shared state you have into a
sub-object that is shared between all the objrefs that omniORB creates.
That way you can do whatever it is you need without having to subvert
what omniORB tries to do.
Cheers,
Duncan.

Thanks for your clarification, Duncan. You are right that we are not
concerned about the overhead of constructing new objrefs. We are
caching non-dynamic data in the proxies and don't want to go and fetch
it from the server every time we need to create an objref. In addition,
we are also storing our own client data in the proxies that can't be
easily recreated.

We can cache data in a different way without storing it in the proxies
as you suggested, but we couldn't afford the work at this time. We are
porting our code from Orbix, and did not want to change more code than
necessary to avoid introducing defects into the product. In general, we
are happy with omniORB but there's always surprises that we need to sort
out on how to deal with them.

We ended up changing omniORB to properly fix the memory leak. Our code
in omniInternal.cc now looks like this:

{
omni_optional_lock sync(*internalLock, locked, locked);

// If the objref already has a different identity, release the
// new one. This can happen if a custom pof wants to reuse
// an objref. (Verano patch for memory leak)
//
if (objref->pd_id == id) {
id->gainRef(objref);
} else {
ior->release();
id->loseRef(NULL);
}
}

We figured the mutex call in here is what killed the 1 second timeout,
and gave up on that work-around.

Does this look like code that you can incorporate into future releases
of omniORB? We would be happy if you could do that so we don't need to
maintain a separate version for it. We think that people who come from a
different implementation of CORBA like Orbix, would probably require the
unique objref feature like we do. This would make it easier for them to
migrate to omniORB. If you can think of a better way to rewrite the
code above, please let us know too.

Best regards,
Tuyen