[omniORB] Question concerning problems with an old omniORB-version

Discussion:

mrfynn at gmx.net ()

2006-08-07 17:27:17 UTC

Hi @all,

Years ago I've ported omniORB 3.0.4 to an embedded system we are using,
a realtime OS for a PowerPC-CPU (8260).

So far so fine. Our system is running properly without any complaints for
4 years now.

Recently we made some changes to our system startup. to make thing run faster. This included changes in the linker-command-file and a new startup-routine in our System (all to make our System run in RAM and no longer in ROM, for speed-up).

With these changes our system encounters a weird crash when we do the following:

1. Establishing a connection to our Corba-interface running on the PowerPC.
(We export the stringified IOR for these purposes, so no IIOP-connection is necessary. Client side: a simple VS 6 console application for tests) doing the following:

1.1. Creating an Object_var out of the sIor by calling string_to_object.
1.2. Narrowing the object-pointer to the designated interface-type.
1.3. Checking for is_nil.

2. Calling one of the methods of our Corba-interface.

3. Destroying the clients orb.

4. Closing client application by returning 0.

Only when the application is doing step 2 the system crash occurs, but only after the client application is closed (after Step 4).

I think there might be some problem when remains of client application are closed after the return.

Can you tell me a point where I can start debugging my software in the server system? Any suggestions are welcome!

Thank you in advance,
Jakob

P.S.: For illustration a simplified code sequence:

int main(int argc, char* argv[])
{

// Step 1.
CORBA::ORB_var orb = CORBA::ORB_init(argc, argv, "omniORB3");

// ... getting sIor

// Step 1.1.
CORBA::Object_var obj = orb->string_to_object(sIor);

// Step 1.2.
IManager_var ManagerRef = IManager::_narrow(obj);

// Step 1.3.
if( CORBA::is_nil(vManagerRef) )
{
printf("Can't narrow reference to type Manager (or it was nil).\n");
return 1;
}

// Step 2.
ManagerRef->GetSettings();

// Step 3.
orb->destroy();

// Step 4.
return 0;
} // <-- here the client application crashes

Duncan Grisby

2006-08-08 17:38:35 UTC

Permalink

On Monday 7 August, ***@gmx.net wrote:

[...]

Post by mrfynn at gmx.net ()
3. Destroying the clients orb.
4. Closing client application by returning 0.
Only when the application is doing step 2 the system crash occurs, but
only after the client application is closed (after Step 4).

It's not totally clear from your message which program crashes, the
server on the embedded system or the Windows client. I think you're
talking about the server are you?

If possible, run both client and server with -ORBtraceLevel 25 to see
what they think is going on.

I assume that the network connection is not being closed by orb
destruction, but only happens when the client exists.

One wild guess is that your new compile options mean you no longer have
thread safe C++ exception handling. When the client connection goes
away, the server will throw an exception that should be caught higher up
the call chain. If the exception handling is broken, that could cause a
crash.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --

mrfynn at gmx.net ()

2006-08-09 15:02:31 UTC

Permalink

Hi Duncan,

thanks for your response!

Post by Duncan Grisby
It's not totally clear from your message which program crashes, the
server on the embedded system or the Windows client. I think you're
talking about the server are you?

Yes indeed the server crashes.

Post by Duncan Grisby
If possible, run both client and server with -ORBtraceLevel 25 to see
what they think is going on.

Thanks for reminding me to have this option, I will have a look there.

Post by Duncan Grisby
I assume that the network connection is not being closed by orb
destruction, but only happens when the client exists.

Yes you are right. Yesterday I've managed to debug into the
tcpSocketWorker::_realRun-method just up to the point where
the exception omniConnectionBroken is catched, or should be.

Post by Duncan Grisby
From then on my systems behaviuor looks very weird.

After that I've tested another sort of exception by transmitting
a non existing (i.e. too big) CORBA-enum constant during a method call.
During the <<-Operation _CORBA_marshal_error() is invoked then, as supposed.
The following throw-statement (_CORBA_marshal_error) leads also to a
crash - although there are differences.

A simple try-catch block with a thrown int-exception does work properly,
but I dont think this will be of any use for me at the moment.

Post by Duncan Grisby
One wild guess is that your new compile options mean you no longer have
thread safe C++ exception handling. When the client connection goes
away, the server will throw an exception that should be caught higher up
the call chain. If the exception handling is broken, that could cause a
crash.

The compile options remained the same. But as far as i've figured out yet
(a wild guess, too) the point I will have to look has something to do with exception handling.
Our new speedup-extension uses some tricks. The linker command file is written in a way that the code is linked to the RAM, but located in ROM.
A new startup routine initializes the runtime environment and afterwards
copies the needed code sections into RAM. And there I will have to look.
Because this speedup-extension has not been used with either exception handling nor CORBA ("black magic").
So I will start checking the maps of the server-firmware with extension
and without to see if there is some difference! Because without this speedup extension all works fine!

If interested I will keep you informed regarding my efforts!

Cheers,

Jakob