Discussion:
[omniORB] Synchronization between omniNames service & Services
R. P. Janaka
2008-08-04 17:32:33 UTC
Permalink
Hi all,

I am writing a simple Fault tolerant server which can handle client request
without any service interruption..Using C++ & omniORB

In my system there are two servers as Master & Slave..

When they starts they get registered with the name service as usual.
Normally the first one will be the Master and the next one is the slave.

All are working properly..But still have a simple problem which I was unable
to solve. It is as follows.


- When both Master & Slave are running....
- Manually kill the Slave
- Start the Slave again.

In some situation this Slave restart does not work properly.. it gives an
exception. This exception is my own exception and It says that Slave already
exist.
This happens only randomly, but occurring frequency is pretty high. (at
least 30% of the time it will happen)

If I retry several times I could get success. Because If I call start method
within the catch block I can really reduce the occurring frequency of this
exception.


So I guess the problem is with the name service. Because when I just kill
the Slave, the name service does not know anything about this killing. So it
is still keep registering details of the Slave. Because of that it gives and
exception when I am going to register the Slave again.

Please can anyone guess any other reason for this problem.....?

If my guess is correct...Please can anyone help me to solve this problem..
Is there a continues synchronization between the name service and the
service itself .....?
--
Regards,
R. P. Janaka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080804/c3993008/attachment.htm
evgeni.rojkov at durr.com ()
2008-08-04 18:14:26 UTC
Permalink
Which exception you get?
Is it CosNaming::NamingContext::AlreadyBound?
If the case, try rebind().
Kind Regards, Evgeni

CORBA::Object_var obj = orb->resolve_initial_references("NameService");
CosNaming::NamingContext_var = CosNaming::NamingContext::_narrow(obj);
.....

try {
rootContext->bind(objectName, objref);
}
catch(CosNaming::NamingContext::AlreadyBound& ex) {
rootContext->rebind(objectName, objref);
}
....

-----Urspr?ngliche Nachricht-----
Von: omniorb-list-***@omniorb-support.com
[mailto:omniorb-list-***@omniorb-support.com] Im Auftrag von R. P. Janaka
Gesendet: Montag, 4. August 2008 13:33
An: omniorb-***@omniorb-support.com
Betreff: [omniORB] Synchronization between omniNames service & Services


Hi all,

I am writing a simple Fault tolerant server which can handle client
request without any service interruption..Using C++ & omniORB

In my system there are two servers as Master & Slave..

When they starts they get registered with the name service as usual.
Normally the first one will be the Master and the next one is the slave.

All are working properly..But still have a simple problem which I was
unable to solve. It is as follows.



* When both Master & Slave are running....
* Manually kill the Slave
* Start the Slave again.

In some situation this Slave restart does not work properly.. it gives
an exception. This exception is my own exception and It says that Slave already
exist.
This happens only randomly, but occurring frequency is pretty high. (at
least 30% of the time it will happen)

If I retry several times I could get success. Because If I call start
method within the catch block I can really reduce the occurring frequency of
this exception.


So I guess the problem is with the name service. Because when I just
kill the Slave, the name service does not know anything about this killing. So
it is still keep registering details of the Slave. Because of that it gives and
exception when I am going to register the Slave again.

Please can anyone guess any other reason for this problem.....?

If my guess is correct...Please can anyone help me to solve this
problem.. Is there a continues synchronization between the name service and the
service itself .....?


--
Regards,
R. P. Janaka


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080804/2bc21fae/attachment-0001.htm
Michael
2008-08-04 18:17:32 UTC
Permalink
Hi,

there is no constant synchronisation between the two. You just register
an IOR at the naming service in a specific context (this is basically
like writing information to a database).

You could do the following on slave retart.

try
{
context->bind(name, objref);
}
catch(CosNaming::NamingContext::AlreadyBound& ex)
{
context->rebind(name, objref);
}

This assumes that context is the context you want to register in ("path"
in the naming service), objref is the object reference of your slave
service and name in the name you want to register (CosNaming::Name).

If you want a failsafe setup, using the naming service to determine if a
service exists might not be the best idea. As an alternative you could
bind master and slave to fixed endpoints (-ORBendPoint) and use resolve
intial references, e.g. by using omniMapper or inside of your code (see
omniNames source for an example of how to do that) and configure the
reference in omniORB.cfg:

InitRef = MyMaster=corbaloc::127.0.0.1:20000/MyService
InitRef = MySlave=corbaloc::127.0.0.1:20001/MyService

as an alternative you should be able to specify multiple endpoints in
InitRef, so the ORB will try both endpoints in the specified order (I
never did this, so maybe somebody else could provide informations about
this).

cheers
michael
Post by R. P. Janaka
Hi all,
I am writing a simple Fault tolerant server which can handle client request
without any service interruption..Using C++ & omniORB
In my system there are two servers as Master & Slave..
When they starts they get registered with the name service as usual.
Normally the first one will be the Master and the next one is the slave.
All are working properly..But still have a simple problem which I was unable
to solve. It is as follows.
- When both Master & Slave are running....
- Manually kill the Slave
- Start the Slave again.
In some situation this Slave restart does not work properly.. it gives an
exception. This exception is my own exception and It says that Slave already
exist.
This happens only randomly, but occurring frequency is pretty high. (at
least 30% of the time it will happen)
If I retry several times I could get success. Because If I call start method
within the catch block I can really reduce the occurring frequency of this
exception.
So I guess the problem is with the name service. Because when I just kill
the Slave, the name service does not know anything about this killing. So it
is still keep registering details of the Slave. Because of that it gives and
exception when I am going to register the Slave again.
Please can anyone guess any other reason for this problem.....?
If my guess is correct...Please can anyone help me to solve this problem..
Is there a continues synchronization between the name service and the
service itself .....?
------------------------------------------------------------------------
_______________________________________________
omniORB-list mailing list
http://www.omniorb-support.com/mailman/listinfo/omniorb-list
R. P. Janaka
2008-08-05 10:47:52 UTC
Permalink
No.. still did not get a solution.

As I found with new experiments, this problem is not a specific for this
Master/Slave case. If we just consider only about the Master, Problem still
remains. I found these result with debugging.

I just repeat the same procedure with only a single server.


- Start the server and get registered with the name service
- Manually kill the server
- Start the server and get registered with the name service *again *.


Then the registration also get success. But when the client try to call
server's functions, it gives this exception.

the exception is *"SystemException: TRANSIENT_ConnectFailed"*

The reason for this problem may be the sudden break down of the server with
out informing it to the name service.

Do we have any solution for this....?
Hi Janaka
Have you found any solution to the problem you mentionned herebelow. Is it
possible for the Master and Slave to mutually test for their respective
status with the name server once they are up and running. They can then
deregister/reregister wth the name server after the slave must have been
killed. I assume this could be done through a messaging system.
Cheers
M. Okeke
Hi all,
I am writing a simple Fault tolerant server which can handle client request
without any service interruption..Using C++ & omniORB
In my system there are two servers as Master & Slave..
When they starts they get registered with the name service as usual.
Normally the first one will be the Master and the next one is the slave.
All are working properly..But still have a simple problem which I was
unable to solve. It is as follows.
- When both Master & Slave are running....
- Manually kill the Slave
- Start the Slave again.
In some situation this Slave restart does not work properly.. it gives an
exception. This exception is my own exception and It says that Slave already
exist.
This happens only randomly, but occurring frequency is pretty high. (at
least 30% of the time it will happen)
If I retry several times I could get success. Because If I call start
method within the catch block I can really reduce the occurring frequency of
this exception.
So I guess the problem is with the name service. Because when I just kill
the Slave, the name service does not know anything about this killing. So it
is still keep registering details of the Slave. Because of that it gives and
exception when I am going to register the Slave again.
Please can anyone guess any other reason for this problem.....?
If my guess is correct...Please can anyone help me to solve this problem..
Is there a continues synchronization between the name service and the
service itself .....?
--
Regards,
R. P. Janaka
------------------------------
_______________________________________________
--
Regards,
R. P. Janaka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080805/10215c51/attachment.htm
Nigel Rantor
2008-08-05 15:39:09 UTC
Permalink
Post by R. P. Janaka
No.. still did not get a solution.
As I found with new experiments, this problem is not a specific for this
Master/Slave case. If we just consider only about the Master, Problem
still remains. I found these result with debugging.
I just repeat the same procedure with only a single server.
* Start the server and get registered with the name service
* Manually kill the server
* Start the server and get registered with the name service *again *.
Then the registration also get success. But when the client try to call
server's functions, it gives this exception.
the exception is *"SystemException: TRANSIENT_ConnectFailed"*
The reason for this problem may be the sudden break down of the server
with out informing it to the name service.
Do we have any solution for this....?
You're going to have to be more explicit about this.

Which of the following is the order things are happening in?

A
--------------------------------------------------------------
- Start server
- Server registers with Nameservice as "server"
- Kill server
- Start server again
- Server registers with Nameservice as "server"
- Start client
- Client looks up "server" with Nameservice
- Client invokes an operation on the reference from the Nameservice
- Client gets "SystemException: TRANSIENT_ConnectFailed"
--------------------------------------------------------------

B
--------------------------------------------------------------
- Start server
- Server registers with Nameservice as "server"
- Client looks up "server" with Nameservice
- Kill server
- Start server again
- Server registers with Nameservice as "server"
- Start client
- Client invokes an operation on the reference from the Nameservice
- Client gets "SystemException: TRANSIENT_ConnectFailed"
--------------------------------------------------------------

My question is, has the client got the most recently bound name from the
nameservice or not?

It appears from you description that it is trying to contact the
original object reference that was registered with the Nameservice
rather than the new one.

n
R. P. Janaka
2008-08-05 16:10:10 UTC
Permalink
The second one is the scenario..

As you think I am also guess, the problem is trying to contact the original
object reference that was registered with the Nameservice rather than the
new one.
Post by Nigel Rantor
Post by R. P. Janaka
No.. still did not get a solution.
As I found with new experiments, this problem is not a specific for this
Master/Slave case. If we just consider only about the Master, Problem still
remains. I found these result with debugging.
I just repeat the same procedure with only a single server.
* Start the server and get registered with the name service
* Manually kill the server
* Start the server and get registered with the name service *again *.
Then the registration also get success. But when the client try to call
server's functions, it gives this exception.
the exception is *"SystemException: TRANSIENT_ConnectFailed"*
The reason for this problem may be the sudden break down of the server
with out informing it to the name service.
Do we have any solution for this....?
You're going to have to be more explicit about this.
Which of the following is the order things are happening in?
A
--------------------------------------------------------------
- Start server
- Server registers with Nameservice as "server"
- Kill server
- Start server again
- Server registers with Nameservice as "server"
- Start client
- Client looks up "server" with Nameservice
- Client invokes an operation on the reference from the Nameservice
- Client gets "SystemException: TRANSIENT_ConnectFailed"
--------------------------------------------------------------
B
--------------------------------------------------------------
- Start server
- Server registers with Nameservice as "server"
- Client looks up "server" with Nameservice
- Kill server
- Start server again
- Server registers with Nameservice as "server"
- Start client
- Client invokes an operation on the reference from the Nameservice
- Client gets "SystemException: TRANSIENT_ConnectFailed"
--------------------------------------------------------------
My question is, has the client got the most recently bound name from the
nameservice or not?
It appears from you description that it is trying to contact the original
object reference that was registered with the Nameservice rather than the
new one.
n
--
Regards,
R. P. Janaka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080805/37b668db/attachment.htm
Loading...