Discussion:
[omniORB] omninames: cannot create a worker fot this endpoint
Bruno CARLUS
2006-10-11 04:15:26 UTC
Permalink
Hello,
I've been running omniNames (omniORB 4.0.7) for quite a long time on the
same machine (Mandriva 2006 x86) and everything was going smoothly but
one morning this week after an omniNames restart (ok) the different
client could not resolve the nameservice reference anymore and omninames
prints out a large bunch of messages "omninames cannot create a worker
for this endpoint". The point is that there is a lot of clients
connecting at the same time to the name service (about 1000).
Has anybody already encountered this problem ?
Is there any system limitation that could have been raised here? Max
thread nb? Tcp connection mem buffer? that make omninames not able to
answer?

Thanks!
Bruno.
K.D.Welast at t-online.de ()
2006-10-11 13:44:40 UTC
Permalink
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20061011/e85c0b7c/attachment.htm
Bruno Carlus
2006-10-11 13:57:07 UTC
Permalink
Hello Bruno,
look to my answers to the subject: "[omniORB] linux, windows
communication problem + new". Maybe the same limitation of file handle
causes the problem.
Best regards
Mit freundlichen Gr??en
Kl. D. Welast
Hellersberstr. 35A
41460 Neuss
Tel: +49 2131 166657
Mobil: +49 171 5638203
That's maybe my problem...
Is it possible to increase this file limitation. Is that an omniorb option?

Thanks,
Bruno.
K.D.Welast at t-online.de ()
2006-10-11 16:51:25 UTC
Permalink
Hello Bruno,

I'm not sure, but I think it's defined in limit.h see line below.

#define OPEN_MAX 256 /* max # of files a process can have
open */

Sorry I don?t know what is to do if you chance this value. I think you
have to rebuild a lot of software pieces.

Maybe Duncan or someone else can help to answer this question.

Best regards

Mit freundlichen Gr??en
Kl. D. Welast

Hellersberstr. 35A
41460 Neuss
Tel: +49 2131 166657
Mobil: +49 171 5638203
Email: ***@t-online.de



-----Original Message-----
Date: Wed, 11 Oct 2006 09:57:02 +0200
Subject: Re: [omniORB] omninames: cannot create a worker fot this
endpoint
Hello Bruno,
look to my answers to the subject: "[omniORB] linux, windows
communication problem + new". Maybe the same limitation of file handle
causes the problem.
?
Best regards
Mit freundlichen Gr??en
Kl. D. Welast
Hellersberstr. 35A
41460 Neuss
Tel: +49 2131 166657
Mobil: +49 171 5638203
That's maybe my problem...
Is it possible to increase this file limitation. Is that an omniorb
option?

Thanks,
Bruno.
Bruno Carlus
2006-10-11 19:58:39 UTC
Permalink
Hello Bruno,
I'm not sure, but I think it's defined in limit.h see line below.
#define OPEN_MAX 256 /* max # of files a process can have
open */
Sorry I don?t know what is to do if you chance this value. I think you
have to rebuild a lot of software pieces.
Maybe Duncan or someone else can help to answer this question.
Best regards
Mit freundlichen Gr??en
Kl. D. Welast
Hellersberstr. 35A
41460 Neuss
Tel: +49 2131 166657
Mobil: +49 171 5638203
-----Original Message-----
Date: Wed, 11 Oct 2006 09:57:02 +0200
Subject: Re: [omniORB] omninames: cannot create a worker fot this
endpoint
Hello Bruno,
look to my answers to the subject: "[omniORB] linux, windows
communication problem + new". Maybe the same limitation of file handle
causes the problem.
Best regards
Mit freundlichen Gr??en
Kl. D. Welast
Hellersberstr. 35A
41460 Neuss
Tel: +49 2131 166657
Mobil: +49 171 5638203
That's maybe my problem...
Is it possible to increase this file limitation. Is that an omniorb option?
Thanks,
Bruno.
I followed the procedure described at to increase the limit (which is
actually of set to 1024 on startup...) :
http://www.xenoclast.org/doc/benchmark/HTTP-benchmarking-HOWTO/node7.html

It seems ok now but i wait and let the system running to be sure it was
really the origin of the problem.

Bruno.
Duncan Grisby
2006-10-12 00:04:10 UTC
Permalink
Post by Bruno CARLUS
I've been running omniNames (omniORB 4.0.7) for quite a long time on
the same machine (Mandriva 2006 x86) and everything was going smoothly
but one morning this week after an omniNames restart (ok) the
different client could not resolve the nameservice reference anymore
and omninames prints out a large bunch of messages "omninames cannot
create a worker for this endpoint". The point is that there is a lot
of clients connecting at the same time to the name service (about
1000).
"Cannot create a worker for this endpoint" means that omniORB was unable
to start a thread. Putting it into thread pool mode will probably make
it work. You might also want to reduce the idle connection timeout
(inConScanPeriod parameter) so that connections from idle clients are
closed sooner.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Bruno Carlus
2006-10-12 18:43:00 UTC
Permalink
Post by Duncan Grisby
Post by Bruno CARLUS
I've been running omniNames (omniORB 4.0.7) for quite a long time on
the same machine (Mandriva 2006 x86) and everything was going smoothly
but one morning this week after an omniNames restart (ok) the
different client could not resolve the nameservice reference anymore
and omninames prints out a large bunch of messages "omninames cannot
create a worker for this endpoint". The point is that there is a lot
of clients connecting at the same time to the name service (about
1000).
"Cannot create a worker for this endpoint" means that omniORB was unable
to start a thread. Putting it into thread pool mode will probably make
it work. You might also want to reduce the idle connection timeout
(inConScanPeriod parameter) so that connections from idle clients are
closed sooner.
Cheers,
Duncan.
Duncan,
When i'm using a thread pool omniNames uses about 50% of CPU !
If I increase the size of the pool (250) charge decreases to 15%.

With no pool I have about 1000 threads...Is it a solution to use a pool
of up to 1000 threads...?
By the way what is the difference for the system between a pool of n
thread and n threads?

Thanks,
Bruno.
Bruno Carlus
2006-10-13 16:18:01 UTC
Permalink
Post by Duncan Grisby
Post by Bruno CARLUS
I've been running omniNames (omniORB 4.0.7) for quite a long time on
the same machine (Mandriva 2006 x86) and everything was going smoothly
but one morning this week after an omniNames restart (ok) the
different client could not resolve the nameservice reference anymore
and omninames prints out a large bunch of messages "omninames cannot
create a worker for this endpoint". The point is that there is a lot
of clients connecting at the same time to the name service (about
1000).
"Cannot create a worker for this endpoint" means that omniORB was unable
to start a thread. Putting it into thread pool mode will probably make
it work. You might also want to reduce the idle connection timeout
(inConScanPeriod parameter) so that connections from idle clients are
closed sooner.
Cheers,
Duncan.
Thanks, everybody !
I solved the "cannot create a worker fot this endpoint" problem with a
thread pool but have now another problem:
I've got about 1000 client app trying to fetch the reference of their
server app. in tha name service.
When I try to start the server app, he finds the name service but
cannot bind its reference if there is more than 1009 (?)clients
contacting the name service...is there some kind of limit there ?
As I increased the file descriptor number per process, I don't know
where does this new problem come from...

Here is the log of the server app:

omniORB: Distribution date: Thu Apr 14 17:19:57 BST 2005 dgrisby
omniORB: My addresses are:
omniORB: 127.0.0.1
omniORB: 134.158.140.49
omniORB: 172.16.11.1
omniORB: 172.16.0.2
omniORB: Maximum supported GIOP version is 1.2
omniORB: Native char code sets: ISO-8859-1 UTF-8.
omniORB: Transmission char code sets: ISO-8859-1(1.2) ISO-8859-1(1.1)
ISO-8859-1(1.0) UTF-8(1.2) UTF-8(1.1).
omniORB: Native wide char code sets: UTF-16.
omniORB: Transmission wide char code sets: UTF-16(1.2).
omniORB: Initialising omniDynamic library.
omniORB: Current configuration is as follows:
omniORB: DefaultInitRef (file) =
omniORB: DefaultInitRef (args) =
omniORB: InitRef = NameService=corbaname::172.16.0.2:2809
omniORB: InitRef = EventService=corbaname::172.16.0.2:11169
omniORB: abortOnInternalError = 0
omniORB: acceptBiDirectionalGIOP = 0
omniORB: acceptMisalignedTcIndirections = 0
omniORB: bootstrapAgentHostname =
omniORB: bootstrapAgentPort = 900
omniORB: clientCallTimeOutPeriod = 1500
omniORB: clientTransportRule = * unix,ssl,tcp
omniORB: diiThrowsSysExceptions = 0
omniORB: dumpConfiguration = 0
omniORB: endPoint = giop:tcp:172.16.0.2:
omniORB: endPoint = giop:tcp:172.16.0.2:
omniORB: endPointPublishAllIFs = 0
omniORB: giopMaxMsgSize = 2097152
omniORB: giopTargetAddressMode = KeyAddr
omniORB: id = omniORB4
omniORB: inConScanPeriod = 10
omniORB: lcdMode = 0
omniORB: maxGIOPConnectionPerServer = 5
omniORB: maxGIOPVersion = 1.2
omniORB: maxInterleavedCallsPerConnection = 5
omniORB: maxServerThreadPerConnection = 20
omniORB: maxServerThreadPoolSize = 100
omniORB: nativeCharCodeSet = ISO-8859-1
omniORB: nativeWCharCodeSet = UTF-16
omniORB: objectTableSize = 0
omniORB: offerBiDirectionalGIOP = 0
omniORB: omniORB_27_CompatibleAnyExtraction = 0
omniORB: oneCallPerConnection = 1
omniORB: outConScanPeriod = 20
omniORB: poaHoldRequestTimeout = 0
omniORB: poaUniquePersistentSystemIds = 1
omniORB: principal = [Null]
omniORB: scanGranularity = 3
omniORB: serverCallTimeOutPeriod = 0
omniORB: serverTransportRule = * unix,ssl,tcp
omniORB: strictIIOP = 1
omniORB: supportBootstrapAgent = 0
omniORB: supportCurrent = 1
omniORB: supportPerThreadTimeOut = 0
omniORB: tcAliasExpand = 0
omniORB: threadPerConnectionLowerLimit = 250
omniORB: threadPerConnectionPolicy = 1
omniORB: threadPerConnectionUpperLimit = 300
omniORB: threadPoolWatchConnection = 1
omniORB: traceExceptions = 1
omniORB: traceInvocations = 0
omniORB: traceLevel = 50
omniORB: traceThreadId = 0
omniORB: unixTransportDirectory = /tmp/omni-%u
omniORB: unixTransportPermission = 777
omniORB: useTypeCodeIndirections = 1
omniORB: verifyObjectExistsAndType = 1
omniORB: Initialising incoming endpoints.
omniORB: Explicit bind to host 172.16.0.2.
omniORB: Bind to address 172.16.0.2.
omniORB: Explicit bind to host 172.16.0.2.
omniORB: Bind to address 172.16.0.2.
omniORB: Starting serving incoming endpoints.
omniORB: Adding root<0> (activating) to object table.
omniORB: State root<0> (activating) -> active
omniORB: Creating ref to local: root<0>
target id : IDL:HighPrecisionTracker/managerToHPTdaqItf:1.0
most derived id: IDL:HighPrecisionTracker/managerToHPTdaqItf:1.0
omniORB: Adding root<1> (activating) to object table.
omniORB: State root<1> (activating) -> active
omniORB: Creating ref to local: root<1>
target id : IDL:HighPrecisionTracker/HPTsensorToHPTdaqItf:1.0
most derived id: IDL:HighPrecisionTracker/HPTsensorToHPTdaqItf:1.0
omniORB: Creating ref to remote: key<NameService>
target id : IDL:omg.org/CORBA/Object:1.0
most derived id:
omniORB: Initial reference `NameService' resolved from configuration file.
omniORB: Client attempt to connect to giop:tcp:172.16.0.2:2809
omniORB: Client opened connection to giop:tcp:172.16.0.2:2809
omniORB: sendChunk: to giop:tcp:172.16.0.2:2809 100 bytes
omniORB:
4749 4f50 0100 0100 5800 0000 0000 0000 GIOP....X.......
0200 0000 010a 6f6d 0b00 0000 4e61 6d65 ......om....Name
5365 7276 6963 6566 0600 0000 5f69 735f Servicef...._is_
6100 7669 0000 0000 2800 0000 4944 4c3a a.vi....(...IDL:
6f6d 672e 6f72 672f 436f 734e 616d 696e omg.org/CosNamin
672f 4e61 6d69 6e67 436f 6e74 6578 743a g/NamingContext:
312e 3000 1.0.
omniORB: Switch rope to use address giop:tcp:172.16.0.2:2809
omniORB: throw giopStream::CommFailure from
giopStream.cc:834(0,MAYBE,COMM_FAILURE_WaitingForReply)
omniORB: AsyncInvoker: thread id = 1 has started. Total threads = 3
omniORB: giopRendezvouser task execute for giop:tcp:172.16.0.2:34823
omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 3
omniORB: giopRendezvouser task execute for giop:tcp:172.16.0.2:34824
omniORB: AsyncInvoker: thread id = 3 has started. Total threads = 3
omniORB: Scavenger task execute.
omniORB: Client connection refcount = 0
omniORB: Client close connection to giop:tcp:172.16.0.2:2809
omniORB: throw COMM_FAILURE from omniObjRef.cc:754
(MAYBE,COMM_FAILURE_WaitingForReply)
omniORB: ORB not destroyed; no final clean-up.


Any idea ?

Thanks, Bruno.

PS: sorry Duncan for the redundant message...
Duncan Grisby
2006-10-23 21:17:09 UTC
Permalink
On Thursday 12 October, Bruno Carlus wrote:

[...]
Post by Bruno Carlus
When i'm using a thread pool omniNames uses about 50% of CPU !
If I increase the size of the pool (250) charge decreases to 15%.
It sounds like your clients are hitting omniNames really hard. What kind
of usage pattern and naming structure do you have? How many names do
you have in each naming context?
Post by Bruno Carlus
With no pool I have about 1000 threads...Is it a solution to use a
pool of up to 1000 threads...?
By the way what is the difference for the system between a pool of n
thread and n threads?
In thread pool mode, the pool size places an upper bound on the number
of threads, and calls are spread between the threads, even if you have
more network connections than threads; in thread per connection mode,
each network connection gets its own thread, so you always have at least
as many threads as connections. Since you were running out of threads,
thread pool mode is better for you. If you find you need a large number
of threads in the pool, then that isn't a particular problem. Just pick
a number that's a fair bit smaller than the limit of the total threads a
process can have.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Duncan Grisby
2006-10-23 21:25:05 UTC
Permalink
Post by Bruno Carlus
I solved the "cannot create a worker fot this endpoint" problem with a
I've got about 1000 client app trying to fetch the reference of their
server app. in tha name service.
When I try to start the server app, he finds the name service but
cannot bind its reference if there is more than 1009 (?)clients
contacting the name service...is there some kind of limit there ?
There is no inherent limit in omniORB. The limit is either that the OS
has a hard limit on the number of file descriptors, or, more likely,
that you've hit the limit of the fd_set size. In thread pool mode,
omniORB uses select() to watch the incoming connections, so it can't
handle file descriptors outside the fd_set range. The solution to that
is to move to omniORB 4.1, which uses poll() instead of select(). poll()
has no limit on the file descriptor number.

The other thing that can help is to configure omniNames to close
connections more rapidly than default. Try setting scanGranularity to 1
and inConScanPeriod to 2.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Loading...