Discussion:
[omniORB] Failure to connect on Windows 2008 Server
Mike Richmond
2009-08-26 22:21:17 UTC
Permalink
I'm seeing a problem whereby tcpAddress::Connect() fails because
getpeername() fails with error 10057 (WSAENOTCONN). This is after
connect() has returned 10035 (WSAEWOULDBLOCK) and select() has
returned 1. I see the error when using omniORB 4.1.3, but not with
omniORB 4.1.0.

Also I am only seeing this problem on one machine, which is running
Windows 2008 Server. Another Windows 2008 Server machine is OK, as
are machines with other Windows OSes. A possibly significant
property of the problem machine is that it has a 6to4 IPv6 address,
which AIUI Windows assigns because the machine has a public IPv4
address. Might that be causing the problem? Any thoughts on a fix?

Mike Richmond
Global Graphics Software Ltd
Duncan Grisby
2009-08-28 15:36:40 UTC
Permalink
Post by Mike Richmond
I'm seeing a problem whereby tcpAddress::Connect() fails because
getpeername() fails with error 10057 (WSAENOTCONN). This is after
connect() has returned 10035 (WSAEWOULDBLOCK) and select() has
returned 1. I see the error when using omniORB 4.1.3, but not with
omniORB 4.1.0.
omniORB 4.1.0 didn't do the getpeername() check -- it just assumed that
when the select() returned, the socket was connected, which isn't always
the case.
Post by Mike Richmond
Also I am only seeing this problem on one machine, which is running
Windows 2008 Server. Another Windows 2008 Server machine is OK, as
are machines with other Windows OSes. A possibly significant
property of the problem machine is that it has a 6to4 IPv6 address,
which AIUI Windows assigns because the machine has a public IPv4
address. Might that be causing the problem? Any thoughts on a fix?
What happens if you modify the code to retry the select() and
subsequent getpeername() if you get WSAENOTCONN? That will be wrong if
the connection fails, but it will tell us if the error is a transient
thing or whether it's permanently failed.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Mike Richmond
2009-09-01 22:08:18 UTC
Permalink
Post by Duncan Grisby
Post by Mike Richmond
I'm seeing a problem whereby tcpAddress::Connect() fails because
getpeername() fails with error 10057 (WSAENOTCONN). This is after
connect() has returned 10035 (WSAEWOULDBLOCK) and select() has
returned 1. I see the error when using omniORB 4.1.3, but not with
omniORB 4.1.0.
omniORB 4.1.0 didn't do the getpeername() check -- it just assumed that
when the select() returned, the socket was connected, which isn't always
the case.
Post by Mike Richmond
Also I am only seeing this problem on one machine, which is running
Windows 2008 Server. Another Windows 2008 Server machine is OK, as
are machines with other Windows OSes. A possibly significant
property of the problem machine is that it has a 6to4 IPv6 address,
which AIUI Windows assigns because the machine has a public IPv4
address. Might that be causing the problem? Any thoughts on a fix?
What happens if you modify the code to retry the select() and
subsequent getpeername() if you get WSAENOTCONN?
I get WSAENOTCONN again, and every time up to a retry limit of 10.
Looks like getpeername() will always return WSAENOTCONN.
Post by Duncan Grisby
That will be wrong if
the connection fails, but it will tell us if the error is a transient
thing or whether it's permanently failed.
I've also tried using tcpAddress.cc from omniORB 4.1.4. Tracing from
that shows:

09/01/09 09:53:16: ORBImpl: omniORB: Client attempt to connect to
giop:tcp:<MyMachine>:9905 (INFO)
09/01/09 09:53:17: ORBImpl: omniORB: Name '<MyMachine>' resolved:
<IPv6 6to4 address> (INFO)
09/01/09 09:53:18: ORBImpl: omniORB: Failed to connect (no peer
name): <IPv6 6to4 address> (INFO)
09/01/09 09:53:18: ORBImpl: omniORB: Name '<MyMachine>' resolved:
<IPv4 address> (INFO)
09/01/09 09:53:18: ORBImpl: omniORB: Client opened connection to
giop:tcp:<IPv4 address>:9905 (INFO)

which seems to show that it's the 6to4 address which is a problem. I
guess we should look to use 4.1.4, but would you expect any problems
from just dropping tcpAddress.cc into a 4.1.3 source tree?

Mike
Duncan Grisby
2009-09-03 21:33:39 UTC
Permalink
On Tuesday 1 September, Mike Richmond wrote:

[...]
Post by Mike Richmond
I've also tried using tcpAddress.cc from omniORB 4.1.4. Tracing from
09/01/09 09:53:16: ORBImpl: omniORB: Client attempt to connect to
giop:tcp:<MyMachine>:9905 (INFO)
<IPv6 6to4 address> (INFO)
Ah, I hadn't realised you were using a host name in your IORs.

Are you sure it's a 6to4 address, and not a link-local address (i.e. one
that starts fe80)? I've seen the problem that resolving the machine's
own name on Windows 2008 can return a useless link-local address,
leading to the problem you see.
Post by Mike Richmond
09/01/09 09:53:18: ORBImpl: omniORB: Failed to connect (no peer
name): <IPv6 6to4 address> (INFO)
<IPv4 address> (INFO)
09/01/09 09:53:18: ORBImpl: omniORB: Client opened connection to
giop:tcp:<IPv4 address>:9905 (INFO)
which seems to show that it's the 6to4 address which is a problem. I
guess we should look to use 4.1.4, but would you expect any problems
from just dropping tcpAddress.cc into a 4.1.3 source tree?
You'd be much better off using the full 4.1.4 release, but I would
expect it to work to drop tcpAddress.cc into the tree as you have done.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Mike Richmond
2009-09-07 14:23:22 UTC
Permalink
Post by Duncan Grisby
[...]
Post by Mike Richmond
I've also tried using tcpAddress.cc from omniORB 4.1.4. Tracing from
09/01/09 09:53:16: ORBImpl: omniORB: Client attempt to connect to
giop:tcp:<MyMachine>:9905 (INFO)
<IPv6 6to4 address> (INFO)
Ah, I hadn't realised you were using a host name in your IORs.
Are you sure it's a 6to4 address, and not a link-local address
(i.e. one
that starts fe80)?
Yes, the reported resolved address starts 2002.
Post by Duncan Grisby
I've seen the problem that resolving the machine's
own name on Windows 2008 can return a useless link-local address,
leading to the problem you see.
Post by Mike Richmond
I guess we should look to use 4.1.4, but would you expect any
problems
from just dropping tcpAddress.cc into a 4.1.3 source tree?
You'd be much better off using the full 4.1.4 release, but I would
expect it to work to drop tcpAddress.cc into the tree as you have done.
Thanks.

Mike

Loading...