Discussion:
[omniORB] Port allocation problem on windows (incl. patch)
Hannes Lerchl
2010-04-14 16:43:52 UTC
Permalink
Hello,

we're developing a testing application using omniORB (to be accessible via Corba). The application is meant to be run on MS Windows so I've downloaded the binary package omniORB-4.1.3-x86_win32-vs9.zip

Things work OK but one of our customers "discovered" a strange "feature" of windows' socket library which is used by omniORB:

Windows has a fundamentally different understanding of the socket option SO_REUSEADDR than posix systems:
Instead of immediately reusing the port after program termination the socket is free for reuse even if another program is still using it. The outcome is undefined and most probably both applications are unhappy (for our application this meant that further communication was impossible without any of the two applications knowing what went wrong).

See also http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4476378 and http://msdn.microsoft.com/en-us/library/ms740621%28VS.85%29.aspx

I didn't find out whether SO_REUSEADDR and SO_EXCLUSIVEADDRUSE (which is Microsofts' way of fixing the described issue) can be used together but using only SO_REUSEADDR on windows does more harm than good.

So I patched the two files src/lib/omniORB/orbcore/ssl/sslEndpoint.cc and src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc (where SO_REUSEADDR is set) and replaced it with SO_EXCLUSIVEADDRUSE. The patch is not perfect but it works for us and now the second started instance of our application gets a socket error when booting (as desired).

Below I append the patch we applied (created with mercurial but it should be compatible with standard unix patch)


Best regards,
Hannes Lerchl


diff -r d5da39bdabf5 -r cd803700f307 src/lib/omniORB/orbcore/ssl/sslEndpoint.cc
--- a/src/lib/omniORB/orbcore/ssl/sslEndpoint.cc Wed Apr 07 10:50:27 2010 +0200
+++ b/src/lib/omniORB/orbcore/ssl/sslEndpoint.cc Fri Apr 09 10:10:46 2010 +0200
@@ -482,11 +482,23 @@

if (pd_address.port) {
int valtrue = 1;
+#ifdef __WIN32__
+ // Windows has a fundamentally different understanding about SO_REUSEADDR
+ // so we don't use it but take SO_EXCLUSIVEADDRUSE. See also
+ // http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4476378 and
+ // http://msdn.microsoft.com/en-us/library/ms740621%28VS.85%29.aspx
+ if (setsockopt(pd_socket,SOL_SOCKET,SO_EXCLUSIVEADDRUSE,
+ (char*)&valtrue,sizeof(int)) == RC_SOCKET_ERROR) {
+
+ omniORB::logs(2, "Warning: failed to set SO_EXCLUSIVEADDRUSE option.");
+ }
+#else
if (setsockopt(pd_socket,SOL_SOCKET,SO_REUSEADDR,
(char*)&valtrue,sizeof(int)) == RC_SOCKET_ERROR) {

omniORB::logs(2, "Warning: failed to set SO_REUSEADDR option.");
}
+#endif
}
if (omniORB::trace(25)) {
omniORB::logger log;
diff -r d5da39bdabf5 -r cd803700f307 src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc
--- a/src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc Wed Apr 07 10:50:27 2010 +0200
+++ b/src/lib/omniORB/orbcore/tcp/tcpEndpoint.cc Fri Apr 09 10:10:46 2010 +0200
@@ -470,11 +470,23 @@

if (pd_address.port) {
int valtrue = 1;
+#ifdef __WIN32__
+ // Windows has a fundamentally different understanding about SO_REUSEADDR
+ // so we don't use it but take SO_EXCLUSIVEADDRUSE. See also
+ // http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4476378 and
+ // http://msdn.microsoft.com/en-us/library/ms740621%28VS.85%29.aspx
+ if (setsockopt(pd_socket,SOL_SOCKET,SO_EXCLUSIVEADDRUSE,
+ (char*)&valtrue,sizeof(int)) == RC_SOCKET_ERROR) {
+
+ omniORB::logs(2, "Warning: failed to set SO_EXCLUSIVEADDRUSE option.");
+ }
+#else
if (setsockopt(pd_socket,SOL_SOCKET,SO_REUSEADDR,
(char*)&valtrue,sizeof(int)) == RC_SOCKET_ERROR) {

omniORB::logs(2, "Warning: failed to set SO_REUSEADDR option.");
}
+#endif
}
if (omniORB::trace(25)) {
omniORB::logger log;



--
Dipl.-Inf. Hannes Lerchl, Senior Software Architect
Phone: +49.89.45 23 47-22



--
jambit Software Development & Management GmbH
Nymphenburger Stra?e 13-15, D-80335 M?nchen
Phone: +49.89.45 23 47-0 Fax: +49.89.45 23 47-70

http://www.jambit.com where innovation works

Gesch?ftsf?hrer: Peter F. Fellinger, Markus Hartinger
Sitz: M?nchen; Registergericht: M?nchen, HRB 129139
Felix Nawothnig
2010-04-14 23:28:05 UTC
Permalink
> Windows has a fundamentally different understanding of the socket option
> SO_REUSEADDR than posix systems [...]

Ran into this too a couple of weeks ago.

> I didn't find out whether SO_REUSEADDR and SO_EXCLUSIVEADDRUSE (which is
> Microsofts' way of fixing the described issue) can be used together but
> using only SO_REUSEADDR on windows does more harm than good.

According to the pages I read on MSDN they can't.

>From what I've understood from MSDN the situation is like this
(Disclaimer: I have not tested all of this - this is mostly based on texts
spread through dozens of MSDN pages):

Micrsoft's SO_REUSEADDR implementation is purely insane and should never
be used under any circumstances (for obvious reasons).

To fix SO_REUSEADDR they invented SO_EXCLUSIVEADDRUSE which does nothing
but prevent _other_ applications from hijacking ports using SO_REUSEADDR -
but it does _not_ solve the original problem for which SO_REUSEADDR is a
solution on unix (the impossibility to rebind to sockets in TIME_WAIT
state).

Also note that before Vista SO_EXCLUSIVEADDRUSE required administrative
rights, so it's not really usable when targeting XP or earlier systems.

So - as I understand this: If you want to safely listen to a TCP socket on
Windows you can not use SO_REUSEADDR, have to use SO_EXCLUSIVEADDRUSE, be
prepared that you might need administrative rights and live with the fact
that you have to wait for the TIME_WAIT to time-out.

I'd be curious if anyone can give me a hint on what the _fuck_ the
developers might have been smoking when they designed this.

Personally I simply use SO_REUSEADDR, create a named mutex to ensure my
application doesn't bind twice to the same port and add a disclaimer that
everything might break down horribly if someone runs another application
besides mine on that machine - Windows isn't really for multitasking you
know.

As an alternative one could call netstat to see if there is another
application using that port right now. Race-con? What race-con?

Note that there is some text on
http://msdn.microsoft.com/en-us/library/ms740621%28VS.85%29.aspx talking
about SO_LINGER which _might_ be a solution to solve the TIME_WAIT
issue... maybe someone knows more about this.

Cheers,

Felix

P.S.: Sorry for this flamebait - I just wasted way too much time on this
issue until I finally gave up... I'd very much like to be proven wrong...
Duncan Grisby
2010-05-06 22:08:55 UTC
Permalink
On Wed, 2010-04-14 at 12:43 +0200, Hannes Lerchl wrote:

> Windows has a fundamentally different understanding of the socket
> option SO_REUSEADDR than posix systems:
>
> Instead of immediately reusing the port after program termination the
> socket is free for reuse even if another program is still using it.
> The outcome is undefined and most probably both applications are
> unhappy (for our application this meant that further communication was
> impossible without any of the two applications knowing what went
> wrong).

Sadly, like Posix systems, Windows needs SO_REUSEADDR to allow a server
that has terminated unexpectedly to restart using the same port. There
is no way to allow that on Windows without also allowing another process
to "steal" the port.

SO_EXCLUSIVEADDRUSE doesn't come into it -- that is there to prevent
other processes stealing a port from the process that specified it, not
to prevent a process accidentally stealing another process' port.

Your patch to set SO_EXCLUSIVEADDRUSE is definitely not appropriate --
it will prevent restart of servers after they have crashed.

I would accept a patch that added a configuration parameter to
optionally disable the setting of SO_REUSEADDR, so the user can choose
which of the two problems they would prefer to have.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
Hannes Lerchl
2010-05-07 20:31:42 UTC
Permalink
Hi Duncan,

> SO_EXCLUSIVEADDRUSE doesn't come into it -- that is there to prevent
> other processes stealing a port from the process that specified it, not
> to prevent a process accidentally stealing another process' port.

Our system is a bit different in that it is (also) meant to run on user machines
and waits to be accessed by some client side. So now if the user accidentally
starts two instances things are bad (and we can't detect it which is more bad).

> Your patch to set SO_EXCLUSIVEADDRUSE is definitely not appropriate --
> it will prevent restart of servers after they have crashed.

OK, I see that point.

> I would accept a patch that added a configuration parameter to
> optionally disable the setting of SO_REUSEADDR, so the user can choose
> which of the two problems they would prefer to have.

That'll be sufficient for our system since we already inject some options via
argc/argv. I'm trying to provide such a patch and mail again.

Btw: The omniORBpy has the same problem so it'll also need such a parameter.

Best regards,
Hannes Lerchl

PS: I will be offline for the next two weeks since I'll attend a conference.
Loading...