Discussion:
[omniORB] Non-NULL terminated strings
David
2008-04-17 22:59:05 UTC
Permalink
Hi,

How would I represent a non-NULL terminated string in the best way? In the
C++ codebase I'm working with, those kinds of strings are represented with a
char * and a length field, put together in a struct. Unfortunately, omniidl
seems to map the string type to a char *, and not a C++ string. This means
that the IDL string datatype cannot be used for strings containing NULLs.
The spontaneous workaround was to do this in IDL:

typedef sequence<char> StringData;

Will this be incredibly inefficient? Are there other collection-type
datatypes in IDL, or is sequence the only one? Reason I'm asking is because
I think it's quite a hassle to use. A growable list would be a lot easier to
work with. Just to re-iterate, here's my original question: how do you guys
handle this scenario? Any help is appreciated.

/David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080417/129a8916/attachment.htm
Michael
2008-04-18 00:38:31 UTC
Permalink
Hi,

I would suggest reading "Advances CORBA Programming with C++" and to get a copy of the
CORBA standard (for omniORB 2.6 seems to be the best choice to read), you can get it for
free on www.omg.org. Since the CORBA standard is basically pre-STL, none of the useful
containers/strings etc. of C++ are used (which sometimes makes it a pain to use). Your
best bet is to write some helper templates to ease using these constructs. CORBA in
general uses NULL-terminated strings everywhere (even for the non-C++ mappings like
Python). So sequence<char> or sequence<octet> are you best chances (even so personally I
would try to avoid the use of strings containnig 0, because it makes interoperability with
standard C APIs impossible).

Depending on your environment the sequence approach should be accepable performace-wise,
especially if you use strings internally and make only one call to do the conversion (so
you allocate the sequence and copy into it instead of growing it element by element).
Sequences of primitive types also allow you to provide a buffer allocated by yourself,
which might be intersting when doing calls - so you could avoid an extra copy on the
client side (read about the sequences for details). This will of course only for on IN
parameters.

Example:

Assume the IDL:
typedef sequence<char> StringData;
interface X
{
void call(in StringData inString);
};

And client code using it:

StringData corba_string_in(std::string& s)
{
return StringData(s.size(), s.size(), reinterpret_cast<CORBA::Char*>(&s[0]), false);
// last parameter "false" means that StringData will not free the buffer on destruction
}

std::string s(...);
X_ptr x = ...;
x->call(corba_string_in(s));

This of course is neither nice nor necessarely very portable (+ I didn't test it), but by
encapsulating this hack into corba_string_in it should be very easy to change if
necessary. You might want to make this a little bit more sophisticated, so StringData
isn't modified by accident (which most likely will crash your application):

class InString
{
std::auto_ptr<StringData> data_;

public:
InString(std::string& s):
data_(new StringData(s.size(), s.size(), reinterpret_cast<CORBA::Char*>(&s[0]), false))
{}

const Mine::StringData& in() const
{
return *data_.get();
}
};

std::string s(...);
X_ptr x = ...;
x->call(InString(s).get());

For the server side you would of course need something like:

std::string to_string(const StringData& sd)
{
return std::string(*(sd.get_buffer()), sd.length());
}

void X_impl::call(const StringData& inString)
{
std::string s = to_string(inString);
}

Again, none of that is tested and I wouldn't consider it particulary good style.

Hope it helps anyway :)

cheers
michael
Post by David
Hi,
How would I represent a non-NULL terminated string in the best way? In the
C++ codebase I'm working with, those kinds of strings are represented with a
char * and a length field, put together in a struct. Unfortunately, omniidl
seems to map the string type to a char *, and not a C++ string. This means
that the IDL string datatype cannot be used for strings containing NULLs.
typedef sequence<char> StringData;
Will this be incredibly inefficient? Are there other collection-type
datatypes in IDL, or is sequence the only one? Reason I'm asking is because
I think it's quite a hassle to use. A growable list would be a lot easier to
work with. Just to re-iterate, here's my original question: how do you guys
handle this scenario? Any help is appreciated.
/David
------------------------------------------------------------------------
_______________________________________________
omniORB-list mailing list
http://www.omniorb-support.com/mailman/listinfo/omniorb-list
Duncan Grisby
2008-04-20 00:59:25 UTC
Permalink
Post by David
How would I represent a non-NULL terminated string in the best way? In the
C++ codebase I'm working with, those kinds of strings are represented with a
char * and a length field, put together in a struct. Unfortunately, omniidl
seems to map the string type to a char *, and not a C++ string. This means
that the IDL string datatype cannot be used for strings containing NULLs.
That is the standard C++ mapping of CORBA strings. It's not omniidl's
decision. The CORBA core standard says that CORBA strings may not
contain nulls, so it's not just a language mapping issue.
Post by David
typedef sequence<char> StringData;
Will this be incredibly inefficient?
It will certainly be pretty inefficient, because the code set conversion
framework will look at each character individually to see if it needs
conversion. You would probably be better off with a sequence<octet>,
which is just sent uninterpreted and is thus the fastest way of sending
data from one side to the other.

Cheers,

Duncan.
--
-- Duncan Grisby --
-- ***@grisby.org --
-- http://www.grisby.org --
David
2008-04-20 20:00:16 UTC
Permalink
Hi,

All great replies. Thanks for the good input.

/David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080420/750f15a8/attachment.htm
David
2008-05-27 14:28:52 UTC
Permalink
Hi folks,

I'm going to re-open this thread for a moment. So, currently I have this IDL
representation:

typedef sequence<octet> StringData;
struct MyStruct {
StringData s;
};

Then I define a function to convert a char * to a StringData:

StringData *convert( char *char_p, int len ) {
CORBA::Octet *chars = new CORBA::Octet[len];
for ( int i = 0; i < len; i++ ) {
chars[i] = char_p[i];
}
return new StringData(len, len, chars, false);
}

It is then used like this:

void foo( char *p, int len ) {
struct MyStruct myStruct;
myStruct.s = StringData_var( convert( p, len ) );
// ... do stuff
}

Now my question is, is the memory I have allocated here with the new
StringData and new Octet freed when foo() returns? How much does the
StringData_var take care of -- i.e., does it free the Octets? Could someone
clarify the meaning of the fourth parameter to new StringData()? I have a
feeling that it might be relevant here.

Thanks,
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20080527/056ef918/attachment.htm
Clarke Brunt
2008-05-27 16:35:29 UTC
Permalink
Currently I have
typedef sequence<octet> StringData;
struct MyStruct {
StringData s;
};
StringData *convert( char *char_p, int len ) {
CORBA::Octet *chars = new CORBA::Octet[len];
for ( int i = 0; i < len; i++ ) {
chars[i] = char_p[i];
}
return new StringData(len, len, chars, false);
}
void foo( char *p, int len ) {
struct MyStruct myStruct;
myStruct.s = StringData_var( convert( p, len ) );
// ... do stuff
}
Now my question is, is the memory I have allocated here with the new
StringData and new Octet freed when foo() returns? How much does the
StringData_var take care of -- i.e., does it free the Octets? Could
someone clarify the meaning of the fourth parameter to new
StringData()? I have a feeling that it might be relevant here.
You really need to read (if you haven't) the C++ language mapping
specification - very likely to leak something if you haven't fully
understood them.

Just writing off the top of my head here, so some risk that _I'll_ get
it wrong as well.

That final contructor parameter: you're using the default 'false', which
means that the sequence will _not_ do memory management of the supplied
buffer. So I reckon at best you're going to leak the char array. It
would stand a better chance with 'true'.

But if you're going to let sequences manage memory that you have
allocated, then you're supposed to use allocbuf to allocate it (rather
than 'new'), to be certain that it matches with the freebuf that the
sequence will do.

Rather than allocating your own buffer at all, why not just set the
sequence to the correct length, and then fill its own buffer, i.e.

StringData *convert( char *char_p, int len ) {
// Use a _var just as good practice - freed if we happened to get an
exception before returning
StringData_var s = new StringData(len);
for ( int i = 0; i < len; i++ ) {
s[i] = char_p[i];
}
return s._retn();
}

Or if you thought that repeated use of operator[] on the sequence was a
bit inefficient, then use get_buffer to obtain a pointer to its buffer.

Now your StringData_var in foo: It will delete the sequence when it goes
out of scope, but with your code won't delete the chars because of that
'false'. But I reckon its presence is completely pointless. It's doing
no harm, but is just wasting the construction/destruction of the _var
and the copy into the structure member, which will do its own memory
management.

So just do:

myStruct.s = convert( p, len );

Clarke Brunt
TRAFFICMASTER PLC
UNIT 22 / ST. JOHN'S INNOVATION CENTRE
COWLEY ROAD / CAMBRIDGE / CB4 0WS
T: 01223 422469
F:
E: ***@trafficmaster.co.uk


Please consider the environment before printing this email. --------------------------------------------------------

Trafficmaster PLC is a limited Company registered in England and Wales.
Registered number: 2292714 Registered office: Martell House, University Way, Cranfield, BEDS. MK43 0TR

This message (and any associated files) is intended only for the use of omniorb-***@omniorb-support.com and may contain information that is confidential, subject to copyright or constitutes a trade secret. If you are not omniorb-***@omniorb-support.com you are hereby notified that any dissemination, copying or distribution of this message, or files associated with this message, is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer. Any views or opinions presented are solely those of the author ***@trafficmaster.co.uk and do not necessarily represent those of the company.

Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
--------------------------------------------------------
Clarke Brunt
2008-05-27 16:47:51 UTC
Permalink
Post by Clarke Brunt
Rather than allocating your own buffer at all, why not just set the
sequence to the correct length, and then fill its own buffer, i.e.
StringData *convert( char *char_p, int len ) {
// Use a _var just as good practice - freed if we happened to get an
exception before returning
Post by Clarke Brunt
StringData_var s = new StringData(len);
for ( int i = 0; i < len; i++ ) {
s[i] = char_p[i];
}
return s._retn();
}
Oops! Forgot what the single-argument constructor does - just sets the
maximum length, not the actual length. So make that e.g.

StringData_var s = new StringData;
s->length(len);
--
Clarke Brunt

Clarke Brunt
TRAFFICMASTER PLC
UNIT 22 / ST. JOHN'S INNOVATION CENTRE
COWLEY ROAD / CAMBRIDGE / CB4 0WS
T: 01223 422469
F:
E: ***@trafficmaster.co.uk


Please consider the environment before printing this email. --------------------------------------------------------

Trafficmaster PLC is a limited Company registered in England and Wales.
Registered number: 2292714 Registered office: Martell House, University Way, Cranfield, BEDS. MK43 0TR

This message (and any associated files) is intended only for the use of omniorb-***@omniorb-support.com and may contain information that is confidential, subject to copyright or constitutes a trade secret. If you are not omniorb-***@omniorb-support.com you are hereby notified that any dissemination, copying or distribution of this message, or files associated with this message, is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer. Any views or opinions presented are solely those of the author ***@trafficmaster.co.uk and do not necessarily represent those of the company.

Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
--------------------------------------------------------
Continue reading on narkive:
Loading...