[Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe

SourceForge.net noreply at sourceforge.net
Wed Aug 10 21:45:36 CEST 2005


Bugs item #1255896, was opened at 2005-08-10 10:43
Message generated for change (Settings changed) made by kwaclaw
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Documentation
Group: Not a Bug
>Status: Closed
>Resolution: Rejected
Priority: 1
Submitted By: abhijitk (abhijitkankaria)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Expat 1.95.8 ReadMe

Initial Comment:
>From Expat ReadMe:
--------------------------------------------------------------------------------------
If you are interested in building Expat to provide document
information in UTF-16 rather than the default UTF-8,
following these instructions:
  1. For UTF-16 output as unsigned short (and
version/error           strings as char), run:
               ./configure CPPFLAGS=-DXML_UNICODE

     For UTF-16 output as wchar_t (incl. version/error
strings),           run:
               ./configure CFLAGS="-g -O2 -fshort-wchar"                           CPPFLAGS=-DXML_UNICODE_WCHAR_T

  2. Edit the MakeFile, changing:
               LIBRARY = libexpat.la
           to:
               LIBRARY = libexpatw.la
           (Note the additional "w" in the library name.)

  3. Run "make buildlib" (which builds the library only).
  4. Run "make installlib" (which installs the library
only).
--------------------------------------------------------------------------------------
As per the defination of -fshort-wchar:

-fshort-wchar
         Override the underlying type for wchar_t to be
short
         unsigned int instead of the default for the
target.
         This option is useful for building programs to
run under
         WINE.
 
         Warning: the -fshort-wchar switch causes GCC
to     generate code that is not binary compatible with
code generated without that switch.  Use it to conform
to a non-default application binary interface.
--------------------------------------------------------------------------------------

So this indicates that the option -fshort-wchar is to
be used in case I need UTF-16 output as unsigned short.
But as the ReadMe suggests should I use the option
-fshort-wchar for UTF-16 output as wchar_t?

Please correct me if my understanding is incorrect.

----------------------------------------------------------------------

>Comment By: Karl Waclawek (kwaclaw)
Date: 2005-08-10 15:45

Message:
Logged In: YES 
user_id=290026

I am glad you found some answers, even though
they are probably not what you wanted.

Closing this issue.

----------------------------------------------------------------------

Comment By: abhijitk (abhijitkankaria)
Date: 2005-08-10 15:33

Message:
Logged In: YES 
user_id=1312629

I came across this bug :
[ 931546 ] Unixode support for Windows and Unix are not
compatible

This is exactly what problem I am facing too. I got your
answer there already.

----------------------------------------------------------------------

Comment By: abhijitk (abhijitkankaria)
Date: 2005-08-10 13:29

Message:
Logged In: YES 
user_id=1312629

Yes Google is the only help i have been using for few yrs
now.....
Thanks for your time and info, made things bit more clear
for me.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2005-08-10 13:25

Message:
Logged In: YES 
user_id=290026

I don't know of any specific sources for wchar_t problems,
but Google should help you there.

I would think that if all your application code is built on the
assumption of a 2-byte wchar_t, then it would make sense to 
use the system libraries compiled for a short wchar_t.

----------------------------------------------------------------------

Comment By: abhijitk (abhijitkankaria)
Date: 2005-08-10 13:13

Message:
Logged In: YES 
user_id=1312629

In my code I am using wchar_t data type,  I have not
specifically defined it in my code.
The existing library is working on MAC as MAC has wchar_t
defined as unsigned short in the system headers.
On Solaris I use the -fshort-wchar option for compiling so i
guess wchar_t gets defined as unsigned short.

Let me explain in short what i am trying to do here, I am
porting the application from MAC to Solaris.
On Mac the expat library is compiled with XML_UNICODE so
XML_Char is defined as wchar_t and wchar_t is defined as
unsigned short in the system headers. My library which
interfaces with expat has wchar_t used in all places as its
avaliable on MAC as unsigned short, so it worked fine.

Now on Solaris if i have to use the -fshort-wchar option to
have two byte wchar_t, I have two options:
1) Compile everything with -fshort-wchar option including
system libraries so that all use the wchar_t as two bytes.
OR
2) Either change my entire library code to use some thing
like XML_Char so I can control how its defined. 
But in second case still the system libraries will stil use
wchar_t defined as long.

This is out of way question, please guide or if there is any
place where i can get more info on wide build of expat.

Thanks.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2005-08-10 11:51

Message:
Logged In: YES 
user_id=290026

Apparently, you want to process Unicode on Solaris not as 
UTF-8 but as UTF-16 encoded. Therefore you need a
two-byte data type for the UTF-16 base character type.

Which base data-type for UTF-16 are you using elsewhere, 
unsigned short or a (redefined) wchar_t?

----------------------------------------------------------------------

Comment By: abhijitk (abhijitkankaria)
Date: 2005-08-10 11:43

Message:
Logged In: YES 
user_id=1312629

So wchar_t has to be two byte for expat to work correctly?

I am compiling a 32 bit applicaiton on Solaris, so if i dont
use the -fshort-wchar, wchar_t will be defined as long. My
code does not depend on the size of wchar_t, will expat give
the desired result in this scenario?

Basically I am getting senmentation fault in my application
and so I am looking if the -fshort-wchar switch which causes
GCC to  generate code that is not binary compatible with
code generated without that switchoption is the reason.
My own libraries are build with this option but other system
libraries (from /usr/lib) are not compiled with this option.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2005-08-10 11:02

Message:
Logged In: YES 
user_id=290026

No, use -fshort-wchar only if you want UTF-16 output
to be deliveerd as a *two-byte* wchar_t type (necessary 
because on Unix wchar_t is typically four bytes).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127


More information about the Expat-bugs mailing list