[Expat-discuss] Expat and JavaScript

Franky Braem franky.braem at gmail.com
Tue Aug 1 19:43:33 CEST 2006


Karl Waclawek wrote:
> Franky Braem wrote:
>   
>> I'm porting expat to JavaScript (using SpiderMonkey, see my wxJS 
>> project). The application is in Unicode.
>> I have the following problem:
>>
>> When I pass a normal JavaScript string (SpiderMonkey uses UTF-16) to a 
>> 'print' method I don't have to do any conversions to write the data to 
>> stdout.
>> When I use a string generated by expat (it is build in UNICODE and 
>> creates data in UTF-16) I have to do a conversion before I can send the 
>> string to the 'print' method.
>> When I look at the expat string in debug mode, I see all weird Chinese 
>> characters. When I debug the normal JavaScript string, I see normal text.
>> Is there a conversion needed between SpiderMonkey and Expat?
>>
>> Maybe this JavaScript code makes some things clearer:
>>
>> p.parse('<tag1 attr="1" attr2="2"><child1>child</child1></tag1>', true);
>> if ( p.currentElement.name == "tag1")
>> {
>>  wxJS.print("Ok");
>> }
>> else
>> {
>>  wxJS.print("Not ok");
>> }
>> wxJS.print(p.currentElement.name);
>>
>> The comparison always fails (all though the parsing in expat went fine)
>>
>>   
>>     
>
> How does p.currentElement.name get its value from Expat
This is how it is done:

void XMLParser::StartElementHandler(void *userData,
                                    const XML_Char *name,
                                    const XML_Char **atts)
{
    wxMBConvUTF16 conv;
    XMLParser *parser = (XMLParser*) userData;

    wxString element(name, conv);

    ...

    jsval rval;
    jsval argv[] =
    {
        JS_NewUCStringCopyN(cx, (jschar *) element.c_str(), 
element.length()));
        OBJECT_TO_JSVAL(objAttr)
    };

    wxJS_CallFunctionProperty(parser->m_cx, parser->m_obj, "onStartElement",
                              2, argv, &rval);
}


the element name is passed to the onStartElement function. And that's a 
JavaScript function that sets the current element.

When I do the following conversion. It seems to work:

    wxCharBuffer s = conv.cWC2MB(element);
    wxWCharBuffer t = wxConvCurrent->cMB2WC(s);
    element = t;


The full Javascript code:

// Constructor for a XMLElement object
function XMLElement(parent, name, attrs)
{
  this.parent = parent;
  if ( parent != null )
    parent.elements[parent.elements.length + 1] = this;
  this.name = name;
  this.attrs = attrs;
  this.elements = new Array();
}

var p = new XMLParser();
p.currentElement = null;

p.onStartElement = function(element, attrs)
{
  currentElement = new XMLElement(currentElement, element, attrs);
}

p.onEndElement = function(element)
{
  if ( currentElement.parent != null )
     currentElement = currentElement.parent;
}

p.onCharacter = function(data)
{
  currentElement.data = data;
}

p.parse('<tag1 attr="1" attr2="2"><child1>child</child1></tag1>', true);
if ( p.currentElement.name == "tag1")
{
  wxJS.print("Ok");
}
else
{
  wxJS.print("Not ok");
}
wxJS.print(p.currentElement.name);
// p.currentElement points to the root tag now(tag1)




More information about the Expat-discuss mailing list