[Tutor] Playing with XML

Peter Otten __peter__ at web.de
Fri Jun 21 10:48:27 CEST 2013


Danilo Chilene wrote:

> Hello,
> 
> Below is my code:
> 
> #!/bin/env python
> # -*- coding: utf-8 -*-
> import requests
> from lxml import etree
> 
> url = 'http://192.168.0.1/webservice.svc?wsdl'
> headers = {'Content-Type': 'text/xml;charset=UTF-8', 'SOAPAction': '
> http://tempuri.org/ITService/SignIn'}
> xml = '''<soapenv:Envelope xmlns:soapenv="
> http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:tem="http://tempuri.org/">
>             <soapenv:Header></soapenv:Header>
>             <soapenv:Body>
>                 <tem:SignIn>
>                     <tem:rq>
>                         <env:ClientSystem>123</env:ClientSystem>
>                         <env:CompanyId>123</env:CompanyId>
>                         <env:Password>123</env:Password>
>                         <env:Signature>omg</env:Signature>
>                     </tem:rq>
>                 </tem:SignIn>
>             </soapenv:Body>
>         </soapenv:Envelope>'''
> 
> response = requests.post(url, data=xml, headers=headers).text
> print response
> 
> doc = etree.parse(response)
> 
> 
> The content of variable response is a big XML with some values that I
> want.
> 
> Part of variable response:
> ---------------------------------------------------------------------------------------------------------------------
> <s:Envelope
> xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Body><SignInResponse
> xmlns="http://tempuri.org/"><SignInResult xmlns:a="
> http://schemas.datacontract.org/2004/07/Core.DTO.Envelopes.Authentication"
> xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><Errors xmlns="
> http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes" xmlns:b="
> http://schemas.datacontract.org/2004/07/Framework"/><Messages xmlns="
> http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes" xmlns:b="
> http://schemas.datacontract.org/2004/07/Framework"/><Successful xmlns="
> http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes">true</Successful><Warnings
> xmlns="http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes"
> xmlns:b="http://schemas.datacontract.org/2004/07/Framework"/><a:ApplicationSettings
> xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays
> "><b:KeyValueOfstringstring><b:Key>removeDuplicatedFlights</b:Key><b:Value>true</b:Value></b:KeyValueOfstringstring><b:KeyValueOfstringstring><b:Key>useWeakPassword</b:Key>
> ---------------------------------------------------------------------------------------------------------------------
> 
> Below the return of doc = etree.parse(response)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3630:
> ordinal not in range(128)
> The type of response is unicode.
> 
> 
> The whole idea is sign in on this webservice and get a Security token and
> then run another XML on the same script.
> 
> Any ideas to transform this unicode on XML and parse it?

Read carefully:

>>> help(lxml.etree.parse)
Help on built-in function parse in module lxml.etree:

parse(...)
    parse(source, parser=None, base_url=None)
    
    Return an ElementTree object loaded with source elements.  If no parser
    is provided as second argument, the default parser is used.
    
    The ``source`` can be any of the following:
    
    - a file name/path
    - a file object
    - a file-like object
    - a URL using the HTTP or FTP protocol
    
    To parse from a string, use the ``fromstring()`` function instead.
    
    Note that it is generally faster to parse from a file path or URL
    than from an open file object or file-like object.  Transparent
    decompression from gzip compressed sources is supported (unless
    explicitly disabled in libxml2).
    
    The ``base_url`` keyword allows setting a URL for the document
    when parsing from a file-like object.  This is needed when looking
    up external entities (DTD, XInclude, ...) with relative paths.

A quick test confirms that fromstring() accepts unicode:

>>> lxml.etree.fromstring(u"<a>äöü</a>")
<Element a at 0x2266320>
>>> print _.text
äöü

If response is a file-like object the following will work, too:

#untested
response = requests.post(url, data=xml, headers=headers)
doc = etree.parse(response)




More information about the Tutor mailing list