[Tutor] Playing with XML
Peter Otten
__peter__ at web.de
Fri Jun 21 10:48:27 CEST 2013
Danilo Chilene wrote:
> Hello,
>
> Below is my code:
>
> #!/bin/env python
> # -*- coding: utf-8 -*-
> import requests
> from lxml import etree
>
> url = 'http://192.168.0.1/webservice.svc?wsdl'
> headers = {'Content-Type': 'text/xml;charset=UTF-8', 'SOAPAction': '
> http://tempuri.org/ITService/SignIn'}
> xml = '''<soapenv:Envelope xmlns:soapenv="
> http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:tem="http://tempuri.org/">
> <soapenv:Header></soapenv:Header>
> <soapenv:Body>
> <tem:SignIn>
> <tem:rq>
> <env:ClientSystem>123</env:ClientSystem>
> <env:CompanyId>123</env:CompanyId>
> <env:Password>123</env:Password>
> <env:Signature>omg</env:Signature>
> </tem:rq>
> </tem:SignIn>
> </soapenv:Body>
> </soapenv:Envelope>'''
>
> response = requests.post(url, data=xml, headers=headers).text
> print response
>
> doc = etree.parse(response)
>
>
> The content of variable response is a big XML with some values that I
> want.
>
> Part of variable response:
> ---------------------------------------------------------------------------------------------------------------------
> <s:Envelope
> xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Body><SignInResponse
> xmlns="http://tempuri.org/"><SignInResult xmlns:a="
> http://schemas.datacontract.org/2004/07/Core.DTO.Envelopes.Authentication"
> xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><Errors xmlns="
> http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes" xmlns:b="
> http://schemas.datacontract.org/2004/07/Framework"/><Messages xmlns="
> http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes" xmlns:b="
> http://schemas.datacontract.org/2004/07/Framework"/><Successful xmlns="
> http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes">true</Successful><Warnings
> xmlns="http://schemas.datacontract.org/2004/07/Framework.BaseEnvelopes"
> xmlns:b="http://schemas.datacontract.org/2004/07/Framework"/><a:ApplicationSettings
> xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays
> "><b:KeyValueOfstringstring><b:Key>removeDuplicatedFlights</b:Key><b:Value>true</b:Value></b:KeyValueOfstringstring><b:KeyValueOfstringstring><b:Key>useWeakPassword</b:Key>
> ---------------------------------------------------------------------------------------------------------------------
>
> Below the return of doc = etree.parse(response)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3630:
> ordinal not in range(128)
> The type of response is unicode.
>
>
> The whole idea is sign in on this webservice and get a Security token and
> then run another XML on the same script.
>
> Any ideas to transform this unicode on XML and parse it?
Read carefully:
>>> help(lxml.etree.parse)
Help on built-in function parse in module lxml.etree:
parse(...)
parse(source, parser=None, base_url=None)
Return an ElementTree object loaded with source elements. If no parser
is provided as second argument, the default parser is used.
The ``source`` can be any of the following:
- a file name/path
- a file object
- a file-like object
- a URL using the HTTP or FTP protocol
To parse from a string, use the ``fromstring()`` function instead.
Note that it is generally faster to parse from a file path or URL
than from an open file object or file-like object. Transparent
decompression from gzip compressed sources is supported (unless
explicitly disabled in libxml2).
The ``base_url`` keyword allows setting a URL for the document
when parsing from a file-like object. This is needed when looking
up external entities (DTD, XInclude, ...) with relative paths.
A quick test confirms that fromstring() accepts unicode:
>>> lxml.etree.fromstring(u"<a>äöü</a>")
<Element a at 0x2266320>
>>> print _.text
äöü
If response is a file-like object the following will work, too:
#untested
response = requests.post(url, data=xml, headers=headers)
doc = etree.parse(response)
More information about the Tutor
mailing list