[Python-es] Extraer datos entre tags
Kennedy Sanchez
kurokysan en gmail.com
Lun Abr 8 16:37:34 CEST 2013
Gracias Andrey.
The lxml XML toolkit is a Pythonic binding for the C libraries
libxslt <http://xmlsoft.org/XSLT/>. It is unique in that it combines the
speed and XML feature completeness of these libraries with the simplicity
of a native Python API, mostly compatible but superior to the well-known
ElementTree <http://effbot.org/zone/element-index.htm> API. The latest
release works with all CPython versions from 2.4 to 3.3. See the
introduction <http://lxml.de/intro.html> for more information about
background and goals of the lxml project. Some common questions are
answered in the FAQ <http://lxml.de/FAQ.html>.
El 8 de abril de 2013 09:40, Andrey Antukh <niwi en niwi.be> escribió:
> lxml es tu amigo para procesado de xml
> http://lxml.de/
> El 6 de abril de 2013 14:08, Kennedy Sanchez <kurokysan en gmail.com>escribió:
> Gracias a todos por el aporte, vere cual es mejor para implementarlo en el
>> codigo. Aunque el xml.elementtree se ve muy bien para usarlo en mi caso.
>> Tratare de evitar lo mas que pueda el regex :s
>> El 5 de abril de 2013 12:30, Kiko <kikocorreoso en gmail.com> escribió:
>>> El 5 de abril de 2013 16:01, Harenson Henao <harenson en gmail.com>escribió:
>>> Hola, esto te puede orientar
>>>> http://docs.python.org/2/library/xml.etree.elementtree.html
>>> Y/o esto (siendo bruto y poco elegante):
>>> http://docs.python.org/2/library/string.html#string.split
>>> kk = """
>>> <listitem>
>>> <variable name="UUID">2bd9c142-9e91-4182-b85c-5bb616823bd9</variable>
>>> <variable name="Name">kurokysan</variable>
>>> <variable name="AccStatus">1</variable>
>>> <variable name="WWWFilter">JARSP</variable>
>>> <variable name="UseTemplate">1</variable>
>>> <variable name="Rights">0</variable>
>>> <variable name="AdmFilter">JARSP</variable>
>>> <variable name="QuotaDayEnabled">0</variable>
>>> <variable name="QuotaDayType"></variable>
>>> <variable name="QuotaDay">5242880</variable>
>>> <variable name="QuotaWeekEnabled">0</variable>
>>> <variable name="QuotaWeekType"></variable>
>>> <variable name="QuotaWeek">0</variable>
>>> <variable name="QuotaMonthEnabled">0</variable>
>>> <variable name="QuotaMonthType"></variable>
>>> <variable name="QuotaMonth">0</variable>
>>> <variable name="QuotaAction"></variable>
>>> <variable name="QuotaSendAlert">0</variable>
>>> <variable name="Lang">detect</variable>
>>> <variable name="DontUseLangTemp">0</variable>
>>> <variable name="DetectedLang"></variable>
>>> </listitem>""".split('\"')
>>> usuario = kk[kk.index('Name')+1].split('>')[1].split('<')[0]
>>> quota = kk[kk.index('QuotaDay')+1].split('>')[1].split('<')[0]
>>> y/o usando expresiones regulares, ejemplos y enlaces a documentación
>>> aquí:
>>> http://pybonacci.wordpress.com/2013/02/21/regex-mediante-ejemplos/
>>> Si no entiendes algo de lo anterior sigue preguntando.
>>> _______________________________________________
>>> Python-es mailing list
>>> Python-es en python.org
>>> http://mail.python.org/mailman/listinfo/python-es
>>> FAQ: http://python-es-faq.wikidot.com/
>> --
>> <Ksanchez>
>> _______________________________________________
>> Python-es mailing list
>> Python-es en python.org
>> http://mail.python.org/mailman/listinfo/python-es
>> FAQ: http://python-es-faq.wikidot.com/
> --
> Andrey Antukh - Андрей Антух - <niwi en niwi.be>
> http://www.niwi.be/about.html
> http://www.kaleidos.net/A5694F/
> "Linux is for people who hate Windows, BSD is for people who love UNIX"
> "Social Engineer -> Because there is no patch for human stupidity"
> _______________________________________________
> Python-es mailing list
> Python-es en python.org
> http://mail.python.org/mailman/listinfo/python-es
> FAQ: http://python-es-faq.wikidot.com/
------------ próxima parte ------------
Se ha borrado un adjunto en formato HTML...
URL: <http://mail.python.org/pipermail/python-es/attachments/20130408/9b959c10/attachment.html>
Más información sobre la lista de distribución Python-es