Mailman 3 Re: [Python-Dev] cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /. - Python-Dev

newer
Re: [Python-Dev] cpython: Closes...

Re: [Python-Dev] cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

older
What do PyAPI_FUNC & PyAPI_DATA...

Georg Brandl

April 24, 2012

12:13 p.m.

On 19.04.2012 03:36, ezio.melotti wrote:

...

I think that's misleading: there's no way to "correctly" parse malformed HTML. Georg

Show replies by date

Benjamin Peterson

April 2012

1:34 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

2012/4/24 Georg Brandl <g.brandl@gmx.net>:

...

There is in the since that you can follow the HTML5 algorithm, which can "parse" any junk you throw at it. -- Regards, Benjamin

Fred Drake

2 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

On Tue, Apr 24, 2012 at 2:34 PM, Benjamin Peterson <benjamin@python.org> wrote:

...

There is in the since that you can follow the HTML5 algorithm, which can "parse" any junk you throw at it.

This whole can of worms is why I gave up on HTML years ago (well, one reason among many). There are markup languages, and there's soup. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> "A person who won't read has no advantage over one who can't read." --Samuel Langhorne Clemens

Georg Brandl

2:02 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

On 24.04.2012 20:34, Benjamin Peterson wrote:

...

Ah, good. Then I hope we are following the algorithm here (and are slowly coming to use it for htmllib in general). Georg

Éric Araujo

2:34 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

Le 24/04/2012 15:02, Georg Brandl a écrit :

...

Yes, Ezio’s commits on html.parser/HTMLParser in the last months have been following the HTML5 spec. Ezio, RDM and I have had some discussion about that on some bug reports, IRC and private mail and reached the agreement to do the useful thing, that is follow HTML5 and not pretend that the stdlib parser is strict or validating. Ezio was thinking about a blog.python.org post to advertise this. Regards

Brian Curtin

2:41 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

On Tue, Apr 24, 2012 at 14:34, Éric Araujo <merwok@netwok.org> wrote:

...

Please do this, and I welcome anyone else who wants to write about their work on the blog to do so. Contact me for info.

Benjamin Peterson

2:05 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

2012/4/24 Benjamin Peterson <benjamin@python.org>:

...

There is in the since

This is confusing, since I meant "sense". -- Regards, Benjamin

Benjamin Peterson

April 2012

1:34 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

2012/4/24 Georg Brandl <g.brandl@gmx.net>:

...

There is in the since that you can follow the HTML5 algorithm, which can "parse" any junk you throw at it. -- Regards, Benjamin

Fred Drake

2 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

On Tue, Apr 24, 2012 at 2:34 PM, Benjamin Peterson <benjamin@python.org> wrote:

...

There is in the since that you can follow the HTML5 algorithm, which can "parse" any junk you throw at it.

Georg Brandl

2:02 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

On 24.04.2012 20:34, Benjamin Peterson wrote:

...

Ah, good. Then I hope we are following the algorithm here (and are slowly coming to use it for htmllib in general). Georg

Éric Araujo

2:34 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

Le 24/04/2012 15:02, Georg Brandl a écrit :

...

Brian Curtin

2:41 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

On Tue, Apr 24, 2012 at 14:34, Éric Araujo <merwok@netwok.org> wrote:

...

Please do this, and I welcome anyone else who wants to write about their work on the blog to do so. Contact me for info.

Benjamin Peterson

2:05 p.m.

New subject: cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

2012/4/24 Benjamin Peterson <benjamin@python.org>:

...

There is in the since

This is confusing, since I meant "sense". -- Regards, Benjamin

4710

Age (days ago)

4710

Last active (days ago)

List overview

Download

6 comments

5 participants

participants (5)

Benjamin Peterson
Brian Curtin
Fred Drake
Georg Brandl
Éric Araujo