Movie (MPAA) ratings and Python?

Ned Batchelder ned at nedbatchelder.com
Wed Dec 11 19:35:17 CET 2013


On 12/10/13 6:50 PM, Dan Stromberg wrote:
>
> On Tue, Dec 10, 2013 at 1:07 PM, Petite Abeille
> <petite.abeille at gmail.com <mailto:petite.abeille at gmail.com>> wrote:
>
>
>     On Dec 10, 2013, at 6:25 AM, Dan Stromberg <drsalists at gmail.com
>     <mailto:drsalists at gmail.com>> wrote:
>
>      > The IMDB flat text file probably came the closest, but it appears
>     to have encoding issues; it's apparently nearly windows-1255, but
>     not quite.
>
>     It's ISO-8859-1.
>
> Thanks - that reads well from CPython 3.3.
>
> Now the question becomes: Why did chardet tell me it was windows-1255?  :)

It probably told you it was Windows-1252 (I'm assuming the last 5 is a 
typo).

Windows-1252 is a super-set of ISO-8859-1, so any text that is correct 
ISO-8859-1 is also correct Windows-1252.  In addition, it's not uncommon 
to find text marked as ISO-8859-1 that in fact has characters that make 
it Windows-1252.


-- 
Ned Batchelder, http://nedbatchelder.com




More information about the Python-list mailing list