[Tutor] Truncated urlopen

Dave Angel davea at davea.name
Mon Feb 11 14:36:16 CET 2013


On 02/11/2013 05:58 AM, Válas Péter wrote:
> Hi tutors,
>
> aboard again after a long time.

Welcome back.

> http://..._export.php?mehet=1 is a link which exports me some data in CSV
> correctly when I click on it in my browser. (It contains personal data,
> that's why I dotted.) I want to process it directly from Python, excluding
> the browser phase.
>
> The following piece of code in Python 3.3 prints the first n-1 lines
> nicely, then the beginning of the last line truncated. I mean there are 12
> fields in CSV and in the last line only 1.5 fields of them are displayed. I
> suspect some closing/caching error.
> In the first times it worked well but after a certain time it spoiled
> without PHP code being modified. I guess it may somehow be connected to
> quantity of lines.
> How could I read the last line?
> Thx, Péter
>
> from urllib.request import urlopen
> x=urlopen('http://..._export.php?mehet=1')
> y=x.read().decode('windows-1250')

I'd suggest splitting that expression into two separate statements, one 
that does the read, and the other that decodes.  Then you can print the 
byte-string, and see if it also is truncated, or whether that happened 
during the decoding.

(Though I can't imagine how decoding one of the windows-xxx character 
sets could fail to do the whole thing)

Also, if you use print( repr(mystring) )
you might discover that there are funny escape sequence characters that 
are fooling your console.

> print(y)
>
> Output looks like this:
> 159;Name of customer159;xxxx at gmail.com;phone;22;0;0;0;1;0;0;2013-02-09
> 20:20:26
> 160;Name of customer160;yyyy at gmail.com;phone;14;0;0;1;0;0;0;2013-02-09
> 20:38:16
> 161;Name of c
>
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>


-- 
DaveA


More information about the Tutor mailing list