[Tutor] file.read..... Abort Problem
Kent Johnson
kent37 at tds.net
Thu Oct 20 21:33:36 CEST 2005
Tomas Markus wrote:
> Hello Pythoners,
>
> This is probably a very newbie question (after all I am one):
>
> I am trying to read a file with some 2500 lines and to do a scan for
> "not allowed" characters in there (such as accented ones and so on). The
> problem is that the file I am testing with has got one of those
> somewhere on line 2100 (the hex value of the character is 1a) and all of
> the file.read functions (read, readline, readlines) actually stop
> reading the file exactly at that line as if it was interpreted as an
> EOF. Can you, please, help?
Are you on windows? Try opening the file in binary mode:
>>> d='abc\x1adef\nghij\n'
>>> d
'abc\x1adef\nghij\n'
>>> open('(temp)/test.txt', 'w').write(d)
>>> d1=open('(temp)/test.txt').read()
>>> d1
'abc'
>>> d1=open('(temp)/test.txt', 'rb').read()
>>> d1
'abc\x1adef\r\nghij\r\n'
> Btw: When I am past this problem I will be
> asking yet another question: what is the most effective way to check a
> file for not allowed characters or how to check it for allowed only
> characters (which might be i.e. ASCII only).
You can read the file with read() and use a regex to search for unallowed characters.
Kent
>
> Many, many thanks.
>
> Tom
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list