Reading by positions plain text files
Tim Harig
usernet at ilthio.net
Sun Dec 12 14:09:56 EST 2010
On 2010-12-12, javivd <javiervandam at gmail.com> wrote:
> On Dec 1, 7:15 am, Tim Harig <user... at ilthio.net> wrote:
>> On 2010-12-01, javivd <javiervan... at gmail.com> wrote:
>> > On Nov 30, 11:43 pm, Tim Harig <user... at ilthio.net> wrote:
>> >> encodings and how you mark line endings. Frankly, the use of the
>> >> world columns in the header suggests that the data *is* separated by
>> >> line endings rather then absolute position and the position refers to
>> >> the line number. In which case, you can use splitlines() to break up
>> >> the data and then address the proper line by index. Nevertheless,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that I specifically questioned the use of absolute file position vs.
postion within a column. These are two different things. You use
different methods to extract each.
>> > I work in a survey research firm. the data im talking about has a lot
>> > of 0-1 variables, meaning yes or no of a lot of questions. so only one
>> > position of a character is needed (not byte), explaining the 123-123
>> > kind of positions of a lot of variables.
>>
>> Thenfile.seek() is what you are looking for; but, you need to be aware of
>> line endings and encodings as indicated. Make sure that you open thefile
>> using whatever encoding was used when it was generated or you could have
>> problems with multibyte characters affecting the offsets.
>
> f = open(r'c:c:\somefile.txt', 'w')
I suspect you don't need to use the c: twice.
> f.write('0123456789\n0123456789\n0123456789')
Note that the file you a writing contains three lines. Is the data that
you are looking for located at an absolute position in the file or on a
position within a individual line? If the latter, not that line endings
may be composed of more then a single character.
> f.write('0123456789\n0123456789\n0123456789')
^ postion 3 using fseek()
> for line in f:
Perhaps you meant:
for character in f.read():
or
for line in f.read().splitlines()
> f.seek(3,0)
This will always take you back to the exact fourth position in the file
(indicated above).
> I used .seek() in this manner, but is not working.
It is working the way it is supposed to.
If you want the absolution position 3 in a file then:
f = open('somefile.txt', 'r')
f.seek(3)
variable = f.read(1)
If you want the absolute position in a column:
f = open('somefile.txt', 'r').read().splitlines()
for column in f:
variable = column[3]
More information about the Python-list
mailing list