[Tutor] Using contents of a document to change file names, (was Re: how to extract data only after a certain ...)
Christian Witts
cwitts at compuscan.co.za
Mon Oct 11 14:23:37 CEST 2010
On 11/10/2010 13:46, Josep M. Fontana wrote:
> I tried your suggestion of using .split() to get around the problem
> but I still cannot move forward. I don't know if my implementation of
> your suggestion is the correct one but here's the problem I'm having.
> When I do the following:
>
> -----------------
>
> fileNameCentury =
> open(r'/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'.split('\r'))
> dct = {}
> for pair in fileNameCentury:
> key,value = pair.split(',')
> dct[key] = value
> print dct
>
> --------------
>
> I get the following long error message:
>
> fileNameCentury =
> open(r'/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'.split('\n'))
>
>
> TypeError: coercing to Unicode: need string or buffer, list found
>
> ------------
What you should be doing is:
fileNameCentury =
open('/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt',
'r')
dct = {}
for line in fileNameCentury: #File objects have built-in iteration
key, value = line.strip().split(',')
dct[key] = value
What you were doing originally was splitting the input filename for the
open function (hence the error message stating `need string or buffer,
list found`. If you wish to read in the entire file and then split it
on newline characters you would do fileObject.read().splitlines() but it
is more efficient to create your file object and just iterate over it
(that way there is only 1 line at a time stored in memory and you're not
reading in the entire file first).
It's not a Mac problem, just a problem with how you were going about it.
Hope that helps.
--
Kind Regards,
Christian Witts
More information about the Tutor
mailing list