Reading plain text file database tables

Li Dongfeng mavip5 at inet.polyu.edu.hk
Thu Sep 9 05:45:28 EDT 1999


Thanks for the direction. I'm also
trying to solve this with RE. But in order
to extract "ab \" cd" like structure,
your [^"\\] seems puzzling, because anything
inside [] represent only one character, not
the '\"' two character sequence. Any real
worked out result?

Stephan Houben wrote:
> 
> On Thu, 09 Sep 1999 14:15:38 +0800, Li Dongfeng
> <mavip5 at inet.polyu.edu.hk> wrote:
> 
> >
> >Do we have a module to read plain text file
> >database tables?
> >
> >All the data management software, e.g. excel,
> >dBase, SAS, etc., support input/output a table
> >from/to a plain text file, fields can be separated
> >by their column position, by spaces, by tabs,
> >by commas, etc.
> >
> >How can we read this kind of file into a matrix like
> >structure(list of lists)? I have written one reading
> >files with fixed-width fields. For delimited files,
> >simply using string.split work most of the time, but
> >fails reading lines like
> >
> >  "Peter Thomson"  25  36
> >
> >or even
> >
> >  "Peter\" Thomson" 25 36
> >
> >I think this is a common task, so maybe someone has
> >already given a very good solution.
> 
> Use regular expressions.
> A regular expression that matches strings within "..",
> while escaping " with \, is:
> 
> r = re.compile(r'"([^"\\]|(\\.))*"')
> 
> Extend this to get what you want.
> If you're very worried about speed, create
> a lexer using (f)lex, and then call the generated
> C code from python.
> 
> Greetings,
> 
> Stephan




More information about the Python-list mailing list