how can i change the text delimiter

sonald sonaldgr8 at gmail.com
Mon Sep 4 08:59:40 CEST 2006


Hi,
Thanks a lot for the snips you have included in your post...
those were quite helpful...

And about the 3rd party data....
we receive the data in csv format ... but we are not supposed to modify
the files provided by the user directly...

Instead we make another file with the same name & different
extensions... and use the new files created by the python for further
processing....

>         quote_char
>             Defines the character used to quote fields that
>             contain the field separator or newlines.  If set to None
>             special characters will be escaped using the escape_char.
> ##### That's what you are looking for #####

Yes you got me right....
I was indeed looking for the quote_char...

> Aha!! Looks like some misguided person has got a copy of the
> object-craft code, renamed it fastcsv, and compiled it to run with
> Python 2.4 ... so you want some docs. The simplest thing to do is to
> ask it, e.g. like this, but with Python 2.4 (not 2.2) and call it
> fastcsv (not csv):
>

I guess... that's true... ;)

Thank you very much.




Thanks a lot for the reponse
John Machin wrote:

> sonald wrote:
> > Hi,
> > I am using
> > Python version python-2.4.1 and along with this there are other
> > installables
> > like:
> > 1. fastcsv-1.0.1.win32-py2.4.exe
>
> Well, you certainly didn't get that from the object-craft website --
> just go and look at their download page
> http://www.object-craft.com.au/projects/csv/download.html -- stops dead
> in 2002 and the latest windows kit is a .pyd for Python 2.2. As you
> have already been told and as the object-craft csv home-page says,
> their csv was the precursor of the Python csv module.
>
>
> > 2. psyco-1.4.win32-py2.4.exe
> > 3. scite-1.63-setup.exe
> >
> > We are freshers here, joined new... and are now into handling this
> > module which validates the data files, which are provided in some
> > predefined format from the third party.
> > The data files are provided in the comma separated format.
> >
> > The fastcsv package is imported in the code...
> >      import fastcsv
> > and
> >      csv = fastcsv.parser(strict = 1,field_sep = ',')
>
> Aha!! Looks like some misguided person has got a copy of the
> object-craft code, renamed it fastcsv, and compiled it to run with
> Python 2.4 ... so you want some docs. The simplest thing to do is to
> ask it, e.g. like this, but with Python 2.4 (not 2.2) and call it
> fastcsv (not csv):
>
> ... command-prompt...>\python22\python
> Python 2.2.3 (#42, May 30 2003, 18:12:08) [MSC 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import csv
> >>> help(csv.parser)
> Help on built-in function parser:
>
> parser(...)
>     parser(ms_double_quote = 1, field_sep = ',',
>            auto_clear = 1, strict = 0,
>            quote_char = '"', escape_char = None) -> Parser
>
>     Constructs a CSV parser object.
>
>         ms_double_quote
>             When True, quotes in a fields must be doubled up.
>
>         field_sep
>             Defines the character that will be used to separate
>             fields in the CSV record.
>
>         auto_clear
>             When True, calling parse() will automatically call
>             the clear() method if the previous call to parse() raised
> an
>             exception during parsing.
>
>         strict
>             When True, the parser will raise an exception on
>             malformed fields rather than attempting to guess the right
>             behavior.
>
>         quote_char
>             Defines the character used to quote fields that
>             contain the field separator or newlines.  If set to None
>             special characters will be escaped using the escape_char.
> ##### That's what you are looking for #####
>         escape_char
>             Defines the character used to escape special
>             characters.  Only used if quote_char is None.
>
> >>> help(csv)
> Help on module csv:
>
> NAME
>     csv - This module provides class for performing CSV parsing and
> writing.
>
> FILE
>     SOMEWHERE\csv.pyd
>
> DESCRIPTION
>     The CSV parser object (returned by the parser() function) supports
> the
>     following methods:
>         clear()
>             Discards all fields parsed so far.  If auto_clear is set to
>             zero. You should call this after a parser exception.
>
>         parse(string) -> list of strings
>             Extracts fields from the (partial) CSV record in string.
>             Trailing end of line characters are ignored, so you do not
>             need to strip the string before passing it to the parser.
> If
>             you pass more than a single line of text, a csv.Error
>             exception will be raised.
>
>         join(sequence) -> string
>             Construct a CSV record from a sequence of fields.
> Non-string
>             elements will be converted to string.
>
>     Typical usage:
>
>         import csv
>         p = csv.parser()
>         file = open('afile.csv')
>         while 1:
>             line = file.readline()
>             if not line:
>                 break
>             fields = p.parse(line)
>             if not fields:
>                 # multi-line record
>                 continue
>             # process the fields
> [snip remainder of docs]
> >
> > can u plz tell me where to find the parser function definition, (used
> > above)
> > so that if possible i can provide a parameter for
> > text qualifier or text separator or text delimiter..
> > just as {field_sep = ','} (as given above)
> >
> > I want to handle string containing double quotes (")
> > but the problem is that the default text qualifier is double quote
> >
> > Now if I can change the default text qualifier... to say pipe (|)
> > the double quote inside the string may be ignored...
> > plz refer to the example given in my previous query...
> >
>
> It *appears* from this message that you have data already in a file,
> and that data is *NOT* (as some one has already told you) in standard
> CSV format.
>
> Let me explain: The magic spell for quoting a field in standard CSV
> format is:
> quote = '"'
> sep = ','
> twoquotes = quote + quote
> if quote in fld:
>     fld = quote + fld.replace(quote, twoquotes) + quote
> elif sep in fld:
>     fld = quote + fld + quote
>
> Note carefully that if the quote character appears in the raw input
> data, it must be *doubled* in the output. If it is not, the standard
> reader can't decode the input unambiguously. If is possible that the
> using ms_double_quote=0 with the [fast]csv module will do the job for
> you. If not, it is possible, if the original data contains *pairs* of
> quotes e.g. -- He said "Hello" to his friend -- to decode that using a
> different state machine. If that's what you've got, e-mail me; I may be
> able to help. However the example you gave had just one quote :-(
>
> *But* are you reading or writing this data? On one hand you say that
> you are getting the data from a 3rd party and can't change it [which
> implies that you are reading] but on the other hand you want to know
> how to tell the [fast]csv module use a "|" as the quote character; that
> would be appropriate under two circumstances (1) you are reading a file
> that already has "pipe" as the quote character (2) you want to create a
> file that quotes using "pipe" ... IOW, it's not guaranteed to work for
> reading an existing file that uses " as the quote character. If there
> is a pipe character in the original data, it will fail. If (more
> likely) there are commas in the original data, then you will get one
> extra field per comma.
>
> A quick simple question: after the above csv = fastcsv.parser(.......),
> does it do csv.parse(.....) or csv.join(...)???? Can you see any
> fread() or fwrite() calls in the code??? If so, which???
>
> HTH -- but you will have to describe what's going on a lot more
> precisely.
> 
> Cheers,
> John




More information about the Python-list mailing list