[Tutor] get columns from txt file

Marc Tompkins marc.tompkins at gmail.com
Fri Jul 13 09:28:07 CEST 2012


On Thu, Jul 12, 2012 at 11:36 PM, susana moreno colomer <
susana_87 at hotmail.com> wrote:

>  Hi!
> I am trying this, but still I get 6 numbers per cell. The only one
> difference is that I get a comma between numbers instead an space.
> I am opening the document also with excel
> Many thanks,
> Susana
>

CSV stands for Comma Separated Values.  Those commas are the separators
between cells.  Your current problem is that a CSV file is just a regular
text file (that happens to have a lot of commas in it), and Excel is trying
to read it as a normal text file.

CSV is about the simplest way ever invented to store tabular data in a
file, but there are some complicating factors.  The most obvious is: what
happens if the data you want to store in your file actually contains commas
(e.g. address fields with "city, state zip", or numbers with thousands
separators, etc.)  One way around the problem is to put quotes around
fields, and then separate the fields with commas (but then, what if your
data contains quotes?); another is to separate the fields with tab
characters instead of commas (technically this isn't really a CSV file
anymore, but the acronym TSV never caught on.)

Excel's native flavor* of CSV is the oldest, simplest, and stupidest of all
- just commas between fields, and newlines between records:
1, 2, 3, 4, 5
a, b, c, d, e

Quotes-and-commas style:
"1", "2", "3", "4,000,000", 5
"a", "b", "c", "Dammit, Janet", "e"

Tab-separated (well, you'll just have to imagine; I don't feel like
reconfiguring my text editor):
1       2       3      4      5
a      b      cde     fghi    j

and a bunch of others I can't think of right now.
* Note: Excel will happily import a quotes-and-commas CSV file and display
it normally - but if you export it to CSV, it will revert to the dumb
bare-commas format.

>From the Python csv module docs:

> To make it easier to specify the format of input and output records,
> specific formatting parameters are grouped together into dialects. A
> dialect is a subclass of the Dialect<http://docs.python.org/library/csv.html#csv.Dialect>class having a set of specific methods and a single
> validate() method.
>

So you can specify which dialect you want to read or write, and/or you can
specify which delimiter(s) you want to use.

Hope that helps...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120713/d0613946/attachment-0001.html>


More information about the Tutor mailing list