[Tutor] Help with strings and lists.
Kent Johnson
kent37 at tds.net
Fri Jul 14 13:08:56 CEST 2006
Alan Collins wrote:
> Hi,
>
> I do a far bit of data manipulation and decided to try one of my
> favourite utilities in Python. I'd really appreciate some optimization
> of the script. I'm sure that I've missed many tricks in even this short
> script.
>
> Let's say you have a file with this data:
>
> Monday 7373 3663657 2272 547757699 reached 100%
> Tuesday 7726347 552 766463 2253 under-achieved 0%
> Wednesday 9899898 8488947 6472 77449 reached 100%
> Thursday 636648 553 22344 5699 under-achieved 0%
> Friday 997 3647757 78736632 357599 over-achieved 200%
>
> You now want columns 1, 5, and 7 printed and aligned (much like a
> spreadsheet). For example:
>
> Monday 547757699 100%
> Wednesday 77449 100%
> ...
>
> This script does the job, but I reckon there are better ways. In the
> interests of brevity, I have dropped the command-line argument handling
> and hard-coded the columns for the test and I hard-coded the input file
> name.
>
You might like to see how it is done in this recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/267662
> -------------------------------------------------------
> """
> PrintColumns
>
> Print specified columns, alignment based on data type.
>
> The script works by parsing the input file twice. The first pass gets
> the maximum length of
> all values on the columns. This value is used to pad the column on the
> second pass.
>
> """
> import sys
>
> columns = [0] # hard-code the columns to be printed.
> colwidth = [0] # list into which the maximum field lenths will
> be stored.
>
> """
> This part is clunky. Can't think of another way to do it without making
> the script
> somewhat longer and slower. What it does is that if the user specifies
> column 0, all
> columns will be printed. This bit builds up the list of columns, from 1
> to 100.
> """
>
> if columns[0] == 0:
> columns = [1]
> while len(columns) < 100:
> columns.append(len(columns)+1)
>
columns = range(1, 100)
> """
> First pass. Read all lines and determine the maximum width of each
> selected column.
> """
> infile = file("mylist", "r")
> indata = infile.readlines()
> for myline in indata:
> mycolumns = myline.split()
> colindex = 0
> for column in columns:
> if column <= len(mycolumns):
> if len(colwidth)-1 < colindex:
> colwidth.append(len(mycolumns[column-1]))
> else:
> if colwidth[colindex] < len(mycolumns[column-1]):
> colwidth[colindex] = len(mycolumns[column-1])
> colindex += 1
> infile.close()
>
> """
> Second pass. Read all lines and print the selected columns. Text values
> are left
> justified, while numeric values are right justified.
> """
> infile = file("mylist", "r")
> indata = infile.readlines()
>
No need to read the file again, you still have indata.
> for myline in indata:
> mycolumns = myline.split()
> colindex = 0
> for column in columns:
> if column <= len(mycolumns):
> if mycolumns[column-1].isdigit():
> x = mycolumns[column-1].rjust(colwidth[colindex]) + ' '
> else:
> x = mycolumns[column-1].ljust(colwidth[colindex]+1)
> print x,
> colindex += 1
> print ""
> infile.close()
>
Hmm...you really should make columns be the correct length. If you use a
list comp to make colwidth then you can just make columns the same
length as colwidth. Then if you make a helper function for the formatting
def format(value, width):
if value.isdigit():
return value.rjust(width) + ' '
else:
return value.ljust(width)
Now the formatting becomes
values = [ format(column[i], colwidth[i] for i in columns ]
which you print with
print ''.join(values)
Kent
> -------------------------------------------------------
>
> Any help greatly appreciated.
> Regards,
> Alan.
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>
>
More information about the Tutor
mailing list