J. Cliff Dyer
jcd at sdf.lonestar.org
Wed Jun 24 22:40:29 CEST 2009
On Wed, 2009-06-24 at 20:53 +0100, Angus Rodgers wrote:
> ... my first Python program! So please be gentle (no fifty ton
> weights on the head!), but tell me if it's properly "Pythonic",
> or if it's a dead parrot (and if the latter, how to revive it).
Yay. Welcome to Python.
> I'm working from Beazley's /Python: Essential Reference/ (2nd
> ed. 2001), so my first newbie question is how best to find out
> what's changed from version 2.1 to version 2.5. (I've recently
> installed 2.5.4 on my creaky old Win98SE system.) I expect to
> be buying the 4th edition when it comes out, which will be soon,
> but before then, is there a quick online way to find this out?
Check here: http://docs.python.org/whatsnew/index.html
It's not designed to be newbie friendly, but it's in there.
> Having only got up to page 84 - where we can actually start to
> read stuff from the hard disk - I'm emboldened to try to learn
> to do something useful, such as removing all those annoying hard
> tab characters from my many old text files (before I cottoned on
> to using soft tabs in my text editor).
> This sort of thing seems to work, in the interpreter (for an
> ASCII text file, named 'h071.txt', in the current directory):
> stop = 3 # Tab stops every 3 characters
> from types import StringType # Is this awkwardness necessary?
Not anymore. You can just use str for this.
> detab = lambda s : StringType.expandtabs(s, stop) # Or use def
First, use def. lambda is a rarity for use when you'd rather not assign
your function to a variable.
Second, expandtabs is a method on string objects. s is a string object,
so you can just use s.expandtabs(stop)
Third, I'd recommend passing your tabstops into detab with a default
argument, rather than defining it irrevocably in a global variable
(which is brittle and ugly)
def detab(s, stop=3):
Then you can do
three_space_version = detab(s)
eight_space_version = detab(s, 8)
> f = open('h071.txt') # Do some stuff to f, perhaps, and then:
f is not opened for writing, so if you do stuff to the contents of f,
you'll have to put the new version in a different variable, so f.seek(0)
doesn't help. If you don't do stuff to it, then you're at the beginning
of the file anyway, so either way, you shouldn't need to f.seek(0).
> print ''.join(map(detab, f.xreadlines()))
Sometime in the history of python, files became iterable, which means
you can do the following:
for line in f:
Much prettier than running through join/map shenanigans. This is also
the place to modify the output before passing it to detab:
for line in f:
# do stuff to line
Also note that you can iterate over a file several times:
f = open('foo.txt')
for line in f:
print line # prints the first character of every line
for line in f:
print line #prints the second character of every line
> Obviously, to turn this into a generally useful program, I need
> to learn to write to a new file, and how to parcel up the Python
> code, and write a script to apply the "detab" function to all the
> files found by searching a Windows directory, and replace the old
> files with the new ones; but, for the guts of the program, is this
> a reasonable way to write the code to strip tabs from a text file?
> For writing the output file, this seems to work in the interpreter:
> g = open('temp.txt', 'w')
> g.writelines(map(detab, f.xreadlines()))
Doesn't help, as map returns a list. You can use itertools.imap, or you
can use a for loop, as above.
> In practice, does this avoid creating the whole string in memory
> at one time, as is done by using ''.join()? (I'll have to read up
> on "opaque sequence objects", which have only been mentioned once
> or twice in passing - another instance perhaps being an xrange()?)
> Not that that matters much in practice (in this simple case), but
> it seems elegant to avoid creating the whole output file at once.
The terms to look for, rather than opaque sequence objects are
"iterators" and "generators".
> OK, I'm just getting my feet wet, and I'll try not to ask too many
> silly questions!
> First impressions are: (1) Python seems both elegant and practical;
> and (2) Beazley seems a pleasantly unfussy introduction for someone
> with at least a little programming experience in other languages.
Glad you're enjoying Beazley. I would look for something more
up-to-date. Python's come a long way since 2.1. I'd hate for you to
miss out on all the iterators, booleans, codecs, subprocess, yield,
unified int/longs, decorators, decimals, sets, context managers and
new-style classes that have come since then.
> Angus Rodgers
More information about the Python-list