Python dos2unix one liner
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sat Feb 27 06:44:30 EST 2010
On Sat, 27 Feb 2010 10:36:41 +0100, @ Rocteur CC wrote:
> cat file.dos | python -c "import sys,re;
> [sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
> sys.stdin]" >file.unix
Holy cow!!!!!!! Calling a regex just for a straight literal-to-literal
string replacement! You've been infected by too much Perl coding!
*wink*
Regexes are expensive, even in Perl, but more so in Python. When you
don't need the 30 pound sledgehammer of regexes, use lightweight string
methods.
import sys; sys.stdout.write(sys.stdin.read().replace('\r\n', '\n'))
ought to do it. It's not particularly short, but Python doesn't value
extreme brevity -- code golf isn't terribly exciting in Python.
[steve at sylar ~]$ cat -vet file.dos
one^M$
two^M$
three^M$
[steve at sylar ~]$ cat file.dos | python -c "import sys; sys.stdout.write
(sys.stdin.read().replace('\r\n', '\n'))" > file.unix
[steve at sylar ~]$ cat -vet file.unix
one$
two$
three$
[steve at sylar ~]$
Works fine. Unfortunately it still doesn't work in-place, although I
think that's probably a side-effect of the shell, not Python. To do it in
place, I would pass the file name:
# Tested and working in the interactive interpreter.
import sys
filename = sys.argv[1]
text = open(filename, 'rb').read().replace('\r\n', '\n')
open(filename, 'wb').write(text)
Turning that into a one-liner isn't terribly useful or interesting, but
here we go:
python -c "import sys;open(sys.argv[1], 'wb').write(open(sys.argv[1],
'rb').read().replace('\r\n', '\n'))" file
Unfortunately, this does NOT work: I suspect it is because the file gets
opened for writing (and hence emptied) before it gets opened for reading.
Here's another attempt:
python -c "import sys;t=open(sys.argv[1], 'rb').read().replace('\r\n',
'\n');open(sys.argv[1], 'wb').write(t)" file
[steve at sylar ~]$ cp file.dos file.txt
[steve at sylar ~]$ python -c "import sys;t=open(sys.argv[1], 'rb').read
().replace('\r\n', '\n');open(sys.argv[1], 'wb').write(t)" file.txt
[steve at sylar ~]$ cat -vet file.txt
one$
two$
three$
[steve at sylar ~]$
Success!
Of course, none of these one-liners are good practice. The best thing to
use is a dedicated utility, or write a proper script that has proper
error testing.
> Is there a better way in Python or is this kind of thing best done in
> Perl ?
If by "this kind of thing" you mean text processing, then no, Python is
perfectly capable of doing text processing. Regexes aren't as highly
optimized as in Perl, but they're more than good enough for when you
actually need a regex.
If you mean "code golf" and one-liners, then, yes, this is best done in
Perl :)
--
Steven
More information about the Python-list
mailing list