Python dos2unix one liner
steve at REMOVE-THIS-cybersource.com.au
Sat Feb 27 18:42:31 CET 2010
On Sat, 27 Feb 2010 16:01:53 +0100, @ Rocteur CC wrote:
> On 27 Feb 2010, at 12:44, Steven D'Aprano wrote:
>> On Sat, 27 Feb 2010 10:36:41 +0100, @ Rocteur CC wrote:
>>> cat file.dos | python -c "import sys,re;
>>> [sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
>>> sys.stdin]" >file.unix
>> Holy cow!!!!!!! Calling a regex just for a straight literal-to-literal
>> string replacement! You've been infected by too much Perl coding!
> Thanks for the replies I'm looking at them now, however, for those who
> misunderstood, the above cat file.dos pipe pythong does not come from
> Perl but comes from:
Whether it comes from Larry Wall himself, or a Python wiki, using regexes
for a simple string replacement is like using an 80 lb sledgehammer to
crack a peanut.
>> Apply regular expression to lines from stdin [another command] | python
>> -c "import sys,re;
>> [sys.stdout.write(re.compile('PATTERN').sub('SUBSTITUTION', line)) for
>> line in sys.stdin]"
And if PATTERN is an actual regex, rather than just a simple substring,
that would be worthwhile. But if PATTERN is a literal string, then string
methods are much faster and use much less memory.
> Nothing to do with Perl, Perl only takes a handful of characters to do
I'm sure it does. If I were interested in code-golf, I'd be impressed.
> and certainly does not require the creation an intermediate file,
The solution I gave you doesn't use an intermediate file either.
*slaps head and is enlightened*
Oh, I'm an idiot!
Since you're reading text files, there's no need to call
replace('\r\n','\n'). Since there shouldn't be any bare \r characters in
a DOS-style text file, just use replace('\r', '').
Of course, that's an unsafe assumption in the real world. But for a quick
and dirty one-liner (and all one-liners are quick and dirty), it should
be good enough.
More information about the Python-list