ignoring or replacing white lines in a diff
Adriaan Renting
renting at astron.nl
Fri Jan 15 04:44:32 EST 2016
Thanks for the various people that provided help.
Peter Otten provided me with a working solution:
I had to split the "-I '^[[:space:]]*$'" into two commands.
cmd = ["diff", "-w", "-I", r"^[[:space:]]*$", "./xml/%s.xml" %
name, "test.xml"]
p = subprocess.Popen(cmd, stdin=open('/dev/null'),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
logs = p.communicate()
diffs = logs[0].splitlines() #stdout
This also works:
cmd = ["diff -w -I '^[[:space:]]*$' ./xml/%s.xml test.xml" %
name]
p = subprocess.Popen(cmd, stdin=open('/dev/null'),
stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
logs = p.communicate()
diffs = logs[0].splitlines() #stdout
As to other comments:
- I've found that stdin=open('/dev/null') is essential in
subprocess.Popen to make it work from automated (headless) scripts.
- print line, did remove the extra newlines, but didn't get rid of the
blank lines.
- making it a raw string with r"-I '^[[:space:]]*$'" made no difference
(also tried r"-I ^[[:space:]]*$")
- I didn't investigate difflib further but will keep it in mind for the
future.
Thank you for your help,
Adriaan.
Adriaan Renting | Email: renting at astron.nl
Software Engineer Radio Observatory
ASTRON | Phone: +31 521 595 100 (797 direct)
P.O. Box 2 | GSM: +31 6 24 25 17 28
NL-7990 AA Dwingeloo | FAX: +31 521 595 101
The Netherlands | Web: http://www.astron.nl/~renting/
>>> On 14-1-2016 at 22:05, Peter Otten <__peter__ at web.de> wrote:
> Adriaan Renting wrote:
>
>>
>> Maybe someone here has a clue what is going wrong here? Any help is
>> appreciated.
>>
>> I'm writing a regression test for a module that generates XML.
>>
>> I'm using diff to compare the results with a pregenerated one from
an
>> earlier version.
>>
>> I'm running into two problems:
>>
>> The diff doesn't seem to behave properly with the -B option. (diff
(GNU
>> diffutils) 2.8.1 on OSX 10.9)
>>
>> Replacing -B with -I '^[[:space:]]*$' fixes it on the command line,
>> which should be exactly the same according to:
>>
>
http://www.gnu.org/software/diffutils/manual/html_node/Blank-Lines.html#Blank-L
> ines
>>
>> (for Python problem continue below)
>>
>> MacRenting 21:00-159> diff -w -B test.xml xml/Ticket_6923.xml
>> 3,5c3,5
>> < <version>2.15.0</version>
>> < <template version="2.15.0" author="Alwin de Jong,Adriaan
Renting"
>> changedBy="Adriaan Renting">
>> < <description>XML Template generator version
2.15.0</description>
>> ---
>>> <version>2.6.0</version>
>>> <template version="2.6.0" author="Alwin de Jong"
>> changedBy="Alwin de Jong">
>>> <description>XML Template generator version
>> 2.6.0</description>
>> 113d112
>> <
>> 163d161
>> <
>> 213d210
>> <
>> 258d254
>> <
>> 369d364
>> <
>> 419d413
>> <
>> 469d462
>> <
>> 514d506
>> <
>> 625d616
>> <
>> 675d665
>> <
>> 725d714
>> <
>> 770d758
>> <
>> 881d868
>> <
>> 931d917
>> <
>> 981d966
>> <
>> 1026d1010
>> <
>> 1137d1120
>> <
>> 1187d1169
>> <
>> 1237d1218
>> <
>> 1282d1262
>> <
>>
>> /Users/renting/src/CEP4-DevelopClusterModel-Story-Task8432-
> SAS/XML_generator/test
>> MacRenting 21:00-160> diff -w -I '^[[:space:]]*$' test.xml
>> xml/Ticket_6923.xml
>> 3,5c3,5
>> < <version>2.15.0</version>
>> < <template version="2.15.0" author="Alwin de Jong,Adriaan
Renting"
>> changedBy="Adriaan Renting">
>> < <description>XML Template generator version
2.15.0</description>
>> ---
>>> <version>2.6.0</version>
>>> <template version="2.6.0" author="Alwin de Jong"
>> changedBy="Alwin de Jong">
>>> <description>XML Template generator version
>> 2.6.0</description>
>>
>>
>> Now I try to use this in Python:
>>
>> cmd = ["diff", "-w", "-I '^[[:space:]]*$'", "./xml/%s.xml"
%
>> name, "test.xml"]
>
> Instead of
>
> ..., "-I '^[[:space:]]*$'", ...
>
> try two separate arguments
>
> ..., "-I", "^[[:space:]]*$", ...
>
>> ## -w ignores differences in whitespace
>> ## -I '^[[:space:]]*$' because -B doesn't work for blank
lines
>> (on OSX?)
>> p = subprocess.Popen(cmd, stdin=open('/dev/null'),
>> stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>
> I don't think you need to specify stdin.
>
>> logs = p.communicate()
>> diffs = logs[0].splitlines() #stdout
>> print "diff reply was %i lines long" % len(diffs)
>>
>> This doesn't work. I've tried escaping the various bits, like the *
and
>> $, even though with single quotes that should not be needed.
>>
>> I tried first removing the blank lines from the file:
>>
>> import fileinput
>> for line in fileinput.FileInput("test.xml",inplace=1):
>> if line.rstrip():
>> print line
>>
>> This makes it worse, as it adds and empty line for each line in the
>> file.
>
> Add a trailing comma to suppress the newline:
>
> print line,
>
>> I've tried various other options. The only thing I can think of, is
>> ditching Python and trying to rewrite the whole script in Bash.
>> (It's quite complicated, as it loops over various things and does
some
>> pretty output in between and I'm not very fluent in Bash)
>>
>> Any suggestions?
>
> Whatever floats your boat ;)
More information about the Python-list
mailing list