[Tutor] Some Strange Behavior

Alan Gauld alan.gauld at btinternet.com
Sun Mar 25 17:16:23 CEST 2007


"Utkarsh Tandon" <utkarsh.tandon at gmail.com> wrote

> So I was just trying to make a program that removed
> comments from a C program.
> The program worked but a whitespace came after every
> character. Can anyone please tell me the reason for thw
> whitespace.

See comments below.

But a general observation first.
It is possible to write C code in Python, but its not very effective.
In the same way you wouldn't use low level assembler features
to write to the screen in C but rather use printf or puts so in
Python there are lots of high level functions and libraries that
you can use to do the job more effectively.

But as a learning excercise, heh, its ok...

> Here is the program

> def main(filename):
>    text = open(str(filename), 'r')

You don;t need to convert it to a string, hopefully it already is
and if not converting anything else is always going to be risky.
Better to wrap the open() in a try/except and catch the error
if the name isn't valid.

>    realtext = text.read()
>    realtext = list(realtext)
>    length = len(realtext)

this converts the contents of the file into a list of characters.
But there's no need to do that. Pythons for and len will work
just as well on strings. Strings are just another type of
sequence to Python. You need to do it because of the
way you are deleting the characters but that's an extremely
non Pythonic way of doing things.

>    string = False

??? This doesn't do anything ???

>    for i in range (0, length):
>        try:
>            if realtext[i] == '/' and realtext[i + 1] == '*':
>                del realtext[i]
>                del realtext[i]
>
>                while realtext[i] != '*' and realtext[i+1] != '/':
>                    del realtext[i]

note that you never increment i here so you are relying
on Python shuffling up the list elements. That isn't guaranteed
behaviour, although I suspect it works at least in the standard
Python implementations.

>
>                del realtext[i]
>                del realtext[i]

you can delete slices in Python so you could do this in one statement.

>        except IndexError:
>            break

>    filename = str(raw_input('Enter the name of the new file '))
>
>    file = open(str(filename), 'w')
>    realtext = str(realtext)

You are trying to turn the list of chars back to a string but in
fact you get a string representation of the list - including [] and
commas etc. (Try it at the >>> prompt!)

You need to look at the string join() method.

>    realtext = realtext.replace('[', "").replace(']', 
> "").replace(',',
> "").replace("'", "").replace('\\n', "").replace('\\t', "")

Eek! Now you try to force the list into the right shape...
But you forgot to replace the spaces after the commas I think...

>    file.write(realtext)
>
> main(str(raw_input('Enter the filename ')))

You definitely don't need str here since raw_input always
returns a string.

> Here is it's output, notice the unnecesary whitespace after every
> character:-
>   m a i n ( )   {   p r i n t f ( " " ) ;     }

Thats a very long winded and difficult way to do a fairly simple
Python task. Don't try to write Python like a C programmer.

FWIW Here is my attempt to do what you want using broadly
the same technoque but in a somewhat more pyhonic style:

#####################
fname = raw_input('Whats the C file name? ')
code = open(fname).read()

result = ''
inComment = False
i = 0
while i < len(code):
    if inComment:
        if code[i:i+2] == '*/':
           inComment = False
           i += 1
    elif code[i:i+2] == '/*':
       inComment = True
       i += 1
    else:
       result = result + code[i]
    i += 1

print result # write to an output file if you prefer...
################

There seems to be a slight bug with multiline comments...
But I didn't test it extensively.

HTH, However in practice I'd probably use regular expressions
to solve this particular problem...

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 




More information about the Tutor mailing list