Replace and inserting strings within .txt files with the use of regex

Νίκος nikos.the.gr33k at gmail.com
Tue Aug 10 01:11:44 EDT 2010


On 10 Αύγ, 01:43, MRAB <pyt... at mrabarnett.plus.com> wrote:
> Íßêïò wrote:
> > D:\>convert.py
> >   File "D:\convert.py", line 34
> > SyntaxError: Non-ASCII character '\xce' in file D:\convert.py on line
> > 34, but no
> >  encoding declared; seehttp://www.python.org/peps/pep-0263.htmlfor
> > details
>
> > D:\>
>
> > What does it refering too? what character cannot be identified?
>
> > Line 34 is:
>
> > src_data = src_data.replace( '</body>', '<br><br><center><h4><font
> > color=green> Áñéèìüò Åðéóêåðôþí: %(counter)d </body>' )
>
> Didn't you say that you're using Python 2.7 now? The default file
> encoding will be ASCII, but your file isn't ASCII, it contains Greek
> letters. Add the encoding line:
>
>      # -*- coding: utf-8 -*-
>
> and check that the file is saved as UTF-8.
>
> > Also,
>
> > for currdir, files, dirs in os.walk('test'):
>
> >    for f in files:
>
> >            if f.lower().endswith("php"):
>
> > in the above lines
>
> > should i state  os.walk('test') or  os.walk('d:\test') ?
>
> The path 'test' is relative to the current working directory. Is that
> D:\ for your script? If not, then it won't find the (correct) folder.
>
> It might be better to use an absolute path instead. You could use
> either:
>
>      r'd:\test'
>
> (note that I've made it a raw string because it contains a backslash
> which I want treated as a literal backslash) or:
>
>      'd:/test'
>
> (Windows should accept a slash as well as of a backslash.)

I will try it as soon as i make another change that i missed:

The ID number of each php page was contained in the old php code
within this string

PageID = some_number

So instead of create a new ID number for eaqch page i have to pull out
this number to store to the beginnign to the file as comment line,
because it has direct relationship with the mysql database as in
tracking the number of each webpage and finding the counter of it.

# Grab the PageID contained within the php code and store it in id
variable
id = re.search( 'PageID = ', src_data )

How to tell Python to Grab that number after 'PageID = ' string and to
store it in var id that a later use in the program?

also i made another changewould something like this work:

===============================
# open same php file for storing modified data
print ( 'writing to %s' % dest_f )
f = open(src_f, 'w')
f.write(src_data)
f.close()

# rename edited .php file to .html extension
dst_f = src_f.replace('.php', '.html')
os.rename( src_f, dst_f )
===============================

Because instead of creating a new .html file and inserting the desired
data of the old php thus having two files(old php, and new html) i
decided to open the same php file for writing that data and then
rename it to html.
Would the above code work?



More information about the Python-list mailing list