Python script that does batch find and replace in txt files

Syed Khalid khalidness at gmail.com
Sun Nov 9 22:51:37 CET 2014


Albert,

Thanks a million for script,

It worked fine after I closed the bracket.


import glob, codecs, re, os

regex = re.compile(r"Age: |Sex: |House No:  ") # etc etc

for txt in glob.glob("D:/Python/source/*.txt"):
    with codecs.open(txt, encoding="utf-8") as f:
        oldlines = f.readlines()
    for i, line in enumerate(oldlines):
        if "Elector's Name:" in line:
            break
    newlines = [regex.sub("", line).strip().replace("-", "_") for line in
oldlines[i:]]
    with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
        w.write(os.linesep.join(newlines))


This program is not deleting the empty lines containing blank characters.
Kindly do the needful.


On Mon, Nov 10, 2014 at 2:50 AM, Syed Khalid <khalidness at gmail.com> wrote:

>
> Code after adding path of .txt files :
>
> import glob, codecs, re, os
>
> regex = re.compile(r"Age: |Sex: |House No: ") # etc etc
>
> for txt in glob.glob("D:/Python/source/*.txt"):
>     with codecs.open(txt, encoding="utf-8") as f:
>         oldlines = f.readlines()
>     for i, line in enumerate(oldlines):
>         if "Elector's Name:" in line:
>             break
>     newlines = [regex.sub("", line).strip().replace("-", "_") for line in
> oldlines[i:]
>     with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
>         w.write(os.linesep.join(newlines))
>
>
> I executed code in edit rocket.
>
> Error message :
>
>
> File "EamClean.log", line 12
>     with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
>        ^
> SyntaxError: invalid syntax
>
>
>
>
>
> On Mon, Nov 10, 2014 at 2:22 AM, Syed Khalid <khalidness at gmail.com> wrote:
>
>> Hi Albert,
>>
>> Thank you for script.
>>
>> I am getting the below error :
>>
>>   File "EamClean.log", line 12
>>     with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
>>        ^
>> SyntaxError: invalid syntax
>>
>> Kindly do the needful.
>>
>> On Mon, Nov 10, 2014 at 1:53 AM, Albert-Jan Roskam <fomcl at yahoo.com>
>> wrote:
>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>> > From: Syed Khalid <khalidness at gmail.com>
>>> > To: python-list at python.org
>>> > Cc:
>>> > Sent: Sunday, November 9, 2014 8:58 PM
>>> > Subject: Python script that does batch find and replace in txt files
>>> >
>>> > Python script that does batch find and replace in txt files Need a
>>> python script
>>> > that opens all .txt files in a folder find replace/delete text and
>>> save files.
>>> >
>>> > I have text files and I need to perform below steps for each file.
>>> >
>>> > Step 1: Put cursor at start of file and Search for "Contact's
>>> > Name:". Delete all the rows before it.
>>> > Step 2: Put cursor at end of file, Search for "Contact's Name:"
>>> > select option UP.
>>> > Step 3: Search for "Photo of the" Replace with blanks
>>> > Step 4: Search for "Contact is" Replace with blanks
>>> > Step 5: Search for "Contact's Name:" Replace with blanks
>>> > Step 6: Search for "Age:" Replace with blanks
>>> > Step 7: Search for "Sex:" Replace with blanks
>>> > Step 8: Search for "House No:" Replace with blanks
>>> > Step 9: Search for "available" Replace with blanks
>>> > Step 10: Remove Empty Lines Containing Blank Characters from file
>>> > Step 11: Trim Leading Space for each line
>>> > Step 12: Trim Trailing Space after each line
>>> > Step 13: Search for - (hyphen) Replace with _ (underscore)
>>>
>>> > Step 14: Save file.
>>>
>>> something like (untested)
>>>
>>>
>>> import glob, codecs, re, os
>>>
>>> regex = re.compile(r"Age: |Sex: |House No: ") # etc etc
>>>
>>> for txt in glob.glob("/some/path/*.txt"):
>>>     with codecs.open(txt, encoding="utf-8") as f:
>>>         oldlines = f.readlines()
>>>     for i, line in enumerate(oldlines):
>>>         if "Contact's Name: " in line:
>>>             break
>>>     newlines = [regex.sub("", line).strip().replace("-", "_") for line
>>> in oldlines[i:]
>>>     with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
>>>         w.write(os.linesep.join(newlines))
>>>
>>>
>>>
>>>
>>> >
>>> > Currently I have recorded a macro in Notepad++.
>>> > I open each file, run macro and save file.
>>> > As there are many files I was looking for a program to automate the
>>> process.
>>> >
>>> > I posted the same query in Notepad++ forum. I got a reply that it can
>>> be done by
>>> > using Python script.
>>> >
>>> > Kindly do the needful.
>>> >
>>> > Thank you.
>>> > khalidness
>>> >
>>> > --
>>> > https://mail.python.org/mailman/listinfo/python-list
>>> >
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20141110/ee229814/attachment.html>


More information about the Python-list mailing list