inplace text filter - without writing file

MRAB python at mrabarnett.plus.com
Sun Oct 2 14:22:37 EDT 2016


On 2016-10-02 06:33, Sayth Renshaw wrote:
> On Sunday, 2 October 2016 16:19:14 UTC+11, Sayth Renshaw  wrote:
>> On Sunday, 2 October 2016 12:14:43 UTC+11, MRAB  wrote:
>> > On 2016-10-02 01:21, Sayth Renshaw wrote:
>> > > Hi
>> > >
>> > > I have a fileobject which was fine however now I want to delete a line from the file object before yielding.
>> > >
>> > > def return_files(file_list):
>> > >     for filename in sorted(file_list):
>> >
>> > When joining paths together, it's better to use 'os.path.join'.
>> >
>> > >         with open(dir_path + filename) as fd:
>> > >              for fileItem in fd:
>> > >                  yield fileItem
>> > >
>> > > Ned gave an answer over here http://stackoverflow.com/a/6985814/461887
>> > >
>> > > for i, line in enumerate(input_file):
>> > >     if i == 0 or not line.startswith('#'):
>> > >         output.write(line)
>> > >
>> > > which I would change because it is the first line and I want to rid <!--
>> > >
>> > > for i, line in enumerate(input_file):
>> > >     if line.startswith(<!--'):
>> > >         output.write(line)
>> > >
>> > > However I do not want to write the file I want to restore all the enumerated files back to the fileItem I need to yield.
>> > >
>> > > How do I do this?
>> > >
>> > You use the logic of the answer from StackOverflow in your
>> > 'return_files' function:
>> >
>> > def return_files(file_list):
>> >      for filename in sorted(file_list):
>> >          with open(dir_path + filename) as fd:
>> >              for fileItem in fd:
>> >                  if keep_item(fileItem):
>> >                      yield fileItem
>> >
>> >
>> > If you need to know the line number when deciding whether to keep a
>> > line, use enumerate too:
>> >
>> > def return_files(file_list):
>> >      for filename in sorted(file_list):
>> >          with open(dir_path + filename) as fd:
>> >              for i, fileItem in enumerate(fd):
>> >                  if keep_item(fileItem, i):
>> >                      yield fileItem
>>
>> I just can't quite get it.
>>
>> def return_files(file_list):
>>     for filename in sorted(file_list):
>>         file = os.path.join(dir_path, filename)
>>         print(file)
>>         with open(file) as fd:
>>             print(fd)
>>             for fileItem in fd:
>>                 print(fileItem)
>>                 for line in fileItem:
>>                     print(line[0])
>>                     if line.startswith('<!--'):
>>                         print(line[1:])
>>                         yield line[1:]
>>                     else:
>>                         yield line[:]
>>
>> I maybe over baking it but just not getting it right.
>>
>> Sayth
>
> Aargh, half solved it, now it works except I am creating a list as the yielded item and lxml doesn't accept list it wants the file.
>
> def return_files(file_list):
>     """
>     Take a list of files and return file when called.
>
>     Calling function to supply attributes
>     """
>     for filename in sorted(file_list):
>         file = os.path.join(dir_path, filename)
>         with open(file, 'r') as fd:
>             data = fd.read().splitlines(True)
>             if data[0].startswith('<!--'):
>                 print(data[1:])
>                 yield data[1:]
>             else:
>                 yield data[0:]
>
A shorter way of reading the lines is:

     lines = fd.readlines()

You can yield the lines one at a time with:

     yield from lines

This is equivalent to:

     for line in lines:
         yield line

but is shorter, quicker and doesn't use a variable!




More information about the Python-list mailing list