Python Equivalent for dd & fold

Thu Jul 16 11:44:18 EDT 2009

On Jul 16, 10:12 am, seldan24 <selda... at gmail.com> wrote:
> On Jul 15, 1:48 pm, Emile van Sebille <em... at fenx.com> wrote:
>
>
>
> > On 7/15/2009 10:23 AM MRAB said...
>
> > >> On Jul 15, 12:47 pm, Michiel Overtoom <mot... at xs4all.nl> wrote:
> > >>> seldan24 wrote:
> > >>>> what can I use as the equivalent for the Unix 'fold' command?
> > >>> def fold(s,len):
> > >>>      while s:
> > >>>          print s[:len]
> > >>>          s=s[len:]
>
> > <snip>
> > > You might still need to tweak the above code as regards how line endings
> > > are handled.
>
> > You might also want to tweak it if the strings are _really_ long to
> > simply slice out the substrings as opposed to reassigning the balance to
> > a newly created s on each iteration.
>
> > Emile
>
> Thanks for all of the help.  I'm almost there.  I have it working now,
> but the 'fold' piece is very slow.  When I use the 'fold' command in
> shell it is almost instantaneous.  I was able to do the EBCDIC->ASCII
> conversion usng the decode method in the built-in str type.  I didn't
> have to import the codecs module.  I just decoded the data to cp037
> which works fine.
>
> So now, I'm left with a large file, consisting of one extremely long
> line of ASCII data that needs to be sliced up into 35 character
> lines.  I did the following, which works but takes a very long time:
>
> f = open(ascii_file, 'w')
> while ascii_data:
>     f.write(ascii_data[:len])
>     ascii_data = ascii_data[len:]
> f.close()
>
> I know that Emile suggested that I can slice out the substrings rather
> than do the gradual trimming of the string variable as is being done
> by moving around the length.  So, I'm going to give that a try... I'm
> a bit confused by what that means, am guessing that slice can break up
> a string based on characters; will research.  Thanks for the help thus
> far.  I'll post again when all is working fine.

The problem is that it creates a new string every time you iterate
through the "ascii_data = ascii_data[len:]".  I believe Emile was
suggesting that you just keep moving the starting index through the
same string, something like (warning - untested code!):

>>> i = 0
>>> str_len = len(ascii_data)
>>> while i < str_len:
>>>     j = min(i + length, str_len)
>>>     print ascii_data[i:j]
>>>     i = j