Python Equivalent for dd & fold
Casey Webster
Caseyweb at gmail.com
Thu Jul 16 11:44:18 EDT 2009
On Jul 16, 10:12 am, seldan24 <selda... at gmail.com> wrote:
> On Jul 15, 1:48 pm, Emile van Sebille <em... at fenx.com> wrote:
>
>
>
> > On 7/15/2009 10:23 AM MRAB said...
>
> > >> On Jul 15, 12:47 pm, Michiel Overtoom <mot... at xs4all.nl> wrote:
> > >>> seldan24 wrote:
> > >>>> what can I use as the equivalent for the Unix 'fold' command?
> > >>> def fold(s,len):
> > >>> while s:
> > >>> print s[:len]
> > >>> s=s[len:]
>
> > <snip>
> > > You might still need to tweak the above code as regards how line endings
> > > are handled.
>
> > You might also want to tweak it if the strings are _really_ long to
> > simply slice out the substrings as opposed to reassigning the balance to
> > a newly created s on each iteration.
>
> > Emile
>
> Thanks for all of the help. I'm almost there. I have it working now,
> but the 'fold' piece is very slow. When I use the 'fold' command in
> shell it is almost instantaneous. I was able to do the EBCDIC->ASCII
> conversion usng the decode method in the built-in str type. I didn't
> have to import the codecs module. I just decoded the data to cp037
> which works fine.
>
> So now, I'm left with a large file, consisting of one extremely long
> line of ASCII data that needs to be sliced up into 35 character
> lines. I did the following, which works but takes a very long time:
>
> f = open(ascii_file, 'w')
> while ascii_data:
> f.write(ascii_data[:len])
> ascii_data = ascii_data[len:]
> f.close()
>
> I know that Emile suggested that I can slice out the substrings rather
> than do the gradual trimming of the string variable as is being done
> by moving around the length. So, I'm going to give that a try... I'm
> a bit confused by what that means, am guessing that slice can break up
> a string based on characters; will research. Thanks for the help thus
> far. I'll post again when all is working fine.
The problem is that it creates a new string every time you iterate
through the "ascii_data = ascii_data[len:]". I believe Emile was
suggesting that you just keep moving the starting index through the
same string, something like (warning - untested code!):
>>> i = 0
>>> str_len = len(ascii_data)
>>> while i < str_len:
>>> j = min(i + length, str_len)
>>> print ascii_data[i:j]
>>> i = j
More information about the Python-list
mailing list