Fwd: Processing a large string

Yaşar Arabacı yasar11732 at gmail.com
Sun Aug 28 15:52:23 EDT 2011


---------- Yönlendirilmiş ileti ----------
Kimden: Yaşar Arabacı <yasar11732 at gmail.com>
Tarih: 28 Ağustos 2011 22:51
Konu: Re: Processing a large string
Kime: Paul Rudin <paul.nospam at rudin.co.uk>


Are you getting Overflow error or memory error? If you don't know what those
means:

Overflow error occurs when your lists gets bigger than sys.maxsize in size.
Memory error occurs, when your objects take too much memory that no more
memory can be allocated for you.

For example, if you have only one item in your list, and that item consumes
all your memory, you will get memory error. On the other hand, if you put so
much items to the list that list index hits the sys.maxsize value, you will
get Overflow error.

Answer to your questions depends on spesifics of your needs and current
situation. If you can read whole of the string without getting memory error,
you wont get memory error when you split the string into a list (at least in
my theory), but you can still get Overflowerror (in 2.6 at least.). If your
string is in a file or buffer, I would do something like,

file__ = open("big-string.txt","r")
output__ = open("outputfile","w")

temp__ = ""

def process_string(string_):
    return processed_string

while 1:
        new_char = file.read(1)
        if new_char == "":
            break
        elif new_char == "3":
            output__.write(process_string(temp__))
            temp__ = ""
        else:
            temp__ = temp__ + new_char


2011/8/28 Paul Rudin <paul.nospam at rudin.co.uk>

> goldtech <goldtech at worldpost.com> writes:
>
> > Hi,
> >
> > Say I have a very big string with a pattern like:
> >
> > akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn.....
> >
> > I want to split the sting into separate parts on the "3" and process
> > each part separately. I might run into memory limitations if I use
> > "split" and get a big array(?)  I wondered if there's a way I could
> > read (stream?) the string from start to finish and read what's
> > delimited by the "3" into a variable, process the smaller string
> > variable then append/build a new string with the processed data?
> >
> > Would I loop it and read it char by char till a "3"...? Or?
> >
> > Thanks.
>
> s = "akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn"
> for k, subs in itertools.groupby(s, lambda x: x=="3"):
>   print ''.join(subs)
>
>
> what you actually do in the body of the loop depends on what you want to
> do with the bits.
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
http://yasar.serveblog.net/




-- 
http://yasar.serveblog.net/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110828/9a2abdc0/attachment.html>


More information about the Python-list mailing list