Fastest way to remove the first x characters from a very long string

Chris Angelico rosuav at gmail.com
Sat May 16 09:45:57 EDT 2015


On Sat, May 16, 2015 at 11:28 PM,  <bruceg113355 at gmail.com> wrote:
> I have a string that contains 10 million characters.
>
> The string is formatted as:
>
> "0000001 : some hexadecimal text ... \n
> 0000002 : some hexadecimal text ... \n
> 0000003 : some hexadecimal text ... \n
> ...
> 0100000 : some hexadecimal text ... \n
> 0100001 : some hexadecimal text ... \n"
>
> and I need the string to look like:
>
> "some hexadecimal text ... \n
> some hexadecimal text ... \n
> some hexadecimal text ... \n
> ...
> some hexadecimal text ... \n
> some hexadecimal text ... \n"
>
> I can split the string at the ":" then iterate through the list removing the first 8 characters then convert back to a string. This method works, but it takes too long to execute.
>
> Any tricks to remove the first n characters of each line in a string faster?

Given that your definition is "each line", what I'd advise is first
splitting the string into lines, then changing each line, and then
rejoining them into a single string.

lines = original_text.split("\n")
new_text = "\n".join(line[8:] for line in lines)

Would that work?

ChrisA



More information about the Python-list mailing list