Splitting a string into substrings of equal size

MRAB python at mrabarnett.plus.com
Sat Aug 15 18:28:09 EDT 2009


Brian wrote:
> 
> 
> On Sat, Aug 15, 2009 at 4:06 PM, MRAB <python at mrabarnett.plus.com 
> <mailto:python at mrabarnett.plus.com>> wrote:
> 
>     ryles wrote:
> 
>         On Aug 14, 8:22 pm, candide <cand... at free.invalid> wrote:
> 
>             Suppose you need to split a string into substrings of a
>             given size (except
>             possibly the last substring). I make the hypothesis the
>             first slice is at the
>             end of the string.
>             A typical example is provided by formatting a decimal string
>             with thousands
>             separator.
> 
>             What is the pythonic way to do this ?
> 
>             For my part, i reach to this rather complicated code:
> 
>             # ----------------------
> 
>             def comaSep(z,k=3, sep=','):
>                z=z[::-1]
>                x=[z[k*i:k*(i+1)][::-1] for i in range(1+(len(z)-1)/k)][::-1]
>                return sep.join(x)
> 
>             # Test
>             for z in ["75096042068045", "509", "12024", "7", "2009"]:
>                print z+" --> ", comaSep(z)
> 
>             # ----------------------
> 
>             outputting :
> 
>             75096042068045 -->  75,096,042,068,045
>             509 -->  509
>             12024 -->  12,024
>             7 -->  7
>             2009 -->  2,009
> 
>             Thanks
> 
> 
>         py> s='1234567'
>         py> ','.join(_[::-1] for _ in re.findall('.{1,3}',s[::-1])[::-1])
>         '1,234,567'
>         py> # j/k ;)
> 
> 
>     If you're going to use re, then:
> 
> 
>      >>> for z in ["75096042068045", "509", "12024", "7", "2009"]:
>            print re.sub(r"(?<=.)(?=(?:...)+$)", ",", z)
> 
> 
>            
>     75,096,042,068,045
>     509
>     12,024
>     7
>     2,009
> 
> 
> Can you please break down this regex?
> 
The call replaces a zero-width match with a comma, ie inserts a comma,
if certain conditions are met:

"(?<=.)"
     Look behind for 1 character. There must be at least one previous
character. This ensures that a comma is never inserted at the start of
the string. I could also have used "(?<!^)". Actually, it doesn't check
whether the first character is a "-". That's left as an exercise for the
reader. :-)

"(?=(?:...)+$)"
     Look ahead for a multiple of 3 characters, followed by the end of
the string.



More information about the Python-list mailing list