Extract the middle N chars of a string

Wed May 18 21:34:59 EDT 2016

On Thu, 19 May 2016 03:00 am, MRAB wrote:

> I think your results are inconsistent.
> 
> For an odd number of characters you have "abc" + "de" + "fg", i.e. more
> on the left, but for an even number of characters you have "a" + "bcd" +
> "ef", i.e. more on the right.

Correct. That's intentional.

I didn't start with an algorithm. I started by manually extracting the N
middle characters, for various values of N, then wrote code to get the same
result.

For example, if I start with an odd-length string, like "inquisition", then
the "middle N" cases for odd-N are no-brainers, because they have to be
centered on the middle character:

N=1 's'
N=3 'isi'
N=5 'uisit'

For even-N, I had a choice:

N=2 'is' or 'si'

and to be perfectly frank, I didn't really care much either way and just
arbitrarily picked the second, based on the fact that string.center() ends
up with a slight bias to the right:

py> 'ab'.center(5, '*')
'**ab*'

For even-length string, like "aardvark", the even-N case is the no-brainer:

N=2 "dv"
N=4 "rdva"
N=6 "ardvar"

but with odd-N I have a choice:

N=3 "rdv" or "dva"

In this case, I *intentionally* biased it the other way, so that (in some
sense) overall the mid() function would be unbiased:

- half the cases, there's no bias at all;
- a quarter of the time, there's a bias to the right;
- a quarter of the time, there's a bias to the left;
- so on average, the bias is zero.

> My own solution is:
> 
> 
> def mid(string, n):
>      """Return middle n chars of string."""
>      if n <= 0:
>          return ''
>      if n > len(string):
>          return string
>      ofs = (len(string) - n) // 2
>      return string[ofs : ofs + n]
> 
> 
> If there's an odd number of characters remaining, it always has more on
> the right.

Thanks to you and Ian (who independently posted a similar solution), that's
quite good too if you don't care about the bias.

-- 
Steven