Extract the middle N chars of a string
Steven D'Aprano
steve at pearwood.info
Wed May 18 21:34:59 EDT 2016
On Thu, 19 May 2016 03:00 am, MRAB wrote:
> I think your results are inconsistent.
>
> For an odd number of characters you have "abc" + "de" + "fg", i.e. more
> on the left, but for an even number of characters you have "a" + "bcd" +
> "ef", i.e. more on the right.
Correct. That's intentional.
I didn't start with an algorithm. I started by manually extracting the N
middle characters, for various values of N, then wrote code to get the same
result.
For example, if I start with an odd-length string, like "inquisition", then
the "middle N" cases for odd-N are no-brainers, because they have to be
centered on the middle character:
N=1 's'
N=3 'isi'
N=5 'uisit'
For even-N, I had a choice:
N=2 'is' or 'si'
and to be perfectly frank, I didn't really care much either way and just
arbitrarily picked the second, based on the fact that string.center() ends
up with a slight bias to the right:
py> 'ab'.center(5, '*')
'**ab*'
For even-length string, like "aardvark", the even-N case is the no-brainer:
N=2 "dv"
N=4 "rdva"
N=6 "ardvar"
but with odd-N I have a choice:
N=3 "rdv" or "dva"
In this case, I *intentionally* biased it the other way, so that (in some
sense) overall the mid() function would be unbiased:
- half the cases, there's no bias at all;
- a quarter of the time, there's a bias to the right;
- a quarter of the time, there's a bias to the left;
- so on average, the bias is zero.
> My own solution is:
>
>
> def mid(string, n):
> """Return middle n chars of string."""
> if n <= 0:
> return ''
> if n > len(string):
> return string
> ofs = (len(string) - n) // 2
> return string[ofs : ofs + n]
>
>
> If there's an odd number of characters remaining, it always has more on
> the right.
Thanks to you and Ian (who independently posted a similar solution), that's
quite good too if you don't care about the bias.
--
Steven
More information about the Python-list
mailing list