Flexable Collating (feedback please)
Gabriel Genellina
gagsl-py at yahoo.com.ar
Wed Oct 18 18:55:45 EDT 2006
At Wednesday 18/10/2006 03:42, Ron Adam wrote:
>I put together the following module today and would like some feedback on any
>obvious problems. Or even opinions of weather or not it is a good approach.
> if self.flag & CAPS_FIRST:
> s = s.swapcase()
This is just coincidental; it relies on (lowercase)<(uppercase) on
the locale collating sequence, and I don't see why it should be always so.
> if self.flag & IGNORE_LEADING_WS:
> s = s.strip()
This ignores trailing ws too. (lstrip?)
> if self.flag & NUMERICAL:
> if self.flag & COMMA_IN_NUMERALS:
> rex =
> re.compile('^(\d*\,?\d*\.?\d*)(\D*)(\d*\,?\d*\.?\d*)',
>re.LOCALE)
> else:
> rex = re.compile('^(\d*\.?\d*)(\D*)(\d*\.?\d*)', re.LOCALE)
> slist = rex.split(s)
> for i, x in enumerate(slist):
> if self.flag & COMMA_IN_NUMERALS:
> x = x.replace(',', '')
> try:
> slist[i] = float(x)
> except:
> slist[i] = locale.strxfrm(x)
> return slist
> return locale.strxfrm(s)
You should try to make this part a bit more generic. If you are
concerned about locales, do not use "comma" explicitely. In other
countries 10*100=1.000 - and 1,234 is a fraction between 1 and 2.
> The NUMERICAL option orders leading and trailing digits as numerals.
>
> >>> t = ['a5', 'a40', '4abc', '20abc', 'a10.2', '13.5b', 'b2']
> >>> collated(t, NUMERICAL)
> ['4abc', '13.5b', '20abc', 'a5', 'a10.2', 'a40', 'b2']
From the name "NUMERICAL" I would expect this sorting: b2, 4abc, a5,
a10.2, 13.5b, 20abc, a40 (that is, sorting as numbers only).
Maybe GROUP_NUMBERS... but I dont like that too much either...
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
More information about the Python-list
mailing list