[stdlib-sig] textwrap module and hyphenation

Antoine Pitrou solipsis at pitrou.net
Sat Apr 19 21:36:08 CEST 2008


Le samedi 19 avril 2008 à 00:08 -0700, Brett Cannon a écrit :
> On Fri, Apr 18, 2008 at 11:18 PM, Sylvain Fourmanoit
> <syfou at users.sourceforge.net> wrote:
> > I just noticed the textwrap module in the standard library will break and
> >  line-wrap hyphenated words given the opportunity:
> >
> >  >>> from textwrap import wrap
> >  >>> wrap('yaba daba-doo', width=10)
> >  ['yaba daba-', 'doo']

[...]

> I personally don't think so as you could easily just walk the list and
> just concatenate the hyphenated words.

But then the words wouldn't be wrapped properly, would they ?
In the above example, if you join the two strings together, the result
is more than 10 chars long.

I think this feature makes sense, and doesn't really clutter the API.
In the meantime, a workaround is to use other unicode hyphens (*) in
order to get the desired result, e.g.:

>>> print(" | ".join(textwrap.wrap('yaba daba-doo', width=10)))
yaba daba- | doo
>>> print(" | ".join(textwrap.wrap('yaba daba\u2010doo', width=10)))
yaba | daba­‐doo


(*) http://www.fileformat.info/info/unicode/char/00ad/index.htm
http://www.fileformat.info/info/unicode/char/2010/index.htm





More information about the stdlib-sig mailing list