[Python-ideas] New explicit methods to trim strings

David Mertz mertz at gnosis.cx
Mon Apr 1 22:43:10 EDT 2019


On Mon, Apr 1, 2019 at 8:54 PM Steven D'Aprano <steve at pearwood.info> wrote:

> I can think of at least one English suffix pair that clash: -ify, -fy.
> How about other languages? How comfortable are you to say that nobody
> doing text processing in German or Hindi will need to deal with clashing
> affixes?
>

 Here are the 30 most common suffixes in a large list of Dutch words.  For
similar answers for other languages, see
https://gist.github.com/DavidMertz/1a4aac0e889097d7bf80d8d41a3a644d.  Note
that there is absolutely nothing morphological here, simply dumb string
literals:

% head -30 suffix-frequency-nl.txt
('en', 55338)
('er', 14387)
('de', 12541)
('den', 11427)
('ten', 9402)
('te', 8263)
('ng', 7502)
('es', 7398)
('st', 7102)
('ing', 6949)
('gen', 6836)
('rs', 6592)
('ers', 5581)
('ren', 4842)
('el', 4602)
('ngen', 4451)
('rde', 4255)
('ken', 4203)
('re', 3870)
('je', 3868)
('len', 3784)
('ste', 3680)
('ie', 3658)
('nd', 3635)
('erde', 3620)
('rden', 3593)
('jes', 3307)
('eren', 3193)
('id', 3123)
('rd', 3083)



-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20190401/2ac9108a/attachment-0001.html>


More information about the Python-ideas mailing list