agrepy 1.1: Python port of agrep (string matching with errors)

Agrep, written by Sun Wu and Udi Manber (described in "Fast Text Searching Allowing Errors", CACM, 35(10), 1992), is a suite of C functions which together perform various string matching operations under UNIX (i.e. specified at the commandline). For example, agrep is able find matches despite differences due to American versus British spelling agrepy takes agrep from its user level setting and makes it available as a Python module. Specifically, what this port implements are those functions from agrep relating to inexact matching of text strings which contain no metacharacters. For example, agrep is able find matches despite differences due to American versus British spelling. apgrepy also extends agrep in the following sense: given a pattern and a text string, agrepy and returns a list of all, non-overlapping pairs of text indexes such that the start index is the first character of the text that matches the earliest pattern character exactly, and the end is the last text character that matches exactly. The end index of each match is 1 place greater than the actual index so it can be immediately used to construct a slice. (agrep itself is content with recognizing that the input line contains a match, but does not say where or differentiate multiple matches.) agrepy 1.0 (July 1999) had an end-of-text bug in the short pattern module, while both the long and short pattern modules had problems at times finding the precise ends of a match. There also can be a genuine ambiguity specifying the ends of match which has equal number of errs. The methodology now is that the original algorithms find the ends of matches fairly accurately, and a separate, recursive function firms up the end position The authors of the original programs are Sun Wu and Udi Manber, as described above. See below for the copyright on the orginal algorithms. Other portions have been written by Michael J. Wise and are copyright under the terms of Open Source Definition (http://www.opensource.org/osd.html). Original Copyright This material was developed by Sun Wu and Udi Manber at the University of Arizona, Department of Computer Science. Permission is granted to copy this software, to redistribute it on a nonprofit basis, and to use it for any purpose, subject to the following restrictions and understandings. 1. Any copy made of this software must include this copyright notice in full. 2. All materials developed as a consequence of the use of this software shall duly acknowledge such use, in accordance with the usual standards of acknowledging credit in academic research. 3. The authors have made no warranty or representation that the operation of this software will be error-free or suitable for any application, and they are under under no obligation to provide any services, by way of maintenance, update, or otherwise. The software is an experimental prototype offered on an as-is basis. 4. Redistribution for profit requires the express, written permission of the authors. and finds the start position. SOURCE The code for port is available from: <A HREF="ftp://ftp.ccsr.cam.ac.uk/pub/michaelw/src/agrepy_1.1.tar.gz"> ftp://ftp.ccsr.cam.ac.uk/pub/michaelw/src/agrepy_1.1.tar.gz</A>. URL http://www.ccsr.cam.ac.uk/~mw263/pyagrep.html Dr Michael J. Wise Bristol-Myers Squibb Senior Research Fellow Pembroke College | Centre for Communications Systems Research (CCSR) Cambridge CB2 1RF | 10 Downing St England | Cambridge CB2 3DS | England Telephone: (+44 1223) 740 121 FAX: (+44 1223) 740 099 Internet: M.Wise1@ccsr.cam.ac.uk (remove 1 to use the address) URL: http://www.ccsr.cam.ac.uk/~michaelw Research Visitor at the European Bioinformatics Institute -- "If I'm not for myself, who is for me? But if I am only for myself, what am I? AND IF NOT NOW, WHEN?" - Sayings of the Fathers (Hillel) (emphasis my own) <P><A HREF="http://www.ccsr.cam.ac.uk/~mw263/pyagrep.html">agrepy 1.1</A> - String matching in the presence of a small number of errors. (30-Sep-99) -- ----------- comp.lang.python.announce (moderated) ---------- Article Submission Address: python-announce@python.org Python Language Home Page: http://www.python.org/ Python Quick Help Index: http://www.python.org/Help.html ------------------------------------------------------------
participants (1)
-
mw263X@quetico.ccsr.cam.ac.uk