[Tutor] Picking up citations

Lie Ryan lie.1296 at gmail.com
Tue Feb 10 12:22:41 CET 2009


On Mon, 09 Feb 2009 14:42:47 -0800, Marc Tompkins wrote:

> Aha! My list of "magic words"!
> (Sorry for the top post - anybody know how to change quoting defaults in
> Android Gmail?)
> ---  www.fsrtechnologies.com
> 
> On Feb 9, 2009 2:16 PM, "Dinesh B Vadhia" <dineshbvadhia at hotmail.com>
> wrote:
> 
>  Kent /Emmanuel
> 
> I found a list of words before the first word that can be removed which
> I think is the only way to successfully parse the citations.  Here they
> are:
> 
> | E.g. | Accord | See |See + Also | Cf. | Compare | Contra | But + See |
> But + Cf. | See Generally | Citing | In |
> 

I think the only reliable way to parse all the citations correctly, in 
the absence of "magic word" is to have a list of names. It involves a bit 
of manual work, but should be good enough if there are a small number of 
cases that is cited a lot of times.

>>> names = '|'.join(['Carter', 'Jury Commision of Greene County', 'Lathe 
Turner', 'Fouche'])
>>> rep = '|'.join(['.*?'])
>>> dd = {'names': names, 'publ': rep}
>>> re.search(r'((%(names)s) v. (%(names)s)(, [0-9]+ (%(publ)s) [0-9]+)* 
\([0-9]+\))' % dd, text).group()



More information about the Tutor mailing list