[Tutor] Picking up citations
Lie Ryan
lie.1296 at gmail.com
Tue Feb 10 12:22:41 CET 2009
On Mon, 09 Feb 2009 14:42:47 -0800, Marc Tompkins wrote:
> Aha! My list of "magic words"!
> (Sorry for the top post - anybody know how to change quoting defaults in
> Android Gmail?)
> --- www.fsrtechnologies.com
>
> On Feb 9, 2009 2:16 PM, "Dinesh B Vadhia" <dineshbvadhia at hotmail.com>
> wrote:
>
> Kent /Emmanuel
>
> I found a list of words before the first word that can be removed which
> I think is the only way to successfully parse the citations. Here they
> are:
>
> | E.g. | Accord | See |See + Also | Cf. | Compare | Contra | But + See |
> But + Cf. | See Generally | Citing | In |
>
I think the only reliable way to parse all the citations correctly, in
the absence of "magic word" is to have a list of names. It involves a bit
of manual work, but should be good enough if there are a small number of
cases that is cited a lot of times.
>>> names = '|'.join(['Carter', 'Jury Commision of Greene County', 'Lathe
Turner', 'Fouche'])
>>> rep = '|'.join(['.*?'])
>>> dd = {'names': names, 'publ': rep}
>>> re.search(r'((%(names)s) v. (%(names)s)(, [0-9]+ (%(publ)s) [0-9]+)*
\([0-9]+\))' % dd, text).group()
More information about the Tutor
mailing list