[Tutor] Parsing Bible verses

Eduardo Vieira eduardo.susan at gmail.com
Tue May 26 07:48:35 CEST 2009


On Sat, May 23, 2009 at 3:37 AM, C or L Smith <smiles at worksmail.net> wrote:
> Here is something from my toolbox of routines that might be useful for the number ranges:
>
>>>> indices('-5--2')
> [-5, -4, -3, -2]
>>>> indices('3-4')
> [3, 4]
>>>> indices('3-4,10')
> [3, 4, 10]
>
> /chris
>
> def indices(s,n=None): #("1-3,7")->1,2,3,7;i("1,-3--1")->1,-3,-2,-1; or (slc,n=None)->slc.start,stop,step [for range(n)]
>    """Return a list of indices as defined by a MSWord print dialog-like range:
>
>    e.g. "1,3,5-7" -> [1, 3, 5, 6, 7]
>
>    A trailing comma will be ignored; a trailing dash will generate an error."""
>
>    # ranges must be increasing: -3--4 will not generate any numbers
>    assert type(s) is str
>    r=[x.strip() for x in s.split(',')]
>    rv = []
>    for ri in r:
>        if not ri: continue
>        if ri.find('-',1)>0: #ignore - in first position
>            dashat = ri.find('-',1) #start searching at position 1
>            nums = ri[:dashat],ri[dashat+1:]
>            #one might want to use sys.maxint-1 for stop if the '-' is encountered, the
>            #meaning being "from start to the end (as defined by the code elsewhere")
>            #but then this should be made into an iterator rather than generating the
>            #whole list
>            if nums[1] in ['','-']:
>                raise ValueError('missing number in request to indices: %s'%ri)
>            start, stop = [int(x.strip()) for x in nums]
>            for i in xrange(start, stop+1):
>                rv.append(i)#yield i
>        else:
>            rv.append(int(ri))#yield int(ri)
>    return rv
>
Thank you, your examples does give me some good ideas. I still want to
investigate better the pyparsing solution, tho.
One thing I came up with, was a way to parse and transform a list of
verses, adding the book name where it was missing. This is a first
step into the bigger program. Here's the code I came up with:
#=====
import re

reference = 'Luke 1:25; 2:1-5, 8; 4:23; 1 Corinthians 2:24; 3:1-10; Salm 23'

def addbook(ref):
    parseref = re.split(r'; *', ref)
    for i, refe in enumerate(parseref):
        if refe[0].isalpha() or refe[1].isspace():
            book = refe.rsplit(' ', 1)
        elif refe[0].isdigit() and refe[1]:
            vers = parseref.pop(i)
            parseref.insert(i, book[0] + ' ' + vers)
    return parseref


print addbook(reference)
#==========
This will give me this result:
['Luke 1:25', 'Luke 2:1-5, 8', 'Luke 4:23', '1 Corinthians 2:24', '1
Corinthians 3:1-10', 'Salm 23']

Now, a little farther on the topic of a Bible database. I'm not sure
how I should proceed. I don't really have the db file I need, I will
have to generate it somehow, from a bible software, because the
version I want is for Portuguese. I have found a bible in sql, a bible
in MS Access to give me some ideas on how to structure my database.
But my question is do I really need a sql database for my need, since
I will be only reading from it? Never adding or updating. One like
sqlite. Would a pickled dictionary from Bible_reference to verse be
faster? Should I work with anydbm?

Thanks for your knowledge and help

Eduardo


More information about the Tutor mailing list