[Tutor] Parsing Bible verses
Eduardo Vieira
eduardo.susan at gmail.com
Tue May 26 07:48:35 CEST 2009
On Sat, May 23, 2009 at 3:37 AM, C or L Smith <smiles at worksmail.net> wrote:
> Here is something from my toolbox of routines that might be useful for the number ranges:
>
>>>> indices('-5--2')
> [-5, -4, -3, -2]
>>>> indices('3-4')
> [3, 4]
>>>> indices('3-4,10')
> [3, 4, 10]
>
> /chris
>
> def indices(s,n=None): #("1-3,7")->1,2,3,7;i("1,-3--1")->1,-3,-2,-1; or (slc,n=None)->slc.start,stop,step [for range(n)]
> """Return a list of indices as defined by a MSWord print dialog-like range:
>
> e.g. "1,3,5-7" -> [1, 3, 5, 6, 7]
>
> A trailing comma will be ignored; a trailing dash will generate an error."""
>
> # ranges must be increasing: -3--4 will not generate any numbers
> assert type(s) is str
> r=[x.strip() for x in s.split(',')]
> rv = []
> for ri in r:
> if not ri: continue
> if ri.find('-',1)>0: #ignore - in first position
> dashat = ri.find('-',1) #start searching at position 1
> nums = ri[:dashat],ri[dashat+1:]
> #one might want to use sys.maxint-1 for stop if the '-' is encountered, the
> #meaning being "from start to the end (as defined by the code elsewhere")
> #but then this should be made into an iterator rather than generating the
> #whole list
> if nums[1] in ['','-']:
> raise ValueError('missing number in request to indices: %s'%ri)
> start, stop = [int(x.strip()) for x in nums]
> for i in xrange(start, stop+1):
> rv.append(i)#yield i
> else:
> rv.append(int(ri))#yield int(ri)
> return rv
>
Thank you, your examples does give me some good ideas. I still want to
investigate better the pyparsing solution, tho.
One thing I came up with, was a way to parse and transform a list of
verses, adding the book name where it was missing. This is a first
step into the bigger program. Here's the code I came up with:
#=====
import re
reference = 'Luke 1:25; 2:1-5, 8; 4:23; 1 Corinthians 2:24; 3:1-10; Salm 23'
def addbook(ref):
parseref = re.split(r'; *', ref)
for i, refe in enumerate(parseref):
if refe[0].isalpha() or refe[1].isspace():
book = refe.rsplit(' ', 1)
elif refe[0].isdigit() and refe[1]:
vers = parseref.pop(i)
parseref.insert(i, book[0] + ' ' + vers)
return parseref
print addbook(reference)
#==========
This will give me this result:
['Luke 1:25', 'Luke 2:1-5, 8', 'Luke 4:23', '1 Corinthians 2:24', '1
Corinthians 3:1-10', 'Salm 23']
Now, a little farther on the topic of a Bible database. I'm not sure
how I should proceed. I don't really have the db file I need, I will
have to generate it somehow, from a bible software, because the
version I want is for Portuguese. I have found a bible in sql, a bible
in MS Access to give me some ideas on how to structure my database.
But my question is do I really need a sql database for my need, since
I will be only reading from it? Never adding or updating. One like
sqlite. Would a pickled dictionary from Bible_reference to verse be
faster? Should I work with anydbm?
Thanks for your knowledge and help
Eduardo
More information about the Tutor
mailing list