[Tutor] How to set up an Array?

Tue May 12 07:46:43 CEST 2009

John Fouhy wrote:
> 2009/5/12 nickel flipper <nickelflipper at yahoo.com>:
>> sfr (key=PORTA addr=0xf80 size=1 access='rw rw rw u rw rw rw rw')
>>    reset (por='xxxxxxxx' mclr='uuuuuuuu')
>>    bit (names='RA7 RA6 RA5 - RA3 RA2 RA1 RA0' width='1 1 1 1 1 1 1 1')
>>    bit (tag=scl names='RA' width='8')
>>    bit (names='OSC1 OSC2 AN4 - AN3 AN2 AN1 AN0' width='1 1 1 1 1 1 1 1')
>>    bit (names='CLKI CLKO nSS1 - VREF_PLUS VREF_MINUS C2INA C1INA' width='1 1 1 1 1 1 1 1')
>>    bit (names='- - LVDIN - C1INB CVREF_MINUS PMPA7 PMPA6' width='1 1 1 1 1 1 1 1')
>>    bit (names='- - RCV - - C2INB RP1 RP0' width='1 1 1 1 1 1 1 1')
>>    bit (names='- - RP2 - - - - -' width='1 1 1 1 1 1 1 1')
> 
> Hmm, hairy!
> 
> You could try looking into regular expressions.  Let's see:
> 
>     \b\w+='.+?'
> 
> That should be a regular expression matching the beginning of a word,
> followed by one or more word characters, an =, a ', some characters,
> up to the first '.
> 
> i.e. we're trying to match things like "names='RA7 RA6 RA5 - RA3 RA2 RA1 RA0'"
> 
> So, here we go:
> 
>>>> import re
>>>> s = "   bit (names='RA7 RA6 RA5 - RA3 RA2 RA1 RA0' width='1 1 1 1 1 1 1 1')"
>>>> rex = re.compile(r"\b\w+='.+?'")
> 
> (note the r before the string.. that makes it a "raw" string and
> prevents backslashes from causing problems, mostly)
> 
>>>> def to_dict(s):
> ....   res = {}
> ....   for token in rex.findall(s):
> ....     word, rest = token.split('=')
> ....     rest = rest.replace("'", '')
> ....     res[word] = rest.split()
> ....   return res
> ....
> 
> (note that my regular expression, rex, is here effectively a global
> variable.  That's convenient in the interactive interpreter, but you
> may want to do that a little differently in your final script)
> 
>>>> to_dict(s)
> {'width': ['1', '1', '1', '1', '1', '1', '1', '1'], 'names': ['RA7',
> 'RA6', 'RA5', '-', 'RA3', 'RA2', 'RA1', 'RA0']}
> 
> If regular expressions aren't powerful enough, you could look into a
> full-fledged parsing library.  There are two I know of: simpleparse
> and pyparsing.  SimpleParse works well if you're familiar with writing
> grammars: you write a grammar in the usual style, and simpleparse
> makes a parser out of it.  pyparsing is more OO.
> 

Surely it is easiest to do this with regular expression or parsing 
library, however I would not recommend you to start learning regex or 
parser libs until you have the basics first. If you do, you will just 
confuse yourself.

Read more on the beginner tutorials in the documentation. More 
specifically you must know that str.startswith is a function and 
startpin.startswith == "RA" or "RB" or ... or "RJ" does not do anything 
useful since you're comparing a function with "RA", then or-ing it with 
a string. Since non-empty string is always True, the conditional is 
always True.

The correct way to do it (using .startswith) would be

startpin.startswith(("RA", "RB", "RC", "RD", "RE", "RF", "RG", "RH", "RJ"))

and it is generally not a good idea to hardcode the string indexes (e.g. 
pin7[8:3]) as it will be PITA if the format has even the slightest change.