Parsing a serial stream too slowly

Thomas Rachel nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915 at
Tue Jan 24 08:04:10 EST 2012

Am 24.01.2012 00:13 schrieb Thomas Rachel:

[sorry, my Thunderbird kills the indentation]

> And finally, you can make use of re.finditer() resp.
> sensorre.finditer(). So you can do
> sensorre = re.compile(r'\$(.)(.*?)\$') # note the change
> theonebuffer = '$A1234$$B-10$$C987$' # for now
> sensorresult = None # init it for later
> for sensorresult in sensorre.finditer(theonebuffer):
> sensor, value = sensorresult.groups()
> # replace the self.SensorAValue concept with a dict
> self.sensorvalues[sensor] = value
> # and now, keep the rest
> if sensorresult is not None:
> # the for loop has done something - cut out the old stuff
> # and keep a possible incomplete packet at the end
> theonebuffer = theonebuffer[sensorresult.end():]
> This removes the mentionned string copying as source of increased slowness.

But it has one major flaw: If you lose synchronization, it may happen 
that only the data *between* the packets is returned - which are mostly 
empty strings.

So it would be wise to either change the firmware of the device to use 
different characters for starting end ending a packet, or to return 
every data between "$"s and discarding the empty strings.

As regexes might be overkill here, we could do

def splitup(theonebuffer):
     l = theonebuffer.split("$")
     for i in l[:-1]: yield i + "$"
     if l: yield l[-1]

sensorvalues = {}
theonebuffer = '1garbage$A1234$$B-10$2garbage$C987$D3' # for now
for part in splitup(theonebuffer):
     if not part.endswith("$"):
	theonebuffer = part
         break # it is the last one which is probably not a full packet
     part = part[:-1] # cut off the $
     if not part: continue # $$ -> gap between packets
     # now part is the contents of one packet which may be valid or not.
     # TODO: Do some tests - maybe for valid sensor names and values.
     sensor = part[0]
     value = part[1:]
     sensorvalues[sensor] = value # add the "self." in your case -
     # for now, it is better without

Now I get sensorvalues, theonebuffer as ({'1': 'garbage', 'A': '1234', 
'2': 'garbage', 'B': '-10', 'C': '987'}, 'D3').

D3 is not (yet) a value; it might come out as D3, D342 or whatever, as 
the packet is not complete yet.


More information about the Python-list mailing list