[Python-ideas] Support parsing stream with `re`
Ram Rachum
ram at rachum.com
Sun Oct 7 09:11:57 EDT 2018
Hi Cameron,
Thanks for putting in the time to study my problem and sketch a solution.
Unfortunately, it's not helpful. I was developing a solution similar to
yours before I came to the conclusion that a multilne regex would be more
elegant. I find this algorithm to be quite complicated. It's basically a
poor man's regex engine.
I'm more likely to use a shim to make the re package work on streams (like
regexy or reading chunks until I get a match) than to use an algorithm like
that.
Thanks,
Ram.
On Sun, Oct 7, 2018 at 9:58 AM Cameron Simpson <cs at cskk.id.au> wrote:
> On 07Oct2018 07:30, Ram Rachum <ram at rachum.com> wrote:
> >I'm doing multi-color 3d-printing. The slicing software generates a GCode
> >file, which is a text file of instructions for the printer, each command
> >meaning something like "move the head to coordinates x,y,z while extruding
> >plastic at a rate of w" and lots of other administrative commands. (Turn
> >the print fan on, heat the extruder to a certain temperature, calibrate
> the
> >printer, etc.)
> >
> >Here's an example of a simple GCode from a print I did last week:
> >https://www.dropbox.com/s/kzmm6v8ilcn0aik/JPL%20Go%20hook.gcode?dl=0
> >
> >It's 1.8MB in size. They could get to 1GB for complex prints.
> >
> >Multi-color prints means that at some points in the print, usually in a
> >layer change, I'm changing the color. This means I need to insert an M600
> >command, which tells the printer to pause the print, move the head around,
> >and give me a prompt to change the filament before continuing printing.
> >
> >I'm sick of injecting the M600 manually after every print. I've been doing
> >that for the last year. I'm working on a script so I could say "Insert an
> >M600 command after layers 5, 88 and 234, and also before process Foo."
> >
> >The slicer inserts comments saying "; layer 234" Or "; process Foo". I
> want
> >to identify the entire layer as one match. That's because I want to find
> >the layer and possibly process at the start, I want to find the last
> >retraction command, the first extrusion command in the new layer, etc. So
> >it's a regex that spans potentially thousands of lines.
> >
> >Then I'll know just where to put my M600 and how much retraction to do
> >afterwards.
>
> Aha.
>
> Yeah, don't use a regexp for "the whole layer". I've fetched your file,
> and it
> is one instruction or comment per line. This is _easy_ to parse. Consider
> this
> totally untested sketch:
>
> layer_re = re.compile('^; layer (\d+), Z = (.*)')
> with open("JPL.gcode") as gcode:
> current_layer = None
> for lineno, line in enumerate(gcode, 1):
> m = layer_re.match(line)
> if m:
> # new layer
> new_layer = int(m.group(1))
> new_z = float(m.group(2))
> if current_layer is not None:
> # process the saved previous layer
> ..........
> current_layer = new_layer
> accrued_layer = []
> if current_layer is not None:
> # we're saving lines for later work
> accrued_layer.append(line)
> continue
> # otherwise, copy the line straight out
> sys.stdout.write(line)
>
> The idea is that you scan the data on a per-line basis, adjusting some
> state
> variables as you see important lines. If you're "saving" a chunk of lines
> such
> as the instructions in a layer (in the above code: "current_layer is not
> None")
> you can stuff just those lines into a list for use when complete.
>
> On changes of state you deal with what you may have saved, etc.
>
> But just looking at your examples, you may not need to save anything; just
> insert or append lines during the copy. Example:
>
> with open("JPL.gcode") as gcode:
> for lineno, line in enumerate(gcode, 1):
> # pre line actions
> if line.startswith('; process '):
> print("M600 instruction...")
> # copy out the line
> sys.stdout.write(line)
> # post line actions
> if ...
>
> So you don't need to apply a regexp to a huge chunk of file. Process the
> file
> on an instruction basis and insert/append your extra instructions as you
> see
> the boundaries of the code you're after.
>
> A minor note. This incantation:
>
> for lineno, line in enumerate(gcode, 1):
>
> is to make it easy to print error message which recite the file line
> number to
> aid debugging. If you don't need that you'd just run with:
>
> for line in gcode:
>
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181007/0c1b1dec/attachment-0001.html>
More information about the Python-ideas
mailing list