Bash-like brace expansion

Peter Waller peter.waller at gmail.com
Tue Mar 24 08:28:17 EDT 2009


Okay, I got fed up with there not being any (obvious) good examples of
how to do bash-like brace expansion in Python, so I wrote it myself.
Here it is for all to enjoy!

If anyone has any better solutions or any other examples of how to do
this, I'd be glad to hear from them.

#~ BraceExpand.py - Bash-like brace expansion in Python
#~ Copyright (C) 2009 <peter.waller at gmail.com>

#~ This program is free software: you can redistribute it and/or
modify
#~ it under the terms of the GNU Affero General Public License as
#~ published by the Free Software Foundation, either version 3 of the
#~ License, or (at your option) any later version.

#~ This program is distributed in the hope that it will be useful,
#~ but WITHOUT ANY WARRANTY; without even the implied warranty of
#~ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#~ GNU Affero General Public License for more details.

#~ You should have received a copy of the GNU Affero General Public
License
#~ along with this program.  If not, see <http://www.gnu.org/licenses/
>.

import re
from itertools import imap

class NoBraces(Exception): pass

def BraceExpand(text, ordering = []):
    """Recursively brace expand text in a bash-like fashion.

    See 'man 2 bash' under the heading 'Brace Expansion'

    Example: "/data/file{1..10}.root" expands to file1.root ...
file9.root

    Ordering allows changing the order of iteration. The rules for
this are
    a little complicated. Ordering is a list of boolean values.
    If it is possible to expand the postamble, the first value from
the list
    is removed, and used to determine whether the postamble should be
iterated
    over before iterating over the amble. The list is then passed
down
    recursively to the next Brace.

    What does this code do?

    It is simpler than it looks.

    There are three main steps:
      1) Split the string into three parts, consisting of pre-amble,
amble and
         post-amble. (This requires keeping track of nested {}'s.)
      2) In the amble, figure out which commas are not stuck in {}'s,
and
         split the string by those commas.
      3) For each part of this split string, Add together the pre-
amble, the
         string and the post-amble. Call BraceExpand() on this string
to deal
         with nested braces and any expansion required in the
postamble

    Other things this code does which make it look more complicated:
      * Expand ranges along the lines of {1..10}
      * Allow for re-arranging the order of iteration

    Todo/Not Implemented:
      * Escaping/quoting

    Example C implementation from bash (inspiration for this):
        http://www.oldlinux.org/Linux.old/bin/old/bash-1.11/braces.c
    """

    def FindMatchedBraces(text, position = -1):
        "Search for nested start and end brace in text starting from
position"
        braceDepth = 0
        nextClose = -1
        first = True

        # Search for a {
        # is it closer than the nearest } ?
        #   Yes : increase brace depth
        #   No  : decrease brace depth
        # When we reach braceDepth == 0, we have found our close brace
        while braceDepth or first:

            nextOpen = text.find("{", position+1)
            nextClose = text.find("}", position+1)

            if first and nextOpen >= 0:
                startBrace = nextOpen
            first = False

            if nextOpen < nextClose and nextOpen >= 0:
                braceDepth += 1
                position = nextOpen
            elif nextClose >= 0:
                braceDepth -= 1
                position = nextClose
            else:
                raise NoBraces()

        return startBrace, position

    try: start, end = FindMatchedBraces(text)
    except NoBraces:
        # There are no braces! Nothing to expand!
        return [text]

    # Split the text into three bits, '{pre,,post}amble'. The 'pre' is
anything
    # before expansion, the '' is the bit that needs expanding and
gluing to
    # the pre and the post. After gluing together, we can recursively
expand
    # again
    preamble = text[:start]
    amble = text[start+1:end]
    postamble = text[end+1:]

    def BareCommaSearcher(amble):
        "Search for commas which are not encapsulated in {}"

        haveBraces = True
        try: start, end = FindMatchedBraces(amble)
        except NoBraces: haveBraces = False

        position = -1
        while True:
            position = amble.find(",", position+1)

            if position < 0:
                # We didn't find any comma after 'position', finish
searching.
                break

            if haveBraces and start < position < end:
                # We're inside some braces, skip to the end of them,
find the
                # next set.
                position = end

                haveBraces = True
                try: start, end = FindMatchedBraces(amble, position)
                except NoBraces: haveBraces = False

                continue

            yield position

        # Reached the end of the string. (Conveniently
"text"[pos:None] is the
        # same as "text"[pos:]
        yield None

    parts = []
    rangeMatcher = re.compile(r"^(\d+)..(\d+)$")

    # Find each segment of the amble, splitting by 'bare' commas
    position = 0
    for commaPos in BareCommaSearcher(amble):
        # Get a bit between bare commas
        part = amble[position:commaPos]

        # Found a matched range, expand it!
        matchedRange = rangeMatcher.match(part)
        if matchedRange:
            matchedIndices = map(int, matchedRange.groups())
            parts.extend(imap(str,xrange(*matchedIndices)))
        else:
            parts.append(part)

        if not commaPos:
            # Avoid None + 1 at the end of iteration
            break

        position = commaPos + 1

    # Is the postamble expandable? (Does it contain braces?)
    postambleExpandable = True
    try: start, end = FindMatchedBraces(postamble)
    except NoBraces: postambleExpandable = False

    # Check to see if we should iterate over postamble first, also
sort out
    # iteration ordering
    postambleFirst = ordering
    thisPostambleFirst = False
    if postambleExpandable and postambleFirst:
        postambleFirst = list(postambleFirst)
        thisPostambleFirst = postambleFirst.pop(0)

    result = []

    if postambleExpandable and thisPostambleFirst:
        # Iterate of expanded postamble first
        for postpart in BraceExpand(postamble):
            for part in parts:
                part = "".join([preamble, part, postpart])
                result.extend(BraceExpand(part, postambleFirst))

    else:
        for part in parts:
            part = "".join([preamble, part, postamble])
            result.extend(BraceExpand(part, postambleFirst))

    return result

if __name__ == "__main__":
    from pprint import pprint
    pprint(BraceExpand("electron_{n,{pt,eta,phi}[{1,2}]}", ordering =
[True]))

    #~ pprint(BraceExpand("Myfile{1,3..10}.root"))

    #~ pprint(BraceExpand("{pre,,post}amble"))

    #~ pprint(BraceExpand("amble{a,b,}}"))



More information about the Python-list mailing list