do a sed / awk filter with python tools (at least as fast)
Peter Otten
__peter__ at web.de
Mon Jul 7 16:04:52 EDT 2008
Mathieu Prevot wrote:
> I use in a bourne shell script the following filter:
>
> sed '/watch?v=/! d;s/.*v=//;s/\(.\{11\}\).*/\1/' \
> | sort | uniq | awk 'ORS=" "{print $1}'
>
> that give me all sets of 11 characters that follows the "watch?v="
> motif. I would like to do it in python on stdout from a
> subprocess.Popen instance, using python tools rather than sed awk etc.
> How can I do this ? Can I expect something as fast ?
You should either do it in Python , e. g.:
def process(lines):
candidates = (line.rstrip().partition("/watch?v=") for line in lines)
matches = (c[:11] for a, b, c in candidates if len(c) >= 11)
print " ".join(sorted(set(matches)))
if __name__ == "__main__":
import sys
process(sys.stdin)
or invoke your shell script via subprocess.Popen(). Invoking a python script
via subprocess doesn't make sense IMHO.
Peter
More information about the Python-list
mailing list