Here are my elaborate &quot;slides&quot; from yesterday:<br><br>This is a random collection of topics related to Python tools.<br><br>Talk about the UNIX philosophy:<br>    Small tools.<br>    My problems tend to be too large for RAM, but not too big for one machine.<br>

    UNIX and batch processing are a natural fit.<br>    Multiple processes = multiple CPUs.<br>    Multiple programming languages = more flexibility.<br>    Pipes = concurrency without the pain.<br>    Scales linearly and predictably, unlike databases.<br>

    UNIX tools that already exist are helpful and fast.<br><br>Use the optparse module to provide consistent command line APIs:<br>    Here&#39;s an example of the setup from the docs:<br>        : from optparse import OptionParser<br>

        : parser = OptionParser()<br>        : parser.add_option(&quot;-f&quot;, &quot;--file&quot;, dest=&quot;filename&quot;,<br>        :                   help=&quot;write report to FILE&quot;, metavar=&quot;FILE&quot;)<br>

        : parser.add_option(&quot;-q&quot;, &quot;--quiet&quot;,<br>        :                   action=&quot;store_false&quot;, dest=&quot;verbose&quot;, default=True,<br>        :                   help=&quot;don&#39;t print status messages to stdout&quot;)<br>

        : (options, args) = parser.parse_args()<br>    Here&#39;s an example of my own help text<br>        : Usage: cleancuttsv.py [options]<br>        : <br>        : Options:<br>        :   -h, --help            show this help message and exit<br>

        :   --assert-head=FIELD1\tFIELD2\t...<br>        :                         assert that the first line of the file matches this<br>        :   --delete-head         delete the first line of input<br>        :   -n NUM, --num-fields=NUM<br>

        :                         assert that there are this many fields per line<br>        :   --drop-blank-lines    delete blank lines instead of raising an error<br>        :<br><br>sort:<br>    <a href="http://jjinux.blogspot.com/2008/08/python-sort-uniq-c-via-subprocess.html">http://jjinux.blogspot.com/2008/08/python-sort-uniq-c-via-subprocess.html</a><br>

    sort -S 20% -T /mnt/some_other_drive ...<br>    <a href="http://jjinux.blogspot.com/2008/08/python-memory-conservation-tip-sort.html">http://jjinux.blogspot.com/2008/08/python-memory-conservation-tip-sort.html</a><br>

<br>tsv:<br>    You need a consistent format.<br>    Downsides:<br>        Most UNIX tools don&#39;t understand true TSV, but only an approximation thereof:<br>            My own code raises an exception in cases where it would actually matter.<br>

        Many UNIX tools are ignorant of encoding issues:<br>            Sometimes playing dumb works and sometimes it hurts.<br>    Using the csv module:<br>        : import csv<br>        : <br>        : DEFAULT_KARGS = dict(dialect=&#39;excel-tab&#39;, lineterminator=&#39;\n&#39;)<br>

        : MYSQL_LOAD_DATA_INFILE_DESC = &quot;&quot;&quot;\<br>        :     FIELDS TERMINATED BY &#39;\t&#39;<br>        :            OPTIONALLY ENCLOSED BY &#39;&quot;&#39;<br>        :            ESCAPED BY &#39;&#39;<br>

        :     LINES TERMINATED BY &#39;\n&#39;&quot;&quot;&quot;<br>        : <br>        : def create_default_reader(iterable):<br>        :     &quot;&quot;&quot;Return a csv.reader with our default options.&quot;&quot;&quot;<br>

        :     return csv.reader(iterable, **DEFAULT_KARGS)<br>        : ...<br>    Using mysqlimport.<br>        : mysqlimport \<br>        :     --user=$MYSQL_USERNAME \<br>        :     --password=$MYSQL_PASSWORD \<br>        :     --columns=id,name \<br>

        :     --fields-optionally-enclosed-by=&#39;&quot;&#39; \<br>        :     --fields-terminated-by=&#39;\t&#39; \<br>        :     --fields-escaped-by=&#39;&#39; \<br>        :     --lines-terminated-by=&#39;\n&#39; \<br>

        :     --local \<br>        :     --lock-tables \<br>        :     --replace \<br>        :     --verbose \<br>        :     $DATABASE ${BUILD}/sometable.tsv<br>        To see warnings:<br>            <a href="http://jjinux.blogspot.com/2009/03/mysql-encoding-hell.html">http://jjinux.blogspot.com/2009/03/mysql-encoding-hell.html</a><br>

<br>Show pdb in the context of a web app:<br>    : import pdb<br>    : from pprint import pprint<br>    : pdb.set_trace()<br>    : pprint(request.environ)<br>    <a href="http://localhost:5000/api/ratio">http://localhost:5000/api/ratio</a><br>