<div dir="ltr"><div>greetings,</div><div><br></div><div>i'm writing a program to scan a data file. from each line of the data file i'd like to add something like below to a dictionary. my perl background makes me want python to autovivify, but when i do:</div><div><br></div><div> file_data = {}</div><div><br></div><div> [... as i loop through lines in the file ...]</div><div><br></div><div> file_data[ md5sum ][ inode ] = { 'path' : path, 'size' : size, }</div><div><br></div><div>i get:</div><div><br></div><div>Traceback (most recent call last):</div><div> File "foo.py", line 45, in <module></div><div> file_data[ md5sum ][ inode ] = { 'path' : path, 'size' : size, }</div><div>KeyError: '91b152ce64af8af91dfe275575a20489'</div><div><br></div><div>what is the pythonic way to build my "file_data" data structure above that has the above structure?</div><div><br></div><div>on <a href="http://en.wikipedia.org/wiki/Autovivification">http://en.wikipedia.org/wiki/Autovivification</a> there is a section on how to do autovivification in python, but i want to learn how a python programmer would normally build a data structure like this.</div><div><br></div><div>here is the code so far:</div><div><br></div><div>#!/usr/bin/python</div><div><br></div><div>import argparse</div><div>import os</div><div><br></div><div>ASCII_NUL = chr(0)</div><div><br></div><div>HOSTNAME = 0</div><div>MD5SUM = 1</div><div>FSDEV = 2</div><div>INODE = 3</div><div>NLINKS = 4</div><div>SIZE = 5</div><div>PATH = 6</div><div><br></div><div>file_data = {}</div><div><br></div><div>if __name__ == "__main__":</div><div> parser = argparse.ArgumentParser(description='scan files in a tree and print a line of information about each regular file')</div><div> parser.add_argument('--file', '-f', required=True, help='File from which to read data')</div><div> parser.add_argument('--field-separator', '-s', default=ASCII_NUL, help='Specify the string to use as a field separator in output. The default is the ascii nul character.')</div><div> args = parser.parse_args()</div><div><br></div><div> file = args.file</div><div> field_separator = args.field_separator</div><div><br></div><div> with open( file, 'rb' ) as f:</div><div> for line in f:</div><div> line = line.rstrip('\n')</div><div> if line == 'None': continue</div><div> fields = line.split( ASCII_NUL )</div><div><br></div><div> hostname = fields[ HOSTNAME ]</div><div> md5sum = fields[ MD5SUM ]</div><div> fsdev = fields[ FSDEV ]</div><div> inode = fields[ INODE ]</div><div> nlinks = int( fields[ NLINKS ] )</div><div> size = int( fields[ SIZE ] )</div><div> path = fields[ PATH ]</div><div><br></div><div> if size < ( 100 * 1024 * 1024 ): continue</div><div><br></div><div> ### print "'%s' '%s' '%s' '%s' '%s' '%s' '%s'" % ( hostname, md5sum, fsdev, inode, nlinks, size, path, )</div><div><br></div><div> file_data[ md5sum ][ inode ] = { 'path' : path, 'size' : size, }</div><div><br></div><div>thanks,</div><div>david</div><div>-- <br><div dir="ltr">Our decisions are the most important things in our lives.<div>***</div><div>Live in a world of your own, but always welcome visitors.</div></div>
</div></div>