<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<blockquote type="cite">
<div><br>
</div>
<div>Can you give an example of how these data structures look
after reading only the first 5 lines?</div>
</blockquote>
Sure, here you go:<br>
<br>
In [38]: mpef._ustore._store<br>
Out[38]: defaultdict(<type 'dict'>, {'Measurement':
{'8991c2dc67a49b909918477ee4efd767':
<micropheno.exchangeformat.Exceptions.FileContext object at
0x2f0fe90>, '7b38b429230f00fe4731e60419e92346':
<micropheno.exchangeformat.Exceptions.FileContext object at
0x2f0fad0>, 'b53531471b261c44d52f651add647544':
<micropheno.exchangeformat.Exceptions.FileContext object at
0x2f0f4d0>, '44ea6d949f7c8c8ac3bb4c0bf4943f82':
<micropheno.exchangeformat.Exceptions.FileContext object at
0x2f0f910>, '0de96f928dc471b297f8a305e71ae3e1':
<micropheno.exchangeformat.Exceptions.FileContext object at
0x2f0f550>}})<br>
<br>
In [39]:
mpef._ustore._store['Measurement']['b53531471b261c44d52f651add647544'].typeStr<br>
Out[39]: 'Measurement'<br>
<br>
In [40]:
mpef._ustore._store['Measurement']['b53531471b261c44d52f651add647544'].lineNumber<br>
Out[40]: 5<br>
<br>
In [41]: mpef._ustore._idstore<br>
Out[41]: defaultdict(<class
'micropheno.exchangeformat.KBaseID.IDStore'>, {'Measurement':
<micropheno.exchangeformat.KBaseID.IDStore object at
0x2f0f950>})<br>
<br>
In [43]: mpef._ustore._idstore['Measurement']._SIDstore<br>
Out[43]: defaultdict(<function <lambda> at 0x2ece7d0>,
{'emailRemoved': defaultdict(<function <lambda> at
0x2c4caa0>, {'microPhenoShew2011': defaultdict(<type
'dict'>, {0: {'MLR_124572462':
'8991c2dc67a49b909918477ee4efd767', 'MLR_124572161':
'7b38b429230f00fe4731e60419e92346', 'SMMLR_12551352':
'b53531471b261c44d52f651add647544', 'SMMLR_12551051':
'0de96f928dc471b297f8a305e71ae3e1', 'SMMLR_12550750':
'44ea6d949f7c8c8ac3bb4c0bf4943f82'}})})})<br>
<br>
-MrsE<br>
<br>
On 9/25/2012 4:33 AM, Oscar Benjamin wrote:
<blockquote
cite="mid:CAHVvXxTOZzjv__r-TgOUdJNnYBuAUUc-yKRPvQF82XQocNr9pQ@mail.gmail.com"
type="cite">
<div class="gmail_quote">On 25 September 2012 00:58, Junkshops <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:junkshops@gmail.com" target="_blank">junkshops@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Tim, thanks for the response.
<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
- check how you're reading the data: are you iterating
over<br>
the lines a row at a time, or are you using<br>
.read()/.readlines() to pull in the whole file and then<br>
operate on that?<br>
</blockquote>
</div>
I'm using enumerate() on an iterable input (which in this case
is the filehandle).
<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
- check how you're storing them: are you holding onto
more<br>
than you think you are?<br>
</blockquote>
</div>
I've used ipython to look through my data structures (without
going into ungainly detail, 2 dicts with X numbers of
key/value pairs, where X = number of lines in the file), and
everything seems to be working correctly. Like I say, heapy
output looks reasonable - I don't see anything surprising
there. In one dict I'm storing a id string (the first token in
each line of the file) with values as (again, without going
into massive detail) the md5 of the contents of the line. The
second dict has the md5 as the key and an object with
__slots__ set that stores the line number of the file and the
type of object that line represents.</blockquote>
<div><br>
</div>
<div>Can you give an example of how these data structures look
after reading only the first 5 lines?</div>
<div><br>
</div>
<div>Oscar</div>
</div>
</blockquote>
</body>
</html>