[Tutor] Simultaneous read and write on file
Peter Otten
__peter__ at web.de
Tue Jan 19 04:52:20 EST 2016
Anshu Kumar wrote:
> Hello All,
>
> So much Thanks for your response.
>
> Here is my actual scenario. I have a csv file and it would already be
> present. I need to read and remove some rows based on some logic. I have
> written earlier two separate file opens which I think was nice and clean.
>
> actual code:
>
> with open(file_path, 'rb') as fr:
> for row in csv.DictReader(fr):
> #Skip for those segments which are part of overridden_ids
> if row['id'] not in overriden_ids:
Oops typo; so probably not your actual code :(
> segments[row['id']] = {
> 'id': row['id'],
> 'attrib': json.loads(row['attrib']),
> 'stl': json.loads(row['stl']),
> 'meta': json.loads(row['meta']),
> }
> #rewriting files with deduplicated segments
> with open(file_path, 'wb') as fw:
> writer = csv.UnicodeWriter(fw)
> writer.writerow(["id", "attrib", "stl", "meta"])
> for seg in segments.itervalues():
> writer.writerow([seg['id'], json.dumps(seg["attrib"]),
> json.dumps(seg["stl"]), json.dumps(seg["meta"])])
>
>
> I have got review comments to improve this block by having just single
> file open and minimum memory usage.
Are the duplicate ids stored in overridden_ids or are they implicitly
removed by overwriting them in
segments[row["id"]] = ...
? If the latter, does it matter whether the last or the first row with a
given id is kept?
More information about the Tutor
mailing list