[Tutor] Regular Expression help

Mike Hansen Mike.Hansen at atmel.com
Wed Jun 27 16:24:22 CEST 2007


 

> -----Original Message-----
> From: tutor-bounces at python.org 
> [mailto:tutor-bounces at python.org] On Behalf Of Gardner, Dean
> Sent: Wednesday, June 27, 2007 3:59 AM
> To: tutor at python.org
> Subject: [Tutor] Regular Expression help
> 
> Hi 
> 
> I have a text file that I would like to split up so that I 
> can use it in Excel to filter a certain field. However as it 
> is a flat text file I need to do some processing on it so 
> that Excel can correctly import it.
> 
> File Example: 
> tag             desc                    VR      VM 
> (0012,0042) Clinical Trial Subject Reading ID LO 1 
> (0012,0050) Clinical Trial Time Point ID LO 1 
> (0012,0051) Clinical Trial Time Point Description ST 1 
> (0012,0060) Clinical Trial Coordinating Center Name LO 1 
> (0018,0010) Contrast/Bolus Agent LO 1 
> (0018,0012) Contrast/Bolus Agent Sequence SQ 1 
> (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 
> (0018,0015) Body Part Examined CS 1 
> 
> What I essentially want is to use python to process this file 
> to give me 
> 
> 
> (0012,0042); Clinical Trial Subject Reading ID; LO; 1 
> (0012,0050); Clinical Trial Time Point ID; LO; 1 
> (0012,0051); Clinical Trial Time Point Description; ST; 1 
> (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 
> (0018,0010); Contrast/Bolus Agent; LO; 1 
> (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 
> (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 
> (0018,0015); Body Part Examined; CS; 1 
> 
> so that I can import to excel using a delimiter. 
> 
> This file is extremely long and all I essentially want to do 
> is to break it into it 'fields' 
> 
> Now I suspect that regular expressions are the way to go but 
> I have only basic experience of using these and I have no 
> idea what I should be doing.
> 
> Can anyone help. 
> 
> Thanks 
> 

Hmmmm... You might be able to do this without the need for regular
expressions. You can split the row on spaces which will give you a list.
Then you can reconstruct the row inserting your delimiter as needed and
joining the rest with spaces again.

In [63]: row = "(0012,0042) Clinical Trial Subject Reading ID LO 1"

In [64]: row_items = row.split(' ')

In [65]: row_items
Out[65]: ['(0012,0042)', 'Clinical', 'Trial', 'Subject', 'Reading',
'ID', 'LO',
'1']

In [66]: tag = row_items.pop(0)

In [67]: tag
Out[67]: '(0012,0042)'

In [68]: vm = row_items.pop()

In [69]: vm
Out[69]: '1'

In [70]: vr = row_items.pop()

In [71]: vr
Out[71]: 'LO'

In [72]: desc = ' '.join(row_items)

In [73]: new_row = "%s; %s; %s; %s" %(tag, desc, vr, vm, )

In [74]: new_row
Out[74]: '(0012,0042); Clinical Trial Subject Reading ID; LO; 1'

Someone might think of a better way with them thar fancy lambdas and
list comprehensions thingys, but I think this will work. 

Mike


More information about the Tutor mailing list