[Tutor] regexp help needed

Kent Johnson kent37 at tds.net
Wed Oct 18 20:55:05 CEST 2006


János Juhász wrote:
> Dear All,
> 
> I have a problem about the EDI invoices created by our erp system.
> I have to make a small correction on them, just before sending them by 
> ftp.
> 
> The problem is that, the big numbers are printed with thousand separator.
> 
> U:\ediout\INVOIC\Backup>grep \...., *.doc
> File 063091.doc:
>      MOALIN    203       79.524,480 DKK4
>      PRI       YYY   1.095,130   1        PC
>      MOATOT    79  594.629,400      DKK4
> File 063092.doc:
>      MOALIN    203       47.281,680 DKK4
>      MOATOT    86   56.738,016      DKK4
>      MOATOT    79   47.281,680      DKK4
> 
> I have to remove the thousand separator by moving the numbers before it to 
> right.
> So the number and char groups has to be left in their original position.
> 
> I have to make this kind of changes on the problematic lines:
>      MOATOT    79   47.281,680      DKK4
>      MOATOT    79    47281,680      DKK4
> 
> I have no idea how to make it :(

Break it up into smaller problems:
for each line in the data:
   break the line up into fields
   fix the field containing the amount
   rebuild the line

You don't really have to make a regex for the whole line. re.split() is 
useful for splitting the line and preserving the whitespace so you can 
rebuild the line with the same format.

Kent



More information about the Tutor mailing list