Parsing of a file
John Machin
sjmachin at lexicon.net
Wed Aug 6 17:36:01 EDT 2008
On Aug 7, 7:06 am, John Machin <sjmac... at lexicon.net> wrote:
> On Aug 7, 6:02 am, Mike Driscoll <kyoso... at gmail.com> wrote:
>
>
>
> > On Aug 6, 1:55 pm, Tommy Grav <tg... at mac.com> wrote:
>
> > > I have a file with the format
>
> > > Field f29227: Ra=20:23:46.54 Dec=+67:30:00.0 MJD=53370.06797690 Frames
> > > 5 Set 1
> > > Field f31448: Ra=20:24:58.13 Dec=+79:39:43.9 MJD=53370.06811620 Frames
> > > 5 Set 2
> > > Field f31226: Ra=20:24:45.50 Dec=+78:26:45.2 MJD=53370.06823860 Frames
> > > 5 Set 3
> > > Field f31004: Ra=20:25:05.28 Dec=+77:13:46.9 MJD=53370.06836020 Frames
> > > 5 Set 4
> > > Field f30782: Ra=20:25:51.94 Dec=+76:00:48.6 MJD=53370.06848210 Frames
> > > 5 Set 5
> > > Field f30560: Ra=20:27:01.82 Dec=+74:47:50.3 MJD=53370.06860400 Frames
> > > 5 Set 6
> > > Field f30338: Ra=20:28:32.35 Dec=+73:34:52.0 MJD=53370.06872620 Frames
> > > 5 Set 7
> > > Field f30116: Ra=20:30:21.70 Dec=+72:21:53.6 MJD=53370.06884890 Frames
> > > 5 Set 8
> > > Field f29894: Ra=20:32:28.54 Dec=+71:08:55.0 MJD=53370.06897070 Frames
> > > 5 Set 9
> > > Field f29672: Ra=20:34:51.89 Dec=+69:55:56.6 MJD=53370.06909350 Frames
> > > 5 Set 10
>
> > > I would like to parse this file by extracting the field id, ra, dec
> > > and mjd for each line. It is
> > > not, however, certain that the width of each value of the field id,
> > > ra, dec or mjd is the same
> > > in each line. Is there a way to do this such that even if there was a
> > > line where Ra=****** and
> > > MJD=******** was swapped it would be parsed correctly?
>
> > > Cheers
> > > Tommy
>
> > I'm sure Python can handle this. Try the PyParsing module or learn
> > Python regular expression syntax.
>
> >http://pyparsing.wikispaces.com/
>
> > You could probably do it very crudely by just iterating over each line
> > and then using the string's find() method.
>
> Perhaps you and the OP could spend some time becoming familiar with
> built-in functions and str methods. In particular, str.split is your
> friend:
>
> C:\junk>type tommy_grav.py
> # Look, Ma, no imports!
>
> guff = """\
> Field f29227: Ra=20:23:46.54 Dec=+67:30:00.0 MJD=53370.06797690 Frames
> 5 Set 1
> Field f31448: MJD=53370.06811620123 Dec=+79:39:43.9 Ra=20:24:58.13
> Frames 5 Set
> 2
> Field f31226: Ra=20:24:45.50 Dec=+78:26:45.2 MJD=53370.06823860 Frames
> 5 Set 3
> Field f31004: Ra=20:25:05.28 Dec=+77:13:46.9 MJD=53370.06836020 Frames
> 5 Set 4
> Field f30782: Ra=20:25:51.94 Dec=+76:00:48.6 MJD=53370.06848210 Frames
> 5 Set 5
>
> Field f30560: Ra=20:27:01.82 Dec=+74:47:50.3 MJD=53370.06860400 Frames
> 5 Set 6
> Field f30338: Ra=20:28:32.35 Dec=+73:34:52.0 MJD=53370.06872620 Frames
> 5 Set 7
> Field f30116: Ra=20:30:21.70 Dec=+72:21:53.6 MJD=53370.06884890 Frames
> 5 Set 8
> Field f29894: Ra=20:32:28.54 Dec=+71:08:55.0 MJD=53370.06897070 Frames
> 5 Set 9
> Field f29672: Ra=20:34:51.89 Dec=+69:55:56.6 MJD=53370.06909350 Frames
> 5 Set 10
>
> """
>
> is_angle = {
> 'ra': True,
> 'dec': True,
> 'mjd': False,
> }
>
> def convert_angle(text):
> deg, min, sec = map(float, text.split(':'))
> return (sec / 60. + min) / 60. + deg
>
> def parse_line(line):
> t = line.split()
> assert t[0].lower() == 'field'
> assert t[1].startswith('f')
> assert t[1].endswith(':')
> field_id = t[1].rstrip(':')
> rdict = {}
> for f in t[2:]:
> parts = f.split('=')
> if len(parts) == 2:
> key = parts[0].lower()
> value = parts[1]
> assert key not in rdict
> if is_angle[key]:
> rvalue = convert_angle(value)
> else:
> rvalue = float(value)
> rdict[key] = rvalue
> return field_id, rdict['ra'], rdict['dec'], rdict['mjd']
>
> for line in guff.splitlines():
> line = line.strip()
> if not line:
> continue
> field_id, ra, dec, mjd = parse_line(line)
> print field_id, ra, dec, mjd
>
> C:\junk>tommy_grav.py
> f29227 20.3962611111 67.5 53370.0679769
> f31448 20.4161472222 79.6621944444 53370.0681162
> f31226 20.4126388889 78.4458888889 53370.0682386
> f31004 20.4181333333 77.2296944444 53370.0683602
> f30782 20.4310944444 76.0135 53370.0684821
> f30560 20.4505055556 74.7973055556 53370.068604
> f30338 20.4756527778 73.5811111111 53370.0687262
> f30116 20.5060277778 72.3648888889 53370.0688489
> f29894 20.5412611111 71.1486111111 53370.0689707
> f29672 20.5810805556 69.9323888889 53370.0690935
>
> Cheers,
> John
Slightly less ugly:
C:\junk>diff tommy_grav.py tommy_grav_2.py
18,23d17
< is_angle = {
< 'ra': True,
< 'dec': True,
< 'mjd': False,
< }
<
27a22,27
> converter = {
> 'ra': convert_angle,
> 'dec': convert_angle,
> 'mjd': float,
> }
>
41,44c41
< if is_angle[key]:
< rvalue = convert_angle(value)
< else:
< rvalue = float(value)
---
> rvalue = converter[key](value)
More information about the Python-list
mailing list