Reading mixed ASCII/bin file-HOW?

Chris Barker chrishbarker at attbi.com
Tue Dec 4 19:59:33 CET 2001


josegomez at gmx.net wrote:
>         I want to read a number of files, all of which have a header,
> and then some data. More often than not :-), the data will be binary.

I have done this a few times myself.

> 31/05/00 16:59:01
> COMMENTS
> Cal - C Band. 12 cm tri
> ENDOFCOMMENTS
> NETWORK ANALYSER SETTINGS
> 4,            Start Freq (GHz)
> 6,            End Freq (GHz)
> 1601,         No. of Freqs per Step
> 
>         The order of the parameters (and units) is always the same, so
> what I want to do is to read all that information into a dictionary. For
> the previous example, that'd be
> data = {'Date':"31/05/00",
>         'Time':"16:59:01",
>         'Comments':"Cal - C Band, 12 cm tri",
>         'StartFreq':4.0,
>         'EndFreq':6.0,
>         'NumFreq':1601}
> 
>         Now, I don't reallu understand how this is done, but I guess
> that I could parse it through a number of readline() calls. There has to
> be a better way though...

What's wrong with that way?? A bunch of readline() calls is exactly what
you want. For example:

data['Date'],data['Time'] = file.readline().strip().split()
while 1:
	line = file.readline().strip()
	comments = []
	if line == "ENDOFCOMMENTS":
		break
	comments.append(line)
data['Comments'] = "\n".join(comments)

....

etc.

With a header that is as free-form as this, you really don't have much
choice. The  NETWORK ANALYSER SETTINGS section could be parsed out all
in the same way, but if really has only 3 lines, I'd just read them in
and set the values. Note that if you want the numbers as numbers, rather
than strings, you'll want to use int() or float()

>         OK, and now we get to the problem of what to do if the file is
> mixed  binary/ascii. Basically, after the header, I get a "delimiter"
> in ASCII and a bunch of datapoints (to be calculated from the
> information on the header).

First of all, make sure to open the file in binary mode so your binary
data won't get mangled. If you are on *nix, it won't matter, but it's a
good habit to get into. On Windows, you are going to get some extra
\r-s, but they are whitespace and will be removed with strip(), so it's
no big deal. 

open(filename,'b')

look for your delimiter, and after it, read the data with
file.read(numbytes). I assumeyou have the number of bytes form the info
in the header. To convert the binary data to something Python can
understand, you have three options (at least):

use the struct module
use the array module, and either fromstring or fromfile.
use Numeric (http://sourceforge.net/projects/numpy) and the fromstring()
function.

I highly recommend Numeric, if you are working with a large 2-d array,
it is indespensible.

> Also, if the data isn't binary and is ascii,
> I get a lot of lines containing to numbers.

I'm assuming you will know when you read the header. For large tables of
ascii numbers, native Python is not so good. You can read() or
readline() or readlines() or xreadlines() the data, and then use
.split() and int() or float() and maybe map:

data =  map(float, file.readline().strip().split())

that works fine, but it is kind of slow. For better alternatives, check
out SciPy (www.scipy.org), and look for io stuff (also search the web
for TableIO and others)

>         In general, these datapoints fit neatly into a 2D complex array,
> and I have to convert the pairs of reals into complex number (Re+j*Im),
> for both ASCII and binary types.

if the format is right, you might be able to read them straight into a
NumPy Complex Array, otherwise, you can easily do:

A = zeros((m,n),Complex)
A[:,:] = realpart + 1j * complexpart
 

That should get you started.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------



More information about the Python-list mailing list