[PythonCAD] Improved DWG Reader

Art Haas ahaas at airmail.net
Fri Oct 3 12:09:39 EDT 2003


Hi.

After several fits and starts I've finished enough of the next iteration
of the DWG reader that I'm making it available for testing. This version
is currently set up for reading just R15 file formats; I'll be working on
beating the R13/R14 reader into shape shortly.

The new DWG reader is structured like this:

dwgbase.py: Class definition for Dwg class, file entities, and common
routines for version-specific readers.
dwg15.py: R15 file reading functions

The 'dwg.py' file will be renamed to 'dwg1314.py'. The 'dwg12.py' will
be adjusted to fit in the layout after some feedback about the new
layout is obtained. I'm not sold on the name 'dwgbase.py' yet, so that may
change, but I think I like the way things are laid out with regards to
the approach of reading DWG files.

So, how does this beastie work? Like this ...

$ python
>>> import dwgbase
>>> dwg = dwgbase.Dwg("/path/to/the/dwg/file.dwg")
[ ... right now 'file.dwg' must be an R15 file - that will change once
      once the other readers are converted ... ]
>>> keys = dwg.getHeaderKeys()
[ ... retrieve all the header variables from the file and store them
     in the dwg object. The value of each variable can be extracted
     like so ... ]
>>> dwg.getHeader('DIMSZ')
1.0
>>> dwg.getHeader('LTSCALE')
10.0
[ ... etc, etc, etc ...]

>>> keys = dwg.getClassKeys()
[ ... retrieve the class key info in the file. These keys are currently
      integer values. The class data is needed for reading the objects
      in the file, so let's move on to that ... ]

>>> objs = dwg.getObjects()
[ ... read all(*) the objects in the file and store them in a list.
      Let's pretend we have 100 objects ... ]
>>> objs[0].getType()
27
[ ... this is a point object - see the OpenDWG spec ... ]
>>> objs[0].getHandle()
(0, 2, 5, 30)
[ ... or something similiar - remember this is pretend :-) ... ]
>>> objs[0].getEntityKeys()
['point', 'thickness', 'extrusion', 'x_axis_angle']
[ ... these are the entity specific values ... ]
>>> objs[0].getEntityData('point')
(0.0, 1.0, 0.5)
[ ... the point's coordinates ... ]
>>> objs[0].getEntityData('extrusion')
(0.0, 0.0, 1.0)
[ ... the extrusion value ... ]

If you're wondering what "all(*)" meant, remember that some of the
entities are not completely described in the OpenDWG spec, and there are
a few entities omitted from the spec that are appearing in drawings I've
been sent. The reader does the best it can but there are some omissions.

Now lets pretend that objs[50] is a circle at (40, 20) with a radius of
14.25:

>>> objs[50].getType()
18
[ ... see the OpenDWG specs ... ]
>>> objs[50].getEntityKeys()
[ 'center', 'radius', 'thickness', 'extrusion' ]
>>> objs[50].getEntityData('center')
(40.0, 20.0, 0.0)
>>> objs[50].getEntityData('radius')
14.25

The above gives you an idea of how things work. The various
entity-specific fields are listed with a getEntityKeys() call, then each
field can be obtained with a getEntityData() call. Instead of using DXF
codes as the keys I chose to use descriptive strings because, well, they
are descriptive. It is clearer to figure out this ...

getEntityData('radius')

... than something like ...

getEntityData(40)

The code is a short of documentation now, so you'll have to read it a
bit if you want to play with it. In the case of the 'Dwg' class, methods
that start with 'get' are free to be used, and methods starting with '_set'
are meant for use only within the reader code. For the class repreenting
the objects in the file, the same restriction holds, but these objects
have more 'get/_set' methods. I think this is a design flaw that will
probably be amended in the next iteration.

To recap:

import dwgbase
dwg = dwgbase.Dwg("/path/to/file.dwg")
headers = dwg.getHeaderKeys()
classes = dwg.getClassKeys()
objects = dwg.getObjects()

The 'Dwg' class also has methods for retrieving the image data that R13,
R14, and R15 files may have stored ...

bmpimg = dwg.getImageData('BMP') # could return None
wmfimg = dwg.getImageData('WMF') # could return None

These methods need testing as I haven't tried to do anything with them
yet.

For each object in the 'objects' list above the following methods will
probably be used the most ...

obj = objects[0] # pick one ...
type = obj.getType()
handle = obj.getHandle()
keys = obj.getEntityKeys()
val = obj.getEntityData('foo') # one of the valid entity keys ...

The 'dwg15.py' file is somewhat messy now due to this conversion. It
will be cleaned up in due time. The immediate goal is to provide a
usable interface for extracting the drawing info stored in the DWG file
so please let me know what you think of the means I've provided so far.
If you find some potential method call missing please provide feedback.

While working on this conversion I found and fixed a number of bugs in
the R15 reader, and also found that there are at least two entity types
not described in the OpenDWG spec that appear in some of the drawings
I've been sent. The DXF info I have called 'AutoCAD 2000 DXF Reference'
lists several entitys that the OpenDWG spec omits (i.e. ARCALIGNEDTEXT)
so maybe someone can create a simple drawing with one or two of these
entities in both DWG and DXF formats and we can try to figure out the
bit layout in the DWG file.

Congrats if you've made it this far through this message ... :-)

Those people willing to test this can send me an e-mail and I'll mail
off the files to you.

Art
-- 
Man once surrendering his reason, has no remaining guard against absurdities
the most monstrous, and like a ship without rudder, is the sport of every wind.

-Thomas Jefferson to James Smith, 1822



More information about the PythonCAD mailing list