How to read files written with COBOL

Steve Williams stevewilliams at wwc.com
Wed May 12 00:28:03 EDT 2004


asdf sdf wrote:
> Steve Williams wrote:
> 
>> I wrote an ETL system in python for a client to convert from 
>> Microfocus COBOL to DB2.  Here are some of the problems I saw:
>>
>> 1)  COBOL has a very rich set of datatypes defined by the PICTURE clause
> 
> <...snipping various items...>
> 
>>     That is, I took the original COBOL 01 level definition and
>>     converted it to a list with definition parameters name, type,
>>     length, decimal point, etc. to make it easy for Python and
>>     to add some stuff to make DB2 happy (convert to title case. . .)
> 
> Steve,
> 
> I've been looking for ideas on getting at DB2 and Adabas from Python. 
> You might have some thoughts.
> 
> Is it feasible to go to directly to MVS/DB2/Adabas from Python on Unix 
> or Win?
> 
> Is it more realistic to hit DB2 on AIX or Linux and use some kind of DB2 
>  linking or replication to reach DB2/MVS?
> 
> Other ideas?  Maybe 3270 emulation with screen scraping?  How about 
> telnet 3270?  (Hundreds years of ago, I could dial into a command line 
> MVS environment.)
> 
> I don't mean to hijack the thread.  I think this is related and might be 
> helpful to unfortunates to have to interoperate with legacy systems.
> 
> 
> 
> 
> 
> 
> 
Well, the application processed a lot of data on a nightly basis.  It 
used FTP to connect to the COBOL machine (an AIX box) and FTP callbacks
to sequentially read the files and convert the the data.  There are two
a bugs in the Python FTP module that surface if the file size is larger
than 2 gig, but they're easily fixed.

I developed this application on Windows, initially targeting a test DB2 
database on Windows and then moving the DB2 database to AIX and posting 
with ODBC over the network from Windows.

In the full production environment I moved the Python
application to AIX.  The moves were straightforward--Python was platform 
independent for my purposes.

Initially I used ODBC or the API to post the data to DB2, but
that turned out to be slow.  To get the speed I needed, I just wrote
the converted data to a CSV flat file and passed the file to the
DB2 loader utilities.  No matter how good your code is, you'll never
outperform the database utilities.

I've never used replication or linking.  I know nothing about DB2 on
MVS.  In general, my experience with DB2 on networks (admittedly Unix
and Windows boxes) tells me accessing DB2 on MVS over a network would
not be a problem.  I know nothing about ADABAS.

Python will certainly do TELNET and screen scraping, but life is short.

Other than the overall success of the project (I've been told successful
data warehouse projects are rare) the major benefit of using Python was
the ability to try new concepts quickly.  With python you have
enormous flexibility, as opposed to compiled languages (COBOL, C, etc)
or third party ETL utilities.

As an example, my application converted accounting data on
a nightly basis.  With no advance warning, the Accounting department
converted to another package.  The python code to extract and load
the data from the new system was written and in production in 2 days.




More information about the Python-list mailing list