python daemon - compress data and load data into MySQL by pyodbc

MacRules MacRules at nome.com
Thu Sep 3 15:35:36 EDT 2009


Martin P. Hellwig wrote:
> MacRules wrote:
> <cut>
>> What I am looking for is this.
>>
>> Oracle DB in data center 1 (LA, west coast)
>> MSSQL DB in data center 2 (DC, east coast)
>> So network bandwidth is an issue, I prefer to have gzip fist and 
>> deliver the data.
> 
> If bandwidth is really an issue, you should send compressed delta's.
> 
>>
>> I need 2 python daemons or a web service here in the future.
>> I will enter the Oracle table name, user id and password.
>> So the task is dump out Oracle data (Linux) and insert that to MSSQL.
> 
> That is assuming the table is the same and the columns of the table have 
>  the same type with the same restrictions and don't even get me started 
> about encoding differences.
> 
>>
>>
>> I can try first with 1 daemon python. Take the Oracle data file, and 
>> let the daemon connects to MSSQL (with pyodbc) and load the data in.
> 
> I think that you are underestimating your task, I would recommend to 
> 'design' your application first using an UML like approach, whether you 
> need one, two or bazillion daemons should not be a design start but a 
> consequence of the design.
> 
> Anyway here is a sum up and some further pointers to take in regard for 
> your design:
> - Can I do delta's? (if yes how do I calculate them)
> - Are the tables comparable in design
>   Specifically if there is a limit on the character width, does Oracle 
> and MS-SQL think the same about newlines? (one versus two characters)
> - How about encoding, are you sure it works out right?
> - Is ATOMIC an issue
> - Is latency an issue, that is how long may the tables be out of sync
> - Can it be manual or is it preferably automatic
>   How about a trigger in the database that sets the sync going, if that 
> is too much burden how about a trigger that set a flag on the disk and a 
> scheduled job that reads that flag first
> - Is security an issue, do I need an encrypted channel
> 
> In the past I've wrote a program that had the same starting point as you 
> had, over time it grew in design to be a datawarehouse push/pull central 
> server with all other databases as an agency. The only reason I wrote it 
> was because the more standard approach like business objects data 
> integrator was just way too expensive and oracles solutions didn't play 
> nice with PostgreSQL (though over time this issue seemed to be resolved).
> 

You understand my issue clearly. For now, I only look for 1 time event, 
I do not need push/pull yet.

For a good push/pull + compression + security + real time in mind, you 
can sell it for decent money.




More information about the Python-list mailing list