python daemon - compress data and load data into MySQL by pyodbc
MacRules
MacRules at nome.com
Thu Sep 3 15:35:36 EDT 2009
Martin P. Hellwig wrote:
> MacRules wrote:
> <cut>
>> What I am looking for is this.
>>
>> Oracle DB in data center 1 (LA, west coast)
>> MSSQL DB in data center 2 (DC, east coast)
>> So network bandwidth is an issue, I prefer to have gzip fist and
>> deliver the data.
>
> If bandwidth is really an issue, you should send compressed delta's.
>
>>
>> I need 2 python daemons or a web service here in the future.
>> I will enter the Oracle table name, user id and password.
>> So the task is dump out Oracle data (Linux) and insert that to MSSQL.
>
> That is assuming the table is the same and the columns of the table have
> the same type with the same restrictions and don't even get me started
> about encoding differences.
>
>>
>>
>> I can try first with 1 daemon python. Take the Oracle data file, and
>> let the daemon connects to MSSQL (with pyodbc) and load the data in.
>
> I think that you are underestimating your task, I would recommend to
> 'design' your application first using an UML like approach, whether you
> need one, two or bazillion daemons should not be a design start but a
> consequence of the design.
>
> Anyway here is a sum up and some further pointers to take in regard for
> your design:
> - Can I do delta's? (if yes how do I calculate them)
> - Are the tables comparable in design
> Specifically if there is a limit on the character width, does Oracle
> and MS-SQL think the same about newlines? (one versus two characters)
> - How about encoding, are you sure it works out right?
> - Is ATOMIC an issue
> - Is latency an issue, that is how long may the tables be out of sync
> - Can it be manual or is it preferably automatic
> How about a trigger in the database that sets the sync going, if that
> is too much burden how about a trigger that set a flag on the disk and a
> scheduled job that reads that flag first
> - Is security an issue, do I need an encrypted channel
>
> In the past I've wrote a program that had the same starting point as you
> had, over time it grew in design to be a datawarehouse push/pull central
> server with all other databases as an agency. The only reason I wrote it
> was because the more standard approach like business objects data
> integrator was just way too expensive and oracles solutions didn't play
> nice with PostgreSQL (though over time this issue seemed to be resolved).
>
You understand my issue clearly. For now, I only look for 1 time event,
I do not need push/pull yet.
For a good push/pull + compression + security + real time in mind, you
can sell it for decent money.
More information about the Python-list
mailing list