[Tutor] python, xml, mongodb

Emile van Sebille emile at fenx.com
Sun Oct 27 21:10:02 CET 2013


Two things bother me about your assignment -- First, you say "i really 
dunno what format to expect, no details on that" and second, "i have 
like three weeks to complete this."  Doesn't sound like a winning 
combination to me.  :(

When I've written systems to aggregate and normalize data from multiple 
input sources I've generally written an adapter per input stream to 
parse and feed the content into a standard processing environment which 
does everything else.  It sounds like you have a handle on this back end 
part.

As to the front end part, I've written adapters that accept excel files, 
email messages, input from hand-held data capture devices, edi content, 
custom ebcdic files, AS2 data interchange, and web interfaces that all 
feed through to the same back end.  Once I had the basic back end going, 
it was about a day's work to fully integrate and test each additional 
data source.

Hope this helps,

Emile




On 10/24/2013 1:44 PM, Ismar Sehic wrote:> hello, me again - the guy 
with a (mis)fortune of having to deal with a
 > lot of company's in and outgoing xml.I guess they just like xml as a
 > data interchange format, its human readable.i've done my task of
 > exporting the entire postgresql database to some prestructured xml, and
 > i guess i've done the job so well, they want me to develop something
 > like an web service, that will be receiving all kinds of hotel related
 > data, no matter what format(csv, xml, txt, maybe even dBase or
 > whatever)from various clients.my service should be parsing the received
 > data in the prestructed xml format, store everything in one xml file per
 > client, then send to some other service.i really love programming in
 > python and struggling my way through streams of data(i don't like the
 > fact that i'm working in a tourism related company, where clients
 > dictate the terms, but i hope i will change my job some day...)
 > i need some help in the idea of the  architecture itself, i'm still a
 > novice in python(started 8 months ago), although i manage to do some
 > nice work, i guess i'm stubborn...
 > so - on the input part ---> i'm receiving a lot of data in various
 > formats, that needs to be validated and parsed in a way i can use it to
 > populate my predefined xml elements.i really dunno what format to
 > expect, no details on that, i just know that 'whatever' i receive will
 > be containing some essential data like hotel id's, occupancies, room
 > details(seaview, room service, prices etc...)is there some way to write
 > a unique parser, that will load a file and look for some pattern of
 > data, then grab it?i will really aprecciate any ideas on that input
 > parsing part.
 > next little problem - what type of database should i use to store the
 > data in.i would prefer something where i  can set the default template
 > and then just pass the parsed data to it, so my output xml is already
 > half-way formed(for example, i set a column name like Hotel_name, pass
 > all the hotel names to it, hotel_id - where i just pass all the id's
 > etc) so i can just export it and i have my xml that matches the
 > company's template.
 > i know it's unusual to ask for an idea how to approach a problem - but
 > my project manager and head of the company aren't of much use, they are
 > interested only in clients and financial gain, not really helpful.so
 > they pass me a problem and i have to find the best way to do it.it
 > <http://it.it>'s my first job, i cannot change any of the terms, i can
 > just go along, or refuse to do it - meaning i'm losing the job.so
 > basically you guys are the best help i can get.
 > so please, give me some ideas, or point me in the right direction.i have
 > like three weeks to complete this.
 > i'll understand if all this is too much to ask, no problem.
 > anyway, thanks :)
 >
 >
 > _______________________________________________
 > Tutor maillist  -  Tutor at python.org
 > To unsubscribe or change subscription options:
 > https://mail.python.org/mailman/listinfo/tutor
 >




More information about the Tutor mailing list