[Tutor] python, xml, mongodb
Emile van Sebille
emile at fenx.com
Sun Oct 27 21:10:02 CET 2013
Two things bother me about your assignment -- First, you say "i really
dunno what format to expect, no details on that" and second, "i have
like three weeks to complete this." Doesn't sound like a winning
combination to me. :(
When I've written systems to aggregate and normalize data from multiple
input sources I've generally written an adapter per input stream to
parse and feed the content into a standard processing environment which
does everything else. It sounds like you have a handle on this back end
part.
As to the front end part, I've written adapters that accept excel files,
email messages, input from hand-held data capture devices, edi content,
custom ebcdic files, AS2 data interchange, and web interfaces that all
feed through to the same back end. Once I had the basic back end going,
it was about a day's work to fully integrate and test each additional
data source.
Hope this helps,
Emile
On 10/24/2013 1:44 PM, Ismar Sehic wrote:> hello, me again - the guy
with a (mis)fortune of having to deal with a
> lot of company's in and outgoing xml.I guess they just like xml as a
> data interchange format, its human readable.i've done my task of
> exporting the entire postgresql database to some prestructured xml, and
> i guess i've done the job so well, they want me to develop something
> like an web service, that will be receiving all kinds of hotel related
> data, no matter what format(csv, xml, txt, maybe even dBase or
> whatever)from various clients.my service should be parsing the received
> data in the prestructed xml format, store everything in one xml file per
> client, then send to some other service.i really love programming in
> python and struggling my way through streams of data(i don't like the
> fact that i'm working in a tourism related company, where clients
> dictate the terms, but i hope i will change my job some day...)
> i need some help in the idea of the architecture itself, i'm still a
> novice in python(started 8 months ago), although i manage to do some
> nice work, i guess i'm stubborn...
> so - on the input part ---> i'm receiving a lot of data in various
> formats, that needs to be validated and parsed in a way i can use it to
> populate my predefined xml elements.i really dunno what format to
> expect, no details on that, i just know that 'whatever' i receive will
> be containing some essential data like hotel id's, occupancies, room
> details(seaview, room service, prices etc...)is there some way to write
> a unique parser, that will load a file and look for some pattern of
> data, then grab it?i will really aprecciate any ideas on that input
> parsing part.
> next little problem - what type of database should i use to store the
> data in.i would prefer something where i can set the default template
> and then just pass the parsed data to it, so my output xml is already
> half-way formed(for example, i set a column name like Hotel_name, pass
> all the hotel names to it, hotel_id - where i just pass all the id's
> etc) so i can just export it and i have my xml that matches the
> company's template.
> i know it's unusual to ask for an idea how to approach a problem - but
> my project manager and head of the company aren't of much use, they are
> interested only in clients and financial gain, not really helpful.so
> they pass me a problem and i have to find the best way to do it.it
> <http://it.it>'s my first job, i cannot change any of the terms, i can
> just go along, or refuse to do it - meaning i'm losing the job.so
> basically you guys are the best help i can get.
> so please, give me some ideas, or point me in the right direction.i have
> like three weeks to complete this.
> i'll understand if all this is too much to ask, no problem.
> anyway, thanks :)
>
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list