[BangPypers] NLTK

Deepu Thomas Philip deepu.dtp at gmail.com
Sun Sep 15 21:04:28 CEST 2013


Take a look at Apache Solr (http://lucene.apache.org/solr/).
You can find python clients for Solr here
http://wiki.apache.org/solr/SolPython

For a quick solution, add all your products to solr with the name of the
product as an indexed field.
Add the *product id* if you plan on storing specs elsewhere.
Add all the spec names to a list of stopwords while creating your solr
index.
Throw the user's query against the index and you should get a list of
products rank ordered by a match score between the query and the product
names stored.
Remove the product name from the user's query and then do a match for the
spec the user is looking for. You can use regular expressions or
https://code.google.com/p/esmre/ for this.

You should be able to account for spelling errors in the user's query with
a little more work on the Solr side of things. Solr will also open up use
cases where the user wants a list of phones which weigh 200g and costs <10k.

Regards,
Deepu

On Sun, Sep 8, 2013 at 12:04 PM, Gopalakrishnan Subramani <
gopalakrishnan.subramani at gmail.com> wrote:

> I have database of specs in json format. This is not manual effort.
>
> Right now, NLTK seems to be hard to me. I will try a plain Python wrappers
> based on word match, approach NLTK later.
>
> Thanks.
>
>
> On Sun, Sep 8, 2013 at 11:29 AM, harish badrinath <
> harishbadrinath at gmail.com
> > wrote:
>
> > Hello,
> >
> > On Sun, Sep 8, 2013 at 2:34 AM, Gopalakrishnan Subramani <
> > gopalakrishnan.subramani at gmail.com> wrote:
> >
> > > Dear All,
> > >
> > > I  want to build a simple automatic text based chat bot for mobile,
> > tablet
> > > specs for proof of concept.
> > >
> > > How do you plan to preseed the knowledge for the application (manually
> or
> > information extraction through webpages,etc).
> >
> >
> > > The question is, when the user talks about  "Samsung Galaxy S3 Weight",
> > > "Galaxy SIII Weight", can NLTK predict a product (ex: Galaxy SIII) and
> > give
> > > me the unique _id of the product for further look up for
> group/attribute
> > > like weight?
> > >
> > > If it is manually enter the knowledge then nltk should not be required
> (
> > something like yacc plus a good database schema should suffice, again
> > depends on the type of input language you plan to support).
> >
> > Warm regards,
> > Harish Badrinath
> > _______________________________________________
> > BangPypers mailing list
> > BangPypers at python.org
> > https://mail.python.org/mailman/listinfo/bangpypers
> >
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> https://mail.python.org/mailman/listinfo/bangpypers
>


More information about the BangPypers mailing list