[Tutor] string comparison modules
Alan Gauld
alan.gauld at yahoo.co.uk
Fri Jan 14 04:03:57 EST 2022
On 14/01/2022 05:59, mhysnm1964 at gmail.com wrote:
> result JSON data construct has an field for title and author. I want to only
> grab the correct author and title JSON record. As there is slight
> differences between the author and title I have provided to google and what
> I have received back. I am wondering what is the easiest method of doing a
> string compare?
First you have to define what "correct" means.
Is it an exact match? - apparently not.
Does it mean a punctuationless match? - yes but also...
Does it mean of a fixed length? - maybe but we don;t have enough samples
Does it mean excluding numbers? - possibly we don;t have enough data
Only you can answer those questions. (And more study of sample data)
> Returned from Google - Author: S. A. Smith
> I provided: S A Smith
>
> The title from google is: Russia in Revolution, An Empire in Crisis
> I sent: Russia in Revolution, An Empire in Crisis, 1890 to 1928
> I could strip out the punctuation chars like '.'.
That may well be a starting point
> I could use 'in' operator
I doubt if that is a good strategy.
> for the titles or the find method.
That might work, but will require some processing of the data first.
> AS there is so many different ways a
> title could be written or the author name. I am concern my approach is not
> very robust and I will not capture the right book record.
Only you can decide what you need.
One loose search mechanism is to simply take the set of characters
used and see if the search set is equal to(or a complete subset of)
the returned result. Or vice versa.
But that will potentially yield multiple results for a single
search term. You need to decide if that is good enough, or if
you need to narrow down to a single (or zero) result.
Another approach is to use regular expressions. But that
requires a very carefully defined query.
In any case you need to sit down and think very carefully and
specifically about what you define as a match.
How flexible should it be. You haven't mentioned case,
but does the case matter? Would "s a smith" count in the
example above?
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list