How to compare words from .txt file against words in .xlsx file via Python? I will then extract these words by writing it to a new .xls file
MRAB
python at mrabarnett.plus.com
Sun Aug 4 14:29:00 EDT 2019
On 2019-08-04 18:53, A S wrote:
> Hi Mrab,
>
> Thank you so much for your detailed response, I really really
> appreciate it as I have been constantly trying to seek help regarding
> this issue.
>
> Yes, I figured that the dictionary is only capturing the last value :(
> I've been trying to get it to capture and store all the values to
> memory in python but it's not working..
>
> Are there any improvements that I could make to allow my code to work?
>
> I would be truly grateful if you could provide further insights on this..
>
> Thank you so much.
>
Make it a set and then add the words to it.
>
> On Mon, 5 Aug 2019, 1:45 am MRAB, <python at mrabarnett.plus.com
> <mailto:python at mrabarnett.plus.com>> wrote:
>
> On 2019-08-04 09:29, aishan0403 at gmail.com
> <mailto:aishan0403 at gmail.com> wrote:
> > I want to compare the common words from multiple .txt files
> based on the words in multiple .xlsx files.
> >
> > Could anyone kindly help with my code? I have been stuck for
> weeks and really need help..
> >
> > Please refer to this link:
> >
> https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi
> >
> > Any help is greatly appreciated really!!
> >
> First of all, in this line:
>
> folder_path1 = os.chdir("C:/Users/xxx/Documents/xxxx/Test
> python dict")
>
> it changes the current working directory (not a problem), but 'chdir'
> returns None, so from that point 'folder_path1' has the value None.
>
> Then in this line:
>
> for file in os.listdir(folder_path1):
>
> it's actually doing:
>
> for file in os.listdir(None):
>
> which happens to work because passing it None means to return the
> names
> in the current directory.
>
> Now to your problem.
>
> This line:
>
> dictionary = cell_range.value
>
> sets 'dictionary' to the value in the spreadsheet cell, and you're
> doing
> it each time around the loop. At the end of the loop, 'dictionary'
> will
> be set to the _last_ such value. You're not collecting the value, but
> merely remembering the last value.
>
> Looking further on, there's this line:
>
> if txtwords in dictionary:
>
> Remember, 'dictionary' is the last value (a string), so that'll be
> True
> only if 'txtwords' is a substring of the string in 'dictionary'.
>
> That's why you're seeing only one match.
>
More information about the Python-list
mailing list