[python-win32] opening files with names in non-english characters.

venu madhav venutaurus539 at gmail.com
Tue Feb 24 08:55:23 CET 2009


Thank you very much for your timely replies. They were really helpful in
making my work easier. But the main issue which I am facing is that if the
file has name has some Unicode charactes (Arabic, Russian etc), this python
interpreter is failing to detect those and inturn returning "???" in place
of those characters. As  a result I am unable to proceed further. The main
problem is python by default uses ascii encoding and since the characters
are in Unicode it fails. But I  also tried changing the default encoding to
Unicode in site.py scirpt (where the setdefaultencoding function is defined)
but of no use.
THank you once again,
Venu

On Tue, Feb 24, 2009 at 1:09 PM, Gerdus van Zyl <gerdusvanzyl at gmail.com>wrote:

> ok, I understand better now, some more points.
>
> You need to handle the case that it doesn't find the file after the
> for loop, one source of a None return.
>
> also as mentioned in point 2 when recursing you need to return the result
> so:
>         if os.path.isdir(full_path):
>            findFile(full_path)
> becomes
>         if os.path.isdir(full_path):
>             return findFile(full_path)
>
> Another thing you need to test is I don't know if you will always
> receive files and folders in the same order which means that the nth
> file won't be consistent.
>
> On Tue, Feb 24, 2009 at 9:27 AM, venu madhav <venutaurus539 at gmail.com>
> wrote:
> > Hello,
> >         The value of n is initialized in the main procedure which calls
> it.
> > Basically I am trying to find the n'th file in the directory(can be in
> its
> > sub directories too). As I've given the previous mail itself
> > file = findFile(path)
> >                                  invokes that function.When the path is a
> > directory it just recurses into it. And coming to your idea of storing
> all
> > the items in a list can't be used here because my folder contain
> thousands
> > of files and storing them in a list would eat up my memory.
> > Thanks for your suggestions,
> > Venu
> > On Tue, Feb 24, 2009 at 12:49 PM, Gerdus van Zyl <gerdusvanzyl at gmail.com
> >
> > wrote:
> >>
> >> I see a couple of problems with your code:
> >>
> >> 1. where is n first given a value and what is it total file count,
> >> etc? also you decrement the value, do you want the last file in the
> >> directory or something?
> >> 2. The if os.path.isdir(full_path): .. findFile(full_path) part
> >> doesn't return or handle the value so it's not useful so far i can
> >> see. So you either need to "return findFile(full_path) " or "value =
> >> findFile(full_path)"
> >> 3. I am not sure of your usage of n, the way i do similiar things is
> >> to build a list and then just get the item i want by index or slicing.
> >>
> >> ~g
> >>
> >> On Tue, Feb 24, 2009 at 5:28 AM, venu madhav <venutaurus539 at gmail.com>
> >> wrote:
> >> > Hello,
> >> >         First of all thanks for your response. I've written a function
> >> > as
> >> > shown below to recurse a directory and return a file based on the
> value
> >> > of
> >> > n. I am calling this fucntion from my main code to catch that
> filename.
> >> > The
> >> > folder which it recurses through contains a folder having files with
> >> > unicode
> >> > names (as an example i've given earlier.
> >> >
> >> >
> -----------------------------------------------------------------------------
> >> > def findFile(dir_path):
> >> >     for name in os.listdir(dir_path):
> >> >         full_path = os.path.join(dir_path, name)
> >> >         print full_path
> >> >         if os.path.isdir(full_path):
> >> >             findFile(full_path)
> >> >         else:
> >> >             n = n - 1
> >> >             if(n ==0):
> >> >                 return full_path
> >> >
> >> >
> --------------------------------------------------------------------------------------------------
> >> >                     The problem is in the return statement. In the
> >> > function
> >> > when I tried to print the file name, it is printing properly but the
> >> > receiving variable is not getting populated with the file name. The
> >> > below
> >> > code (1st statement) shows the value of the full_path variable while
> the
> >> > control is at the return statement. The second statement is in the
> main
> >> > code
> >> > from where the function call has been made.
> >> > Once the control has reached the main procedure after executing the
> >> > findFile
> >> > procedure, the third statement gives the status of file variable which
> >> > has
> >> > type as NoneType and value as None. Now when I try to check if the
> path
> >> > exists, it fails giving the below trace back.
> >> >
> >> >
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> >
> >> >
> E:\DataSet\Unicode\UnicodeFiles_8859\001_0006_test_folder\0003testUnicode_ÍÎIÐNOKÔÕÖ×ØUÚÛÜUUßaáâãäåæicéeëeíîidnokôõö÷øuúûüuu.txt.txt
> >> >
> >> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> > file = findFile(fpath)
> >> >
> >> >
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> > file
> >> > NoneType
> >> > None
> >> >
> >> >
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> > This is the final trace back:
> >> > Traceback (most recent call last):
> >> >   File "C:\RecallStubFopen.py", line 268, in <module>
> >> >     if os.path.exists(file):
> >> >   File "C:\Python26\lib\genericpath.py", line 18, in exists
> >> >     st = os.stat(path)
> >> > TypeError: coercing to Unicode: need string or buffer, NoneType found
> >> >
> >> >
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> > Please ask if you need any further information.
> >> >
> >> > Thank you,
> >> > Venu
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Feb 23, 2009 at 11:32 PM, Chris Rebert <clp2 at rebertia.com>
> >> > wrote:
> >> >>
> >> >> On Mon, Feb 23, 2009 at 5:51 AM, venutaurus539 at gmail.com
> >> >> <venutaurus539 at gmail.com> wrote:
> >> >> > Hi all,
> >> >> >          I am trying to find the attributes of afile whose name has
> >> >> > non english characters one like given below. When I try to run my
> >> >> > python scirpt, it fails giving out an error filename must be in
> >> >> > string
> >> >> > or UNICODE. When i try to copy the name of the file as a strinig,
> it
> >> >> > (KOMODO IDE) is not allowing me to save the script saying that it
> >> >> > cannot convert some of the characters in the current encoding which
> >> >> > is
> >> >> > Western European(CP-1252).
> >> >> >
> >> >> > 0010testUnicode_ėíîïðņōóôõöũøųúûüýþĸ !#$%&'()+,-.
> >> >> > 0123456789;=@ABCD.txt.txt
> >> >>
> >> >> (1) How are you entering or retrieving that filename?
> >> >> (2) Please provide the exact error and Traceback you're getting.
> >> >>
> >> >> Cheers,
> >> >> Chris
> >> >>
> >> >> --
> >> >> Follow the path of the Iguana...
> >> >> http://rebertia.com
> >> >
> >> >
> >> > _______________________________________________
> >> > python-win32 mailing list
> >> > python-win32 at python.org
> >> > http://mail.python.org/mailman/listinfo/python-win32
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-win32/attachments/20090224/207af1b1/attachment.htm>


More information about the python-win32 mailing list