waling a directory with very many files
Nick Craig-Wood
nick at craig-wood.com
Mon Jun 15 17:29:33 EDT 2009
Jean-Paul Calderone <exarkun at divmod.com> wrote:
> On Mon, 15 Jun 2009 09:29:33 -0500, Nick Craig-Wood <nick at craig-wood.com> wrote:
> >Hrvoje Niksic <hniksic at xemacs.org> wrote:
> >> Nick Craig-Wood <nick at craig-wood.com> writes:
> >>
> >> > Here is a ctypes generator listdir for unix-like OSes.
> >>
> >> ctypes code scares me with its duplication of the contents of system
> >> headers. I understand its use as a proof of concept, or for hacks one
> >> needs right now, but can anyone seriously propose using this kind of
> >> code in a Python program? For example, this seems much more
> >> "Linux-only", or possibly even "32-bit-Linux-only", than
> >> "unix-like":
> >
> >It was a proof of concept certainly..
> >
> >It can be done properly with gccxml though which converts structures
> >into ctypes definitions.
> >
> >That said the dirent struct is specified by POSIX so if you get the
> >correct types for all the individual members then it should be correct
> >everywhere. Maybe ;-)
>
> The problem is that POSIX specifies the fields with types like off_t and
> ino_t. Since ctypes doesn't know anything about these types, application
> code has to specify their size and other attributes. As these vary from
> platform to platform, you can't get it correct without asking a real C
> compiler.
These types could be part of ctypes. After all ctypes knows how big a
long is on all platforms, and it knows that a uint32_t is the same on
all platforms, it could conceivably know how big an off_t or an ino_t
is too.
> In other words, POSIX talks about APIs and ctypes deals with ABIs.
>
> http://pypi.python.org/pypi/ctypes_configure/0.1 helps with the problem,
> and is a bit more accessible than gccxml.
I haven't seen that before - looks interesting.
> It is basically correct to say that using ctypes without using something
> like gccxml or ctypes_configure will give you non-portable code.
Well it depends on if the API is specified in types that ctypes
understands. Eg, short, int, long, int32_t, uint64_t etc. A lot of
interfaces are specified exactly like that and work just fine with
ctypes in a portable way. I agree with you that struct dirent
probably isn't one of those though!
I think it would be relatively easy to implent the code I demonstrated
in a portable way though... I'd do it by defining dirent as a block
of memory and then for the first run, find a known filename in the
block, establishing the offset of the name field since that is all we
are interested in for the OPs problem.
--
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
More information about the Python-list
mailing list