[Cython] Cython 0.17 beta 1 released

Yaroslav Halchenko lists at onerussian.com
Thu Jul 26 03:12:37 CEST 2012


actually I have not stated alternative variant since I thought it would
not be wise to 'waste' memory :  just store association between a
particular build and target module_name but now I have mentioned that
such code is pretty much there ... but incorrect and not used:

$> grep -e '\Wkey\W' -e '^def cython_inline' -e 'code_cache' Cython/Build/Inline.py 
_code_cache = {}
def cython_inline(code,
    key = orig_code, arg_sigs, sys.version_info, sys.executable, Cython.__version__
    module_name = "_cython_inline_" + hashlib.md5(str(key).encode('utf-8')).hexdigest()
        for key, value in literals.items():
            module_code = module_code.replace(key, value)
        _code_cache[key] = module_name


so 
1. key in for loop overrides the key tuple identifying the module_path
2. _code_cache is not used anywhere (but does waste memory although probably
   not much since there is not that many values of key I guess which
   it would get)

thus I wondered if it is ok to waste memory ( ;) ) should following patch be
used?  or should _code_cache be removed altogether (or am I missing its role?)

$> quilt diff
--- a/Cython/Build/Inline.py
+++ b/Cython/Build/Inline.py
@@ -138,13 +138,11 @@ def cython_inline(code,
     arg_sigs = tuple([(get_type(kwds[arg], ctx), arg) for arg in arg_names])
     key = orig_code, arg_sigs, sys.version_info, sys.executable, Cython.__version__
     module_name = "_cython_inline_" + hashlib.md5(str(key).encode('utf-8')).hexdigest()
-
-    so_ext = [ ext for ext,_,mod_type in imp.get_suffixes() if mod_type == imp.C_EXTENSION ][0]
-    module_path = os.path.join(lib_dir, module_name+so_ext)
+    module_path = _code_cache.get(module_name, None)
 
     if not os.path.exists(lib_dir):
         os.makedirs(lib_dir)
-    if force or not os.path.isfile(module_path):
+    if force or module_path is None or not os.path.isfile(module_path):
         cflags = []
         c_include_dirs = []
         qualified = re.compile(r'([.\w]+)[.]')
@@ -189,7 +187,9 @@ def __invoke(%(params)s):
         build_extension.build_temp = os.path.dirname(pyx_file)
         build_extension.build_lib  = lib_dir
         build_extension.run()
-        _code_cache[key] = module_name
+        # Should we check if module_path is not None either it still matches?
+        module_path = os.path.join(lib_dir, build_extension.get_ext_filename(module_name))
+        _code_cache[module_name] = module_path
 
     module = imp.load_dynamic(module_name, module_path)
     arg_list = [kwds[arg] for arg in arg_names]




On Wed, 25 Jul 2012, Yaroslav Halchenko wrote:


> On Wed, 25 Jul 2012, Robert Bradshaw wrote:

> > One essential feature of cython.inline(...) is that if the code has
> > already been compiled (and loaded) it should return very fast. This
> > would seem to add significant overhead. 

> that is what was my concern also with such an approach...  I am not sure
> if that is a significant overhead -- on my laptop (python 2.7):

> In [13]: !cat test_get_ext_time.py
> from distutils.core import Distribution
> from distutils.command.build_ext import build_ext

> def get_ext_filename(module_name):
>     dist = Distribution()
>     # Ensure the build respects distutils configuration by parsing
>     # the configuration files
>     config_files = dist.find_config_files()
>     dist.parse_config_files(config_files)
>     build_extension = build_ext(dist)
>     build_extension.finalize_options()

>     return build_extension.get_ext_filename(module_name)

> In [14]: %run test_get_ext_time.py

> In [15]: %timeit get_ext_filename('/asd/f.asdf/asdf/asd/fasdf/xx')
> 1000 loops, best of 3: 301 us per loop

> of cause it is a relatively big slowdown relatively to 3ms of solution
> based on imp.get_suffixes ...  and with cython 0.16 full dummy inline:

> In [4]: %timeit cython_inline('i=1')
> 1000 loops, best of 3: 445 us per loop

> so I guess indeed 300 us would be a significant overhead

> > Is the extension relatively
> > consistant? Perhaps it could be cached at module load time.

> could indeed be especially given the code of

>     def get_ext_filename(self, ext_name):
>         r"""Convert the name of an extension (eg. "foo.bar") into the name
>         of the file from which it will be loaded (eg. "foo/bar.so", or
>         "foo\bar.pyd").
>         """
>         from distutils.sysconfig import get_config_var
>         ext_path = ext_name.split('.')
>         # OS/2 has an 8 character module (extension) limit :-(
>         if os.name == "os2":
>             ext_path[len(ext_path) - 1] = ext_path[len(ext_path) - 1][:8]
>         # extensions in debug_mode are named 'module_d.pyd' under windows
>         so_ext = get_config_var('SO')
>         if os.name == 'nt' and self.debug:
>             return os.path.join(*ext_path) + '_d' + so_ext
>         return os.path.join(*ext_path) + so_ext

> suggesting only get_config_var and 1 self.debug variable which could affect .. which are probably would not be changed at run time.  Paranoid me
> though would still have added some check when extension does get built to
> verify that assumption was right and then spit out a warning and adjust
> module_name to match so that load_dynamic  doesn't fail... or would that be too
> much?

> would you like me to prep a perspective path or you would like to do that? ;)
-- 
Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


More information about the cython-devel mailing list