[Python-ideas] revisit pep 377: good use case?

Ethan Furman ethan at stoneleaf.us
Wed Feb 29 22:48:26 CET 2012


Ethan Furman wrote:
re-posting to list

Craig Yoshioka wrote:
 > Here is what the context might look like:
 >
 > class Uncached(object):
 >   def __init__(self,path):
 >     self.path = path
 >     self.lock = path + '.locked'
 >   def __enter__(self):
 >     if os.path.exists(self.path):
 >        return SKipWithBlock # skips body goes straight to __exit__
 >     try:
 >         os.close(os.open(self.lock,os.O_CREAT|os.O_EXCL|os.O_RDWR))
 >     except OSError as e:
 >         if e.errno != errno.EEXIST:
 >             raise
 >         while os.path.exists(self.lock):
 >             time.sleep(0.1)
 >         return self.__enter__()
 >     return self.path
 >   def __exit__(self,et,ev,st):
 >     if os.path.exists(self.lock):
 >       os.unlink(self.lock)
 >
 > class Cache(object):
 >   def __init__(self,*args,**kwargs):
 >      self.base = os.path.join(CACHE_DIR,hashon(args,kwargs))
 >   #.....
 >   def create(self,path):
 >      return Uncached(os.path.join(self.base,path))
 >   #.....
 >
 > def cached(func):
 >   def wrapper(*args,**kwargs):
 >     cache = Cache(*args,**kwargs)
 >     return func(cache,*args,**kwargs)
 >   return wrapper
 >
 > ---------------------------------------------------------------------
 > Person using code:
 > ---------------------------------------------------------------------
 >
 > @cached
 > def createdata(cache,x):
 >   path = cache.pathfor('output.data')
 >   with cache.create(path) as cpath:
 >     with open(cpath,'wb') as cfile:
 >       cfile.write(x*10000)
 >   return path
 >
 > pool.map(createdata,['x','x','t','x','t'])
 >
 > ---------------------------------------------------------------------
 >
 > so separate processes return the path to the cached data and create it
 > if it doesn't exist, and even wait if another process is working on
 > it.
 >
 > my collaborators could hopefully very easily wrap their programs with
 > minimal effort using the cleanest syntax possible,
 > and since inputs get hashed to consistent output paths for each
 > wrapped function, the wrapped functions can be easily combined,
 > chained, etc. and behind the scenes they are reusing as much work as
 > possible.
 >
 > Here are the current possible alternatives:
 >
 > 1. use the passed var as a flag, they must insert the if for every use
 > of the context, if not, then cached results get recomputed
 >
 > @cached
 > def createdata(cache,x):
 >   path = cache.pathfor('output.data')
 >   with cache.create(path) as cpath:
 >     if not cpath: return
 >     with open(cpath,'wb') as cfile:
 >       cfile.write(x*10000)
 >   return path
 >
 > 2. using the for loop and an iterator instead of a context, is more
 > fool-proof, but a bit confusing?
 >
 > @cached
 > def createdata(cache,x):
 >   path = cache.pathfor('output.data')
 >   for cpath in cache.create(path):
 >     if not cpath: return
 >     with open(cpath,'wb') as cfile:
 >       cfile.write(x*10000)
 >   return path
 >
 > 3. using a class the outputs and caching function need to be specified
 > separately so that calls can be scripted together, also a lot more
 > boilerplate:
 >
 > class createdata(CachedWrapper):
 >   def outputs(self,x):
 >     self.outputs += [self.cache.pathfor('output.data')]
 >   def tocache(self,x):
 >     with open(self.outputs[0],'wb') as cfile:
 >       cfile.write(x*10000)




More information about the Python-ideas mailing list