[Python-ideas] revisit pep 377: good use case?
Ethan Furman
ethan at stoneleaf.us
Wed Feb 29 22:48:26 CET 2012
Ethan Furman wrote:
re-posting to list
Craig Yoshioka wrote:
> Here is what the context might look like:
>
> class Uncached(object):
> def __init__(self,path):
> self.path = path
> self.lock = path + '.locked'
> def __enter__(self):
> if os.path.exists(self.path):
> return SKipWithBlock # skips body goes straight to __exit__
> try:
> os.close(os.open(self.lock,os.O_CREAT|os.O_EXCL|os.O_RDWR))
> except OSError as e:
> if e.errno != errno.EEXIST:
> raise
> while os.path.exists(self.lock):
> time.sleep(0.1)
> return self.__enter__()
> return self.path
> def __exit__(self,et,ev,st):
> if os.path.exists(self.lock):
> os.unlink(self.lock)
>
> class Cache(object):
> def __init__(self,*args,**kwargs):
> self.base = os.path.join(CACHE_DIR,hashon(args,kwargs))
> #.....
> def create(self,path):
> return Uncached(os.path.join(self.base,path))
> #.....
>
> def cached(func):
> def wrapper(*args,**kwargs):
> cache = Cache(*args,**kwargs)
> return func(cache,*args,**kwargs)
> return wrapper
>
> ---------------------------------------------------------------------
> Person using code:
> ---------------------------------------------------------------------
>
> @cached
> def createdata(cache,x):
> path = cache.pathfor('output.data')
> with cache.create(path) as cpath:
> with open(cpath,'wb') as cfile:
> cfile.write(x*10000)
> return path
>
> pool.map(createdata,['x','x','t','x','t'])
>
> ---------------------------------------------------------------------
>
> so separate processes return the path to the cached data and create it
> if it doesn't exist, and even wait if another process is working on
> it.
>
> my collaborators could hopefully very easily wrap their programs with
> minimal effort using the cleanest syntax possible,
> and since inputs get hashed to consistent output paths for each
> wrapped function, the wrapped functions can be easily combined,
> chained, etc. and behind the scenes they are reusing as much work as
> possible.
>
> Here are the current possible alternatives:
>
> 1. use the passed var as a flag, they must insert the if for every use
> of the context, if not, then cached results get recomputed
>
> @cached
> def createdata(cache,x):
> path = cache.pathfor('output.data')
> with cache.create(path) as cpath:
> if not cpath: return
> with open(cpath,'wb') as cfile:
> cfile.write(x*10000)
> return path
>
> 2. using the for loop and an iterator instead of a context, is more
> fool-proof, but a bit confusing?
>
> @cached
> def createdata(cache,x):
> path = cache.pathfor('output.data')
> for cpath in cache.create(path):
> if not cpath: return
> with open(cpath,'wb') as cfile:
> cfile.write(x*10000)
> return path
>
> 3. using a class the outputs and caching function need to be specified
> separately so that calls can be scripted together, also a lot more
> boilerplate:
>
> class createdata(CachedWrapper):
> def outputs(self,x):
> self.outputs += [self.cache.pathfor('output.data')]
> def tocache(self,x):
> with open(self.outputs[0],'wb') as cfile:
> cfile.write(x*10000)
More information about the Python-ideas
mailing list