in place list modification necessary? What's a better idiom?
andrew cooke
andrew at acooke.org
Tue Apr 7 06:37:01 EDT 2009
Carl Banks wrote:
> import collections
> import itertools
>
> def createInitialCluster(fileName):
> fixedPoints = []
> # quantization is a dict that assigns sequentially-increasing
> numbers
> # to values when reading keys that don't yet exit
> quantization = defaultdict.collections(itertools.count().next)
> with open(fileName, 'r') as f:
> for line in f:
> dimensions = []
> for s in line.rstrip('\n').split(","):
> if isNumeric(s):
> dimensions.append(float(s))
> else:
> dimensions.append(float(quantization[s]))
> fixedPoints.append(Point(dimensions))
> return Cluster(fixedPoints)
nice reply (i didn't know defaultdict worked like that - very neat).
two small things i noticed:
1 - do you need a separate quantization for each column? the code above
might give, for example, non-contiguous ranges of integers for a
particular column if a string occurs ("by accident" perhaps) in more than
one.
2 - don't bother with isNumeric. just return the cast value or catch the
exception:
[...]
try:
dimensions.append(float(s))
except:
dimensions.append(float(quantization[s]))
(not sure float() is needed there either if you're using a recent version
of python - only reason i can think of is to avoid integer division in
older versions).
andrew
More information about the Python-list
mailing list