frequency of values in a field
noydb
jenn.duerr at gmail.com
Wed Feb 9 15:44:45 EST 2011
On Feb 9, 3:28 pm, Ethan Furman <et... at stoneleaf.us> wrote:
> noydb wrote:
>
> > Paul Rubin wrote:
>
>
>
>
>
> >> The Decimal module is pretty slow but is conceptually probably the right
> >> way to do this. With just 50k records it shouldn't be too bad. With
> >> more records you might look for a faster way.
>
> >> from decimal import Decimal as D
> >> from collections import defaultdict
>
> >> records = ['3.14159','2.71828','3.142857']
>
> >> td = defaultdict(int)
> >> for x in records:
> >> td[D(x).quantize(D('0.01'))] += 1
>
> >> print td
>
> > I played with this - it worked. Using Python 2.6 so counter no good.
>
> > I require an output text file of sorted "key value" so I added
> > (further code to write out to an actual textfile, not important here)
> >>> for z in sorted(set(td)):
> >>> print z, td[z]
>
> > So it seems the idea is to add all the records in the particular field
> > of interest into a list (record). How does one do this in pure
> > Python?
> > Normally in my work with gis/arcgis sw, I would do a search cursor on
> > the DBF file and add each value in the particular field into a list
> > (to populate records above). Something like:
>
> > --> import arcgisscripting
> > --> # Create the geoprocessor object
> > --> gp = arcgisscripting.create()
> > --> records_list = []
> > --> cur = gp.SearchCursor(dbfTable)
> > --> row = cur.Next()
> > --> while row:
> > --> value = row.particular_field
> > --> records_list.append(value)
>
> Are you trying to get away from arcgisscripting? There is a pure python
> dbf package on PyPI (I know, I put it there ;) that you can use to
> access the .dbf file in question (assuming it's a dBase III, IV, or
> FoxPro format).
>
> http://pypi.python.org/pypi/dbf/0.88.16if you're interested.
>
> Using it, the code above could be:
>
> -----------------------------------------------------
> import dbf
> from collections import defaultdict
> from decimal import Decimal
>
> table = dbf.Table('path/to/table/table_name')
>
> freq = defaultdict(int)
> for record in table:
> value = Decimal(record['field_of_interest'])
> key = value.quantize(Decimal('0.01'))
> freq[key] += 1
>
> for z in sorted(freq):
> print z, freq[z]
>
> -----------------------------------------------------
>
> Numeric/Float field types are returned as python floats*, so there may
> be slight discrepancies between the stored value and the returned value.
>
> Hope this helps.
>
> ~Ethan~
>
> *Unless created with zero decimal places, in which case they are
> returned as python integers.- Hide quoted text -
>
> - Show quoted text -
Oops, didn't see htis before I posted last.
Thanks! I'll try this, looks good, makes sense.
More information about the Python-list
mailing list