[Tutor] Get a single random sample
d at davea.name
Fri Sep 9 14:24:01 CEST 2011
On 09/09/2011 06:44 AM, kitty wrote:
> I'm new to python and I have read through the tutorial on:
> which was really good, but I have been an R user for 7 years and and am
> finding it difficult to do even basic things in python, for example I want
> to import my data (a tab-delimited .txt file) so that I can index and select
> a random sample of one column based on another column. my data has
> 2 columns named 'area' and 'change.dens'.
> In R I would just
> data<-read.table("FILE PATH\\Road.density.municipio.all.txt", header=T)
> #header =T gives colums their headings so that I can call each individually
> Then to Index I would simply:
> subset<-change.dens[area<2000&area>700] # so return change.dens values that
> have corresponding 'area's of between 700 and 2000
> then to randomly sample a value from that I just need to
> My question is how do I get python to do this???
> Sorry I know it is very basic but I just cant seem to get going,
> Thank you
No, it's not basic. If that R fragment is self-contained, apparently R
makes lots of assumptions about its enviroment. Python can handle all
of this, but not so succinctly and not without libraries (modules). In
particular, your problem would be addressed with the csv module and the
The following (untested) code might get you started.
import csv, sys, os
infilename = "path to file/Road.density.municipio.all.txt"
infile = open(infilename, "r")
incsv = csv.DictReader(infile, delimiter="\t") # \t is the tab character
for index, item in enumerate(incsv):
item will be a dict representing one line of the file, each time
through the loop. You can choose some of those with an if statement,
and build a list of the dicts that are useful.
You can combine some of these steps using things like list
comprehensions, but it's easier to take it a step at a time and see what
More information about the Tutor