Having done similar, the options are (depending on the dataset):

1: Python to read, clean and classify data, then R to do the analysis (e.g.
regression analysis)

2: Python to read, clean and classify data, and python for the analysis

3: All in R

If you want to use Python for the analysis, most people would probably use
Pandas for the data cleaning and SciPy for the stats. However, there are

There is a tutorial that describes almost exactly the same problem as yours
here, using Pandas and some other packages:



