How to capture a CSV file and read it into a Pandas Dataframe?

Skip Montanaro skip.montanaro at gmail.com
Wed Apr 5 17:18:57 EDT 2017


David> This is something very different.

David> New thinking and methods are needed.

David> Try to click on the following link

David> European Commission : CORDIS : Search : Results page
<http://cordis.europa.eu/search/result_en?q=uk&format=csv>

Hopefully not too very different. :-)

Looks to me like UTF-32 or UTF-16 encoding, or one of those encodings which
needs a BOM. I was able to read it using Python's csv module (this in
Python 3):

>>> f = open("cordis-search-results.csv")
>>> rdr = csv.DictReader(f, delimiter=';')
>>> for row in rdr:
...   print(row["Title"])
...
Incremental Nonlinear flight Control supplemented with Envelope ProtecTION
techniques
AdvancEd aicRaft-noIse-AlLeviation devIceS using meTamaterials
Imaging Biomarkers (IBs) for Safer Drugs: Validation of Translational
Imaging Methods in Drug Safety Assessment (IB4SD-TRISTAN)
Big Data for Better Outcomes, Policy Innovation and Healthcare System
Transformation (DO->IT)
Translational quantitative systems toxicology to improve the understanding
of the safety of medicines
Real world Outcomes across the AD spectrum for better care: Multi-modal
data Access Platform
Models Of Patient Engagement for Alzheimer’s Disease
INtestinal Tissue ENgineering Solution
Small vessel diseases in a mechanistic perspective: Targets for
InterventionAffected pathways and mechanistic exploitation for prevention
of stroke and dementia
How does dopamine link QMP with reproductive repression to mediate colony
harmony and productivity in the honeybee?


Note that I had to specify the delimiter as a semicolon. That also works
with pandas.read_csv:

>>> df = pd.read_csv("cordis-search-results.csv", sep=";")
>>> print(df["Title"])
0    Incremental Nonlinear flight Control supplemen...
1    AdvancEd aicRaft-noIse-AlLeviation devIceS usi...
2    Imaging Biomarkers (IBs) for Safer Drugs: Vali...
3    Big Data for Better Outcomes, Policy Innovatio...
4    Translational quantitative systems toxicology ...
5    Real world Outcomes across the AD spectrum for...
6    Models Of Patient Engagement for Alzheimer’s D...
7               INtestinal Tissue ENgineering Solution
8    Small vessel diseases in a mechanistic perspec...
9    How does dopamine link QMP with reproductive r...
Name: Title, dtype: object


Skip


More information about the Python-list mailing list