Fri Oct 6 09:11:38 EDT 2017

New issue 2673: PyPy3 failure with pandas C-engine

Omer Ben-Amram:

I have both pypy2 and pypy3 installed on my mac.

I'm running into this error when trying to read from a csv

import sys

> 3.5.3 (d72f9800a42b, Oct 06 2017, 09:04:27)
[PyPy 5.9.0-beta0 with GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]

import pandas as pd

df = pd.DataFrame({'a': 1, 'b': 2}, index=pd.RangeIndex(2))

ValueErrorTraceback (most recent call last)
<ipython-input-11-beb04ea51840> in <module>()
      4 df.to_csv('test.csv')
----> 6 pd.read_csv('./test.csv')

~/pypy3-5.9-osx/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
    653                     skip_blank_lines=skip_blank_lines)
--> 655         return _read(filepath_or_buffer, kwds)
    657     parser_f.__name__ = name

~/pypy3-5.9-osx/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    404     # Create the parser.
--> 405     parser = TextFileReader(filepath_or_buffer, **kwds)
    407     if chunksize or iterator:

~/pypy3-5.9-osx/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
    762             self.options['has_index_names'] = kwds['has_index_names']
--> 764         self._make_engine(self.engine)
    766     def close(self):

~/pypy3-5.9-osx/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
    983     def _make_engine(self, engine='c'):
    984         if engine == 'c':
--> 985             self._engine = CParserWrapper(self.f, **self.options)
    986         else:
    987             if engine == 'python':

~/pypy3-5.9-osx/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
   1603         kwds['allow_leading_cols'] = self.index_col is not False
-> 1605         self._reader = parsers.TextReader(src, **kwds)
   1607         # XXX

ValueError: only single character unicode strings can be converted to Py_UCS4, got length 0

It will work if I specify to use the "python" engine :

`pd.read_csv('./test.csv', engine='python')`

It also works with pypy2 5.9 with the C engine.

Any idea on why this might be happening?
I'm aware that `Py_UCS4` does not exist in python2, but I'm not sure what would cause that, maybe some incomparability with cpyext?


