anyone know pandas ? Don't understand error: NotImplementedError...
someone
newsboost at gmail.com
Thu Apr 18 15:30:20 EDT 2013
On 04/18/2013 04:07 PM, Neil Cerutti wrote:
> On 2013-04-18, Wayne Werner <wayne at waynewerner.com> wrote:
>> On Wed, 17 Apr 2013, someone wrote:
....
>> Go to line 214, and take a look-see at what you find. My guess is it will
>> be something like:
>>
>> def rule_code():
>> raise NotImplementedError()
>>
>> Which is terribly unhelpful.
>
> It most likely means that the program is instantiating an
> abstract base class when it should be using one of its subclasses
> instead, e.g., BusinessDay, MonthEnd, MonthBegin,
> BusinessMonthEnd, etc.
>
> http://pandas.pydata.org/pandas-docs/dev/timeseries.html
Hi Neil and Wayne,
Thank you very much for your suggestions... I now found out something:
In the function:
def convertListPairToTimeSeries(dList, cList):
...
...
# create the timeseries
ts = pandas.Series(cListL, index=indx)
# fill in missing days
#ts = ts.asfreq(pandas.datetools.DateOffset())
return ts
I had to out-comment the last line before the return-statement (not sure
what that line is supposed to do, in the first case)...
Now the program runs, but no plot is seen. Then I found out that I had
to add:
import matplotlib.pyplot as plt
in the top of the program and add the following in the bottom of the
program:
plt.show()
Final program:
==================
#!/usr/bin/python
import pandas
import datetime
import numpy
import ipdb
import matplotlib.pyplot as plt
datesList = [datetime.date(2011,12,1), \
datetime.date(2011,12,2), \
datetime.date(2011,12,3), \
datetime.date(2011,12,10)]
countsList = numpy.random.randn(len(datesList))
startData = datetime.datetime(2011,12,3)
endData = datetime.datetime(2011,12,8)
def convertListPairToTimeSeries(dList, cList):
# my dateList had date objects, so convert back to datetime objects
dListDT = [datetime.datetime.combine(x, datetime.time()) for x in
dList]
# found that NaN didn't work if the cList contained int data
cListL = [float(x) for x in cList]
# create the index from the datestimes list
indx = pandas.Index(dListDT)
# create the timeseries
ts = pandas.Series(cListL, index=indx)
# fill in missing days
#ts = ts.asfreq(pandas.datetools.DateOffset())
return ts
print "\nOriginal datesList list:\n", datesList
tSeries = convertListPairToTimeSeries(datesList, countsList)
print "\nPandas timeseries:\n", tSeries
# use slicing to change length of data
tSeriesSlice = tSeries.ix[startData:endData]
print "\nPandas timeseries sliced between", startData.date(), \
"and", endData.date(), ":\n", tSeriesSlice
# use truncate instead of slicing to change length of data
tSeriesTruncate = tSeries.truncate(before=startData, after=endData)
print "\nPandas timeseries truncated between", startData.date(), \
"and", endData.date(), ":\n", tSeriesTruncate
# my data had lots of gaps that were actually 0 values, not missing data
# So I used this to fix the NaN outside the known outage
startOutage = datetime.datetime(2011,12,7)
endOutage = datetime.datetime(2011,12,8)
tsFilled = tSeries.fillna(0)
# set the known outage values back to NAN
tsFilled.ix[startOutage:endOutage] = numpy.NAN
print "\nPandas timeseries NaN reset to 0 outside known outage between", \
startOutage.date(), "and", endOutage.date(), ":\n", tsFilled
print "\nPandas series.tail(1) and series.head(1) are handy for " +\
"checking ends of list:\n", tsFilled.head(1), tsFilled.tail(1)
print
tsFilled.plot()
plt.show()
==================
This seem to work, although I don't fully understand it, as I'm pretty
new to pandas...
More information about the Python-list
mailing list