[scikit-learn] Heisenbug?
Dan Stromberg
dstromberg at grokstream.com
Mon Dec 16 20:02:09 EST 2019
Hi folks.
I'm new to Scikit-learn.
I have a very large Python project that seems to have a heisenbug which is
manifesting in scikit-learn code.
Short of constructing an SSCCE, are there any magical techniques I should
try for pinning down the precise cause? Like valgrind or something?
An SSCCE will most likely be pretty painful: the project has copious
shared, mutable state, and I've already tried a largish test program that
calls into the same code path with the error manifesting 0 times in 100.
It's quite possible the root cause will turn out to be some other part of
the software stack.
The traceback from pytest looks like:
sequential/test_training.py:101:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _
../rt/classifier/coach.py:146: in train
**self.classifier_section
../domain/classifier/factories/classifier_academy.py:115: in
create_classifier
**kwargs)
../domain/classifier/factories/imp/xgb_factory.py:164: in create
clf_random.fit(X_train, y_train)
../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722:
in fit
self._run_search(evaluate_candidates)
../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515:
in _run_search
random_state=self.random_state))
../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711:
in evaluate_candidates
cv.split(X, y, groups)))
../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996:
in __call__
self.retrieve()
../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899:
in retrieve
self._output.extend(job.get(timeout=self.timeout))
../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517:
in wrap_future_result
return future.result(timeout=timeout)
/usr/lib/python3.6/concurrent/futures/_base.py:425: in result
return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _
self = <Future at 0x7f15571ec7f0 state=finished raised ValueError>
def __get_result(self):
if self._exception:
> raise self._exception
E ValueError: Input contains NaN, infinity or a value too large
for dtype('float32').
/usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError
The above exception is raised about 12 to 14 times in 100 in full-blown
automated testing.
Thanks for the cool software.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20191216/ae5fda82/attachment.html>
More information about the scikit-learn
mailing list