[scikit-learn] Heisenbug?

Dan Stromberg dstromberg at grokstream.com
Tue Dec 17 10:50:28 EST 2019


Hi.

Overflow does sound kind of possible.  We're sending semi-random values to
the test.

I believe our systems are all x86_64, Linux.  Some are Ubuntu 16.04, some
are Mint 19.2.

I realized on the way to work this morning, that I left out some important
information; I suspect a heisenbug for 3 reasons:

1) If I try to look at it with print functions, I get a traceback after the
print's, but no print output.  This happens with both writing to a
disk-based file, and with printing to stdout.

2) If I try to look at it with pudb (a debugger) via pudb.set_trace(), I
get a failure to start pudb.

3) If I create a small test program that sends the same inputs to the
function in question, the function works fine.

Thanks.

On Mon, Dec 16, 2019 at 11:20 PM Joel Nothman <joel.nothman at gmail.com>
wrote:

> Hi Dan, this kind of error can come from overflow. Are all of your test
> systems the same architecture?
>
> On Tue., 17 Dec. 2019, 12:03 pm Dan Stromberg, <dstromberg at grokstream.com>
> wrote:
>
>> Hi folks.
>>
>> I'm new to Scikit-learn.
>>
>> I have a very large Python project that seems to have a heisenbug which
>> is manifesting in scikit-learn code.
>>
>> Short of constructing an SSCCE, are there any magical techniques I should
>> try for pinning down the precise cause?  Like valgrind or something?
>>
>> An SSCCE will most likely be pretty painful: the project has copious
>> shared, mutable state, and I've already tried a largish test program that
>> calls into the same code path with the error manifesting 0 times in 100.
>>
>> It's quite possible the root cause will turn out to be some other part of
>> the software stack.
>>
>> The traceback from pytest looks like:
>> sequential/test_training.py:101:
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _ _ _ _ _ _ _ _ _ _ _ _
>> ../rt/classifier/coach.py:146: in train
>>     **self.classifier_section
>> ../domain/classifier/factories/classifier_academy.py:115: in
>> create_classifier
>>     **kwargs)
>> ../domain/classifier/factories/imp/xgb_factory.py:164: in create
>>     clf_random.fit(X_train, y_train)
>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722:
>> in fit
>>     self._run_search(evaluate_candidates)
>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515:
>> in _run_search
>>     random_state=self.random_state))
>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711:
>> in evaluate_candidates
>>     cv.split(X, y, groups)))
>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996:
>> in __call__
>>     self.retrieve()
>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899:
>> in retrieve
>>     self._output.extend(job.get(timeout=self.timeout))
>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517:
>> in wrap_future_result
>>     return future.result(timeout=timeout)
>> /usr/lib/python3.6/concurrent/futures/_base.py:425: in result
>>     return self.__get_result()
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _ _ _ _ _ _ _ _ _ _ _ _
>>
>> self = <Future at 0x7f15571ec7f0 state=finished raised ValueError>
>>
>>     def __get_result(self):
>>         if self._exception:
>> >           raise self._exception
>> E           ValueError: Input contains NaN, infinity or a value too large
>> for dtype('float32').
>>
>> /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError
>>
>>
>> The above exception is raised about 12 to 14 times in 100 in full-blown
>> automated testing.
>>
>> Thanks for the cool software.
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20191217/9937451a/attachment.html>


More information about the scikit-learn mailing list