[scikit-learn] Can cluster help me to cluster data with length of continuous series?

Christian Braune christian.braune79 at gmail.com
Wed Apr 3 06:18:13 EDT 2019


Hi,

that does not really sound like a clustering but more like a preprocessing
problem to me. For each item you want to calculate the length of the
longest subsequence of "1"s. That could be done by a simple function and
would create a new (one-dimensional) property for each of your items.
You could then apply any clustering algorithm to this feature (i.e. you'd
be clustering a one-dimensional dataset)...

Regards,
  Christian

lampahome <pahome.chen at mirlab.org> schrieb am Mi., 3. Apr. 2019 um
11:08 Uhr:

> I have data which contain access duration of each items.
>
> EX: t0~t4 is the access time duration. 1 means the item was accessed in
> the time duration, 0 means not.
> ID,t0,t1,t2,t3,t4
> 0,1,0,0,1
> 1,1,0,0,1
> 2,0,0,1,1
> 3,0,1,1,1
>
> What I want to cluster is the length of continuous duration
> Ex:
> ID=3 > 2 > 1 = 0
>
> Can any distance metric to help clustering based on the length of
> continuous duration?
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190403/baac3589/attachment.html>


More information about the scikit-learn mailing list