[ANN] - Skyline v1.2.2-stable-luminosity

For those interested in anomaly detection and deflection in streamed time series data. I would like to announce a new release of Skyline v1.2.2 - https://github.com/earthgecko/skyline/releases/tag/v1.2.2-stable-luminosity What is Skyline? ---------------- Skyline v1.2.2 - For those interested in anomaly detection and deflection in streamed time series data. Anomaly deflection. The obvious next evolution in the use of all the anomaly detection data? Skyline is a Python based anomaly detection/deflection stack that analyses, anomaly detects, deflects, fingerprints and learns vast amounts of streamed time series data. - Skyline ingests streamed metric time series data - skyline/horizon - Skyline uses a ```CONSENSUS``` of 3-sigma algorithms to detect anomalies on batch processed, streamed metric time series data - skyline/analyzer - anomaly detector - It handles large and small seasonality in the data - skyline/mirage - anomaly deflector and detector - You can train it on what is NOT anomalous and it learns - skyline/ionosphere - anomaly deflector - It records all your anomalies - skyline/panorama - anomaly memory - It shows you all your data - skyline/webapp - anomaly view Seeing as we desire our metrics to be not anomalous most of the time and we want to know when they ARE anomalous and given the fact that we try and build systems that try to behave within not anomalous bounds so they perform well, due to this we have: - A lot of metric time series data that are not anomalous most of the time. - A lot of data to train a system on what is NOT anomalous given a time series data set, rather than simply focusing on what is anomalous, also focusing on what is not anomalous. To achieve this Skyline implements a novel time series similarities comparison algorithm and a boundary layers methodology that generates fingerprints of time series data using the sum of the values of features of the time series which have been extracted using the tsfresh features extraction package - https://github.com/blue-yonder/tsfresh and evaluation against boundary layer algorithms to determine whether a 3-sigma triggered anomaly is actually a normal, known pattern in the data. The Skyline-Ionosphere-Tsfresh Time Series Similairities Comparison Algorithm - SITTSSCA first coined here :) compares the generated fingerprints of the two time series and can determine if they closely resemble each other in terms of: - of the amount of "power/energy", range and "movement" there is within the time series data set somewhat like RMS - Erol Kalkan from United States Geological Survey, “Another approach to compute the differences between two time series is moving window root-mean-square. RMS can be run for both series separately. This way, you can compare the similarities in energy (gain) level of time series. You may vary the window length for best resolution.” (https://www.researchgate.net/post/How_can_I_perform_time_series_data_similar...) http://stackoverflow.com/questions/5613244/root-mean-square-in-numpy-and-com... The Skyline-Ionosphere-Tsfresh Time Series Similairities Comparison Algorithm compares how close the fingerprint values are as a percentage and varying this percentage variable will either focusing the algorithm with greater precision the closer to 0% the parameter gets, the perfect match (or possibly a mirror match too - unkonwn/untested) or it will incrementally increase the tolerance as the percentage variable increases and the matching will become less and less reliable. However there is a sweet spot and here SITTSSCA works extremely well :) Added to SITTSSCA is an optional layer of simple boundary algorithms that are user defined during the operator training interaction with Skyline, where the operator augments the SITTSSCA results with boundaries that describe the expected norm within the time series. Very similar to being able to describe the Active Brownian Motion of a time series - https://github.com/blue-yonder/tsfresh/pull/143#issuecomment-272314801 This results in an anomaly detection/deflection system which enables the user to very simply label time series and train Skyline on the peaks and troughs and the expected Active Brownian Motion or best effort thereof. However it takes a little effort on your part to train Skyline, however with the effort, Skyline is very good at doing anomaly detection and deflection. With your help. There is no easy anomaly detection or deflection, but there is some reward with a bit of effort. To learn more... ---------------- Project page -> https://github.com/earthgecko/skyline Documentation -> https://earthgecko-skyline.readthedocs.io/en/latest/index.html With the hope Skyline can make the universe a bit less anomalous. Regards Gary
participants (1)
-
Gary Wilson