Concepts#

In Skchange a change detector is composed of an interval scorer and a penalty. A change detector is the object you use for detecting changes, an interval scorer is the user-specified component that tells the detector what distributional feature of the data to look for changes in, while the penalty controls the number of detected events. This page introduces each of these concepts to give you a high level understanding of the library’s design.

Change detectors#

A change detector in Skchange is a Scikit-learn-style estimator (see Scikit-learn compatibility for details) that finds changepoints in a single univariate or multivariate time series. A changepoint is a point in the time series where the data distribution changes in some way. The workflow of change detectors is familiar to anyone who has used Scikit-learn: You first initialise the detector with algorithm-specific parameters, before calling fit on a time series to fit the detector on training data. Finally, you can use the fitted detector to detect changepoint on new data using predict_changepoints or predict, depending on the output format you want.

predict_changepoints(X)

Returns a numpy array of changepoint indices of shape (n_changepoints,). A changepoint marks the inclusive start of a new segment. This means that when cpts is the array of detected changepoints, the segments of the input time series are X[:cpts[0]], X[cpts[0]:cpts[1]], ..., X[cpts[-1]:].

predict(X)

Returns a numpy array of segment labels of shape (n_samples,) — one integer label per input sample. This mirrors the output convention of Scikit-learn clusterers and classifiers, so the array plugs directly into pipelines and metrics. Note that segment labels in general can reoccur for discontiguous segments, so the labels are not necessarily ordered.

In addition to fit, these two methods are mandatory on all change detectors in Skchange.

As a concrete example, the snippet below generates a univariate series with a single mean change at index 20, fits PELT to it, and calls both methods to make the output shapes concrete.

[1]:
from skchange.new_api.datasets import generate_piecewise_normal_data
from skchange.new_api.detectors import PELT
from skchange.new_api.interval_scorers import L2Cost

# A univariate series with a change in mean at index 20. Reused throughout the page.
X = generate_piecewise_normal_data(means=[0, 5], lengths=[20, 10], seed=1)

detector = PELT(cost=L2Cost(), penalty=10.0).fit(X)

print("predict_changepoints:", detector.predict_changepoints(X))
print("predict:             ", detector.predict(X))

predict_changepoints: [20]
predict:              [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1]

Segment anomaly detectors#

Some change detectors in Skchange can natively identify anomalous segments — segments where the data deviates from a “normal” or “baseline” behaviour. In this case, the changepoints are the boundaries of the anomalous segments. Such detectors expose the following additional method:

predict_segment_anomalies(X)

Returns a numpy array of anomalous intervals of shape (n_anomalies, 2) with [start, end) rows. The samples outside these intervals are considered normal, and are all labelled as 0 in the output of predict.

For example, running CAPA on the same X returns the post-change region as a single anomalous segment relative to the baseline mean learned robustly from the data:

[2]:
from skchange.new_api.detectors import CAPA
from skchange.new_api.interval_scorers import L2Saving

anomaly_detector = CAPA(segment_saving=L2Saving(), segment_penalty=10.0).fit(X)
anomaly_detector.predict_segment_anomalies(X)

[2]:
array([[20, 30]])

Advanced outputs#

Most Skchange detectors also expose the following methods for advanced users:

predict_scores(X, return_index=False)

Returns the detector’s internal interval scoring objective as a 1D numpy array. The length depends on the algorithm — one entry per evaluated interval, candidate split point, etc. — and is not generally equal to n_samples. With return_index=True it returns a (scores, index_dict) tuple, where index_dict carries algorithm-specific metadata that locates each score on the input timeline. Used by the tuning module for penalty calibration.

predict_all(X)

Convenience method on detectors that compute all outputs in a single pass, which could be more than the sum of the individual methods. Returns a dict whose keys are detector-specific. Useful for power users who want all the details in one go, without repeating work.

Scikit-learn compatibility#

All Skchange detectors inherit from scikit-learn’s BaseEstimator via Skchange’s BaseChangeDetector class, and the aim is to follow the Scikit-learn API conventions as closely as possible. This gives you sklearn-standard machinery for free, but a few sklearn tools are intentionally not supported because they assume properties that time series data do not have.

Data types: Like Scikit-learn estimators, Skchange detectors accept 2D array-like input of shape (n_samples, n_features) and return an np.ndarray (except the predict_all method intended for advanced use).

What works:

  • ``get_params`` / ``set_params`` for inspecting and updating hyperparameters.

  • ``sklearn.base.clone`` for making unfitted copies.

  • ``Pipeline`` with sklearn transformers that don’t rely on the ordering of samples.

  • Fitted-attribute convention: attributes set in fit end with _ (e.g. detector.penalty_), and check_is_fitted works as expected.

What does not work, and why:

  • ``GridSearchCV`` / ``cross_val_score`` and other cross-validation utilities. Scikit-learn’s cross-validation tools don’t respect the ordering of samples crucial to time series, and thus cannot be used with Skchange detectors. Use the built-in tuning utilities in skchange.new_api.tuning instead.

Interval scorers#

Skchange provides modular and flexible algorithms for change detection that also run fast. Interval scorers are the components that make this possible. They are user-specified building blocks of all change detectors in Skchange.

Apart from being constructed, interval scorers are not meant to be used directly. They are used internally by the detectors. To make full use of the library, however, it is important to understand how they work and how they can be used to build more complex detectors.

The choice of interval scorer represents the choice of distributional feature(s) to detect changes in, for example the mean, the variance, the covariance matrix, regression parameters, the full distribution, a Poisson rate, or something else. You have already seen two examples above: L2Cost for detecting changes in mean, and L2Saving for detecting segments with anomalous means relative to a baseline.

An interval scorer is responsible for evaluating a score over interval specifications (interval specs for short). An interval spec consists of the start and end indices of an interval, and possibly split points inside it. Different detectors require different types of scores. Below is a summary of the four types of interval scorers in Skchange, and examples of which detectors use them:

Score type

Interval spec

What it scores

Example detectors

Cost

start, end

The cost/loss of a model fit to a single interval [start, end).

PELT, CROPS

Change score

start, split, end

The degree of change between two adjacent intervals [start, split) and [split, end). A two-sample statistical test, also known as an “at-most-one-change” test statistic.

SeededBinarySegmentation, MovingWindow

Saving

start, end

The degree to which an interval [start, end) deviates from a fixed baseline model. A one-sample statistical test.

CAPA

Transient score

start, split1, split2, end

The degree to which an interval [split1, split2) is different compared to its local context [start, split1) and [split2, end). A two-sample statistical test, but with a different splitting strategy than change scores.

CircularBinarySegmentation

The four score types are covered in more detail in the subsections below.

Fast scoring. The computational bottleneck of most change detection algorithms is to evaluate the score over a large number of interval specs, often with overlapping regions. The design of interval scorers provides two mechanisms for speeding this up:

  1. Precomputation: Interval scorers can precompute quantities that are reused across many interval specs, such as cumulative sums or sufficient statistics.

  2. Vectorisation: Interval scorers can evaluate many interval specs in a single call, often using numba for acceleration.

The scorer API. All interval scorers share the same three-step usage:

  • fit(X): Fits the scorer to training data. This only validates the data and parameters in many cases, but not all. This is called within the detector’s fit method.

  • precompute(X): Takes (possibly) new data and returns a cache of reusable quantities. Called within the detector’s predict_* methods.

  • evaluate(cache, interval_specs): Scores a batch of interval specs in one call.

Like detectors, interval scorers inherit from Scikit-learn’s BaseEstimator. This enables seemless use of get_params, set_params, and clone for hyperparameter inspection, updating, and copying.

Cost#

A cost measures the cost/loss/error of a model fit to a data interval X[start:end]. For example, L2Cost returns the sum of squared deviations from the sample mean within each interval — small when the interval is well-modelled by a single mean, large when it straddles a change. In the example below, we reuse the series of length 30 with a changepoint at index 20 from the Change detectors section.

[3]:
from skchange.new_api.interval_scorers import L2Cost

cost = L2Cost().fit(X)
cache = cost.precompute(X)
cost.evaluate(cache, [[0, 5], [15, 25], [25, 30]])

[3]:
array([[ 3.10965652],
       [70.94171088],
       [ 5.78706847]])

The cost is much larger for the interval [15, 25) than for the other two, because that interval straddles the change point at index 20 and a single-mean model fits poorly there.

Change score#

A change score takes in a triplet (start, split, end) and measures the degree of change between the two adjacent intervals X[start:split] and X[split:end]. Change scores can be statistical tests, time-series distances, or any other measure of difference. A classical example is the CUSUM score for a change in mean.

[4]:
from skchange.new_api.interval_scorers import CUSUM

score = CUSUM().fit(X)
cache = score.precompute(X)
score.evaluate(cache, [[0, 3, 6], [15, 20, 25], [20, 25, 30]])

[4]:
array([[0.22776324],
       [7.62895011],
       [1.10324031]])

Again, the change score is largest for the interval that contains the change point at index 20.

Saving#

A saving takes a (start, end) spec and measures the difference in cost between a locally estimated and a fixed reference model parameter over X[start:end]. The reference parameter represents the “normal” data behaviour and is either user-specified or estimated robustly from the training data in the saving’s fit method. Savings are used primarily for segment anomaly detection.

[5]:
from skchange.new_api.interval_scorers import L2Saving

saving = L2Saving(baseline_mean=None)  # Estimate the baseline mean robustly in fit.
saving.fit(X)
print(f"Baseline mean: {saving.baseline_mean_[0]:.3f}")

cache = saving.precompute(X)
saving.evaluate(cache, [[0, 5], [15, 25], [20, 30]])

Baseline mean: 0.405
[5]:
array([[  0.95511629],
       [ 28.46673296],
       [197.89773247]])

The saving is largest for the last interval, whose mean differs most from the baseline.

Transient score#

A transient score takes a quadruplet (start, split1, split2, end) and measures how different the inner interval X[split1:split2] is compared to its local context to the left and right, X[start:split1] and X[split2:end]. These scores serve a similar purpose to savings and and are primarily used for segment anomaly detection, but they don’t require a fixed baseline model at the cost of often being heavier to compute.

As an example, L2TransientScore measures the reduction in squared error obtained by letting the inner interval have one mean and the surrounding intervals a separate mean, relative to the whole outer interval X[start:end] sharing a single mean.

[6]:
from skchange.new_api.interval_scorers import L2TransientScore

transient_score = L2TransientScore().fit(X)
cache = transient_score.precompute(X)
transient_score.evaluate(cache, [[0, 5, 10, 15], [15, 20, 25, 30], [0, 20, 30, 30]])

[6]:
array([[  0.6842639 ],
       [ 14.19496277],
       [159.30247397]])

The transient score is largest for the last interval, whose inner region [20, 30) sits entirely in the post-change segment while the surrounding context [0, 20) sits entirely in the pre-change segment. The first interval is homogeneous and scores near zero; the second sits across the change point but its surrounding context mixes pre- and post-change data, so the contrast is smaller.

Penalties#

Interval scorers score, but they don’t decide. In Skchange, the detector’s decision is shaped by a penalty — a non-negative number (or array, see below) that controls the trade-off between under- and over-segmentation. A low penalty makes the detector eager to accept changes, producing many segments, while a high penalty makes it conservative, producing few.

For change scores, savings, and transient scores combined with a scalar penalty, the penalty acts as a threshold: A candidate is accepted only when its score exceeds the penalty. For costs, the penalty is added to the segmentation objective, so accepting an extra change point has to reduce the total cost by at least the penalty. Either way, higher penalty means fewer accepted changes.

The penalty can be a hand-picked scalar, a value derived from a theoretical formula such as BIC, or a value calibrated from data to control the false-positive rate.

Penalty arrays#

Some detectors support non-negative and non-decreasing array penalties. The purpose of array penalties is to allow different levels of penalisation depending on how many features in a multivariate time series are affected by a change. Plenty of statistical research articles have shown that for optimal detection of changes in multivariate time series when the number of affected features is unknown, different levels of penalisation is necessary.