The goal of dynamic time warping (DTW) is to find a function that transforms, or "warps," time in order to approximately align two signals. DTW has various applications, some of which are demonstrated in the Interactive Demo below. Existing methods are often slow, with quadratic time complexity, and suffer from inaccuracies falling into two categories: susceptibility to local minima leading to poor alignments, and/or sharp irregularities in the warping function, known as "singularities." Many methods have attempted to address these issues in an ad-hoc manner, usually through hand-tuned pre- or post-processing steps. Our method is the first general solution that rapidly —in linear time— produces a smooth warping function without the need for any pre- or post-processing steps. Additionally, we offer a fast open-source C++ and Python package under a generous Apache license, comprehensive documentation, and several demos.
Read the paperChoose a dataset and adjust the sliders below to interactively explore how the smoothing and cumulative warp regularizers affect the results.
Please wait while this example loads.
The documentation is written to be consistent with the paper, and will often refer to the variable and symbols used in the paper. So that the community can benefit from a common knowledge-base, software-related questions should be submitted via GitHub issues.
The Python package requires Python 3.6 or greater. There are pre-compiled binary distributions available for OSX, Linux, and Windows.
The Python package is distributed through PyPi and installed using the following command:
pip install gdtw
You may download the source code by following the link above, or by cloning the repo with the following command:
git clone https://www.github.com/dderiso/gdtw
General notes on compilation are provided as comments in setup.py
.
Compilation requires the Numpy C++ bindings (automatically installed with Numpy) and g++ compiler with support for C++11.
git clone https://www.github.com/dderiso/gdtw
cd gdtw
python setup.py install
This software is provided under the Apache license.
Deriso, Dave, and Stephen Boyd (2022). A general optimization framework for dynamic time warping. Optimization and Engineering, 1-22.
@article{deriso2022general, title={A general optimization framework for dynamic time warping}, author={Deriso, Dave and Boyd, Stephen}, journal={Optimization and Engineering}, pages={1--22}, year={2022}, publisher={Springer} }
So that the community can benefit from a common knowledge-base, software-related questions should be submitted via GitHub issues.
Other questions about code or documenation can be addressed to Dave Deriso (dderiso/at/alunmi-dot-stanford-dot-edu
).
In this example, we'll define signal \( x(t) = \sin(2 \pi * 5 t) \) to be a 5Hz sine wave and signal \( y(t) = x(q(t)) \) to be the same 5Hz sine wave but warped with a quadratic time warping function \( q(t) = t^2 \).
import numpy as np
t = np.linspace(0,1,1000)
q = lambda t_: t_**2
x = lambda t_: np.sin(2*np.pi * 5 * t_)
y = lambda t_: x(q(t_))
The signals are plotted using the following code.
If you don't have matplotlib
installed, run pip install matplotlib
into your shell to install the library.
import matplotlib.pyplot as plt
plt.plot(t, y(t), '-', color='C1', label='y(t)')
plt.plot(t, x(t), '-', color='C0', label='x(t)')
plt.legend()
plt.show()
To warp signal \( x \) into signal \( y \), we'll use the function
gdtw.warp(x,y,params)
using the default parameters.
We'll pass discretized signals \( x(t) \) and \( y(t) \),
where \( t \) is a vector of \( N \) values \( 0 = t_1 < t_2 < \cdots < t_N = 1 \),
into arguments x
and y
as Numpy arrays.
Recall that in our formulation, the warping function is denoted as \( \phi \), the discretized warped time is denoted as \( \tau \), the discretized warped signal is denoted as \( \hat{x} = x(\tau) \), and objective function evaluated on \( \tau \) (aka. DTW distance) is denoted as \( \hat{f}(\tau) \) (see \( \S 4 \) in the paper). Correspondingly, the function returns \( \phi \), \( x(\tau) \), and \( \hat{f}(\tau) \) as tuple consisting of a function, a Numpy array, and a Numpy double precision float.
import gdtw
phi, x_tau, f_tau, g = gdtw.warp(x(t), y(t))
The warped signals are plotted using the following code. You can plot the discretized result, \( x ( \tau ) \).
plt.plot(t,y(t), '-', color='C1', label='y(t)')
plt.plot(t,x_tau, '--', color='C0', label='x(tau)')
plt.legend()
plt.show()
Alternatively, you can plot the composition \( x \circ \phi \).
plt.plot(t,y(t), '-', color='C1', label='y(t)')
plt.plot(t,x(phi(t)), '--', color='C0', label='x(phi(t))')
plt.legend()
plt.show()
As illustrated in the interactive demonstration above, the choice of hyper-parameters greatly affects the solution. In addition to specifying hyper-parameter values, you can also specify custom loss and regularization functions. Together, you can design a custom objective function to better address more complicated problems. A table of allowable parameters is included in the API Reference.
gdtw.warp(x, y, params={
"lambda_cum": 0.001,
"lambda_inst": 0.001,
"Loss": lambda x,y: np.abs(x-y),
"R_cum": lambda phi,t: np.abs(phi-t),
"R_inst": lambda grad_phi: np.abs(grad_phi-1)
})
Congratulations! You now know how to time-warp two signals together, inspect the warping function, set hyper-parameter values, ' and specify custom loss and regularization functions. This is all you need to know for most applications.
gdtw.warp(x,y,params)
Uses our formulation of dynamic time warping and method of iterative refinement to find a function \( \phi : t \to \tau \) that minimizes the difference between \( x(\phi(t)) \) and \( y(t) \). Recall that in our formulation, the warping function is denoted as \( \phi \), the discretized warped time is denoted as \( \tau \), the discretized warped signal is denoted as \( x(\tau) \), and the objective function evaluated on \( \tau \) (aka. DTW distance) is denoted as \( \hat{f}(\tau) \) (see \( \S 4 \) in the paper). Correspondingly, the function returns \( \phi \), \( \tau \), \( x(\tau) \), and \( \hat{f}(\tau) \) as tuple consisting of a function, two Numpy arrays, and a Numpy double precision float.
Argument | Type(s) | Default | Description |
---|---|---|---|
x
| np.ndarray or function | Required |
\(x\) is a sequence, or \(x: R \to R \) is a function, that is being warped.
Any specified function must have the following signature:
Here is an example of a valid function:
|
y
| np.ndarray or function | Required | \(y\) is a sequence, or \(y: R \to R \) is a function, that serves as the target.
Any specified function must have the following signature:
Here is an example of a valid function:
|
params
| dictionary | {} | Parameters are optional. When a parameter is left unspecified, the solver uses default value listed in the Parameter Table below. |
Key | Value Type(s) | Default Value | Description |
---|---|---|---|
"lambda_inst"
| np.double | 0.1 | \(\Laminst \ge 0 \) is a positive hyperparameter for the regularization functional \(\calRinst\). This hyperparameter increases the penalty on the instantaneous amount of warping. Higher values produce a smoother \( \phi \), and lower values result in singularities. |
"lambda_cum"
| np.double | 1.0 | \(\Lamcum \ge 0\) is a positive hyperparameter for the regularization functional \(\calRcum\). This hyperparameter increases the penalty on the cumulative amount of warping. Higher values result in less over-fitting. |
"Loss"
| string or function | "L2" |
\( \mathcal{L} \) is the loss that ensures \( x( \phi (t)) \) is close to a target \( y(t) \).
Fast C++ methods are invoked by passing a string instead of a function, built-in options include: "L1" or "L2" .
Any specified function must have the following signature:
Here is an example of a valid function:
|
"R_cum"
| string or function | "L2" |
\( \calRcum \) is the regularization functional for the cumulative warp and reduces over-fitting.
Fast C++ methods are invoked by passing a string instead of a function, built-in options include: "L1" or "L2" .
Any specified function must have the following signature:
Here is an example of a valid function:
|
"R_inst"
| string or function | "L2" |
\( \calRinst \) is the regularization functional for the instantaneous warp and improves the smoothness of the warping.
Any specified function must have the following signature:
Here is an example of a valid function:
The constraint that the slope of \( \phi \) must be between \( s_\mathrm{min} \) and \( s_\mathrm{max} \) is handled in the solver outside of this function.
|
"N"
| np.double | 300 | \( N \) is the number of time points in the downsampled input. The computation time grown linearly as \( N \) increases. |
"M"
| np.double | 165 | \( M \) is the search window size, and must satisfy \( M \le M_\mathrm{max}\). When \( M \) is not explicitly set, \( M \) is computed as \( \mathrm{min}( 0.55*N, M_\mathrm{max}) \). With default settings \( N = 300 \) and \( M_\mathrm{max} = 300 \), \( M = 165 \) The computation time grows quadratically as parameter \( M \) increases. |
"M_max"
| np.double | 300 | \( M_\mathrm{max} \) is the maximum search window size. |
"eta"
| np.double | 0.15 | \( \eta \) is the step size of the iterative refinement. Smaller values result in fewer iterations, but may result in lower quality results. |
"s_min"
| np.double | 1E-8 | \( s_\mathrm{min} \) specifies the constraint for the minimum slope of \( \phi \). When this is non-zero, the warping function will be monotone increasing. We can allow the slope of \( \phi \) be negative by choosing \( s_\mathrm{min} < 0 \). |
"s_max"
| np.double | 1E8 | \( s_\mathrm{max} \) specifies the constraint for the maximum slope of \( \phi \). Large values allow \( \phi \) to have sharp discontinuities. |
"s_beta"
| np.double | 0 | \( s_\beta \) specifies a tolerance on the boundary constraints. The default is \(s_\beta = 0 \), which enforces boundary conditions \( \phi(0)=0 \) and \( \phi(1)=1 \). |
phi
| function |
\( \phi [0,1]: R \to R \) is the time warping function that maps \( t \to \tau \).
This function has the following signature:
|
---|---|---|
x_tau
| np.ndarray | \( x( \tau ) \in R^N \) is an array of length \( N \) that contains the discretized version of the time warped signal \( x \). |
f_tau
| np.double | \( \hat{f}(\tau) \), a.k.a. the "DTW distance", is the discretized version of the objective function evaluated on \( \tau \). |
g
| GDTW | This is a reference to the GDTW solver object. It's generally ignored. |
ValueError
| This error is raised when x or y are not of type np.ndarray or do not have the required function signature (t:float) -> float . |
---|---|
ValueError
| This error is raised when params is not of the dictionary type. |
This example is explained in detail in the quick start tutorial.
import numpy as np
t = np.linspace(0,1,1000)
q = lambda t_: t_**2
x = lambda t_: np.sin(2*np.pi * 5 * t_)
y = lambda t_: x(q(t_))
import gdtw
phi, x_tau, f_tau, g = gdtw.warp( x(t), y(t) )
import matplotlib.pyplot as plt
plt.plot(t,y(t), '-', color='C1', label='y(t)')
plt.plot(t,x_tau, '--', color='C0', label='x(tau)')
plt.legend()
plt.show()
This method supports custom hyper-parameter values as well as custom loss and regularizer functionals. The available options are listed in the Parameter Table above.
gdtw.warp(x, y, params={
"lambda_cum": 0.001,
"lambda_inst": 0.001,
"Loss": lambda x,y: np.abs(x-y),
"R_cum": lambda phi,t: np.abs(phi-t),
"R_inst": lambda grad_phi: np.abs(grad_phi-1)
})
The node costs are computed and stored in an \( M\times N \) array, and the edge costs are computed on the fly and stored in an \( M\times M\times N \) array. The computational cost of our dynamic programming procedure is order \( \mathcal{O}( N ) \) flops (not counting the evaluation of the loss and regularization terms). With current hardware, it is entirely practical for \( N=1000 \) or (much) larger. For example, \( N=1000 \) takes \( 0.83 \) ms to solve on a \( 4 \)-core Macbook and only \( 4 \) iterations are needed before the iterative refinement procedure terminates.