A General Optimization Framework for Dynamic Time Warping

Software

General

The documentation is written to be consistent with the paper, and will often refer to the variable and symbols used in the paper. So that the community can benefit from a common knowledge-base, software-related questions should be submitted via GitHub issues.

System Requirements

The Python package requires Python 3.6 or greater. There are pre-compiled binary distributions available for OSX, Linux, and Windows.

Installation

The Python package is distributed through PyPi and installed using the following command:

pip install gdtw

Source Code

Source Code on Github

You may download the source code by following the link above, or by cloning the repo with the following command:

git clone https://www.github.com/dderiso/gdtw

Custom Compilation

General notes on compilation are provided as comments in setup.py. Compilation requires the Numpy C++ bindings (automatically installed with Numpy) and g++ compiler with support for C++11.

git clone https://www.github.com/dderiso/gdtw
cd gdtw
python setup.py install

License

This software is provided under the Apache license.

Citing

Deriso, Dave, and Stephen Boyd (2022). A general optimization framework for dynamic time warping. Optimization and Engineering, 1-22.

@article{deriso2022general,
  title={A general optimization framework for dynamic time warping},
  author={Deriso, Dave and Boyd, Stephen},
  journal={Optimization and Engineering},
  pages={1--22},
  year={2022},
  publisher={Springer}
}

Questions

So that the community can benefit from a common knowledge-base, software-related questions should be submitted via GitHub issues. Other questions about code or documenation can be addressed to Dave Deriso (dderiso/at/alunmi-dot-stanford-dot-edu).

Quick Start

1. Generate Signals

In this example, we'll define signal \( x(t) = \sin(2 \pi * 5 t) \) to be a 5Hz sine wave and signal \( y(t) = x(q(t)) \) to be the same 5Hz sine wave but warped with a quadratic time warping function \( q(t) = t^2 \).

import numpy as np
t = np.linspace(0,1,1000)
q = lambda t_: t_**2
x = lambda t_: np.sin(2*np.pi * 5 * t_)
y = lambda t_: x(q(t_))

2. Plot Signals

The signals are plotted using the following code. If you don't have matplotlib installed, run pip install matplotlib into your shell to install the library.

import matplotlib.pyplot as plt
plt.plot(t, y(t), '-', color='C1', label='y(t)')
plt.plot(t, x(t), '-', color='C0', label='x(t)')
plt.legend()
plt.show()

3. Perform Time Warping

To warp signal \( x \) into signal \( y \), we'll use the function gdtw.warp(x,y,params) using the default parameters. We'll pass discretized signals \( x(t) \) and \( y(t) \), where \( t \) is a vector of \( N \) values \( 0 = t_1 < t_2 < \cdots < t_N = 1 \), into arguments x and y as Numpy arrays.

Recall that in our formulation, the warping function is denoted as \( \phi \), the discretized warped time is denoted as \( \tau \), the discretized warped signal is denoted as \( \hat{x} = x(\tau) \), and objective function evaluated on \( \tau \) (aka. DTW distance) is denoted as \( \hat{f}(\tau) \) (see \( \S 4 \) in the paper). Correspondingly, the function returns \( \phi \), \( x(\tau) \), and \( \hat{f}(\tau) \) as tuple consisting of a function, a Numpy array, and a Numpy double precision float.

import gdtw
phi, x_tau, f_tau, g = gdtw.warp(x(t), y(t))

4. Plot Warped Signals

The warped signals are plotted using the following code. You can plot the discretized result, \( x ( \tau ) \).

plt.plot(t,y(t),       '-', color='C1', label='y(t)')
plt.plot(t,x_tau, '--', color='C0', label='x(tau)')
plt.legend()
plt.show()

Alternatively, you can plot the composition \( x \circ \phi \).

plt.plot(t,y(t),       '-', color='C1', label='y(t)')
plt.plot(t,x(phi(t)), '--', color='C0', label='x(phi(t))')
plt.legend()
plt.show()

4. Custom Parameters

As illustrated in the interactive demonstration above, the choice of hyper-parameters greatly affects the solution. In addition to specifying hyper-parameter values, you can also specify custom loss and regularization functions. Together, you can design a custom objective function to better address more complicated problems. A table of allowable parameters is included in the API Reference.

gdtw.warp(x, y, params={
    "lambda_cum": 0.001, 
    "lambda_inst": 0.001,
    "Loss":   lambda x,y:      np.abs(x-y),
    "R_cum":  lambda phi,t:    np.abs(phi-t),
    "R_inst": lambda grad_phi: np.abs(grad_phi-1)
})

6. Summary

Congratulations! You now know how to time-warp two signals together, inspect the warping function, set hyper-parameter values, ' and specify custom loss and regularization functions. This is all you need to know for most applications.

API Reference

gdtw.warp

`gdtw.warp(x,y,params)`

Uses our formulation of dynamic time warping and method of iterative refinement to find a function \( \phi : t \to \tau \) that minimizes the difference between \( x(\phi(t)) \) and \( y(t) \). Recall that in our formulation, the warping function is denoted as \( \phi \), the discretized warped time is denoted as \( \tau \), the discretized warped signal is denoted as \( x(\tau) \), and the objective function evaluated on \( \tau \) (aka. DTW distance) is denoted as \( \hat{f}(\tau) \) (see \( \S 4 \) in the paper). Correspondingly, the function returns \( \phi \), \( \tau \), \( x(\tau) \), and \( \hat{f}(\tau) \) as tuple consisting of a function, two Numpy arrays, and a Numpy double precision float.

Arguments

Argument Type(s) Default Description

Argument	Type(s)	Default	Description
`x`	np.ndarray or function	Required	\(x\) is a sequence, or \(x: R \to R \) is a function, that is being warped. Any specified function must have the following signature: `(t:float) -> float` Here is an example of a valid function: `lambda t: np.sin(2np.pi t)`
`y`	np.ndarray or function	Required	\(y\) is a sequence, or \(y: R \to R \) is a function, that serves as the target. Any specified function must have the following signature: `(t:float) -> float` Here is an example of a valid function: `lambda t: np.sin(2np.pi t)`
`params`	dictionary	{}	Parameters are optional. When a parameter is left unspecified, the solver uses default value listed in the Parameter Table below.

x

np.ndarray or function

Required

\(x\) is a sequence, or \(x: R \to R \) is a function, that is being warped. Any specified function must have the following signature:

(t:float) -> float

Here is an example of a valid function:

lambda t: np.sin(2*np.pi * t)

y

np.ndarray or function

Required

\(y\) is a sequence, or \(y: R \to R \) is a function, that serves as the target. Any specified function must have the following signature:

(t:float) -> float

Here is an example of a valid function:

lambda t: np.sin(2*np.pi * t)

params dictionary {} Parameters are optional. When a parameter is left unspecified, the solver uses default value listed in the Parameter Table below.

Parameter Table

Key	Value Type(s)	Default Value	Description
`"lambda_inst"`	np.double	0.1	\(\Laminst \ge 0 \) is a positive hyperparameter for the regularization functional \(\calRinst\). This hyperparameter increases the penalty on the instantaneous amount of warping. Higher values produce a smoother \( \phi \), and lower values result in singularities.
`"lambda_cum"`	np.double	1.0	\(\Lamcum \ge 0\) is a positive hyperparameter for the regularization functional \(\calRcum\). This hyperparameter increases the penalty on the cumulative amount of warping. Higher values result in less over-fitting.
`"Loss"`	string or function	"L2"	\( \mathcal{L} \) is the loss that ensures \( x( \phi (t)) \) is close to a target \( y(t) \). Fast C++ methods are invoked by passing a string instead of a function, built-in options include: `"L1"` or `"L2"`. Any specified function must have the following signature: `(x:float, y:float) -> float` Here is an example of a valid function: `lambda x,y: np.abs(x-y)`
`"R_cum"`	string or function	"L2"	\( \calRcum \) is the regularization functional for the cumulative warp and reduces over-fitting. Fast C++ methods are invoked by passing a string instead of a function, built-in options include: `"L1"` or `"L2"`. Any specified function must have the following signature: `(phi:float, t:float) -> float` Here is an example of a valid function: `lambda phi,t: np.abs(phi-t)`
`"R_inst"`	string or function	"L2"	\( \calRinst \) is the regularization functional for the instantaneous warp and improves the smoothness of the warping. Any specified function must have the following signature: `(grad_phi:float) -> float` Here is an example of a valid function: `lambda grad_phi: np.abs(grad_phi-1)` The constraint that the slope of \( \phi \) must be between \( s_\mathrm{min} \) and \( s_\mathrm{max} \) is handled in the solver outside of this function.
`"N"`	np.double	300	\( N \) is the number of time points in the downsampled input. The computation time grown linearly as \( N \) increases.
`"M"`	np.double	165	\( M \) is the search window size, and must satisfy \( M \le M_\mathrm{max}\). When \( M \) is not explicitly set, \( M \) is computed as \( \mathrm{min}( 0.55*N, M_\mathrm{max}) \). With default settings \( N = 300 \) and \( M_\mathrm{max} = 300 \), \( M = 165 \) The computation time grows quadratically as parameter \( M \) increases.
`"M_max"`	np.double	300	\( M_\mathrm{max} \) is the maximum search window size.
`"eta"`	np.double	0.15	\( \eta \) is the step size of the iterative refinement. Smaller values result in fewer iterations, but may result in lower quality results.
`"s_min"`	np.double	1E-8	\( s_\mathrm{min} \) specifies the constraint for the minimum slope of \( \phi \). When this is non-zero, the warping function will be monotone increasing. We can allow the slope of \( \phi \) be negative by choosing \( s_\mathrm{min} < 0 \).
`"s_max"`	np.double	1E8	\( s_\mathrm{max} \) specifies the constraint for the maximum slope of \( \phi \). Large values allow \( \phi \) to have sharp discontinuities.
`"s_beta"`	np.double	0	\( s_\beta \) specifies a tolerance on the boundary constraints. The default is \(s_\beta = 0 \), which enforces boundary conditions \( \phi(0)=0 \) and \( \phi(1)=1 \).

Returns

`phi`	function	\( \phi [0,1]: R \to R \) is the time warping function that maps \( t \to \tau \). This function has the following signature: `(t:float) -> np.double`
`x_tau`	np.ndarray	\( x( \tau ) \in R^N \) is an array of length \( N \) that contains the discretized version of the time warped signal \( x \).
`f_tau`	np.double	\( \hat{f}(\tau) \), a.k.a. the "DTW distance", is the discretized version of the objective function evaluated on \( \tau \).
`g`	GDTW	This is a reference to the `GDTW` solver object. It's generally ignored.

Raises

`ValueError`	This error is raised when `x` or `y` are not of type `np.ndarray` or do not have the required function signature `(t:float) -> float`.
`ValueError`	This error is raised when `params` is not of the `dictionary` type.

Example 1

This example is explained in detail in the quick start tutorial.

import numpy as np
t = np.linspace(0,1,1000)
q = lambda t_: t_**2
x = lambda t_: np.sin(2*np.pi * 5 * t_)
y = lambda t_: x(q(t_))

import gdtw
phi, x_tau, f_tau, g = gdtw.warp( x(t), y(t) )

import matplotlib.pyplot as plt
plt.plot(t,y(t),       '-', color='C1', label='y(t)')
plt.plot(t,x_tau, '--', color='C0', label='x(tau)')
plt.legend()
plt.show()

Example 2

This method supports custom hyper-parameter values as well as custom loss and regularizer functionals. The available options are listed in the Parameter Table above.

gdtw.warp(x, y, params={
    "lambda_cum": 0.001, 
    "lambda_inst": 0.001,
    "Loss":   lambda x,y:      np.abs(x-y),
    "R_cum":  lambda phi,t:    np.abs(phi-t),
    "R_inst": lambda grad_phi: np.abs(grad_phi-1)
})

Performance

The node costs are computed and stored in an \( M\times N \) array, and the edge costs are computed on the fly and stored in an \( M\times M\times N \) array. The computational cost of our dynamic programming procedure is order \( \mathcal{O}( N ) \) flops (not counting the evaluation of the loss and regularization terms). With current hardware, it is entirely practical for \( N=1000 \) or (much) larger. For example, \( N=1000 \) takes \( 0.83 \) ms to solve on a \( 4 \)-core Macbook and only \( 4 \) iterations are needed before the iterative refinement procedure terminates.

A General Optimization Framework for Dynamic Time Warping

Dave Deriso and Stephen Boyd

The Dynamic Time Warping Problem

Interactive Demo

Software

General

System Requirements

Installation

Source Code

Custom Compilation

License

Citing

Questions

Quick Start

1. Generate Signals

2. Plot Signals

3. Perform Time Warping

4. Plot Warped Signals

4. Custom Parameters

6. Summary

API Reference

gdtw.warp

`gdtw.warp(x,y,params)`

Arguments

Parameter Table

Returns

Raises

Example 1

Example 2

Performance