{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Fast Approximate Matrix Profiles with SCRUMP\n", "\n", "[](https://mybinder.org/v2/gh/TDAmeritrade/stumpy/main?filepath=notebooks/Tutorial_Fast_Approximate_Matrix_Profiles.ipynb)\n", "\n", "In \n", "[this paper](https://www.cs.ucr.edu/~eamonn/SCRIMP_ICDM_camera_ready_updated.pdf), a new approach called \"SCRIMP++\", which computes a matrix profile in an incremental fashion, is presented. When only an approximate matrix profile is needed, the this algorithm uses certain properties of the matrix profile calculation to greatly reduce the total computational time and, in this tutorial, we'll demonstrate how this approach may be sufficient for your applications.\n", "\n", "`stumpy` implements this approach for both self-joins and AB-joins in the `stumpy.scrump` function and it allows for the matrix profile to be easily refined when a higher resolution output is desired." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting started\n", "\n", "First, let us import some packages we will use for data loading, analyzing, and plotting." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import pandas as pd\n", "import stumpy\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from matplotlib.patches import Rectangle\n", "\n", "plt.style.use('https://raw.githubusercontent.com/TDAmeritrade/stumpy/main/docs/stumpy.mplstyle')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the Steamgen Dataset\n", "\n", "This data was generated using fuzzy models applied to mimic a steam generator at the Abbott Power Plant in Champaign, IL. The data feature that we are interested in is the output steam flow telemetry that has units of kg/s and the data is \"sampled\" every three seconds with a total of 9,600 datapoints.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | drum pressure | \n", "excess oxygen | \n", "water level | \n", "steam flow | \n", "
---|---|---|---|---|
0 | \n", "320.08239 | \n", "2.506774 | \n", "0.032701 | \n", "9.302970 | \n", "
1 | \n", "321.71099 | \n", "2.545908 | \n", "0.284799 | \n", "9.662621 | \n", "
2 | \n", "320.91331 | \n", "2.360562 | \n", "0.203652 | \n", "10.990955 | \n", "
3 | \n", "325.00252 | \n", "0.027054 | \n", "0.326187 | \n", "12.430107 | \n", "
4 | \n", "326.65276 | \n", "0.285649 | \n", "0.753776 | \n", "13.681666 | \n", "