{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Semantic Segmentation \n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/TDAmeritrade/stumpy/main?filepath=notebooks/Tutorial_Semantic_Segmentation.ipynb)\n", "\n", "## Analyzing Arterial Blood Pressure Data with FLUSS and FLOSS\n", "\n", "This example utilizes the main takeaways from the [Matrix Profile VIII](https://www.cs.ucr.edu/~eamonn/Segmentation_ICDM.pdf) research paper. For proper context, we highly recommend that you read the paper first but know that our implementations follow this paper closely.\n", "\n", "According to the aforementioned publication, \"one of the most basic analyses one can perform on [increasing amounts of time series data being captured] is to segment it into homogeneous regions.\" In other words, wouldn't it be nice if you could take your long time series data and be able to segment or chop it up into `k` regions (where `k` is small) and with the ultimate goal of presenting only `k` short representative patterns to a human (or machine) annotator in order to produce labels for the entire dataset. These segmented regions are also known as \"regimes\". Additionally, as an exploratory tool, one might uncover new actionable insights in the data that was previously undiscovered. Fast low-cost unipotent semantic segmentation (FLUSS) is an algorithm that produces something called an \"arc curve\" which annotates the raw time series with information about the likelihood of a regime change. Fast low-cost online semantic segmentation (FLOSS) is a variation of FLUSS that, according to the original paper, is domain agnostic, offers streaming capabilities with potential for actionable real-time intervention, and is suitable for real world data (i.e., does not assume that every region of the data belongs to a well-defined semantic segment).\n", "\n", "To demonstrate the API and underlying principles, we will be looking at arterial blood pressure (ABP) data from from a healthy volunteer resting on a medical tilt table and will be seeing if we can detect when the table is tilted from a horizontal position to a vertical position. This is the same data that is presented throughout the original paper (above)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting Started\n", "\n", "Let's import the packages that we'll need to load, analyze, and plot the data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import pandas as pd\n", "import numpy as np\n", "import stumpy\n", "from stumpy.floss import _cac\n", "import matplotlib.pyplot as plt\n", "from matplotlib.patches import Rectangle, FancyArrowPatch\n", "from matplotlib import animation\n", "from IPython.display import HTML\n", "import os\n", "\n", "plt.style.use('https://raw.githubusercontent.com/TDAmeritrade/stumpy/main/docs/stumpy.mplstyle')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Retrieve the Data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | time | \n", "abp | \n", "
---|---|---|
0 | \n", "0 | \n", "6832.0 | \n", "
1 | \n", "1 | \n", "6928.0 | \n", "
2 | \n", "2 | \n", "6968.0 | \n", "
3 | \n", "3 | \n", "6992.0 | \n", "
4 | \n", "4 | \n", "6980.0 | \n", "