Thoughts in data science, time-series and R.

100 Time Series Data Mining Questions - Part 2

In the last post we started looking for a known pattern in a time series. For the next question, we will still be using the datasets available at so you can try this at home. The original code (MATLAB) and data are here. Now let’s start: Are there any repeated patterns in my data? Now we don’t know what we are looking for, but we want to discover something.

Read More…

100 Time Series Data Mining Questions (with answers!) - Part 1

I decided to start this series of Time Series Data Mining base on Eamonn’s presentation, so that’s why the title is “100”. That’s the idea, but for now, we only have 19 questions ready to go. I’ll use the datasets available at so you can try this at home. The original code (MATLAB) and data are here.. So, let’s start with number one: Have we ever seen a pattern that looks just like this?

Read More…

tsmp is going big!

Since the beginning of the tsmp package, it was evident that a series of algorithms around the Matrix Profile would pop-up sooner or later. After the creation of the Matrix Profile Foundation (MPF), the tsmp package had doubled the number of monthly downloads, and that is a good thing! The current version of tsmp, as shown in the previous post had added the new Pan-Matrix Profile and introduced the Matrix Profile API that aims to standardize high-level tools across multiple programming languages.

Read More…

SIR Model for Portugal - North Region

This post is just a placeholder to another post I’m maintaining in Rpubs since April, 11.

There I’m comparing the SIR Model (derived from Tim Churches) that I ran on April, 11, at least for now, with the real data available for the North region of Portugal during the COVID-19 pandemic.

Please follow this link for the SIR Model at RPubs.

tsmp v0.4.8 release – Introducing the Matrix Profile API

A new tool for painlessly analyzing your time series. We’re surrounded by time-series data. From finance to IoT to marketing, many organizations produce thousands of these metrics and mine them to uncover business-critical insights. A Site Reliability Engineer might monitor hundreds of thousands of time series streams from a server farm, in the hopes of detecting anomalous events and preventing catastrophic failure. Alternatively, a brick and mortar retailer might care about identifying patterns of customer foot traffic and leveraging them to guide inventory decisions.

Read More…

The early stages of the `tsmp` package concept

Recently I began to look further into Time Series(TS). During the course of my Master’s degree, I used the forecast package quite a bit (Thanks to Prof. Hyndman), and TS got my attention. So, after reading lots of publications about everything you can imagine about TS, I came across one publication from Prof. Eamonn, of the University of California, that made me contact him to ask a few questions.

Read More…