On July 19, 2024, a faulty content update from CrowdStrike caused 8.5 million Windows machines to crash simultaneously — the largest IT outage in history. Airlines grounded flights. Hospitals postponed surgeries. Banks froze transactions. The total economic damage exceeded $10 billion. The root cause was a single bad configuration file pushed to production. An anomaly detection system monitoring the deployment’s telemetry — CPU spikes, crash rates, memory patterns — could have flagged the cascading failure within seconds and triggered an automatic rollback before 0.1% of those machines were affected.
This is not a hypothetical benefit. Companies like Netflix, Uber, and Meta operate real-time anomaly detection systems that catch exactly these patterns — sudden deviations in request latency, error rates, transaction volumes, or system metrics that indicate something has gone wrong before users notice. The difference between catching an anomaly in 30 seconds versus 30 minutes can mean the difference between a minor incident and front-page news.
Time-series anomaly detection — the task of identifying unusual patterns in sequential, timestamped data — has experienced a remarkable transformation over the past three years. Classical statistical methods that served practitioners for decades are being augmented and in some cases replaced by deep learning architectures, transformer-based models, and most recently, pre-trained foundation models that can detect anomalies in time series they’ve never seen before, without any task-specific training. The pace of innovation in this space has been extraordinary, and the gap between what’s possible in a research paper and what works in production is narrowing rapidly.
This guide covers the full landscape: from classical approaches that remain surprisingly competitive, through the deep learning revolution of 2020-2024, to the foundation model frontier of 2025-2026. Whether you’re building anomaly detection for infrastructure monitoring, financial fraud detection, predictive maintenance, or healthcare, understanding these models — their strengths, limitations, and practical trade-offs — is essential.
Why Anomaly Detection in Time Series Is Harder Than You Think
Detecting anomalies in tabular data is relatively straightforward: a transaction amount of $50,000 when the customer’s average is $200 is clearly unusual. Time-series anomaly detection is fundamentally harder because the definition of “unusual” depends on temporal context — patterns that are normal at one time are anomalous at another.
Consider server CPU usage. A spike to 95% utilization at 3 AM might be perfectly normal — that’s when the batch processing job runs. The same spike at 3 PM, when only light API traffic is expected, might indicate a runaway process or a denial-of-service attack. A gradual drift from 40% baseline to 60% over six weeks might indicate a memory leak that will eventually cause a crash. Each of these requires the detection system to understand not just the current value but its relationship to seasonal patterns, trends, and the broader temporal context.
The challenges break down into several categories:
Rarity of labeled anomalies. In most real-world datasets, anomalies represent less than 1% of observations — often less than 0.01%. Supervised learning approaches struggle because the classes are so imbalanced. Most state-of-the-art methods therefore operate in unsupervised or semi-supervised settings, learning what “normal” looks like and flagging deviations.
Concept drift. What constitutes “normal” changes over time. A system that learned normal patterns from January data may flag perfectly healthy February patterns as anomalous if the business grew, the user base shifted, or infrastructure was upgraded. Models must adapt to evolving baselines without losing sensitivity to genuine anomalies.
Multivariate dependencies. Modern systems generate hundreds or thousands of metrics simultaneously. An anomaly may not be visible in any single metric — CPU looks fine, memory looks fine, disk I/O looks fine — but the specific combination of all three at slightly elevated levels, simultaneously, indicates an emerging problem. Capturing these inter-metric correlations is where deep learning approaches excel over classical univariate methods.
A Taxonomy of Time-Series Anomalies
Before selecting a model, you need to know what kind of anomaly you’re looking for. Different model architectures excel at detecting different anomaly types:
| Anomaly Type | Description | Example | Best Detection Approach |
|---|---|---|---|
| Point anomaly | A single observation far from expected | Sudden CPU spike to 100% | Statistical thresholds, Isolation Forest |
| Contextual anomaly | Normal value in wrong context | High traffic at 4 AM (normally low) | Seasonal decomposition, LSTM, Transformer |
| Collective anomaly | A sequence of observations anomalous together | Sustained elevated error rate for 10 minutes | Sliding-window models, sequence-to-sequence |
| Trend anomaly | Gradual shift from expected trajectory | Memory usage growing 2% weekly (leak) | Change-point detection, trend decomposition |
| Shapelet anomaly | Unusual pattern shape in a subsequence | Abnormal ECG waveform morphology | Matrix Profile, deep autoencoders |
Classical Approaches: Where It All Started
Before deep learning, time-series anomaly detection relied on statistical methods that remain relevant and surprisingly competitive for many use cases. Understanding these foundations is essential — they serve as baselines, they’re interpretable, and they run efficiently without GPU infrastructure.
Statistical and Decomposition Methods
STL Decomposition + Residual Thresholding: Seasonal-Trend decomposition using LOESS (STL) separates a time series into trend, seasonal, and residual components. Anomalies are identified by flagging residuals that exceed a threshold (typically 3 standard deviations). This method is simple, interpretable, and handles seasonality well — making it excellent for business metrics like daily active users or hourly revenue.
ARIMA-based Detection: AutoRegressive Integrated Moving Average models forecast the next value based on historical patterns. Observations that deviate significantly from the forecast are flagged. ARIMA works well for stationary series with clear autoregressive structure but struggles with complex multi-seasonal patterns or non-linear dynamics.
Exponential Smoothing State Space Models (ETS): Similar in spirit to ARIMA but using exponential weighting of past observations. The Holt-Winters variant handles both trend and seasonality and remains a workhorse in production monitoring systems.
Isolation Forest and Tree-Based Methods
Isolation Forest (Liu et al., 2008) takes a brilliantly different approach: instead of building a model of normal behavior and looking for deviations, it directly identifies anomalies by measuring how easy they are to isolate. Anomalous points, being different from the majority, require fewer random partitions to separate from the rest of the data. Isolation Forest is fast, scales well to high-dimensional data, and handles multivariate anomaly detection naturally.
from sklearn.ensemble import IsolationForest
import numpy as np
import pandas as pd
# Create windowed features from raw time series
def create_features(series, window=24):
features = []
for i in range(window, len(series)):
window_data = series[i-window:i]
features.append({
'mean': np.mean(window_data),
'std': np.std(window_data),
'min': np.min(window_data),
'max': np.max(window_data),
'last': window_data[-1],
'trend': np.polyfit(range(window), window_data, 1)[0]
})
return pd.DataFrame(features)
# Fit Isolation Forest
features = create_features(cpu_usage_series, window=24)
model = IsolationForest(contamination=0.01, random_state=42)
predictions = model.fit_predict(features)
# -1 = anomaly, 1 = normal
Matrix Profile: The Subsequence Analysis Powerhouse
Matrix Profile (Yeh et al., 2016) computes the distance between every subsequence in a time series and its nearest neighbor, producing a profile of how “unique” each subsequence is. Subsequences with high matrix profile values — meaning their nearest neighbor is unusually far away — are anomalous. Matrix Profile excels at detecting shapelet anomalies (unusual pattern shapes) and is remarkably efficient thanks to the STOMP algorithm, which computes the full matrix profile in O(n² log n) time.
The Python library stumpy provides production-grade Matrix Profile implementations and remains one of the most underappreciated tools in the anomaly detection practitioner’s toolkit.
The Deep Learning Revolution in Anomaly Detection
Starting around 2019, deep learning models began consistently outperforming classical methods on complex, multivariate anomaly detection benchmarks. The key insight: deep neural networks can learn non-linear temporal patterns that are invisible to linear statistical models.
LSTM Autoencoders: The First Deep Success
The LSTM Autoencoder architecture — an encoder that compresses a time-series window into a latent representation, followed by a decoder that reconstructs the original window — became the first widely adopted deep learning approach for time-series anomaly detection. The model learns to reconstruct “normal” patterns during training. At inference time, windows with high reconstruction error are flagged as anomalous, because the model has never learned to reconstruct those patterns.
LSTM Autoencoders handle temporal dependencies (the LSTM component) and learn what to expect (the autoencoder objective) simultaneously. They were the standard deep approach from roughly 2019-2022 and remain effective for many applications.
import torch
import torch.nn as nn
class LSTMAutoencoder(nn.Module):
def __init__(self, n_features, hidden_size=64, n_layers=2):
super().__init__()
self.encoder = nn.LSTM(
n_features, hidden_size, n_layers, batch_first=True
)
self.decoder = nn.LSTM(
hidden_size, hidden_size, n_layers, batch_first=True
)
self.output_layer = nn.Linear(hidden_size, n_features)
def forward(self, x):
# Encode: compress the sequence
_, (hidden, cell) = self.encoder(x)
# Decode: reconstruct the sequence
seq_len = x.size(1)
decoder_input = hidden[-1].unsqueeze(1).repeat(1, seq_len, 1)
decoder_out, _ = self.decoder(decoder_input)
reconstruction = self.output_layer(decoder_out)
return reconstruction
# Anomaly score = reconstruction error (MSE per window)
# High reconstruction error → anomaly
GDN and GNN-Based Methods: Modeling Inter-Metric Relationships
Graph Deviation Network (GDN) (Deng & Hooi, 2021) introduced an elegant solution for multivariate anomaly detection: model the relationships between sensors/metrics as a graph, where each node is a time series and edges represent learned dependencies. When a metric deviates from what the graph structure predicts based on its neighbors’ values, it’s flagged as anomalous.
GDN’s key advantage is its ability to identify anomalies that are invisible in individual metrics but manifest as broken inter-metric correlations. For example, in a server cluster, CPU and memory usage typically correlate. If CPU spikes but memory doesn’t — or vice versa — GDN detects the correlation violation, even if both values are individually within normal ranges.
USAD: UnSupervised Anomaly Detection
USAD (Audibert et al., 2020) combines autoencoders with adversarial training. Two decoder networks compete: one reconstructs the input from the latent space, while the other tries to reconstruct the first decoder’s output. This adversarial training scheme forces the autoencoders to learn sharper boundaries between normal and anomalous patterns, significantly improving detection accuracy compared to standard autoencoders. USAD is fast to train, works well on multivariate data, and has become a popular baseline in academic benchmarks.
Transformer-Based Models: The Current State of the Art
The transformer architecture — originally designed for natural language processing — has proven remarkably effective for time-series analysis. Its self-attention mechanism can capture long-range dependencies in sequences without the vanishing gradient problems that limit RNNs and LSTMs. Several transformer-based models have set new state-of-the-art results on anomaly detection benchmarks.
Anomaly Transformer (ICLR 2022)
Anomaly Transformer (Xu et al., 2022) introduced a key insight: in normal time-series data, each point’s attention pattern should focus on adjacent points (the “prior-association”) and on semantically similar points elsewhere in the series (the “series-association”). These two association patterns align for normal data but diverge for anomalies. Anomaly Transformer introduces an Association Discrepancy metric that measures this divergence, providing a principled anomaly score.
The model achieved state-of-the-art results on six benchmark datasets at the time of publication and remains among the strongest methods for unsupervised multivariate anomaly detection. Its key contribution — using attention pattern discrepancy rather than reconstruction error as the anomaly score — represents a conceptual advance over prior autoencoder-based approaches.
DCdetector: Dual Attention Contrastive (ICML 2023)
DCdetector (Yang et al., 2023) builds on the association discrepancy idea with a contrastive learning framework. It creates two representations of each time step — one from a “patch-wise” attention view and one from a “channel-wise” attention view — and uses contrastive learning to maximize agreement for normal patterns and divergence for anomalies. DCdetector achieved new state-of-the-art results on multiple benchmarks, improving on Anomaly Transformer’s F1 scores by 2-5 points on several datasets.
TimesNet: From Temporal to Spatial (ICLR 2023)
TimesNet (Wu et al., 2023) takes a creative approach: it transforms 1D time-series data into 2D representations by reshaping each period (daily, weekly, etc.) into a 2D image-like tensor, then applies 2D convolutional neural networks to capture both intra-period and inter-period patterns simultaneously. This transformation allows TimesNet to leverage the powerful feature extraction capabilities of CNNs — originally developed for computer vision — on temporal data.
TimesNet is a general-purpose time-series model (it handles forecasting, classification, and anomaly detection), and its multi-task capability makes it a strong choice for teams that need a single architecture serving multiple analytical needs.
| Model | Year | Core Idea | Strengths | Limitations |
|---|---|---|---|---|
| LSTM Autoencoder | 2019 | Reconstruct normal patterns | Simple, well-understood | Limited long-range context |
| GDN | 2021 | Graph-based inter-metric modeling | Catches correlation anomalies | Complex graph construction |
| Anomaly Transformer | 2022 | Attention association discrepancy | Strong benchmark results | Computationally expensive |
| TimesNet | 2023 | 1D→2D transformation + CNN | Multi-task capable | Assumes periodic structure |
| DCdetector | 2023 | Dual-attention contrastive learning | SOTA on multiple benchmarks | Requires careful tuning |
Foundation Models for Time Series: The 2025-2026 Frontier
The most exciting development in time-series analysis over the past two years has been the emergence of foundation models — large, pre-trained models that can perform time-series tasks (including anomaly detection) on data they’ve never seen before, without task-specific training. This is the same paradigm shift that GPT brought to language and CLIP brought to vision: train once on massive diverse data, then apply to arbitrary downstream tasks via fine-tuning or zero-shot inference.
TimesFM (Google, 2024)
TimesFM (Time Series Foundation Model) was developed by Google Research and pre-trained on approximately 100 billion time points from diverse sources — financial markets, weather stations, energy consumption, web traffic, and synthetic data. At 200 million parameters, TimesFM is designed as a decoder-only transformer that generates point forecasts, and anomaly detection is achieved by flagging observations that deviate significantly from the model’s zero-shot forecast.
TimesFM’s remarkable property is that it produces competitive forecasts — and therefore competitive anomaly detection — without ever seeing your specific data during training. You feed it a time series, it generates a forecast based on patterns learned from 100 billion diverse time points, and you compare actuals against forecasts. This zero-shot capability eliminates the need for per-dataset model training, dramatically reducing time-to-deployment for new monitoring use cases.
Chronos (Amazon, 2024)
Chronos (Ansari et al., 2024) from Amazon takes an innovative approach: it tokenizes time-series values into discrete bins (similar to how language models tokenize words) and then applies a standard language model architecture (T5) to the tokenized sequence. This allows Chronos to leverage battle-tested language model architectures and training recipes for time-series tasks.
Chronos offers multiple model sizes (Mini: 20M, Small: 46M, Base: 200M, Large: 710M parameters) and performs remarkably well in zero-shot evaluations. For anomaly detection, the approach is forecast-based: Chronos generates probabilistic forecasts, and observations falling outside the prediction intervals are flagged as anomalous.
import torch
from chronos import ChronosPipeline
# Load pre-trained Chronos model
pipeline = ChronosPipeline.from_pretrained(
"amazon/chronos-t5-base",
device_map="auto",
torch_dtype=torch.float32,
)
# Generate probabilistic forecast (zero-shot — no training needed)
context = torch.tensor(historical_data) # Your time series
forecast = pipeline.predict(
context,
prediction_length=24, # Forecast next 24 steps
num_samples=100, # Generate 100 forecast samples
)
# Anomaly detection via prediction intervals
median_forecast = forecast.median(dim=1).values
lower_bound = forecast.quantile(0.025, dim=1).values # 2.5th percentile
upper_bound = forecast.quantile(0.975, dim=1).values # 97.5th percentile
# Points outside the 95% prediction interval are anomalies
anomalies = (actual_values < lower_bound) | (actual_values > upper_bound)
MOMENT (CMU, 2024)
MOMENT (Goswami et al., 2024) — Multi-task Open-source pre-trained Model for Every Time series — is a family of models specifically designed for multiple time-series tasks, including anomaly detection, classification, forecasting, and imputation. Unlike TimesFM and Chronos, which approach anomaly detection indirectly through forecasting, MOMENT is explicitly trained with an anomaly detection objective during pre-training.
MOMENT uses a masked reconstruction objective: during pre-training, random patches of the time series are masked, and the model learns to reconstruct them. For anomaly detection, the reconstruction error at each time step serves as the anomaly score. Observations that are hard for the model to reconstruct from context — because they deviate from patterns the model has learned across its massive pre-training dataset — receive high anomaly scores.
MOMENT is open-source, available on Hugging Face, and supports fine-tuning for domain-specific applications. Its anomaly detection performance is competitive with specialized models that were trained on the target dataset, despite MOMENT requiring zero task-specific training.
Timer and TimeGPT: Commercial and Research Alternatives
TimeGPT (Nixtla, 2024) is a commercially available foundation model with an API-based interface. Users send time-series data to the API and receive forecasts and anomaly scores without managing any model infrastructure. TimeGPT is attractive for teams that want foundation model capabilities without the complexity of model deployment, though it requires sending data to an external service — a non-starter for sensitive applications.
Timer (Liu et al., 2024) from Tsinghua University is a generative pre-trained transformer for time series that unifies multiple analytical tasks. It uses an autoregressive next-token prediction objective (analogous to GPT) on tokenized time-series data and can perform anomaly detection, forecasting, and imputation in a single framework.
| Foundation Model | Origin | Parameters | Open Source | Anomaly Approach | Key Advantage |
|---|---|---|---|---|---|
| TimesFM | 200M | Yes | Forecast-based | Massive pre-training data (100B points) | |
| Chronos | Amazon | 20M-710M | Yes | Probabilistic forecast | Multiple sizes, LLM architecture |
| MOMENT | CMU | 40M-385M | Yes | Masked reconstruction | Explicit anomaly detection objective |
| TimeGPT | Nixtla | Undisclosed | No (API) | Forecast-based | Zero infrastructure, API-ready |
| Timer | Tsinghua | 67M | Yes | Autoregressive | GPT-style unified framework |
Benchmarks and Real-World Performance
The academic community evaluates anomaly detection models on several standard benchmark datasets. Understanding these benchmarks — and their limitations — helps calibrate expectations for real-world performance.
| Dataset | Domain | Dimensions | Anomaly % | Key Challenge |
|---|---|---|---|---|
| SMD | Server Machines | 38 | ~4.2% | Multi-entity, diverse patterns |
| MSL | NASA Spacecraft | 55 | ~10.7% | Telemetry with complex physics |
| SMAP | NASA Soil Moisture | 25 | ~13.1% | Sensor noise, gradual drifts |
| SWaT | Water Treatment Plant | 51 | ~12.1% | Cyber-physical attacks, subtle |
| PSM | eBay Server Metrics | 25 | ~27.8% | High anomaly rate, noisy labels |
Practical Guide: Choosing the Right Model for Your Problem
With so many available models, the selection decision can feel overwhelming. Here’s a practical decision framework based on real-world constraints:
Decision Framework
Do you have labeled anomaly data?
- Yes (100+ labeled anomalies): Fine-tune a supervised or semi-supervised model. Consider fine-tuning MOMENT or training DCdetector with the labels guiding threshold selection.
- No: Use unsupervised methods. Continue to next question.
Is this a new deployment with no historical training data?
- Yes: Use a foundation model (Chronos, TimesFM, or MOMENT) in zero-shot mode. You’ll get competitive detection immediately without any training.
- No (ample historical data): Train a specialized model for best performance. Continue to next question.
Univariate or multivariate?
- Univariate (single metric): STL decomposition + thresholding is hard to beat for simplicity and interpretability. For higher accuracy, use Matrix Profile or an LSTM autoencoder.
- Multivariate (many correlated metrics): Use Anomaly Transformer, DCdetector, or GDN to capture inter-metric correlations.
Latency requirements?
- Real-time (sub-second): Avoid transformer models for inference. Use Isolation Forest, streaming Matrix Profile (via STUMPY), or lightweight LSTM models.
- Near-real-time (seconds to minutes): Any model is feasible with proper infrastructure.
- Batch (hourly/daily): Prioritize accuracy over speed. Use the most capable model available.
Implementation: Building an Anomaly Detection Pipeline
A production anomaly detection system involves more than just a model. Here’s the full pipeline architecture:
# Complete anomaly detection pipeline with Chronos
import torch
import numpy as np
from chronos import ChronosPipeline
from dataclasses import dataclass
from typing import Optional
@dataclass
class AnomalyResult:
timestamp: str
value: float
expected: float
lower_bound: float
upper_bound: float
anomaly_score: float
is_anomaly: bool
class TimeSeriesAnomalyDetector:
def __init__(
self,
model_name: str = "amazon/chronos-t5-small",
context_length: int = 512,
prediction_length: int = 1,
confidence_level: float = 0.95,
):
self.pipeline = ChronosPipeline.from_pretrained(
model_name,
device_map="auto",
torch_dtype=torch.float32,
)
self.context_length = context_length
self.prediction_length = prediction_length
self.alpha = 1 - confidence_level
def detect(
self,
history: np.ndarray,
actual_value: float,
timestamp: str,
) -> AnomalyResult:
"""Detect if actual_value is anomalous given history."""
# Use last context_length points
context = torch.tensor(
history[-self.context_length:]
).unsqueeze(0).float()
# Generate probabilistic forecast
forecast = self.pipeline.predict(
context,
prediction_length=self.prediction_length,
num_samples=200,
)
# Extract prediction intervals
median = forecast.median(dim=1).values[0, 0].item()
lower = forecast.quantile(
self.alpha / 2, dim=1
).values[0, 0].item()
upper = forecast.quantile(
1 - self.alpha / 2, dim=1
).values[0, 0].item()
# Calculate anomaly score (normalized deviation)
interval_width = upper - lower
if interval_width > 0:
score = abs(actual_value - median) / interval_width
else:
score = abs(actual_value - median)
is_anomaly = actual_value < lower or actual_value > upper
return AnomalyResult(
timestamp=timestamp,
value=actual_value,
expected=median,
lower_bound=lower,
upper_bound=upper,
anomaly_score=score,
is_anomaly=is_anomaly,
)
# Usage
detector = TimeSeriesAnomalyDetector()
result = detector.detect(
history=cpu_usage_last_7_days,
actual_value=current_cpu_reading,
timestamp="2026-04-03T08:15:00Z",
)
if result.is_anomaly:
print(f"ANOMALY at {result.timestamp}: "
f"value={result.value:.1f}, "
f"expected={result.expected:.1f} "
f"[{result.lower_bound:.1f}, {result.upper_bound:.1f}]")
Key pipeline components beyond the model itself:
- Data preprocessing: Handle missing values (forward-fill or interpolation), normalize scales across metrics, align timestamps across data sources.
- Threshold calibration: Use a validation period of known-normal data to calibrate anomaly thresholds. A threshold set too low floods operators with false positives; too high misses real incidents.
- Suppression and deduplication: A single incident may trigger dozens of anomaly alerts across correlated metrics. Group alerts by time window and root cause to avoid alert fatigue.
- Feedback loop: Operators who acknowledge or dismiss alerts provide implicit labels. Feed this data back into the model as fine-tuning signal to improve detection over time.
- Seasonal awareness: Explicitly model known business cycles (daily patterns, weekend effects, holiday traffic changes) to reduce false positives during expected-but-unusual periods.
Where the Field Is Heading
Time-series anomaly detection is at an inflection point. The convergence of foundation models, transformer architectures, and practical tooling is making it possible to deploy sophisticated anomaly detection systems with dramatically less effort than even two years ago. Where a 2022 deployment required collecting domain-specific training data, training a specialized model, and calibrating thresholds through iterative experimentation, a 2026 deployment can start with a zero-shot foundation model that delivers competitive performance from day one and improves with domain-specific fine-tuning.
Several trends will shape the next 2-3 years:
Multimodal foundation models that jointly reason over time-series metrics, log messages, and trace data are emerging from research labs. An anomaly detection system that can correlate a latency spike with a specific error message in the application logs and a deployment event in the change management system would dramatically reduce mean time to diagnosis — not just detection.
LLM-augmented anomaly explanation is another frontier. Current systems tell you that something is anomalous; they rarely tell you why. Integrating LLMs that can explain anomaly detections in natural language (“CPU spiked to 95% at 3:14 PM, coinciding with a deployment of version 2.4.1 to the payment service; historical pattern suggests a connection between this deployment and similar spikes”) would close the gap between detection and remediation.
Edge deployment of lightweight anomaly detection models is becoming practical as foundation model distillation techniques improve. Running a compact anomaly detector directly on IoT devices, industrial sensors, or network routers — without round-tripping data to a cloud service — enables real-time detection with lower latency and better data privacy.
The field has moved from “can we detect anomalies automatically?” (yes, reliably, since the late 2010s) to “can we detect anomalies without per-dataset training?” (yes, with foundation models, since 2024) to the current frontier: “can we detect, explain, and suggest remediation, all in real time?” That question is being actively answered, and the pace of progress suggests it won’t be open for long.
References
- Xu, Jiehui, et al. “Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy.” ICLR 2022.
- Yang, Yiyuan, et al. “DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection.” ICML 2023.
- Wu, Haixu, et al. “TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis.” ICLR 2023.
- Ansari, Abdul Fatir, et al. “Chronos: Learning the Language of Time Series.” arXiv:2403.07815, 2024.
- Das, Abhimanyu, et al. “A Decoder-Only Foundation Model for Time-Series Forecasting.” (TimesFM) ICML 2024.
- Goswami, Mononito, et al. “MOMENT: A Family of Open Time-Series Foundation Models.” ICML 2024.
- Deng, Ailin, and Bryan Hooi. “Graph Neural Network-Based Anomaly Detection in Multivariate Time Series.” AAAI 2021.
- Audibert, Julien, et al. “USAD: UnSupervised Anomaly Detection on Multivariate Time Series.” KDD 2020.
- Kim, Siwon, et al. “Towards a Rigorous Evaluation of Time-Series Anomaly Detection.” AAAI 2023.
- Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. “Isolation Forest.” ICDM 2008.
- Yeh, Chin-Chia Michael, et al. “Matrix Profile I: All Pairs Similarity Joins for Time Series.” ICDM 2016.
- Time-Series-Library (THU) — Unified framework for time-series models including anomaly detection
- Amazon Chronos GitHub Repository
- MOMENT GitHub Repository
Leave a Reply