USAD UnSupervised Anomaly Detection on Multivariate Time Series
   2 min read    ์†์ง€์šฐ

Audibert, J., Michiardi, P., Guyard, F., Marti, S., & Zuluaga, M. A. (2020, August). Usad: Unsupervised anomaly detection on multivariate time series. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 3395-3404).

In Short

1. Introduction

  • AE-based AD ๋ฌธ์ œ์ : ์ด์ƒ์น˜๊ฐ€ ์ •์ƒ๋ฐ์ดํ„ฐ์™€ ์œ ์‚ฌํ•˜๋ฉด Reconstruction Loss๊ฐ€ ์ž‘์•„์„œ ํƒ์ง€ ์‹คํŒจ
  • GAN-based AD ๋ฌธ์ œ์ : ๋ถˆ์•ˆ์ •ํ•œ ํ•™์Šต (mode collapse or non-convergence)

AutoEncoder์— adversarial training์„ ์ ‘๋ชฉํ•˜์—ฌ ์œ„์˜ ํ•œ๊ณ„์ ๋“ค์„ ํ•ด๊ฒฐํ•˜๋Š” ๋ชจ๋ธ

2-1. LSTM-AE

  • unsupervised
  • Training: ์ •์ƒ๋ฐ์ดํ„ฐ์˜ reconstruction error๋ฅผ ๊ธฐ๋ฐ”์œผ๋กœ LSTM-AE๋ฅผ ํ•™์Šตํ•˜์—ฌ ์ •์ƒ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•จ
  • Anomaly Detection: ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์˜ reconstruction error๊ฐ€ threshold๋ฅผ ์ดˆ๊ณผํ•˜๋ฉด ์ด์ƒ์น˜๋กœ ํƒ์ง€ํ•จ.

$$
\text{Anomaly Score = } \bigg|\bigg|X - D(E(X))\bigg|\bigg|_2
$$

2-2. MAD-GAN

  • unsupervised
  • Training: ์ •์ƒ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ LSTM ๊ตฌ์กฐ์˜ Generator์™€ Distriminator๋ฅผ ํ•™์Šตํ•˜์—ฌ ์ •์ƒ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋ฅผ ํ•™์Šต
  • Anomaly Score = Reconsctruction Loss + Discrimination Loss

$$
\text{Anomaly Score = } \lambda \Big| X - G(Z) \Big| + (1-\lambda)*D(X)
$$

MAD-GAN

3. Methods

3-1. Autoencoder Training

USAD1

3-2. Adversarial Training

USAD2

ํŽธ์˜์ƒ AutoEncoder1์„ AE1, AutoEncoder2์„ AE2๋ผ๊ณ  ํ•˜๊ฒ ๋‹ค.

AE1: (1) input ๋ณต์› & (2) AE2 ์†์ด๊ธฐ
AE2: (1) input ๋ณต์› & (2) AE1์ด ๋ณต์›ํ•œ ๋ฐ์ดํ„ฐ์™€ input ๊ตฌ๋ณ„

$$
\begin{align} &\min_{AE_1} \frac{1}{n}||X - AE_1(X)||_2 + (1-\frac{1}{n})||X - AE_2(AE_1(X))||_2 \\ &\min_{AE_2} \frac{1}{n}||X - AE_2(X)||_2 - (1-\frac{1}{n})||X - AE_2(AE_1(X))||_2 \end{align}
$$

์—ฌ๊ธฐ์„œ n์€ epoch์ด๋‹ค. ์ฆ‰, ์ดˆ๋ฐ˜์—๋Š” reconstruction loss ํ•™์Šต์— ์ค‘์ ์„ ๋‘๊ณ , ์ ์ฐจ adversarial training์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.

3-3. Anomaly Score

$$
\text{Anomaly Score = } \alpha \bigg|\bigg| X - AE_1(X) \bigg|\bigg|_2 + \beta \bigg|\bigg| X - AE_2(AE_1(X))\bigg|\bigg|_2
$$

  • \(\alpha + \beta = 1\)
  • \(\alpha > \beta\) : True & False Positive ๊ฐ์†Œ = Low Detection Sensitivity Scenario
  • \(\alpha < \beta\) : True & False Positive ์ฆ๊ฐ€ = High Detection Sensitivity Scenario

4. Result

  • \(\alpha, \beta\)์— ๋ฏผ๊ฐํ•˜๋‹ค.
  • ํ•˜์ง€๋งŒ ์ „๋ฐ˜์ ์œผ๋กœ hyperparameter์˜ ๋ณ€ํ™”๋กœ๋ถ€ํ„ฐ ๊ฐ•๊ฑดํ•˜๋‹ค. (ex. Window size)

5. Conclusion

AutoEncoder์— adversarial training์„ ์ ‘๋ชฉํ•˜์—ฌ ์œ„์˜ ํ•œ๊ณ„์ ๋“ค์„ ํ•ด๊ฒฐํ•˜๋Š” ๋ชจ๋ธ

Critical Point (MY OWN OPINION)

Reference

[1] Youtube ์˜์ƒ