2017년가을학기 손시운 ([email protected])ysmoon/courses/2017_2/dm/p-09.pdf ·...
TRANSCRIPT
![Page 2: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/2.jpg)
시계열
시계열이란?
– 시간에따라 관측된데이터의 집합
– e.g. Their blood pressure, Obama's popularity rating, the annual rainfall in
Seattle, and the value of their Google stock, etc.
2
![Page 3: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/3.jpg)
시계열데이터마이닝
시계열 데이터마이닝이란?
– 시계열데이터에서 다양한마이닝 기술로의미를찾는 작업
유사도(Similarity)
– 시계열데이터간의 유사함을판단한 수치
• Euclidean distance, DTW distance, etc.
– 유사도는대부분의 시계열데이터마이닝 기술에적용
3
![Page 4: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/4.jpg)
데이터생성
10개의실수(0~100) 데이터를 16개생성
– 첫번째 시퀀스(ts_data[1])는질의로 사용
– 나머지시퀀스(ts_data[2:16])는 데이터로사용
4
![Page 5: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/5.jpg)
데이터플로팅
데이터를 2차원 선형그래프로 플로팅하여 비교
5
![Page 6: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/6.jpg)
Euclidean Distance (1/3)
가장 기본적인 유사 척도
각대응하는 점의 거리를 계산하여 두시계열의 유사도를 측정
6
![Page 7: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/7.jpg)
Euclidean Distance (2/3)
R에서는 간단히 dist() 함수로 계산
– e.g. 질의 시퀀스와첫번째 데이터시퀀스의 거리를계산한경우
가장 유사한 시계열을 찾기 위해 모든 데이터 시퀀스와 비교
7
![Page 8: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/8.jpg)
Euclidean Distance (3/3)
질의 시퀀스와 가장 유사한 순으로 플로팅
8
![Page 9: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/9.jpg)
Dynamic Time Warping Distance (1/5)
Euclidean Distance에 비해 복잡하지만 정확한 유사 척도
대응하는 점과 다음 점의 거리를 비교하여, 최소값을 서로대응시킴
9
![Page 10: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/10.jpg)
Dynamic Time Warping Distance (2/5)
DTW 패키지 다운로드 및 설치
– https://cran.r-project.org/web/packages/dtw/index.html
– 또는 ...
10
![Page 11: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/11.jpg)
Dynamic Time Warping Distance (3/5)
DTW 계산
DTW 플로팅
11
![Page 12: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/12.jpg)
Dynamic Time Warping Distance (4/5)
가장 유사한 시계열을 찾기 위해 모든 데이터 시퀀스와 비교
12
![Page 13: 2017년가을학기 손시운 (ssw5176@kangwon.ac.kr)ysmoon/courses/2017_2/dm/P-09.pdf · 2017-08-28 · R Graphics: Device 2 (ACTIVE) 8 10 10 8 10 9 2 2 4 2 8 2 10 8 10 4 2 8 8 10](https://reader034.vdocuments.net/reader034/viewer/2022042420/5f3865380322ef5b2874a420/html5/thumbnails/13.jpg)
Dynamic Time Warping Distance (5/5)
질의 시퀀스와 가장 유사한 순으로 플로팅
13