mirror outlier detection in foreign trade data markos fragkakis ntts 2009

13
Mirror Outlier Mirror Outlier Detection in Foreign Detection in Foreign Trade Data Trade Data Markos Fragkakis NTTS 2009

Upload: sara-figueroa

Post on 27-Mar-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Mirror Outlier Detection in Mirror Outlier Detection in Foreign Trade DataForeign Trade DataMarkos Fragkakis

NTTS 2009

Page 2: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

IntroductionIntroductionForeign Trade dataImprovement of FT quality is essentialQuality can be assessed using several

dimensions (e.g. accuracy, timeliness, clarity)

We focus on accuracy using outlier detection

Methods for outlier outlier detection (e.g. threshold, model based)

Presentation of the Mirror Outlier Detection application

2

Page 3: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

MethodologyMethodologyUnivariate detection in time

series (value, quantity, supplementary quantity)

Median Absolute Deviation

Robust◦median, not mean◦non-parametric

Ti =xi −M1

M2

=xi −M1

Median(| x j −M1 |)> c

3

Page 4: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Mirror Outlier DetectionMirror Outlier DetectionCharacterization of outliers

according mirror flow.Possible outlier types:

◦Green: outlier appears in mirror (same sign)

◦Red: outlier does not appear in mirror◦Violet: outlier appears in mirror

(opposite sign)◦Black: mirror series not present◦Pink: mirror series not present

(confidentiality)

4

Page 5: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Additional functionalitiesAdditional functionalitiesOutlier classification (error in

dimension, not observed values)◦Swapping of observation between

series◦Copy of observations◦Time delay (hidden green outlier)

Outlier detection in short series (product code changes)

Reporting for◦Detected outliers per country (e-mailed)◦Summary reporting

5

Page 6: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Example of detected Example of detected outlieroutlier

6

Page 7: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Example of error due to Example of error due to swapswap

7

Page 8: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Error due to time delayError due to time delay

8

Page 9: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Technical InformationTechnical InformationMOD-DB has RDBMS repository for

storing outlier data (support for Oracle, MySQL).

Implemented in Java (portability, maintainability)

Command Line InterfacePerformance issues

◦Large volume of data cause bottleneck in DB

◦Storage is in question (several GBs per month)

9

Page 10: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

ArchitectureArchitecture

10

Page 11: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

Proposal for new platformProposal for new platformUse a multi dimensional viewerEnable OLAP functions (slice, dice,

rollup drilldown) Create dynamic charts from dataEstimated variables (indices from

raw outlier data)Data mining could be performed for

extracting inferences from data◦Log linear models

Pin-point of poor data involving high values

11

Page 12: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

ConclusionsConclusionsUse of mirror flow for outlier

chacterisationNew featuresImproving qualityEnable building new platform for

data explorationExpansions of MOD to other FT

data outside EU, other domain.

12

Page 13: Mirror Outlier Detection in Foreign Trade Data Markos Fragkakis NTTS 2009

QuestionsQuestions

Thank you for your attention

13