visual analytics research at wpi
DESCRIPTION
Visual Analytics Research at WPI. Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department. What is Visual Analytics?. - PowerPoint PPT PresentationTRANSCRIPT
Visual Analytics Research at WPI
Dr. Matthew Ward and Dr. Elke Rundensteiner
Computer Science Department
What is Visual Analytics?
• “The science of analytical reasoning facilitated by interactive visual interfaces”, from Illuminating the Path – the Research and Development Agenda for Visual Analytics, J. Thomas and K. Cook (eds.), 2005
• More than information visualization or visual data mining, it involves technology to support all aspects of the analysis and reasoning processes.
An Overview of VA at WPI
Transforms Abstractions
Data Sources
Discovery & Reasoning
Interaction Spaces
Visual Representations
-Files-Databases-Numeric-Nominal
-Clustering-Sampling-Nominal to ordinal-Dimension reduction
-Data (multiple)-Statistics-Structure (hierarchy)
-Data-Structure (hierarchy)
-Clusters-Associations
-Past Work
-Quality-Uncertainty-Missing values
-Clutter reduction
-Data quality-Abstraction quality-Anomalies
-Spatial-Temporal-Quality
-Nuggets-Outliers
-Recent Work
-Streaming-Evidence
-Events-Trends-Hypotheses
-Planned Work
Examples of Projects
Multiresolution Visualization
• For large datasets, visualizations quickly get cluttered
• We have extended all of our visualizations to work at multiple resolutions
• Hierarchical clustering generates many levels of detail
• User can select areas of interest to view at full resolution while the rest of the data is shown via cluster centers and extents (shown as bands of variable opacity)
This work was funded by NSF grant IIS-9732897
Dimension Reduction
• Dimensions are hierarchically clustered based on similarity measures
• Hierarchy displayed using InterRing
• Users select clusters of dimensions or representative dimensions for detailed analysis
42 dimension census dataset.This work was funded by NSF grant IIS-0119276
Linking Spatial and Non-Spatial
• Diagonal plots of scatterplot matrix can have numerous uses
• We’ve implemented histograms, line plots, and 2-D options
• Example show multispectral remote sensing data, 1 layer per diagonal plot
• User can select in either 2-D or parameter space and see corresponding elements in other views.
Layout Strategies
• Different layout strategies can reveal different patterns in the data
• Detecting, classifying, and measuring trends, outliers, repeated patterns, clusters, and correlations can be facilitated via specific layouts
Cyclic Data Driven
Principal Components Order Driven
Visualizing Data with Nominal Fields
• Arbitrary assignment of non-numeric fields to numbers can lead to misinterpretation, lost patterns
• By looking at similarities in distributions across all dimensions, we can group values of a nominal variable with similar global characteristics
• Assignments used to convey order and relative distance
Original Assignment Assignment after Correspondence Analysis
This work was funded by NSF grant IIS-0119276 and funds from the NSA
Visual Clutter Reduction
• In scenes with thousands of moving objects, there is need to reduce clutter
• We’ve explored and developed many strategies, including:– Information-preserving– Information-reducing– Visual remapping
This work was funded by a grant from the AFRL
Data Quality Visual Encoding
• Data quality refers to the degree of uncertainty of data
• Quality measures are visually encoded into existing visualizations
• This helps users focus on high quality data to draw reliable conclusions
This work was funded by NSF grant IIS-0414380
Quality Space Visualization
• Quality space is visualized separately to convey patterns in the data quality measures
• Records or dimensions can be ordered by quality to reveal structure and relations
• Stripe view shows individual data value quality; Histogram view shows summarization and distribution
StripeQuality
Map
HistogramQuality
Map
This work was funded by NSF grant IIS-0414380
Interactions between Data Spaceand Quality Space
• Linking brush: When users select a subset in one space, the corresponding subset in the other space will be highlighted accordingly.
• Sample figures: The data points in the data space with high values in the third dimension are highlighted, then the distribution of quality measures for this subset is rendered in the quality map.
Data space with highlighting
LinkedQuality space
This work was funded by NSF grant IIS-0414380
Nugget Management System (NMS)
• Nuggets are patterns, clusters, anomalies or other features of a data set that have been visually or computationally isolated.
• NMS helps users to extract, consolidate and manage nuggets during their visual exploration. NMS eventually builds a hypothesis view based on the nugget space to support or refute hypotheses of users.
Nugget Space Hypothesis View
Common Themes and Strategies
• Provide data and attributes in multiple, linked spaces• Use automated and interactive tools for controlling and
optimizing views• Measure quality at all stages of the pipeline and convey
to the user for decision support• Assess quality measures by comparing them to user
responses• Manage scale via abstractions such as sampling and
clustering, but communicate information loss to analyst to allow trade-offs
• Perform usability testing with all visualizations and interactive tools
• Release code to the public domain for widest possible impact
Some References• Hierarchical Parallel Coordinates:
– Fua, Y.-H., Ward, M. O., and Rundensteiner, E. A., "Hierarchical Parallel Coordinates for Visualizing Large Multivariate Data Sets," IEEE Conf. on Visualization '99, Oct. 1999.
• Hierarchical Dimension Management:– Jing Yang, Matthew O. Ward, Elke A. Rundensteiner and Shiping Huang, "Visual Hierarchical Dimension
Reduction for Exploration of High Dimensional Datasets", Proc. VisSym 2003. – Jing Yang, Wei Peng, Matthew O. Ward and Elke A. Rundensteiner, "Interactive Hierarchical Dimension
Ordering, Spacing and Filtering for Exploration of High Dimensional Datasets", IEEE Symposium on Information Visualization 2003 (InfoVis 2003), pp 105 - 112, October 2003.
• Visual Clutter Measurement and Reduction:– Wei Peng, Matthew O. Ward and Elke A. Rundensteiner, "Clutter Reduction in Multi-Dimensional Data
Visualization Using Dimension Reordering", IEEE Symposium on Information Visualization 2004 (InfoVis 2004), pp 89 - 96, October 2004.
• Glyph Layout:– Matthew O. Ward, "A taxonomy of glyph placement strategies for multidimensional data visualization",
Information Visualization, Vol 1, pp 194-210, 2002. • Nominal Data Visualization:
– Geraldine E. Rosario, Elke A. Rundensteiner, David C. Brown, Matthew O. Ward and Shiping Huang, "Mapping Nominal Values to Numbers for Effective Visualization", Information Visualization Journal, Vol 3, pp 80-95, 2004.
• Data Quality Visualization:– Z. Xie, S. Huang, M. Ward, and E. Rundensteiner, “Exploratory Visualization of Multivariate Data with
Variable Quality,” Proc. IEEE Symposium on Visual Analytics Science and Technology, pp 183-190, 2006. – Zaixian Xie, Matthew O. Ward, Elke A. Rundensteiner, Shiping Huang, "Integrating Data and Quality Space
Interactions in Exploratory Visualizations", The Fifth International Conference on Coordinated & Multiple Views in Exploratory Visualization (CMV 2007), pp 47-60, July 2007.
• Discovery Management:– Di Yang, Elke A. Rundensteiner, Matthew O. Ward, "Nugget Discovery in Visual Exploration Environments
by Query Consolidation", ACM CIKM 2007, November, 2007– Di Yang, Elke A. Rundensteiner, Matthew O. Ward, "Analysis Guided Visual Exploration to Multivariate
Data", IEEE Symposium on Visual Analytics Science and Technology, October 2007.