analytics on unstructured data

7
Analytics on UnStructured Data (USD) Business Value in UnStructured Data Sai Turlapati Jan 2016 Acknowledgement: The contents (images and text) of this presentation is gathered from my research. I acknowledge any copyright or usage of the material. This presentation is only for the knowledge sharing purposes.

Upload: sai-turlapati

Post on 16-Apr-2017

219 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Analytics on UnStructured Data

Analytics on UnStructured Data (USD)Business Value in UnStructured Data

Sai TurlapatiJan 2016

Acknowledgement: The contents (images and text) of this presentation is gathered from my research. I acknowledge any copyright or usage of the material. This presentation is only for the knowledge sharing purposes.

Page 2: Analytics on UnStructured Data

What is UnStructured Data?Unstructured data is a generic term for describing data that have no rigorous data structure such as Relation Databases. Unstructured data can be • Textual • Non-Textual.

Textual unstructured data is generated in media like email messages, PowerPoint presentations, Word documents, collaboration software and instant messages.

Non-textual unstructured data is generated in media like JPEG images, MP3 audio files and Flash video files.

Page 3: Analytics on UnStructured Data

Value in UnStructured Data

80 percent of organizations-relevant information originates in unstructured form, primarily text. Emerging valuable sources such as Image, Audio and Video formats are pushing the envelope and increasingly becoming sources of Business Intelligence and Analytics.

Advances in natural language-processing and the ability to read, store and process/analyze large volumes of UnStructured Data allowing organizations not only Predict what will happen, enabling to make it happen (Prescriptive Analytics).

Page 4: Analytics on UnStructured Data

UnStructured Data WarehouseData warehouses are moving beyond managing only Structured Data. The value of UnStructured Data pushing Organizations (public/private) to expand Data warehouses. Sources of Unstructured Data can be Internal and External.Internal Unstructured Data can be Spreadsheets, Emails, Surveys, Documents, Images, Power points, Security Data, Log Data, Network Data, Videos, Audio…External Unstructured Data can be Social Media text, images, Fraud data, Surveys, industry news, videos…Structured and UnStructured Data Warehouses combined with enormous storage and processing being referred as Data Lakes!

Page 5: Analytics on UnStructured Data

Analytics on UnStructured Data

Above image is from http://www.datanami.com

Text Analytics: Is the process of bringing value out of UnStructured Text by mining data from various text sources. Text analytics, through the use of natural language processing holds the key to unlocking the business value within these vast data assets.

SAS and SPSS are positioned as leaders in Gartner's Advanced Analytic platform providers. Speech Analytics, Video, Image, Audio Analytics are moving out of infancy.

Page 6: Analytics on UnStructured Data

Challenges of UnStructured Data Managing the growth of Unstructured Data and dealing with the storage,

archival, Query, Search or Extraction is a challenge. MDM and Data Quality: UnStructured Data by nature lacks defined

format, type or relationship. In today’s mark space Commercial software gap exist in the Unstructured data quality and Master Data Management.

Identifying the important segments of Unstructured Data to address business goal(s).

Securing unstructured data that resides on file servers, NAS, SAN, portals, mailboxes, the cloud, the data center, social Media…. is challenging.

Regulatory Compliance, Auditing and monitoring of UnStructured data is posing a big challenge for IT Auditors.

Given the above challenges small, medium and enterprise IT vendors are developing solutions. Selecting a vendor to address UnStructured Data along with organizations Strategic relationships and technology road map alignment is a very challenging.

Page 7: Analytics on UnStructured Data

Acknowledgement: The contents (images and text) of this presentation is gathered from my internet research. I acknowledge any copyright or usage of the material. This presentation is only for the knowledge sharing purposes.

Thank You!