text analytics presentation linkedin

22
Stacy Faught IT Business Solutions 11/18/2016

Upload: stacy-faught

Post on 20-Mar-2017

67 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Text Analytics Presentation LinkedIn

Stacy FaughtIT Business Solutions

11/18/2016

Page 2: Text Analytics Presentation LinkedIn

Introduction

GOAL OF THIS PRESENTATION: Define text analytics capabilities to build the business case for creating the service.

2

creating the service.

Page 3: Text Analytics Presentation LinkedIn

Ontology Of Text Analytics BUZZ Words

Text AnalyticsText Mining Big Data

Linguistics

Machine

NLP

associated withassociated with

relies upon

sub-discipline ofworks together with

technology tools for

works together with

Faceted Search

enables

3

Taxonomy

SyntaxSemanticsMachine Learning

OntologyThesaurus

Semantic Network

Synonymous with

with

more complex

more complex

used to build

Morphology

Disambiguation

Entity Extraction

Sentiment

Page 4: Text Analytics Presentation LinkedIn

Beginnings

Text analytics leverages and learns from massive quantities of textual data to reveal customer intentions and sentiment…

Text Analytics World

4

Text Analytics World

Page 5: Text Analytics Presentation LinkedIn

Gaining Interest

Text Analytics is the process of deriving information from text sources.

Gartner

5

Gartner

Text Analytics is a method for making unstructured contentuseful and accessible.

Expert System

Page 6: Text Analytics Presentation LinkedIn

It’s All About the CONCEPT

6

Page 7: Text Analytics Presentation LinkedIn

Current State - Keyword Searching

All these documents contain the keywords “big cat oil discoveries”. Read ALL the documents to find the ones relevant to you.

7

Page 8: Text Analytics Presentation LinkedIn

Keyword Search Not Always Enough

Advantages

Speculative searching (i.e. Where are the best tacos in Houston?)

Finding general information (i.e. Address of Access Sciences’ website)

Disadvantages

8

Large result sets mean not enough time to read all documents

Noisy and irrelevant hits have to be filtered out

Narrowing the question may mean missing a key result

Have to type in all variants of a term

i.e. significant oil discovery, large oil find, >200M barrels???

Page 9: Text Analytics Presentation LinkedIn

What Text Analytics DoesTYPICAL SEARCH

Text Analytics occurs here

9

INDEX

Text Analytics occurs hereAnalyzes content & extracts meaningful metadata

Entities

Themes

Sentiment

SMARTER Searching

Page 10: Text Analytics Presentation LinkedIn

How it works

Concept:

big cat

Definition: Carnivorous

mammal

Child:

Jaguar

Relationship: Wild

Synonym:

Constraint: Not domestic

Parent: Mammal

Concept:

big cat

Definition: Caterpillar machinery

Child:

Drilling equipment

Relationship: Mining

Synonym: Heavy

Constraint: Not Hitachi

Parent: Equipment

Definition: Large oil discovery

10

Synonym: Feline

Heavy equipment

Concept:big cat

discovery

Child:Shale big cat

Relationship: Elephant

Synonym: Significant

Constraint: >200Mboe

Parent:Oil discovery

Page 11: Text Analytics Presentation LinkedIn

Interpreting the meaning of text

• Groups words into meaningful units

• Searches for different forms of words (morphology)

• Searches for words with semantic relationships

sentences Noun groupsMatch entities

verb groupsMatch actions

morphologyMatch different formssemanticsMatch related meanings

11

Match related meanings

Total has confirmed just one “big cat” -- with more than 200 million barrels -- in Bolivia in May 2011 that extends a 2004 discovery.

Shell has discovered oil on three big cat prospects offshore Nigeria, plus a large gas-condensate field in the Norwegian Sea.

Firm makes major Gatwick oil find.

Page 12: Text Analytics Presentation LinkedIn

Automatically Extract Known EntitiesPeople OrganizationsPlaces

Total S.A.Europe

France

12

• Entity extraction

Total S.A.– French oil & gas co.

vs.

total - adj. meaning entire

Saudi AramcoRoyal Dutch Shell

Exxon Mobil

Erle P. Halliburton

Charles Holiday– Shell Chairman

vs.

4th of July Holiday

FranceParisMonaco

Tyrrhenian Sea

Oceania

BasinsPlaysFields

Page 13: Text Analytics Presentation LinkedIn

Put the Puzzle Pieces Together…

Concepts: Big Cat

Discoveries Prospects

Play

Entities: Organizations

Shell

13

• Faceted search

Play Shale Conventional

Shell Total

Places Africa S. America

Page 14: Text Analytics Presentation LinkedIn

Find the Missing Piece

14

Page 15: Text Analytics Presentation LinkedIn

Automatically Extract Relevant Facts

15

WHO WHEN WHERE

Total May 2011 Bolivia

Shell July 2005 Offshore Nigeria

UK Oil and Gas April 2015 Sussex

Page 16: Text Analytics Presentation LinkedIn

Use Cases

Auto Classification Competitive Intel

16

Page 17: Text Analytics Presentation LinkedIn

Auto Classification

Internal Data Sources

In Place Share Drives Legacy Datasets

Migrations

17

• In its simplest form

Migrations Consolidation/Expansion Mergers and Acquisitions

Page 18: Text Analytics Presentation LinkedIn

Competitive Intel

External Data Sources

Public Domain Industry Publications Regulatory Reporting

18

Page 19: Text Analytics Presentation LinkedIn

Industries and Drivers

Oil & Gas

Pharmaceuticals

Government Agencies

Legal

Competitive IntelResearch

19

Page 20: Text Analytics Presentation LinkedIn

Knowledge Areas/Roles

Text Mining Library Sciences Taxonomy Linguistics Foreign Language Technology Domain Expert

20

Domain Expert

Page 21: Text Analytics Presentation LinkedIn

Technology Tools

Expert System Cogito conceptSearching Smartlogic Semaphore HP Autonomy Linguamatics I2E

SAS Text Analytics Suite IBM Languageware / Content Analytics Lexalytics Text Analytics Provalis Research QDAMinder / WordStat

21

Provalis Research QDAMinder / WordStat PingarAPI AlchemyAPI Content Analyst Angoss KnowledgeREADER NetOwl Language Computer Corp. Basis Technology MeaningCloud Forest RIM’s Textual ETL

Page 22: Text Analytics Presentation LinkedIn

Q & A

22