text analytics presentation linkedin

Post on 20-Mar-2017

67 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Stacy FaughtIT Business Solutions

11/18/2016

Introduction

GOAL OF THIS PRESENTATION: Define text analytics capabilities to build the business case for creating the service.

2

creating the service.

Ontology Of Text Analytics BUZZ Words

Text AnalyticsText Mining Big Data

Linguistics

Machine

NLP

associated withassociated with

relies upon

sub-discipline ofworks together with

technology tools for

works together with

Faceted Search

enables

3

Taxonomy

SyntaxSemanticsMachine Learning

OntologyThesaurus

Semantic Network

Synonymous with

with

more complex

more complex

used to build

Morphology

Disambiguation

Entity Extraction

Sentiment

Beginnings

Text analytics leverages and learns from massive quantities of textual data to reveal customer intentions and sentiment…

Text Analytics World

4

Text Analytics World

Gaining Interest

Text Analytics is the process of deriving information from text sources.

Gartner

5

Gartner

Text Analytics is a method for making unstructured contentuseful and accessible.

Expert System

It’s All About the CONCEPT

6

Current State - Keyword Searching

All these documents contain the keywords “big cat oil discoveries”. Read ALL the documents to find the ones relevant to you.

7

Keyword Search Not Always Enough

Advantages

Speculative searching (i.e. Where are the best tacos in Houston?)

Finding general information (i.e. Address of Access Sciences’ website)

Disadvantages

8

Large result sets mean not enough time to read all documents

Noisy and irrelevant hits have to be filtered out

Narrowing the question may mean missing a key result

Have to type in all variants of a term

i.e. significant oil discovery, large oil find, >200M barrels???

What Text Analytics DoesTYPICAL SEARCH

Text Analytics occurs here

9

INDEX

Text Analytics occurs hereAnalyzes content & extracts meaningful metadata

Entities

Themes

Sentiment

SMARTER Searching

How it works

Concept:

big cat

Definition: Carnivorous

mammal

Child:

Jaguar

Relationship: Wild

Synonym:

Constraint: Not domestic

Parent: Mammal

Concept:

big cat

Definition: Caterpillar machinery

Child:

Drilling equipment

Relationship: Mining

Synonym: Heavy

Constraint: Not Hitachi

Parent: Equipment

Definition: Large oil discovery

10

Synonym: Feline

Heavy equipment

Concept:big cat

discovery

Child:Shale big cat

Relationship: Elephant

Synonym: Significant

Constraint: >200Mboe

Parent:Oil discovery

Interpreting the meaning of text

• Groups words into meaningful units

• Searches for different forms of words (morphology)

• Searches for words with semantic relationships

sentences Noun groupsMatch entities

verb groupsMatch actions

morphologyMatch different formssemanticsMatch related meanings

11

Match related meanings

Total has confirmed just one “big cat” -- with more than 200 million barrels -- in Bolivia in May 2011 that extends a 2004 discovery.

Shell has discovered oil on three big cat prospects offshore Nigeria, plus a large gas-condensate field in the Norwegian Sea.

Firm makes major Gatwick oil find.

Automatically Extract Known EntitiesPeople OrganizationsPlaces

Total S.A.Europe

France

12

• Entity extraction

Total S.A.– French oil & gas co.

vs.

total - adj. meaning entire

Saudi AramcoRoyal Dutch Shell

Exxon Mobil

Erle P. Halliburton

Charles Holiday– Shell Chairman

vs.

4th of July Holiday

FranceParisMonaco

Tyrrhenian Sea

Oceania

BasinsPlaysFields

Put the Puzzle Pieces Together…

Concepts: Big Cat

Discoveries Prospects

Play

Entities: Organizations

Shell

13

• Faceted search

Play Shale Conventional

Shell Total

Places Africa S. America

Find the Missing Piece

14

Automatically Extract Relevant Facts

15

WHO WHEN WHERE

Total May 2011 Bolivia

Shell July 2005 Offshore Nigeria

UK Oil and Gas April 2015 Sussex

Use Cases

Auto Classification Competitive Intel

16

Auto Classification

Internal Data Sources

In Place Share Drives Legacy Datasets

Migrations

17

• In its simplest form

Migrations Consolidation/Expansion Mergers and Acquisitions

Competitive Intel

External Data Sources

Public Domain Industry Publications Regulatory Reporting

18

Industries and Drivers

Oil & Gas

Pharmaceuticals

Government Agencies

Legal

Competitive IntelResearch

19

Knowledge Areas/Roles

Text Mining Library Sciences Taxonomy Linguistics Foreign Language Technology Domain Expert

20

Domain Expert

Technology Tools

Expert System Cogito conceptSearching Smartlogic Semaphore HP Autonomy Linguamatics I2E

SAS Text Analytics Suite IBM Languageware / Content Analytics Lexalytics Text Analytics Provalis Research QDAMinder / WordStat

21

Provalis Research QDAMinder / WordStat PingarAPI AlchemyAPI Content Analyst Angoss KnowledgeREADER NetOwl Language Computer Corp. Basis Technology MeaningCloud Forest RIM’s Textual ETL

Q & A

22

top related