iknow – handling unstructured data · ‘iknow – can create ... in same category combine...

51
c h e Cache Conf Using iKnow for handling unstructured data Danny Wijnschenk Sales Engineer

Upload: others

Post on 19-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

CacheConf

Using iKnow for handling unstructured data

Danny Wijnschenk

Sales Engineer

Page 2: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Making Decisions

Page 3: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Making Decisions

Page 4: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Making Better Decisions

Page 5: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Why investors need unstructured data

Page 6: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Why investors need unstructured data

Page 7: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Why clinicians need unstructured data

Page 8: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Why clinicians need unstructured data

Page 9: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Unstructured Data

Page 10: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Unstructured Data

Page 11: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

How?

Page 12: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

What is iKnow

Page 13: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Structural

or

Analytical

Approach

Computational Linguistics: Common

Discourse Analysis: study of

document structures

Semantics: study of meaning

Syntax: study of sentence structures

Morphology: study of word forms

Data

Driven

or

Statistical

Approach

Page 14: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Little(Adj-s) Pete (N-s) walk(s)(V-3ps) to (prep) the(det) new(Adj-s)

bakery (N-s).

Computational Linguistics: Common

Little Pete walks to the new bakery.

S

NP((Adj-s)+(N-s)) + VP (V+PP (prep + NP ((det)+((Adj-s)+(N-s)))

Little Pete

walks

to the new bakery

Sem Roles: (Subject+Agent+Person) (NP1): Little Pete

(Act+Move)(VP): walks

(Direction)(PP): to

(Object) (NP2):the new bakery

Page 15: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Statistical approach ?

i.Know

Based on

breakthrough

removing

hidden in

to

knowledge

the

applications you

Page 16: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

The iKnow Approach

iKnow

can create

можете создать

breakthrough applications

прорывные приложения

Knowledge

знаниях

unstructured text

неструктурированном тексте

based on

основанные на

hidden in

скрытых в

With iKnow you can create breakthrough applications based on knowledge hidden in unstructured text

You

вы with

С помощью

iKnow engine finds relations, all words between them are meaningfull

Wordgroups.

Page 17: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

iKnow Building blocks

entity entity entity “breakthrough applications"

### ###

### ### #

### ### # ### # ### #

### ### # ### # ### # ###

Entities

CRCs

CCs

Paths

Sentences

Concepts and relationships

Concept-Relationship-Concept triples

Concept-Concept pairs

Concept and relation chains

‘iKnow’ ‘breakthrough applications’ ‘knowledge’ ‘unstructured text’

‘iKnow – can create - breakthrough applications’ ‘breakthrough applications – based on - knowledge’ ‘knowledge - hidden in - unstructured text’

‘iKnow – breakthrough applications’ ‘breakthrough applications – knowledge’ ‘knowledge - unstructured text’

‘iKnow – can create - breakthrough applications – based on – knowledge - hidden in - unstructured text’

Page 18: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Content routing, discrete data extraction, intelligent browsing, tone analysis, ...

Multi-Purpose

Uses in Healthcare, Contact Centers, Media, Real Estate, Legal, Police, ...

Domain- Independent

Simultaneous support for English, Spanish, French, Portuguese, German, Dutch, Russian,...

Multi-Lingual

No need for training, upfront knowledge, dictionaries, expert input, ...

Pro-Active

What makes iKnow Unique

Page 19: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

iKnow Use Cases

Page 20: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Demo : Aviation accident reports

NTSB has aircraft accident reports with

structured data and narrative of accident

Page 21: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Demo : Aviation accident reports

Demo to show how

Insight from narrative text can complement

structured data (find new insight)

Use iKnow to improve or correct structured data by

analysing the narrative data

Demo shows integration iKnow with

DeepSee

Page 22: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Aviation Demo : ‘new’ insight More fatal accidents when wreckage is found vs

‘substantial damage’

Page 23: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Aviation demo:correct structured data

Pivot table shows ‘Highest Injuries’ from

structured data (Columns) vs Injury

Concepts from text (rows)

Everything above the diagonal is

suspicious : eg. 1 accident labeled

‘Serious’ in structured has a concept ‘fatal’

in text

Page 24: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Aviation Demo : evidence

Drill down on record shows full

narrative :

Page 25: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Aviation Demo: pivot analysis of cells

Analysis of ‘Fatal’ cells : green more

relevant, red less relevant

Page 26: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Categorisation Demo

Use most frequent concepts to quickly

scan for probable Categories

Use Similar concepts to find other causes

in same Category

Combine concepts in a Category Set

Look at highlighted source for Category

overlaps

Page 27: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Combine most frequent concepts + similars in a category

Categorisation Demo

Page 28: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

View sources of overlapping Sets

Categorisation Demo

Page 29: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Assign Categories to records and review

Categorisation Demo

Page 30: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Patient Portal Demo

Use iKnow to surface concepts from notes

Use dictionaries to filter relevant medical

terms

Automatic negation detection

Highlight source text

Page 31: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Show frequent concepts for a patient

Patient Portal Demo

Page 32: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Show negation, Similar and Related Concepts, Highlight Source

Patient Portal Demo

Page 33: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Use dictionaries to surface relevant medical concepts

Patient Portal Demo

Page 34: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Find trigger concepts that predict seclusion of

patients based on doctor’s notes

Page 35: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Parnassia: preventing seclusion Ward list

Patient 1 High Risk

Patient 2 High Risk

Patient 3 Medium Risk

Patient 4 Medium Risk

Patient 5 Medium Risk

Patient 6 Low Risk

Patient 7

Patient 8

Patient 9

Patient 10

Patient 1

Page 36: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Page 37: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Finding structured data in text

Page 38: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Ranging hotels…

Page 39: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

What we have

A number of reviews with tons of plain text and overall rating of

hotel (not neccessary)

Отель сам по себе вполне

даже нормальный, виды

красивые. Но вот заселение -

это тихий ужас! Селят в

номера соседнего отеля,

который раз в сто хуже, без

"дополнительной"

договоренности на большее

не рассчитывайте! Для 5*

отель не тянет, но и цена об

этом говорит. Пляж без урн

как таковых……

Hotel Rating 4.0

N Hotel Rating 3.98

Plus many

many texts

Page 40: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

What we need

Clear advices, separate and proven ratings of hotels’ categories

like: comfort, food, hospitality, territory, location, service

Отель сам по себе вполне

даже нормальный, виды

красивые. Но вот заселение -

это тихий ужас! Селят в

номера соседнего отеля,

который раз в сто хуже, без

"дополнительной"

договоренности на большее

не рассчитывайте! Для 5*

отель не тянет, но и цена об

этом говорит. Пляж без урн

как таковых……

Hotel Rating 4.0

N Hotel Rating 3.98

Comfort

Food

Hospitality

Territory

Location

Service

4.12

4.47

3.71

3.88

3.90

4.08

Categories

description

Page 41: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

How it works

Reviews

Hotel grades

Caregories grades

Short text summary of

review

Reviews quality grades

Searching for the best

and most useful reviews

Searching for similar

reviews

Excluding advertisement

reviews

Collecting data of special

units of hotel (swimming

pool, sand, drinks variety,

etc.)

Первые впечатления от отеля были

неплохие, доброжелательный русскоязычный

персонал на рецепшене, сразу заселили в

номер (нестандарт, так как других не было),

территория хорошая. Через 2 дня

переселили в стандартный номер, даже

почти с видом на море. Полотенца от

предыдущих отдыхающих грязные просто

сложили на полку - пошли на рецепшен

сказали к 21. Номер не убирали, только

одеяло на кровати складывали, полотенца и

дальше не менялись, даже бумагу

туалетную не ложили! Номер убрали,

бумагу повесили, полотенца и ДАЖЕ

ПОСТЕЛЬ поменяли!

Через день к нам присоединилась

моя подруга с дочкой(6 лет). Моей

старшей дочке (16 лет) всё

понравилось, а особенно анимация.

Нас с мужем и сыном (3 года)почти

всё устроило, кроме кухни и

обслуживания. Малой постоянно

заливал соком простыни, и поэтому

нам их меняли каждый день. Так что

по поводу проживания нас всё

устроило.

Отель сам по себе

вполне даже

нормальный, виды

красивые. Но вот

заселение - это

тихий ужас!

Селят в номера

соседнего отеля,

который раз в

сто хуже, без

"дополнительной"

договоренности

на большее не

рассчитывайте!

Для 5* отель не

тянет, но и цена

об этом говорит.

Пляж без урн как

таковых, поэтому

мусора хватало(((

Рядом был

немецкий отель

тоже 5* - земля и

небо!

iKn

ow

Dictionaries

Page 42: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Rating correlation results R

atin

g

Hotel

User

iKnow

Algorithm Correlation

value

Sentence + Negation 79%

Sentence – Negation 84%

Path + Negation 78%

Path - Negation 79%

Page 43: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Auxipress

Media monitoring and analyses solutions

to take better strategic decisions and to

achieve your business goals

Aggregates written press, website, radio,

television and social media

4,000,000 web pages; 10,000,000 press

articles; 1,000,000 Radio/TV/Web TV files

Page 44: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Horse-Gate scandal media analysis

iKnow @ AuxiPress

Page 45: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Flow

iKnow

Page 46: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

eBook summary

Page 47: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Related books based on content

Page 48: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

And more …

Summarize sources

Page 49: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

And more …

Tag Clouds

Page 50: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Technical …

iKnow is a toolset where you can Index a source to get the ‘building blocks’

Use api’s to get indexing results

Caché Object Script code

SQL Stored procedures

REST / JSON

Use iKnow in Caché or as a blackbox

(sources in, building blocks out) for other

NLP tools

Page 51: iKnow – handling unstructured data · ‘iKnow – can create ... in same Category Combine concepts in a Category Set Look at highlighted source for Category overlaps . Cache C

Cache C o n f

Questions?