text analytics world - expert system usa
TRANSCRIPT
Text Analytics WorldSan Francisco – March 31, 2015
4:15-4:45pmSpeaker: Bryan Bell, Executive Vice President, Expert System USA
What is in Your Business Requirement: Searching or Finding? Enterprise Search
Product Demonstration: The Google Search Appliance (GSA) integrated with a semantic technology platform.
1. Internal and external information comes at us faster than we can keep up with.
2. Business expectations for deploying solutions, using enterprise search and content navigation systems to capture the hidden value of strategic information.
3. CONTEXT: Exploiting deep linguistic analysis, combined with semantics offers the ability to create contextually correct metadata.
4. Dynamically enrich content with contextually relevant metadata and deploy as the heart of a knowledge management applications and the Google Search Appliance.
1. Internal and external information comes at us faster than we can keep up with.
80 – 90% is unstructured text.
Zettabyte1,000,000,000,000,000,000, 000 bytes
4
The Google crawler visits 20 billion web sites a day. The search engine has located more than 30 trillion unique URLs.
Processes 100 billion searches every month.• 3.3 billion searches per day. • Over 38,000 thousand searches per second.• A single Google query uses 1,000 computers to retrieve an answer.• This volume combined with the PageRank algorithm…PR(A) = (1-d) + d (PR(T1)/C(T1) + PR(Tn)/C(Tn)) …. is why Google is so good on the internet.
• 16% to 20% of queries that get asked every day have never been asked before.Amit Singhal, Senior Vice President of development, Google SearchAugust 2012
The Internet
2. Deploying internal enterprise search engine / content navigation system to
capture and share the hidden value of the information that is available to the company.
The intranet / corporate portal
2. Deploying internal enterprise search engine / content navigation system to
capture and share the hidden value of the information that is available to the company.
The intranet / corporate portal
“Our search stinks!
I want it to work like Google.”
9
Zettabyte1,000,000,000,000,000,000,000 bytes
Good news:PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
Don’t have 3.3 billion searches per day. Don’t have 38,000 thousand searches per second.Don’t have 1,000 computers to retrieve an answer.
10
Zettabyte1,000,000,000,000,000,000,000 bytes
Key wordsNo metadata
Poor metadataInconsistent
11
Zettabyte1,000,000,000,000,000,000,000 bytes
Key wordsNo metadata
Poor metadataInconsistent
=
POORCONTENT
FINDABILITY
12
stock
People are able to disambiguate “on the fly”, but machines cannot.
Key words vs. ContextLanguage ambiguity
13
People are able to disambiguate “on the fly”, but machines cannot.
stockapple
Key words vs. ContextLanguage ambiguity
14
stockappleApple
People are able to disambiguate “on the fly”, but machines cannot.
Key words vs. ContextLanguage ambiguity
15
stockappleApple
“I bought 10,000 shares of stock in Apple.”
“I have 10,000 apples in stock.”
People are able to disambiguate “on the fly”, but machines cannot.
Context is King
3. Exploiting deep linguistic analysis, combined with semantics.
4. Dynamically enrich content with contextually relevant metadata.
How is word context established?
Morphological analysis word forms dog, dog-catcher, doggy bag
Grammatical analysis parts of speech "There are 40 rows in the table." (noun)
"She rows 5 times a week." (verb)
Logical analysisword
relationships"The car I bought, to replace my Chrysler,
stinks."
Semantic analysis word context "I bought 10,000 shares of stock in Apple."
"I have 10,000 apples in stock."
"I used chicken broth for my soup stock."
Deep linguistic analysis of words to achieve word disambiguation.
How is word context establishedand deployed with the GSA?
www.intelligenceapi.com
20
Linguistic and semantic analysis engine
27
Case Study: GSA – Google Search ApplianceWhat is in Your Business Requirement? Searching or Finding.
Contacts
Thank you
Bryan Bell
@bellbryan
+1.847.508.7938
www.expertsystem.com