Download - Meltwater - nlp matiné 2014
Gábor PécsySenior Manager, Data Enrichment
Meltwater Group
June 25, 2014
• Privately owned and organically grown• Headquarters: San Francisco, California• Employees: 900+• Customers: 35,000+• Core business –Media Monitoring
History / Overview
2
Founded: 2001 in Oslo, Norway with $15,000
• Strong top line growth since inception• Consistently profitable every year of operation• Mostly self-funded• Yearly revenue 165M USD
Financials
Jorn Lyseggen
• CEO & Founder• Involved in four
startups to date• Founded Meltwater
in 2001
Offices All Around the Globe
• 50+ offices in Europe, North and South America, Asia, Africa, Australia
• 900+ employees, mostly sales
Not-for-profit NGO fully funded and run by Meltwater
5
Our product vision
Morning coffee
Informed decisions
6
Help our clients track and understand
own brandcompetitors leads
partnersproduct reviews
own industry
Uses Meltwater to find out about new instances of
vandalism and break-ins. Often, the victim is in need
of services
Uses Meltwater to help determine how public perception of certain ingredient chemicals will influence adoption & sales
Uses Meltwater to be alerted of when certain
patent will expire in target markets
TV Station In India: Uses Meltwater to monitor the
performance and popularity of news anchors and programs
Uses Meltwater social listening to
estimate and prevent
infrastructure attacks
7
Meltwater in Budapest
• Operations started in 2009
• No sales
• Originally a technology research group
• Currently two teams present:
• Content Services: responsible for content acquisition
• Data Enrichment: data analytics and enrichments (including NLP)
• Current size: 11, plan to grow to 20 by end of the year
8
Our technology in numbers
• Content:
• News crawler: 250K+ sources, 2M+ documents daily
• Over 3 billion since 2001
• Blog crawler (icerocket.com): 30M blogs
• Social data: 100M+ document daily from various sources (twitter, facebook, Youtube, comment streams, Wikipedia etc.)
• Data enrichment:
• NLP services in 12 languages (details later)
• Search and Storage:
• Ellastic Search index
• Riak – the largest know installation according to Basho
• ~150TB of data
Existing NLP Services
● Language detection
53 Languages
● Sentiment analysis● Key phrase extraction
12 languages with support for numeric values
12 languages
● Named Entity Recognition
4 languages (English, German, Swedish, Norwegian)
● Content Categorization
12 Languages with support for dynamic categories
● Intent detection
“I want to by an iPhone.”
PURCHASE
“How can I play music on my iPhone?”
QUESTION
Sales
Customer Support
1 Language (English)
● Named Entity Disambiguation1 Language● Near duplicate detection
Language Agnostic
9
NLP Capabilities under development
• Entity level sentiment
• Relationship extraction
• Document Grouping
• Searchable knowledge base
10
Current Products
12
mNews
13
mPress - now part of mNews
14
mBuzz
15