the technology perspective- staffan truvé, recorded future

Post on 06-May-2015

1.756 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation by Steffan Truvé (Recorded Future) from the 'Prefect Swell' workshop on text and data mining on the 27th of September 2013.

TRANSCRIPT

Robots’ Rights and the Future of

Web Intelligence Staffan Truvé, Ph.D.

CTO, Recorded Future truve@recordedfuture.com

Who are we?

•  Founded in 2009 •  US-Swedish startup

•  40 persons, 50/50 US/SE

•  Boston (HQ), DC, Göteborg,

•  and a few more places around the globe

•  Focus on web intelligence, for governments and industry

•  Backed by Google Ventures, Atlas Venture, Balderton, IA Ventures, and I-Q-T

2

Cost to acquire and analyze

data

Amount of data

Cost to acquire and analyze

data

Amount of data

Today’s Discussion, Tomorrow’s News

6

Silicon Valley executives head to Vail, Colo. next week for the annual Pacific Crest Technology Leadership Forum

The carrier may select partners to set up a new carrier as early as next month

“2010 is the year when Iran will kick out Islam. Ya Ahura we will.”

“... Dr Sarkar says the new facility will be operational by March 2014...”

Unilever will hold their UK launch event early next week in Manchester

“...opposition organizers plan to meet on Thursday to protest...”

“Excited to see Morsi speak this weekend...”

“According to TechCrunch China’s new 4G network will be deployed by mid-2010”

“Strange new Russian worm set to unleash botnet on 4/1/2013...”

Estimated study completion date: November 2014

…new facility will be operational by March 2013…

...the transaction is expected close in late 2013…

Organizing the Web for Analysis

8

250,000 Real-time Sources

10 Billion Time-tagged Facts

“Kuo expects that Apple will introduce an iPhone 5S around June or July of this year”

“...opposition organizers plan to meet on Thursday to protest...”

Drought and malnutrition hinder next spring’s expansion plans in Kabul...

A few minutes from publishing to analysis

Inside the Web Intelligence Machine

Web Intelligence – at Web Scale •  Processing 100s of millions of

documents

•  Sources from all over the world, in 8 languages – English, Arabic, Chinese x 2, Russian, Spanish, French, Farsi

•  From government sites and big media to blogs and social media

•  10B “facts” - growing fast

•  25+ entity types

•  100+ event types

•  Metrics, signals, alerts

•  In real time!

10

11

Evolution, Opportunities, and Threats

12

Without text mining, the web is useless!

•  No way to find stuff without search engines – which all rely on text mining •  And all publishers realize this

•  Search à Analysis •  A necessary evolution as

the web grows •  Creating new value

•  Aggregation, analysis

•  Enabling media criticism

13

Drivers / Opportunities •  Moore’s Law!

•  Advances in linguistics, algorithms, math

•  Exponential growth of

content

•  The volume of information on the web is making traditional search worthless

14

Threats

•  Appification / closed vertical silos

•  Deep Web (& darkweb)

•  IP protectionism / legislation 15

Restricting the right to analyze is absurd

•  What is the borderline between reading and analyzing?

•  Impossible to differentiate humans from ‘robots’

•  Robots must have the same rights as human readers

16

Turing would have laughed!

17

18

Staffan Truvé truve@recordedfuture.com

The best way to predict the future is to invent it! (Alan Kay)

Dynabook, 1968

top related