thomson reuters machine_readable_news_v3

74
THOMSON REUTERS MACHINE READABLE NEWS OVERVIEW RICH BROWN GLOBAL BUSINESS MANAGER May, 2010 V3.1

Upload: andrey-udovitskiy

Post on 31-Oct-2014

3.772 views

Category:

Business


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Thomson reuters machine_readable_news_v3

THOMSON REUTERS MACHINE READABLE NEWS OVERVIEW

RICH BROWN

GLOBAL BUSINESS MANAGER

May, 2010V3.1

Page 2: Thomson reuters machine_readable_news_v3

2

EXPLOITING NEWS CONTENT

• News is emerging as differentiated, value generating content set– Quant strategies – all trading frequencies– Human decision support – especially with analytic enhancements

• Key uses– Speed – beat the humans, beat the machines– Manage scale and scope of events affecting portfolio– Risk management and loss avoidance

• Machine Readable News product line - Robust set of capabilities – Historical data to back test and build algorithms– Real-time feeds for deployment, including ultra-low latency feed– Analytic add-ons which convert qualitative text into quantitative data

Page 3: Thomson reuters machine_readable_news_v3

3

EXPLOITING NEWS CONTENT• News flow is a good indicator of volume and

volatility.

• Pricing movements accompanied by news tend to be momentum in nature; those with a lack of news tend to reverse to average trends.

• The market tends to overreact when there is a lot of news on something and under-react when there is a small quantity of news.

• For direction and magnitude, find cause:effect relationships

Page 4: Thomson reuters machine_readable_news_v3

4

MACHINE READABLE NEWS USE CASES• Circuit breaker / halt trading alert (Wolf detection)

• News flow algorithms (more participatory algos)

• Alpha generating signal

• Risk Management – Quantify “event risk” and manage portfolio volatility

• Compliance – monitor for potential market abuse

• Post trade analysis – why did the algo/strategy not work?

• Stock screening tool (good/bad news stocks)

• Fundamental research – measure company sentiment, peer analysis, aggregate for market/sector outlook

• Trader support – confirming/contrarian trading signals, volatility signals

Page 5: Thomson reuters machine_readable_news_v3

5

MACHINE READABLE NEWSPRODUCT PORTFOLIO• News Archive

– Historical database of Reuters and select third-party market moving sources used to build and back-test news or event based strategies.

• News Feed Direct– Ultra-low latency feed of highly structured news and economic data used to

implement news or event-based strategies.

• Thomson Reuters News Analytics (fka NewsScope Sentiment Engine)– Automated news analysis solution measuring sentiment, relevance, and

novelty of text along with a host of other valuable metadata. Used to predict returns and volatility, applicable across all trading frequencies and short-term (intraday) to medium (months) investment horizons.

• Event Indices– Automated news analysis solution indicating when abnormal amounts of news

occur across various categories. Used to predict abnormal risk and volatility as well as returns.

• A Real-time News license, formerly NewsScope Real-Time, is also available to license news from our consolidated feed (RDF-delivered news).

Page 6: Thomson reuters machine_readable_news_v3

6

NEWS ARCHIVE• Historical file of Reuters and select third-party news for building

and back-testing strategies– Reuters – History to 1987– Third parties (PR Newswire, Business Wire, etc.) – History to 2003

• Follow a story as it unfolded including all alerts, headlines, updates, corrections, and deletions

• Millisecond time-stamped – GMT, synchronized with Tick History, common symbology and access methods.

• Robust metadata including company identifiers, topic codes, headline tags, stage of story, genre, etc.

• Delivered in .csv format via FTP site

• English items available via FasTick/Market QA EAP 2Q10

Page 7: Thomson reuters machine_readable_news_v3

7

NEWS FEED DIRECT• Proprietary, machine-readable, market-moving news.

– Ultra-low latency machine-readable economics – Reuters news– Third-party newswires – Company Events – structured company data– Credit Ratings data (from S&P - EAP - Q1, 2010)

• Robust platform optimized for ultra-low latency delivery– Massive parallel data processing and routing procedures– Full text & comprehensive metadata via streaming broadcast– XML format for text with binary format for latency sensitive releases– Millisecond transmission timestamps– Assured delivery options– Full geographic redundancy

• Manhattan & Piscataway, NJ - Dedicated circuits or internet availability• London – Dedicated circuits, Co-lo, or internet availability • Chicago – Dedicated circuits, Co-lo, or internet from NY metro• Washington, D.C. – beta site available for Washington-based releases

Click here for datacenter details

Page 8: Thomson reuters machine_readable_news_v3

8

NEWS FEED DIRECT DISTRIBUTION LOCATIONS

ChicagoNew York

Washington D.C.

LondonChicago New York

Washington

London

Detailed architecture

Page 9: Thomson reuters machine_readable_news_v3

9

NEWS FEED DIRECT CONNECTIONS

Page 10: Thomson reuters machine_readable_news_v3

10

NEWS FEED DIRECT CONNECTION OPTIONS BY SITE• New York

– Dedicated, managed circuits ordered by TR (OCL and OCP)– Dedicated circuits ordered and managed by customer for OCL– Internet

• Chicago (Elektron hub)– Dedicated circuits ordered and managed by customer– Cross-connect to/from Savvis

• London (Elektron hub)– Dedicated circuits ordered and managed by customer– Cross-connect to/from Savvis– Internet

• Washington, D.C. (beta site, limited data)– Dedicated circuits ordered and managed by customer– Cross-connect to/from Savvis

Page 11: Thomson reuters machine_readable_news_v3

11

NEWS FEED DIRECT CONTENT• Reuters News (Full feed - Equities, Treasury, C&E, Political and General)

• Third-party newswires (PRN, BSW, Globe, MKTW)

• Economic data

• Company Events

• Credit Ratings EAP Q1, 2010

US Europe• Department of Labor • Bank of England• Department of Commerce • Office of National Statistics• Treasury

• Treasury Auction Data

• Eurostat

• Philadelphia Fed• ISM (Industry of Supply Management) Click here for full list of all sources• Housing Data •Rest of world – Q2• University of Michigan Advance Feed

• Natural gas inventories (Oil in Q2)

Click here for econ data release procedures

Page 12: Thomson reuters machine_readable_news_v3

12

• Economic announcements are released in four ways:– Lock-up releases– Embargoed data– Press releases– Internet site posted

• Thomson Reuters is consistently the fastest on nearly all market-moving releases.

• The time difference between the #1 and #2 providers as well as the rest of the pack will vary based upon how the data is released.

ECONOMIC ANNOUNCEMENTRELEASE PROCEDURES

Page 13: Thomson reuters machine_readable_news_v3

13

ECONOMIC ANNOUNCEMENTRELEASE PROCEDURES: LOCK-UPSA “Lock-up” is used to control the release procedure and ensure the data

is distributed at the same time from the news agencies in the lock-up.• News agencies are in the lock-up and receive data prior to the

scheduled release. No external communication of that announcement is permitted.

• Distribution enforcement is controlled through two primary methods from the lock-up

– Finger push • Countdown to the release time (5, 4, 3, 2, 1, Go)• Winner is more random due to human process• Dept of Treasury releases this way.

– Switch release• No communication lines are open prior to the time of the release.• At the appropriate release time, a power switch controls communication equipment.

When powered up, the data leaves the lock-up, ensuring a level playing field (no fast fingers or other methods to release data early).

• Winner is is based on which provider has the best end-to-end technology and networks.

• TR is consistently #1 on these announcements (since summer, 2009)• Dept of Labor & Dept of Commerce release this way.

Page 14: Thomson reuters machine_readable_news_v3

14

• News agencies are provided with data ahead of a specified release time (under embargo).

• Data is typically released by agency using a NTP synchronized clock at the specified release time.

• Empire Manufacturing and Consumer Confidence are examples, as is the Thomson Reuters / University of Michigan Surveys of Consumers, but this one is exclusive to TR.

ECONOMIC ANNOUNCEMENTRELEASE PROCEDURES: EMBARGOED

Page 15: Thomson reuters machine_readable_news_v3

15

Press releases are used by certain organizations to distribute their information. News providers receive press releases, typically from providers like Business Wire and Market Wire, and extract data for distribution to clients.

• ISM (industry of supply manufacturers), ADP, Philly Fed are examples.

• TR’s automated application extracts information from the wire very quickly, putting is in the lead.

• Tens of milliseconds ahead of the #2 provider on ISM data (industry of supply manufacturers), Philly Fed, and ADP

ECONOMIC ANNOUNCEMENTRELEASE PROCEDURES: PRESS RELEASES

Page 16: Thomson reuters machine_readable_news_v3

16

Internet Sourced

• Certain releases are posted to a specific web site as the “public notice” or announcement– There is a controlled release, typically through government

web site.– Strict rules for access/ping frequency for certain releases. – Distribution of win rates is more random.

• Oil number posted to EIA/Energy Information Agency (US Dept. of Energy) web page.

ECONOMIC ANNOUNCEMENTRELEASE PROCEDURES: INTERNET

Page 17: Thomson reuters machine_readable_news_v3

17

UNIVERSITY OF MICHIGAN ADVANCED FEED• Advance feed of Thomson Reuters / University of

Michigan Surveys of Consumers headline information– Twice monthly releases (preliminary and final)– Thomson Reuters has exclusive media release rights to data

• Conference call begins at 9:55• Sent to terminals for human reading at 9:55• Machine Readable News license for API access at 9:55

– Advance feed sent at 9:54:58 for News Feed Direct subscribers

Page 18: Thomson reuters machine_readable_news_v3

18

COMPANY EVENTS

• Company Events is a structured feed of key data points from company press releases– TR analyses Business Wire and PR Newswire company press

releases on US companies– Fully automated extraction using ClearForest text analytics and

other technology for data verification prior to publication– Sub-second extraction and delivery process– XML representation of key facts (EPS, Rev, FFO and guidance)– Recall: % of facts extracted from releases - average 80%+– Precision: accuracy - over 95%

Click here for positioning

Page 19: Thomson reuters machine_readable_news_v3

19

COMPANY EVENTS POSITIONING

• A very fast way to get select data points from company news releases in the US (US at launch, expanding later).

• An easier and faster way for customers to consume this data vs parsing it from other sources

• A complement to IBES, First Call, and Estimates Delta used to determine if actuals (via Company Events) vary from expectations (via IBES, FC, or ED)

• A fully comprehensive array of company event data designed to replace other feeds such as fundamental data, Reuters Knowledge, company 10Ks/Qs or 8K announcements

• Replacement for broker research, estimates delta; or the human interpretation of context in a company’s press release

It is: It is not:

Page 20: Thomson reuters machine_readable_news_v3

20

COMPANY EVENTS STRUCTURE• For a single press release, the facts extracted include:

– Company Name and Reuters Instrument Code (RIC)• Example:

<Company><CompanyName>AFC Enterprises Inc</CompanyName><Symbol type="RIC">AFCE.O</Symbol>

</Company>– EPS (both GAAP & non-GAAP)– Guidance EPS– Actual Revenue– Guidance Revenue– Actual Funds From Operations (FFO) per share– Guidance Funds From Operations (FFO) per share

Click here for details

Page 21: Thomson reuters machine_readable_news_v3

21

COMPANY EVENTS STRUCTURE (cont)• For each Actual or Guidance fact, the following

elements can be present:– Period (required)

• A period can either be a Quarter or a Year.– If it is a Quarter: the period specifies which quarter,

the periodYear says which year it belongs to• Sometimes it is not possible to discern from the source document

which year or even which quarter the document applies to.– the Quarter will just be 'Q', indicating current quarter.– the period could not be inferred from the source document if the period

attribute is absent.

• Example:<Period periodType="Quarter" period="Q3"/>

Page 22: Thomson reuters machine_readable_news_v3

22

• ValueUnitCurrency (required)– The ValueUnitCurrency element contains:

One of:– a single value– a range represented by LowValue and HighValue

And one of:– a currencyCode (such as USD)– other units (Currently the only value of units is ‘%’, used for guidance figures

when expressed as a percentage increase/decrease.)

• Example:<ValueUnitCurrency currencyCode="USD">

<Value>0.16</Value>

</ValueUnitCurrency>

COMPANY EVENTS STRUCTURE (cont)

Page 23: Thomson reuters machine_readable_news_v3

23

• Example of ValueUnitCurrency

with a range:<ValueUnitCurrency currencyCode="USD">

<LowValue>0.75</LowValue>

<HighValue>0.77</HighValue>

</ValueUnitCurrency>

• Example of ValueUnitCurrency

with a percentage:<ValueUnitCurrency units="%">

<LowValue>10.00</LowValue>

<HighValue>12.00</HighValue>

</ValueUnitCurrency>

COMPANY EVENTS STRUCTURE (cont)

• Type (required)– For both Actual and Guidance elements– This specifies the type of value represented, which can be

• EPS• Revenue• FFO

• Example:<Type>EPS</Type>

Page 24: Thomson reuters machine_readable_news_v3

24

• QualifierList (optional)– Contained in both Actuals and Guidance – Specifies one or more qualifiers for the values. – The possible values are:

• Example:<QualifierList>

<Qualifier>ExcludingItems</Qualifier>

<Qualifier>FromContinuingOperations</Qualifier>

<Qualifier>NonGAAP</Qualifier>

</QualifierList>

•ExcludingItems •ProForma

•NonGAAP •Cash

•GAAP •IncludingItems

•Adjusted •Core

•Operating •FromContinuingOperations

COMPANY EVENTS STRUCTURE (cont)

Page 25: Thomson reuters machine_readable_news_v3

25

• Sample data available on Customer Zone

Customer requests

NewsScope product

team approves

• How to request a trial?

TAM enables customer to

request

Customer receives email with download

link

TAM enables customer to

request

SAMPLE DATA AVAILABLE FOR TRIAL

Page 26: Thomson reuters machine_readable_news_v3

26

<Data xsi:type="e:StructuredAlertsSdi"><StructuredAlertsSdi xmlns="http://company.schemas.tfn.thomson.com/2008-11-13/">

<Company><CompanyName>PG&amp;E Corp</CompanyName><Symbol type="RIC">PCG.N</Symbol>

</Company><ActualList>

<Actual><Period periodType="Quarter" period="Q2"/><ValueUnitCurrency currencyCode="USD">

<Value>0.83</Value></ValueUnitCurrency><Type>EPS</Type><QualifierList>

<Qualifier>NonGAAP</Qualifier><Qualifier>Operating</Qualifier>

SAMPLE OF A FULL PUBLISHED EVENT

Page 27: Thomson reuters machine_readable_news_v3

27

</QualifierList></Actual>

</ActualList><GuidanceList>

</Guidance><Guidance>

<Period periodType="Year" period="2009"/><ValueUnitCurrency currencyCode="USD">

<LowValue>3.15</LowValue><HighValue>3.25</HighValue>

</ValueUnitCurrency><Type>EPS</Type><QualifierList>

<Qualifier>Operating</Qualifier></QualifierList>

</Guidance></GuidanceList>

</StructuredAlertsSdi></Data>

SAMPLE OF A FULL PUBLISHED EVENT (cont)

Page 28: Thomson reuters machine_readable_news_v3

28

CREDIT RATINGS• A feed of structured Credit Ratings from S&P

– Corporate debt – Sovereign– Munis

• Structured tags to describe the ratings:– Org id, Cusip, Ticker to identify the organization and securities– Well defined universe of action codes

• Feed significantly faster than corresponding news headlines on terminals (measures showing at least half a second advantage)

• Allows reacting to unexpected events eg Greece downgrade 20091216 12:16:01 EST

Click here for details

Page 29: Thomson reuters machine_readable_news_v3

29

CREDIT RATINGS – SAMPLE OUTPUT

<xrefdate>24-feb-2010</xrefdate><xreftime>13:26</xreftime><doc_type>rating</doc_type><ticker>cmcsa</ticker><org_name>comcast corp.</org_name><org_id>100533</org_id><action>

<action_code>ONOL</action_code><action_description>on outlook</action_description>

</action><ratings_group> <lc_long_outlook_current>stable</lc_long_outlook_current>

<lc_long_outlook_prior>positive</lc_long_outlook_prior><lc_long_outlook_date>24-feb-2010</lc_long_outlook_date>

<lc_long_current>bbb+</lc_long_current><lc_long_prior>bbb+</lc_long_prior><lc_long_rating_date>14-jun-2005</lc_long_rating_date>

[…]</ratings_group>

Transmission time (EST),Ticker, Org Id

Action: On Outlook

Outlook stable from positive

Rating unchanged

Page 30: Thomson reuters machine_readable_news_v3

30

CREDIT RATINGS – S&P ACTION CODESAction Code Description

AFFIRMED Affirmed CWUPD CreditWatch Update DOWN Downgraded NEWRAT New Rating OLDEV Outlook: Developing OLNEG Outlook: Negative OLPOS Outlook: Positive OLST Outlook: Stable ONCWDEV On CreditWatch: Developing ONCWNEG On CreditWatch: Negative ONCWPOS On CreditWatch: Positive PRELIMRAT Preliminary Rating REMCW Removed From CreditWatch SUSP Suspended UP Upgraded WITHDRAWN Withdrawn NEWCW New CreditWatch ONCW On CreditWatch NEWOL New Outlook ONOL On Outlook

Page 31: Thomson reuters machine_readable_news_v3

31

THOMSON REUTERS NEWS ANALYTICS• Linguistics system scores text across three dimensions

– Sentiment (Author tone – positive, neutral, negative)– Relevance (Is it substantively about the company?)– Novelty (How unique is the article?)

• Sentiment: Assigns sentiment scores to different words/phrases– Put into context by part of speech, surrounding words, proximity of

words to one another, and other sophisticated linguistic cues– Scores combined to determine prevailing sentiment for a given entity

within article - Entity level scoring gives more complete picture

• Process emerged in PR/marketing industry to track media reputation– Human scores ~6-10 articles per hour– Limited scope of media outlets and number of articles

• Customer feedback and internal studies indicate alpha-bearing signal across all trading frequencies

More on NLP techniques

Powered by:

an Infonic company

Page 32: Thomson reuters machine_readable_news_v3

32

THOMSON REUTERS NEWS ANALYTICS• Many “text analysis” systems for finance measure the volume

of news only. Volume is an indicator, but is not the sole one

• There are 2 types of sentiment analysis:– Market sentiment – measures numerical values (e.g. EPS) or

keywords only (e.g. “outperform”)

– Author sentiment – Thomson Reuters News Analytics measures the author tone of the whole text

“…Vodafone upgraded to outperform..

“…EPS was higher than expected…” “…but sales growth was

disappointing…”

“…and litigation increased…”

“…EPS of 20p/share beat analyst consensus of 15p/share…”

Page 33: Thomson reuters machine_readable_news_v3

33

THOMSON REUTERS NEWS ANALYTICS: A HYBRID STATISTICAL/LINGUISTIC SYSTEM

BP gaveanalyst

sa negativ

e surprise1. Statistical: how many times “negative” appears

2. Linguistic Syntactic: which verb acts on which object?

3. Linguistic Semantic: “results were not very good”

BP gaveanalyst

sa negativ

e surprise

BP

give

analysts

negative

surprise

AGENT

RECIPIENT

INSTRUMENTMODIFIER

Page 34: Thomson reuters machine_readable_news_v3

34

ANALYSIS OF A SINGLE DOCUMENT QUICKLY BECOMES COMPLICATED

Page 35: Thomson reuters machine_readable_news_v3

35

THOMSON REUTERS NEWS ANALYTICS EQUITIES SAMPLE OUTPUT

Relevance: 0 - 1.0

Prevailing Sentiment: 1, 0, -1

Positive, Neutral, Negative: Probabilities which sum to 1.00, providing more granular sentiment

Novelty represented by Linked Counts: 12h, 24h; 3d, 5d, 7d

Item Type: Alert, Article, Updates, Corrections

Headline: Alert or Headline text

Topic Codes: What the story is about; RCH=Research; RES=Results; RESF=Results Forecast; MRG=Merger & Acquisitions . . .

Other metadata: Index IDs, Linked references, Story Chains (41 total)

Page 36: Thomson reuters machine_readable_news_v3

36

THOMSON REUTERS NEWS ANALYTICS COMMODITIES & ENERGY SAMPLE OUTPUT

Page 37: Thomson reuters machine_readable_news_v3

37

THOMSON REUTERS NEWS ANALYTICS –V2.0 KEY NEW FIELDSField Name ExplanationNews Source The publisher/feed providerNumber of Words The number of word used in the sentiment calculation

Total Words The total number of words in the itemFirst Mention The first sentence in which the scored entity is

mentionedTotal Sentences The total number of sentences in the news item

Item count 1 - 5 The number of items that mention scored entity in history period 1 - 5

Novelty / Linked references Notes amount of repetition within a feed and across feeds/sources in history periods 1-5 and provides links to them.

Broker Action Action of a Broker: "UPGRADE" "DOWNGRADE" "MAINTAIN" "BROKER" "UNDEFINED"

Market Commentary Market commentary indicator Headline Tags Story type - Interview, Exclusive, BreakingViews, Wrap-

up, etc.Company Count Number of companies mentioned in item

Page 38: Thomson reuters machine_readable_news_v3

38

THOMSON REUTERS NEWS ANALYTICS• Visualizations

– Spotfire showcasing IBM & Sony sentiment– Spotfire showcasing aggregate monthly sentiment– Panopticon scatter plots– Calais document viewer– News story filtering (Ford example)

Page 39: Thomson reuters machine_readable_news_v3

39

• Daily News & Price data in the same view (Jan-June 2007)• Daily Net Positive Sentiment [orange] : Daily sum of each item's Relevance*(Positive - Negative Sentiment)• Average Daily Price [blue]• Y-axis normalised to go from 0-100% of the respective values

• Event above shows direct correlation between dip in News Sentiment and Price on a single day• Series of Events above show close correlation between upturns in News Sentiment and Price over a sustained period of a few days (multiple short term signals lead to longer term movement)

Dip in net positive

sentiment and price

Rise & fall in net positive sentiment

lead to similar movements in price

Upturns in Net Positive Sentiment correlate to upward price momentum

over period of a few days

• What happened here to drive the price down at the end of February?

IMPACT OF DAILY NEWS SENTIMENTON IBM PRICE

Page 40: Thomson reuters machine_readable_news_v3

40

EXAMINING DAILY NET POSITIVE SENTIMENT VS PRICE FOR SONY

Conclusions drawn:• Daily or intra-day sentiment

can be a powerful indicator for stock price movements

– Real-time for very rapid decision-making –market making, high frequency

– Daily sentiment impact into following day/week’s price movement

– Multi-day signals for longer-term movements

– Weight and filter by relevance and novelty

1

43

2

756 8 9

10

BA

Page 41: Thomson reuters machine_readable_news_v3

41

IMPACT OF CUMULATIVE NEWS SENTIMENT ON IBM PRICE

Overall positive correlation between

Price and Cumulative Sentiment

• Cumulative Sentiment can be powerful measure to predict medium to long-term movements

• Variations: •X day moving averages

•Relevance filtering and weighting

•De-duplication

•Multiple content sources

• Same downturn as seen previously, but visually a contrarian signal. Why???

Page 42: Thomson reuters machine_readable_news_v3

42

BENCHMARK PRICE TREND – IBM VS XLK

• Drop in IBM’s share price between 2/21/07 and 3/5/07 • Corresponding drop in IBM benchmark - Special Technology Sector Spider XLK index• Broader market factors were influencing the price during this time

Page 43: Thomson reuters machine_readable_news_v3

43

1. Can News explain this downturn in Price?

2. Highlight this significant cluster of negative news stories which are only slightly relevant to IBM

PRICE TREND AND INDIVIDUAL NEWS

3. News is related to general worries on the China economy in February 2007 Human decision support:

• Analyze price movements

• Drill into news stories by type, source, sentiment, relevance, topic, other criteria

Page 44: Thomson reuters machine_readable_news_v3

44

General Observation: spikes in Quantity of News (tall bars in the top view) are co-incident with spikes in trading volume (third pane), especially when negative (second pane).

NEWS VOLUME AND TRADING VOLUME

Page 45: Thomson reuters machine_readable_news_v3

45

MONTHLY NET SENTIMENT (NUMBER OF POSITIVE – NUMBER OF NEGATIVES)

September, 2008 financial collapse

But overall sentiment turns in October, 2007

S&P500

XLE (Energy) Monthly Net Sentiment

Page 46: Thomson reuters machine_readable_news_v3

46

Panopticon-based visualizations; Weekly price change (Y axis) vs average weekly sentiment (x-axis) ; Aggregate Industry view

Page 47: Thomson reuters machine_readable_news_v3

47

Panopticon-based visualizations; Weekly price change (Y axis) vs average weekly sentiment (x-axis) ; Financial industry drill-down

Page 48: Thomson reuters machine_readable_news_v3

48

Panopticon-based visualizations; Weekly price change (Y axis) vs average weekly sentiment (x-axis) ; Company view with sector color

Page 49: Thomson reuters machine_readable_news_v3

49

SPEED READING WITH CALAIS DOCUMENT VIEWER

Page 50: Thomson reuters machine_readable_news_v3

50

NEWS VIEWER FILTER

Page 51: Thomson reuters machine_readable_news_v3

51

188 STORIES ON FORD ON 11/19/2008

Page 52: Thomson reuters machine_readable_news_v3

52

FILTERED BY RELEVANCE; 1.0 FILTER CUTS STORY COUNT TO 74

Page 53: Thomson reuters machine_readable_news_v3

53

FILTERED BY LINKED COUNT; ELIMINATE DUPLICATES -- CUTS STORY COUNT TO 44

Page 54: Thomson reuters machine_readable_news_v3

54

SORT BY SENTIMENT: RED=NEGATIVE; CLR=NEUTRAL; GREEN=POSITIVE

Page 55: Thomson reuters machine_readable_news_v3

55

THOMSON REUTERS NEWS ANALYTICS SAMPLE DATA• Available on 12,500+ companies globally (20K in May, 30K

avail in summer)

• 39 Commodities and Energy topics

• Guide to sample data and system overview

• Metadata keys for topic codes and list of RICs

• Tab delimited TXT files - each row corresponds to a news item which mentions a company in a material way.

• Each line item is millisecond time stamped corresponding to the NewsScope Archive (synchronized with RDTH)

• Clients can choose one year of content (2003-2010) for a no-charge, 30 day trial

Page 56: Thomson reuters machine_readable_news_v3

56

THOMSON REUTERS NEWS ANALYTICSEQUITIES COVERAGEEquities: (Summer, 2010)• All equities: 34,037 100%

• Active companies: 32,719 96.1%

• Inactive companies: 1,318 3.9%

Equity coverage by region• Americas: 14785

• APAC: 11055

• EMEA: 8197

Page 57: Thomson reuters machine_readable_news_v3

57

THOMSON REUTERS NEWS ANALYTICSREGULATORY AND COMPLIANCE USES• Filter by topic codes – Mergers, EPS, Management changes,

etc• Filter highly negative or positive items• Filter out market commentary• Filter out low relevance items• Filter out duplicates• Combine with other TR data sets

– Conference call transcripts– Broker research– Fundamentals & Estimates --- Earnings surprises– Insider trading– Deals – MRN Credit Ratings

Page 58: Thomson reuters machine_readable_news_v3

58

COMMODITIES AND ENERGY TOPIC CODESBIOF - Biofuels

COC - Cocoa

COF - Coffee

COR - Corn

COT - Cotton and silk

GMO - Genetically Modified Organisms

GOL - Gold and Precious Metals

GRA - Grains

LIV - Livestock

MEAL - Meals and feeds

OILS - Oils

ORJ - Orange juice

RUB - Rubber

SUG - Sugar

TEA - Tea

URAN - Uranium

WOO - Wool

BUN - Bunkers

CO2 - Emissions

COA - Coal

CRU - Crude oil

PROD - Refined oil products

OPEC - OPEC

MET - Non-Ferrous Metals

TIM - Forestry & Timber

NGS - Natural Gas

NSEA - North Sea Oil

JET - Jet Fuel

LPG – Liquefied Petroleum Gas

LNG – Liquefied Natural Gas

RFO - Residual fuel oil

HOIL - Heating Oil

MOG - Gasoline

NAP - Naptha

SOY - Soybeans

DIAM - Diamonds

STL - Iron & Steel

PLAS - Plastics

CHE - Chemicals

Page 59: Thomson reuters machine_readable_news_v3

59

THOMSON REUTERS NEWS ANALYTICS DELIVERY OPTIONS• Hosted in RHS (Elektron-Summer, 2010 w/VPN delivery)

• Deployed at customer site (connecting to TRMDS)– Customer specific configurations– Customer content or additional feeds

• Daily updates via TRQA (FasTick API)

• Historical files for testing via customer zone

Click here for TRMDS architecture

Or here for competitive differentiators

Page 60: Thomson reuters machine_readable_news_v3

60

THOMSON REUTERS NEWS ANALYTICS COMPETITIVE DIFFERENTIATORS• Market-leading news

– Only system able to provide real-time professional-grade news content from Reuters

• 2800+ reporters• 2.5M unique stories/yr• 900,000 alerts/yr• 200 bureaus worldwide

– Supplemented by licensed third parties

• Market-leading sentiment and Natural Language Processing capabilities– Lexalytics/Infonic have more than a decade of experience– Tuned system for financial services

• Extensive and powerful metadata across 80+ fields

Page 61: Thomson reuters machine_readable_news_v3

61

THOMSON REUTERS NEWS ANALYTICS COMPETITIVE DIFFERENTIATORS• Entity level news analytics

– Competitor scores article• Less accurate for given entities within article• Problems with comparisons and assigning sentiment to proper entity• Problems with casual references

• Granular sentiment indicators– Competitor claims 100-point rating system, but only 0, 50, 100 for overall

article– TRNA has range from 0-1.0 for more precise measurements and better

signals for individual entities

• Pure sentiment – no calibration on expected stock price reactions– More consistent signal through bullish and bearish market turns– Customer calibrating its model on a calibrated measurement can be

problematic

• Commodities and Energy module– Calibrated for C&E nuances– “Increases in supply” sounds good, but for commodities, it’s not

Page 62: Thomson reuters machine_readable_news_v3

62

THOMSON REUTERS NEWS ANALYTICS COMPETITIVE DIFFERENTIATORS• More powerful novelty indicators

– Not only tells you it’s repetitive, but how repetitive and over which time periods

• Default= 12hours, 24hrs, 3days, 5d, 7d• Configurable by client in deployed environments

– Links back to original repetitive items

• Sentence-level entity location indicator– Tells you in which sentence the item first appeared. – Paragraphs are problematic due to varying size of articles.

• Includes headlines for better understanding

• Numerous signals to detect market commentary

• Broker research flags

• Headline tags to weight articles differently

Page 63: Thomson reuters machine_readable_news_v3

63

THOMSON REUTERS NEWS ANALYTICS COMPETITIVE DIFFERENTIATORS• Flexible delivery options

– Real-time = Hosted and deployed o• Hosting simplifies infrastructure • Deployed facilitates customization and addition of customer sources

– TR Quantitative Analytics• Full integration into Market QA / FasTick for easier research and

analysis• Nightly updates

– Historical data for testing in tab-delimited file• Monthly updates dating back to 2003 on all companies from over 40

sources

• Fault tolerant, fully resilient systems built on TR APIs – Built for zero down-time– TRMDS installed at over 2600 sites worldwide– APIs power over 50,000 applications worldwide

Page 64: Thomson reuters machine_readable_news_v3

64

THOMSON REUTERS NEWS ANALYTICSTECHNICAL ENVIRONMENT

Page 65: Thomson reuters machine_readable_news_v3

65

EVENT INDICES• Collaboration with AlphaSimplex – quantitative research firm

• Produces analysis of the types of news reported– 45 indices (macroeconomic, political, violence, bullish, bearish,

natural disasters, central banking, etc.)– Adjusted for seasonality – What amount of news is abnormal for

that period of time?– Analysis shows predictive indicator to FX and stock volatility– Applicable to other asset classes – next phase of research to

focus more on equities• Violence Index – defense stocks?• Natural Disaster Index – Insurance stocks?• Bullish/Bearish – broader market signals?

Powered by:

Page 66: Thomson reuters machine_readable_news_v3

66

EVENT INDICES INDEX FRAMEWORK• Analysis shows:

– Strong seasonalities are present (e.g., 10:00am EST ≠ 3:00am EST)– Different types of news (e.g., bullish ≠ bearish)– Importance of news changes over time (e.g., “subprime” today vs. 2006;

Bernanke today vs Greenspan)– Relevance depends on applications (e.g., trading vs. risk management)

• Framework:– Construct broadest possible set of base indexes - 45 initially – 16 sample ones provided focused on currencies– Allow users to construct customized indexes using base indexes – Applicable across other asset classes and processes

Page 67: Thomson reuters machine_readable_news_v3

67

EVENT INDICES CONSTRUCTING AN INDEX• What Does An Index Look Like?

– A real number between 0 and 100, updated up to every 1 second– Represents percentile of news relative to “comparable” periods– Example: Violence Index of 90 implies that current volume of

violence news is higher only 10% of the time during periods of comparable news volume

• News index event = a time when the score exceeds 99.5 – High threshold eliminates idiosyncratic noise– Users can define their own significant index event – Analyze returns/volatility before and after events– Aggregate many events by averaging across such events

Page 68: Thomson reuters machine_readable_news_v3

68

EVENT INDICESSAMPLE EMPIRICAL RESULTS• Macroeconomic: unemployment, retail sales, fed, greenspan, bernanke,

inflation, goods, svcs, housing . . . – 1893 news index events over 4.25 years– t-stat = 9.7 (p < 0.001)

• Macroeconomic index events are statistically significant

Event Study of Macro News Index (EUR)

Vola

tility

Event Time+30 mins-30 mins

Pre/post-event distribution

Volatility

Page 69: Thomson reuters machine_readable_news_v3

69

• Impact of event indices can change over time (decrease example)

• Bearish in 2004 (t = 5.4, significant)

• Bearish in 2006 (t = 0.11, not significant)

EVENT INDICESSAMPLE EMPIRICAL RESULTS

Pre/post-event distributionEvent Study on EUR Volatility

Volatility+30 min-30 min Event Time

Event Study on EUR Volatility

Pre/post-event distributionEvent Study on EUR Volatility

Volatility+30 min-30 min Event Time

Vola

tility

Vola

tility

Page 70: Thomson reuters machine_readable_news_v3

70

• Bearish Index becomes significant again in 2007 (esp. Q3&Q4)

• Bearish in 2007 (t = 5.5, significant)

• Bearish in Q3, Q4 of 2007 (t = 7.85, significant)Pre/post-event distributionEvent Study on EUR Volatility

EVENT INDICESSAMPLE EMPIRICAL RESULTS

Pre/post-event distributionEvent Study on EUR Volatility

Volatility

Volatility

+30-30

+30 min-30 min Event Time

Event TimeEvent Time +30 min-30 min

Vola

tility

Vola

tility

Page 71: Thomson reuters machine_readable_news_v3

71

• Impact of event indices can change over time (increase example)

• Livestock in 2003 (t = 2.4, not significant)

• Livestock in 2004 (t = 12.91, significant, also for 2005, 2006)

EVENT INDICESSAMPLE EMPIRICAL RESULTS

Pre/post-event distributionEvent Study on EUR Volatility

Pre/post-event distributionEvent Study on EUR Volatility

Volatility

Volatility

+30 min-30 min

-30 min +30 min

Event Time

Event Time

Vola

tility

Vola

tility

Page 72: Thomson reuters machine_readable_news_v3

72

EVENT INDICESSAMPLE EMPIRICAL RESULTS• News indexes also have significance for equities

• Stock Topics (t = 7.6, significant)

• Political Topics (t = 7.9, significant) Pre/post-event distributionEvent Study on S&P 500 Volatility

Pre/post-event distributionEvent Study on S&P 500 Volatility

Volatility

Volatility

+30 min-30 min

-30 min +30 min

Event Time

Event Time

Vola

tility

Vola

tility

Page 73: Thomson reuters machine_readable_news_v3

73

EVENT INDICES• Research Whitepaper – describes detailed methodology

• Sample historical data (from Jan 2003) for back-testing

• White label opportunities for brokers and risk management platforms

Page 74: Thomson reuters machine_readable_news_v3

74

Questions?