…optimise your it investments warehousing for low latency analytics philip howard research...
Post on 05-Jan-2016
215 Views
Preview:
TRANSCRIPT
…optimise your IT investments
Warehousing for low latency analytics
Philip HowardResearch Director – Bloor Research
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Agenda
What are low latency analytics?
The inverse quantity/latency relationship problem
What types of application need them?
What is required in query and analytic terms?
What’s special about low latency analytics?
What database structures and features will be useful?
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
What is low latency?
A spectrum of query requirements typically ranging up to a few
minutes in duration
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
The quantity/latency issue
Typically:
Large volumes to be ingested in (close to) real-time
Often involve large quantities of historic data
Analytics are often complex
Analytics are often ad hoc
High performance requirement
photo by Amnemona
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Low latency applications
Social media, social networking, web comparison sites, recommendation engines ….
Primarily about understanding customers and influencers
Latency varies depending on requirements
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Low latency applications
On-line gaming: casinos and video gaming
Typically about up/cross-selling but also fraud
The latter tends to have lower latency requirements
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Low latency applications
Telecommunications
Various applications including traffic analysis, mobile advertising (especially location specific analytics),
re-pricing and fraud
Can be very low latency e.g. mobile advertising
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Low latency applications
Real-time log and event management
Particularly important (and very low latency) for monitoring and responding to security threats such as cyber attacks
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Low latency applications
Real-time web analytics
Important to support on-line marketing campaigns and dropped shopping cart re-marketing
Typically a few minutes latency is fine
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Low latency applications
Others include:Fraud prevention
Capital markets
Network monitoring
Sensor-based applications
…
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Query and analytic requirements
Often require complex analytics with:Large table scansMulti-way joinsCorrelated sub-queries…
Often cannot be predicted in advance
Even in monitoring environments may require scoring against a model
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
What’s special about low latency analytics?
Not just about (real-time) load speeds and ingestion rates
Not just about query performance
Also about what’s in-between
Having data available in memory before it is stored on disk
Standardising the data – may be multiple formats
Writing to disk
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Database features
High performing and scalable loading
Trickle feed, micro-batch or CDC
Agility (data sources)
No indexes, summary tables etc.
Avoid whole table scans
Performance speed-up
In-memory capabilities
High availability
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
Conclusion
Analytic warehouses and marts for low latency applications have all the same requirements as for any other analytic environment
But they also require not just high ingestion rates and fast query performance, but high performance in the intermediate step(s) between ingestion and reading data off disk
telling the Information Management storyConfidential © Bloor Research 2009 telling the right storyConfidential © Bloor Research 2010
top related