axibase time series database

44
Axibase Time Series Database

Upload: heinrichvk

Post on 03-Aug-2015

2.943 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Axibase Time Series Database

Axibase Time Series Database

2 Prepared by Axibase

Axibase Time-Series Database (ATSD) is a clustered non-relational database for the storage of

various information coming out of the IT infrastructure. ATSD is specifically designed to store and

analyze large amounts of statistical data collected at high frequency.

Database History

3 Prepared by Axibase

• 1970 – IBM introduced relational algebra for data processing.

• Cambrian explosion of relational database management systems:

• 2000 – first large-scale applications emerge, such as Google Search.

• 2004 – Google Big Table – first non-relational database using distributed file system.

• Currently we are experiencing Cambrian explosion of non-relational (a.k.a. NoSQL) databases:

Key Differences Between SQL and NoSQL

4 Prepared by Axibase

SQL NoSQL

High-level Programming Language SQL

Transactions

Query Optimizer

Non-key indexes

Key Differences Between SQL and NoSQL

5 Prepared by Axibase

SQL NoSQL

Scalability TB PB

Maximum Cluster Size 48 (Oracle RAC) 1000+

Distributed

Read TimeDepends on table size and

indexesLinear

Write TimeDepends on table size and

indexesLinear

Table Schema (column names, data types)

PredeterminedRaw bytes. Schema

determined by application

How Proven Is NoSQL Technology

6 Prepared by Axibase

NoSQL is the leading technology behind big data applications.

• Google – search, gmail, AppEngine

• Yahoo/Microsoft – search

• Amazon – e-commerce, search, cloud computing (AWS DynamoDB)

• IBM Big Insights, Microsoft Azure HD Insight

Big Data Adoption

7 Prepared by Axibase

HBase behind Facebook Messages:

• 6+ billion messages per day

• 75+ billion R/W operations per day

• Peak throughput: 1.5 million R/W operations per second

• 2+ petabytes of data (6+ PB including replicas) with data growth of over 8 TB per day

Big Data Adoption

8 Prepared by Axibase

IBM BigInsights behind Vestas:

• A wind energy company in Denmark is reducing the time to analyze petabytes of data from

several weeks to 15 minutes to improve the accuracy of wind turbine placement.

• Stores 2.8 PB of company historical data together with over 178 external parameters:

temperature, barometric pressure, humidity, precipitation, wind direction, wind velocity etc.

• Stores precise data on weather over the past 11 years.

• Collects data from over 35,000 meteorological stations.

Big Data Adoption

9 Prepared by Axibase

HBase behind Explorys:

• Explorys uses HBase to enable search and analysis of patient populations, treatment protocols,

and clinical outcomes.

• Stores over 275 billion clinical, financial and operational data elements.

• 48 million unique patient files.

• Collecting data from over 340 hospitals and 300,000 healthcare providers.

• Pull data from 22 integrated major healthcare systems.

Axibase Time Series Database

10 Prepared by Axibase

Scalability & Speed• Collects billions of samples per day. Retains detailed data forever.

Features• Combines database, rule engine, and visualization in one product.

Analytical Rule Engine• Applies aggregate functions and filters on streaming data.

Integration• Accepts data from any source based on industry-standard protocols.

Visualization• Built-in portals with smart widgets.

11 Prepared by Axibase

Big Data for IT Monitoring

12 Prepared by Axibase

• Retain detailed data forever.

• Collect statistics at high-frequency, for example every 15 seconds.

• Consolidate performance statistics from all systems into one database: facilities, network,

storage, servers, applications, databases, transactions, service providers, user activity etc.

• Monitor infrastructure based on abnormal deviations instead of manual thresholds.

• Apply statistical formulas to predict outages.

• Take advantage of schema-less database to collect data from any source.

Big Data for Developers

13 Prepared by Axibase

• Support for annotation-style instrumentation.

• Alternative to byte-code instrumentation and

file logging.

• Collect detailed performance and usage

statistics for reporting and analytics, without

writing custom monitors.

Big Data for Operations

14 Prepared by Axibase

• Gather and analyze statistical data generated by the various systems and sensors.

• Analytics that can support decision control systems.

• Allows for better real‐time operations decision‐support.

• Generate accurate forecasts of upcoming issues:

• Delays

• Scheduled maintenance based on product usage and sensor data instead of warranty

periods

• Improved customer service times and standards.

ATSD Architecture

15 Prepared by Axibase

• ATSD architecture combines database,

analytics and reporting tools into one

complete product.

• Data locality makes analytics run faster.

• Application server layer is simplified to

provide core shared services

ATSD Components

16 Prepared by Axibase

• Pluggable driver provides support for

different storage engines

• Compute, persistence and data

collection layers scaled independently

Fault Tolerance

17 Prepared by Axibase

• ATSD is a distributed system,

with high fault tolerance.

• Each data sample is

automatically replicated 3

times for recovery.

ATSD Scalability

18 Prepared by Axibase

• ATSD is a distributed, non-relational database with high throughput, fault tolerance and reading

speed.

• ATSD can collect billions of metrics per day and store petabytes of data.

• ATSD supports millisecond resolution and sampling intervals of up to several measurements per

second. The data is stored without losing accuracy.

• Additional nodes can be added at runtime to handle increasing volumes. ATSD automatically

distributes the table across active nodes.

• New nodes can be added in remote data centers to minimize network traffic.

Supported Data Types

19 Prepared by Axibase

• Two types of data ingestion: push and pull.

• ATSD supports numeric values, log messages and properties (collection of key-values).

• ATSD uses collectors for retrieving structured and unstructured data from remote sources.

• Support for standard protocols: Telnet, ICMP, CSV/TSV, FILE, JMX, HTTP, and JSON.

Data Collection

20 Prepared by Axibase

• Collection is agentless; data is pushed by external systems into ATSD.

• New metrics are auto-registered. No need to update schema or restart any server components.

• Existing monitoring tools can be instrumented to stream data into ATSD.

• Each data sample can be tagged (key = value) at source for subsequent querying, aggregations,

and roll-ups.

Data Storage

21 Prepared by Axibase

• Built-in data compression provides 70%-80% disk space savings over raw data.

• No data needs to be deleted. Seek time is almost linear regardless of the dataset size.

• Data storage is sparse and efficient. ATSD stores only what is collected instead of long rows with

NULLs or zeros, as is the case in relational model.

• VMware VMFS-attached disks are sufficient for small to medium clusters.

• Direct attached disks with JBOD are recommended for larger clusters.

• JBOD alternatives to minimize node recovery time are available from leading storage vendors,

such as NetApp E-Series.

Built-in Instruments

22 Prepared by Axibase

Unlike conventional data warehouses, ATSD comes with a set of built-in tools for data analysis:

• Analytical Rule Engine

• Forecasting

• Visualization

Analytical Rule Engine

23 Prepared by Axibase

• Evaluates incoming data in memory based on statistical rules.

• Statistical rules are applied to the incoming data stream before data is

stored on disk.

• As data is ingested by ATSD server, a subset of samples that match rule

queries are routed to the rule engine for processing.

• Rule Engine supports both time- and count- based data windows.

• Rule expressions and filters can reference not just numeric values but also

tags such as system type, location, priority to ensure that alerts are raised

only for critical issues.

• Multiple metrics and entities can be correlated within the same rule.

Analytical Rule Engine – Rule Examples

24 Prepared by Axibase

Type Window Example Description threshold none value > 75 Raise an alert if last metric value exceeds threshold

range none value > 50 AND value <= 75 Raise an alert if value is outside of specified range

statistical-count count(10) avg(value) > 75 Raise an alert if average value of the last 10 samples exceeds threshold

statistical-time time('15 min') avg(value) > 75 Raise an alert if average value for the last 15 minutes exceeds threshold

statistical-deviation time('15 min') avg(value) / avg(value(time: '1 hour')) >

1.25

Raise an alert if 15-minute average exceeds 1-hour average by more than 25%

statistical-ungrouped time('15 min') avg(value) > 75 Raise an alert if 15-minute average values for all entities in the group exceeds threshold

metric correlation time('15 min') avg(value) > 75 AND avg(value(metric:

'loadavg.1m')) > 0.5

Raise an alert if average values for two separate metrics for the last 15 minutes exceed predefined

thresholds

entity correlation time('15 min') avg(value) > 75 AND avg(value(entity:

'host2')) > 75

Raise an alert if average values for two entities for the last 15 minutes exceed thresholds

threshold override time('15 min') avg(value) >= entity.groupTag('cpu

_avg').min()

Raise an alert if 15-minute average value exceeds minimum threshold specified for groups to which

the entity belongs

cpu forecast deviation time('5 min') abs(forecast_deviation(wavg())) > 2 Raise an alert if 5-minute average deviates from forecast by more than two standard deviations

cpu forecast diff time('10 min') abs(wavg() - forecast()) > 25 Raise alert if absolute forecast deviates from average by more than specified value

disk threshold time('15 min') new_maximum() &&

threshold_linear_time(99) < 120

Raise alert if last value is the highest observed and linear threshold is expected to violate the 99%

threshold in less than 120 minutes

Analytical Rule Engine

25 Prepared by Axibase

Analytical Rule Engine

26 Prepared by Axibase

Forecasting

27 Prepared by Axibase

• Customers have a growing need to predict problems before they occur. The accuracy of

predictions and the percentage of false positives/negatives highly depends on the frequency of

data collection, the retention interval, and algorithms.

• The use of built-in autoregressive time-series extrapolation algorithms (Holt-Winters, ARIMA,

etc.) in ATSD allows predicting of system failures at early stages.

• The forecasting process is resource intensive and is most effective in a clustered system with

data locality such as ATSD.

• Dynamic predictions eliminate the need to set manual thresholds.

Forecasting Example

28 Prepared by Axibase

Forecasting Example

29 Prepared by Axibase

Forecast Settings

30 Prepared by Axibase

• ATSD selects the most accurate

forecasting algorithm for each

time-series separately based on a

ranking system.

• The winning algorithm is used to

compute forecast for the next day,

week or month.

• Pre-computed forecasts can be

used in rule engine.

Forecast Settings

31 Prepared by Axibase

Visualization

32 Prepared by Axibase

• ATSD can be integrated with Axibase Enterprise Reporting using the ATSD adapter

• ATSD comes with a wide variety of widgets for creating interactive portals directly in ATSD.

• ATSD widgets are designed from the ground-up to handle large data sets and calculations on the

client.

• ATSD visualization is supported on mobile devices and Smart TVs.

Visualization

33 Prepared by Axibase

Search

34 Prepared by Axibase

• Implemented in ATSD is log file search system to detect problems in distributed systems for the

purposes of security, audit and change control.

Notifications

• Supports standard notification mechanisms: email, console, web service, and notification in the

environment.

• For example, Axibase LED lighting system - the "Data Cube", which changes colors depending on

the status of IT services.

ATSD Benefits

35 Prepared by Axibase

• Enables customers to extract value from data that already exists in their operational and IT

infrastructures.

• Delivers preemptive monitoring through identification of abnormal behaviors in production

systems.

• Eliminates most manually-defined rules from the customer’s monitoring catalog.

• Serves as a centralized repository for historical data.

• Directly supported by AER for Dashboards, Reports, Capacity Planning

System Requirements

36 Prepared by Axibase

• Operating Systems:

• Red Hat Enterprise Linux 5.6+

• Ubuntu 12.04+

• Suse Linux Enterprise Server 10+

• Computing Hardware:

Edition Community - FREE Standard Enterprise

ATSD Nodes 1 1 + 1 > 5

Processors 2 vCPU, 2+ GHz 4 vCPU, 2+ GHz 4 vCPU, 2+ GHz

Memory 4 GB (2GB for JVM) 16 GB (8GB for JVM) 16 GB (8GB for JVM)

Use Cases

37 Prepared by Axibase

• ITM long-term history extension

• nmon reporting for AIX, Linux and Solaris

• Minimize exceptions in monitoring catalog

• Collect environmental data from SCADA

• Predictive Maintenance – based on sensors

ITM History Extension

38 Prepared by Axibase

• ITM can be instrumented to write streaming data into CSV files.

• CSV can be instantly uploaded into ATSD using inotify utility and wget.

• Example: private history streaming in ITM

• KHD_CSV_OUTPUT_ACTIVATE = Y

ITM History Extension

39 Prepared by Axibase

• Warehouse Proxy Agent is setup to save history data to CSV file

on the local machine.

• ATSD ingests the CSV files for analytics and long-term storage.

• ATSD converts the data using built in parsers.

nmon Reporting

40 Prepared by Axibase

• Consolidate trusted statistics from UNIX systems in one database

• ATSD is able to collect, parse and analyze nmon files

• Analyze nmon data with forecasting algorithms

• Capitalize on nmon data with two predefined visualization portals or easily create your own

portals using built-in HTML5 widgets

nmon Predefined Portals

41 Prepared by Axibase

42 Prepared by Axibase

Predefined AIX Portal

43 Prepared by Axibase

Predefined Linux Portal

44 Prepared by Axibase

Contact Axibase

Axibase Contact Details:• General - 408.973.7897• Fax - 408.725.8885• Email - [email protected]

Our headquarters are located in Cupertino, Silicon Valley: • 19925 Stevens Creek Blvd. Cupertino, CA 95014 USA