with ibm corp. · 2017. 6. 9. · apache hadoop stack. the featur es of iop that ar e used in...

IBM Network Performance Insight 1.2.1Document Revision R2E1

Network Performance Insight Overview

IBM

NoteBefore using this information and the product it supports, read the information in “Notices” on page 59.

This edition applies to version 1.2.1.0 of IBM Network Performance Insight and to all subsequent releases andmodifications until otherwise indicated in new editions.

© Copyright IBM Corporation 2015, 2017.US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

Contents

Introduction . . . . . . . . . . . . . vIntended audience . . . . . . . . . . . . vService Management Connect. . . . . . . . . vNetwork Performance Insight technical training . . vSupport information . . . . . . . . . . . . vConventions used in this publication . . . . . . vi

Typeface conventions . . . . . . . . . . vi

Chapter 1. Network Performance Insightvalue propositions . . . . . . . . . . 1

Chapter 2. Network Performance Insightarchitecture . . . . . . . . . . . . . 3Foundation services . . . . . . . . . . . . 5

Manager . . . . . . . . . . . . . . . 5DNS . . . . . . . . . . . . . . . . 6Event . . . . . . . . . . . . . . . . 6Storage . . . . . . . . . . . . . . . 7UI . . . . . . . . . . . . . . . . 11

Entity Metric services . . . . . . . . . . . 11Tivoli Network Manager Collector . . . . . . 11Formula Service . . . . . . . . . . . . 12SNMP Collector . . . . . . . . . . . . 23Entity Analytics . . . . . . . . . . . . 24Thresholds . . . . . . . . . . . . . 25

Flow Metric services . . . . . . . . . . . 26Flow Collector . . . . . . . . . . . . 26Flow Analytics . . . . . . . . . . . . 48Remote Flow Collector . . . . . . . . . 52

Chapter 3. Platform support . . . . . 53Suggested node and services layout . . . . . . 53

Cluster behavior . . . . . . . . . . . . 55

Notices . . . . . . . . . . . . . . 59Trademarks . . . . . . . . . . . . . . 61Terms and conditions for product documentation. . 62

© Copyright IBM Corp. 2015, 2017 iii

iv IBM Network Performance Insight: Network Performance Insight Overview

Introduction

IBM® Network Performance Insight is a network performance monitoring system.It provides a comprehensive and scalable visibility on network traffic withvisualization and reporting of network performance data for complex, multivendor,multi-technology networks.

The Network Performance Insight overview provides a description of the product, itsfeatures, and the steps to take if you decide to purchase IBM Network PerformanceInsight, Version 1.2.1.

Intended audienceThe audience includes those considering IBM Network Performance Insight as anetwork performance monitoring solution and all the new users who want anoverview of the system.

Service Management ConnectConnect, learn, and share with Service Management professionals and productsupport technical experts who provide their perspectives and expertise.

Access Network and Service Assurance community at https://www.ibm.com/developerworks/servicemanagement/nsa/index.html. Use Service ManagementConnect in the following ways:v Become involved with transparent development, an ongoing, open engagement

between other users and IBM developers of Tivoli products. You can access earlydesigns, sprint demonstrations, product roadmaps, and prerelease code.

v Connect one-on-one with the experts to collaborate and network about Tivoliand the Network and Service Assurance community.

v Read blogs to benefit from the expertise and experience of others.v Use wikis and forums to collaborate with the broader user community.Related information:

IBM Network Performance Insight community on developerWorks

Network Performance Insight technical trainingFor Tivoli technical training information, see the following Network PerformanceInsight Training website at https://tnpmsupport.persistentsys.com/updated_trainings.

Support informationIf you have a problem with your IBM Software, you want to resolve it quickly.IBM provides the following ways for you to obtain the support you need:

OnlineAccess the IBM Software Support site at http://www.ibm.com/software/support/probsub.html .

IBM Support AssistantThe IBM Support Assistant is a free local software serviceability workbench

© Copyright IBM Corp. 2015, 2017 v

https://www.ibm.com/developerworks/servicemanagement/nsa/index.html

https://www.ibm.com/developerworks/servicemanagement/nsa/index.html

https://www.ibm.com/developerworks/community/groups/service/html/communitystart?communityUuid=eaa9489e-3122-440a-93a3-ee5cd204c603

https://tnpmsupport.persistentsys.com/updated_trainings

https://tnpmsupport.persistentsys.com/updated_trainings

http://www.ibm.com/software/support/probsub.html

http://www.ibm.com/software/support/probsub.html

that helps you resolve questions and problems with IBM Softwareproducts. The Support Assistant provides quick access to support-relatedinformation and serviceability tools for problem determination. To installthe Support Assistant software, go to http://www.ibm.com/software/support/isa.

Troubleshooting GuideFor more information about resolving problems, see the problemdetermination information for this product.

Conventions used in this publicationSeveral conventions are used in this publication for special terms, actions,commands, and paths that are dependent on your operating system.

Typeface conventionsThis publication uses the following typeface conventions:

Bold

v Lowercase commands and mixed case commands that are otherwisedifficult to distinguish from surrounding text

v Interface controls (check boxes, push buttons, radio buttons, spinbuttons, fields, folders, icons, list boxes, items inside list boxes,multicolumn lists, containers, menu choices, menu names, tabs, propertysheets), labels (such as Tip:, and Operating system considerations:)

v Keywords and parameters in text

Italic

v Citations (examples: titles of publications, diskettes, and CDs)v Words defined in text (example: a nonswitched line is called a

point-to-point line)v Emphasis of words and letters (words as words example: "Use the word

that to introduce a restrictive clause."; letters as letters example: "TheLUN address must start with the letter L.")

v New terms in text (except in a definition list): a view is a frame in aworkspace that contains data.

v Variables and values you must provide: ... where myname represents....

Monospace

v Examples and code examplesv File names, programming keywords, and other elements that are difficult

to distinguish from surrounding textv Message text and prompts addressed to the userv Text that the user must typev Values for arguments or command options

Bold monospace

v Command names, and names of macros and utilities that you can typeas commands

v Environment variable names in textv Keywordsv Parameter names in text: API structure parameters, command

parameters and arguments, and configuration parameters

vi IBM Network Performance Insight: Network Performance Insight Overview

http://www.ibm.com/software/support/isa

http://www.ibm.com/software/support/isa

v Process namesv Registry variable names in textv Script names

Introduction vii

viii IBM Network Performance Insight: Network Performance Insight Overview

Chapter 1. Network Performance Insight value propositions

Network Performance Insight is a network performance management anddiagnostics solution that is integrated with IBM Netcool® Operations Insight. It canhelp enterprises detect, isolate, and diagnose network anomalous events andmaintain optimal performance of IT resources.

Integrated reportingProvides a quick view of network and IP device session data monitoringfrom IBM Netcool Operations Insight dashboards.

Exhaustive network visibilityThe available dashboards help to understand network usage and deviceusage patterns to view all information about end points and enhanced dataslicing facility on longer duration.

User-defined thresholds and alertsPredefined and custom threshold settings help provide proactive alerts ontraffic data violations at interface level. These violations might be due tohigh usage of applications.

Faster troubleshooting and root cause analysis

Improves visibility into network performance to help minimize servicedegradations and disruptions, and speed troubleshooting. The usage andtrend analysis of the reports remove network bottlenecks, blind spots withseamless navigation among the dashboards and performance monitoringprotocols to allow the total problem resolution.

Reduces operational costs and boosts efficiencies by consolidatingperformance management of systems and enabling proactive problemidentification. The traffic data usage reports provide clear data points todifferentiate bandwidth spikes, along with information about congestion innetwork.

© Copyright IBM Corp. 2015, 2017 1

2 IBM Network Performance Insight: Network Performance Insight Overview

Chapter 2. Network Performance Insight architecture

IBM Network Performance Insight is a network performance monitoring system. Itoffers near real-time and interactive view on the network data that helps inreduced network downtime and optimized network performance.

Network Performance Insight provides IBM Netcool Operations Insight withcomprehensive IP network device performance monitoring and session trafficanalysis.

The following diagram shows how data is flowing through the variouscomponents in Network Performance Insight:

Network Performance Insight services

Network Performance Insight services are running on microservice architecturethat has the software application as a suite of independently deployable, small,modular services in which each service runs a unique process and communicatesthrough a well-defined, lightweight mechanism. Currently, Network PerformanceInsight 1.2.1 consists of the following microservices:

Foundation services

v Manager


v DNSv Eventv Storagev UI

Entity Metric services

v Tivoli Network Manager Collectorv SNMP Collectorv Formula Servicev Entity Analyticsv Thresholds

Flow Metric services

v Flow Collectorv Flow Analytics

For more information about these services, see their respective sections in IBMNetwork Performance Insight: Product Overview.

IBM Open Platform with Apache Spark and Apache Hadoopcomponents

IBM Open Platform with Apache Spark and Apache Hadoop (IOP) can be used tohelp process and analyze the volume, variety, and velocity of data that continuallyenters your organization every day. Network Performance Insight is installed as aservice extension to the installed IBM Open Platform with Apache Spark andApache Hadoop stack.

The features of IOP that are used in Network Performance Insight:v IBM Open Platform with Apache Spark and Apache Hadoopv Default support for rolling upgrades for Hadoop servicesv Support for long-running applications within YARN for enhanced reliabilityv Spark in-memory distributed compute engine for dramatic performance

increasesv Apache Ambari operational framework. Apache Ambari is an open framework

for provisioning, managing, and monitoring Apache Hadoop clusters. Ambariprovides an intuitive and easy-to-use Hadoop management web UI backed byits collection of tools and APIs that simplify the operation of Hadoop clusters.

v Essentially includes the following open source technologies for working withNetwork Performance Insight:– HDFS– Kafka– Ambari– Spark– ZooKeeper

Note: Because Zookeeper requires a majority, it is best to use an odd numberof machines. For example, with four machines ZooKeeper can only handle thefailure of a single machine; if two machines fail, the remaining two machinesdo not constitute a majority. However, with five machines ZooKeeper canhandle the failure of two machines.


http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_hadoop.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_kafka.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/ambari.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_spark.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_zookeeper.html

Integrated products

The products that are needed to work with Network Performance Insight, V1.2.1are as follows:

Jazz™ for Service Management 1.1.3.0Dashboard Application Services Hub provides visualization and dashboardservices in Jazz for Service Management. It has a single console foradministering IBM products and related applications. Visualization forNetwork Performance Insight is federated into Dashboard ApplicationServices Hub.

Products that are integrated with Network Performance Insight 1.2.1:

IBM Tivoli® Network Manager IP Edition 4.2.0.3Tivoli Network Manager provides network discovery, device polling,including storage of polled SNMP data for reporting and analysis, andtopology visualization. In addition, Network Manager can display networkevents, perform root-cause analysis of network events, and enrich networkevents with topology and other network data.

Tivoli Netcool/OMNIbus component of IBM Netcool Operations Insight 1.4.1Netcool Operations Insight is powered by the fault managementcapabilities of IBM Tivoli Netcool/OMNIbus. In Network PerformanceInsight v1.2.1, Tivoli Netcool/OMNIbus 8.1.0.11 is an important part of thesolution for monitoring the network threshold violations.

Related information:

IBM Network Performance Insight on IBM Knowledge Center

IBM BigInsights 4.2 documentation

HDFS Architecture

Apache Hadoop YARN

Apache Kafka

Apache Zookeeper

Foundation servicesFoundation services are the basic infrastructure services that are used by multipleother Network Performance Insight services.

ManagerMonitors the status and health of all Network Performance Insight microservices.

Some salient features of the Manager service:v Periodically monitors the status of all Network Performance Insight services.v Sends metrics to Ambari.v Provides the npi-manager-cmd command line interface to start, stop, and check

the status of selected Network Performance Insight microservices from Ambari.For more information about the command, see npi-manager-cmd commandreference in IBM Network Performance Insight: References.

v Copies the Spark and Hadoop configuration files to HDFS for the first time afterthe installation is complete.

Chapter 2. Network Performance Insight architecture 5

http://www.ibm.com/support/knowledgecenter/SSCVHB

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.welcome.doc/doc/welcome.html

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#HDFS_Architecture

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

http://kafka.apache.org/documentation.html

http://zookeeper.apache.org/

DNSThe DNS Service operates as a cluster singleton. The DNS service is stateless. Itsperformance depends on the performance of the supporting DNS server that isprovided by your enterprise.

Resolves DNS names for reporting and distributing the interface metadata.

By default, the network requests that support DNS lookup run on port 53 for TCPand UDP protocols. The configuration settings are available on Ambari server fromServices > NPI > NPI Settings under NPI DNS Services pane.

EventThe Event Service operates as a cluster singleton. The performance of this servicedepends on the performance of the STDIN probe and Tivoli Netcool/OMNIbusserver. STDIN probe is used to send events to the IBM Tivoli Netcool/OMNIbusserver. The terms alert and event can be used interchangeably.

“Flow Analytics” on page 48 Service processes and manages events for Flow dataand “Thresholds” on page 25 Service for Entity Metric data. The Flow AnalyticsService and Threshold Service generate alerts when any violation occurs.

Event evaluation process

Thresholds are computed based on the data that is collected from 1-minuteaggregation.

The event evaluation process goes through these phases:1. The Event Service then compares for any threshold violations.

If a KPI enters a violation state of Major and critical severity, an event isgenerated. The Event Viewer displays the active events from IBM NetcoolOperations Insight dashboards.

2. If an existing threshold violation state changes, an event is generated.The severity is set to CLEAR and the event from the Event Viewer is removed.Whether a cleared event is removed from Tivoli Netcool/OMNIbus EventViewer or not is determined by Tivoli Netcool/OMNIbus Web GUIconfigurations.

For more information about exiting a threshold violation in Network PerformanceInsight, see Exiting a threshold violation.

Generating an event

An event is generated only in the following conditions:v KPI value violates the threshold configuration for an interface and no existing

violation for the same interface has been triggered.v The threshold state is cleared.

The two main event management scenarios:

Generating a new eventWhen the following conditions are met, an event is generated and sent toIBM Netcool Operations Insight dashboards:v Traffic data violates the threshold configurations.


v The severity is not changed for a KPI that has violated a threshold.

Note: If the severity is not change for of a KPI that is already in aviolated state, a new event is not generated.

Generating an event when the severity changesWhen the following conditions are met, an event is generated and sent toIBM Netcool Operations Insight dashboards:v The Traffic data that is in violated state changes to a new severity.

StorageIBM Network Performance Insight uses a built-in columnar storage database. TheStorage Service operates as a cluster singleton that provides resiliency.

Network performance data profiling requires advanced database technology tohandle high volumes of data in a large enterprise and service providerinfrastructures. Network Performance Insight uses an indigenously developeddatabase with efficient memory usage and has the following features:v Compressed, columnar data storage that manages the data that is collected by

Network Performance Insight.v Supports high-bandwidth queries and analysis of data.v Provides inbound APIs to insert and update data.v Maintains HDFS file-store, aging data, query coordination.v Periodically consolidates storage files and purging old data with the help of

Storage optimizer.v Queries are delegated to spark for scalability. All fact data are stored as parquet

files in compressed, columnar format which can be processed by sparkefficiently.

v Provides a mechanism for incremental storage of network performance data forspecific time intervals

The Network Performance Insight Storage Service uses Hadoop-based big datainfrastructure for its distributed computing features.

Hadoop and Spark are a fully integrated infrastructure solution with clustermanagement and analytics software that is optimized for Hadoop and Spark basedworkloads. Hadoop comprises of two main components; the HDFS, and a Java™

based file system for storing large volumes of data and a programming paradigm,that is, Hadoop MapReduce.

The Storage Service state is maintained by the following components:v HDFS - fact tablesv Kafka - domain tablesv Application logic - database schemav Table Schema information - Kafka Schema Registry

State is automatically recovered at start up. Scaleout is achieved by using Spark tosupport queries.


Storage optimization

Network Performance Insight database is optimized to improve NetworkPerformance Insight overall performance. Storage optimization minimizes the dataretrieval time and reduces hardware and administration costs.

Data storageDatabase tables that store specific types of data collected by Network PerformanceInsight. .

The following table lists the types of data that is stored in Network PerformanceInsight database schema:

Table 1. Network Performance Insight database schema.

Database schema Type of data

Flow Metric

Flow Raw data

Aggregated data of 1 minute, 30 minutes, and daily for alltop 10

Aggregation status

Interface configuration settings for flow metrics data

Configuration Domain Names, Retention Profiles, and Thresholdconfiguration settings and static threshold definition for flowmetrics data

Event Event data for both flow and entity metrics data

Inventory Inventory related metadata that is collected from IPSLAenabled devices and this schema stores the followinginformation in its tables:

v IPSLA enabled devices properties names and values

v IPSLA enabled devices formula or metric names.

v IPSLA enabled devices entity or resource names and parententity names.

v IP to hostname mappings

v Network interface information that is discovered by TivoliNetwork Manager and available in Network Connectivityand Inventory Model database.

Defaults Dashboard Application Services Hub integration-related data

NCIM Discovered Network interface related data

Data is polled from Tivoli Network Manager database byusing Kafka Connect which is sent to Kafka message bus andstored within Network Performance Insight database schema.

NCPOLLDATA Resources and metrics metadata from Tivoli NetworkManager

Tivoli Network Manager core server, which contains ApacheStorm server manages poll data aggregation. These data ispushed to Kafka and stored within Network PerformanceInsight database schema.

Threshold Threshold evaluation records and static threshold definitionsof entity metric data.

Static threshold definition for entity metrics data.


Table 1. Network Performance Insight database schema (continued).

Database schema Type of data

Entity Metric Entity Metric raw data

Tivoli Network Manager metric data and SNMP data from IPSLA enabled devices.

Entity Metric aggregated data of 30 minutes, 6 hours anddaily

Entity Metric aggregation status

Flow Aggregated data

Network Performance Insight consists of Fact tables data and Domain Objects.

All raw NetFlow data is stored up to 10 days as the initial default configuration ofNetwork Performance Insight in the flow database table.

Aggregated data represents the total Ingress and Egress traffic, top 10 applications,top 10 conversations, and top 10 protocols for each interface for every 1-minuteinterval. Aggregated data is progressively stored in 1 minute, 30 minutes, andDaily aggregation tables.

The most recent data is available with 1-minute aggregation. Data less than 50hours are populated from 1-minute aggregation tables while data more than 50hours but less than 100 days are populated from 30-minutes aggregation tables. Alldata that is more than 100 days are populated from 1-day aggregation tables.

The following table lists the traffic tables that are queried for the data that isdisplayed in dashboards in Network Performance Insight for different timeintervals:

Time interval Traffic tables

Less than 50 hours 1-minute aggregation

50 hours to less than 100 days 30-minutes aggregation

More than 100 days 1-day aggregation

Entity Metric Aggregated data

All Tivoli Network Manager SNMP raw data is stored up to 90 days as the initialdefault configuration of Network Performance Insight in the Entity Metric rawdatabase table.

The most recent data is available with 30-minutes aggregation. Data are populatedfrom the 30-minutes, 6-hours aggregation or 1-day aggregation table aggregationtable depending on the selection of time period and aggregation type from therelevant reports.

Retention periodNetwork Performance Insight supports data retention, which relates to historicaldata and managing the data storage periods.


You can maintain Network Performance Insight database by configuring theretention period to keep the required data. The time period for which you canstore raw, aggregated, resolved data by DNS, and events types of data, depends onthe number of flows that are received by Network Performance Insight and thefree disk space available on your system.

You can maintain Network Performance Insight database by configuring theretention period to keep the required data.

After the specified period, the database is cleaned up automatically. For resolveddata by DNS, after the specified retention period, DNS server auto resolves thehostname. If the DNS resolution is unsuccessful, the IP address is populatedinstead.

The following table lists the default retention period that is used for the followingtypes of data:

Type of data Retention period

Flow Raw 5 Days

Flow 1 minute aggregated 1 Month

Flow 30 minutes aggregated 12 Months

Flow Daily aggregated 12 Months

DNS 3 Months

Events 6 Weeks

Logs 10 Days

Entity Raw 90 Days

Entity 30 minutes aggregated 12 Months

Entity 6 hours aggregated 12 Months

Entity Daily aggregated 12 Months

Entity Threshold State 90 Days

For example, if you consider the date as 12 January, the data is retained from 12January until 12 April before it gets purged for Entity Raw.

Note:

v The specified disk space usage is calculated based on the default Hardwarerequirements.

v All these options except the logs retention period can be configured from theNetwork Performance Insight System Configuration UI.– To configure your own settings for database retention period for logs, see

Configuring logging in Troubleshooting IBM Network Performance Insight.– To configure the rest of the database retention period profiles, see

Administering retention profiles in Administering IBM Network PerformanceInsight.


UIThe UI service operates in cluster load balancing mode. Each instance of UI is fullyoperational and capable of serving requests.

Jazz for Service Management federation can be configured in one of the followingways:v A single instance of Jazz for Service Management can be configured to federate

against any one of the UI instances. If either Jazz for Service Management serveror the UI instance that it is federating fails, all UI functions are disabled.

v A HTTP/HTTPS load balancer can be deployed between a single instance ofJazz for Service Management and a cluster of UI instances. The load balancercan be configured to present a single IP address for the pool of UI instances.Load is distributed by the load balancer and if one or more UI instances fail, theothers pick up the load.

The configuration settings are available on Ambari server from Services > NPI >NPI Settings under NPI Web Services pane.

Dashboard Application Services Hub federation settings are available on Ambariserver from NPI > NPI Core Settings under Advanced > Advanced npi-authpane.

Entity Metric servicesServices that are required for Network Performance Insight entity metric data thatis collected, aggregated, and monitored.

Tivoli Network Manager CollectorThe Tivoli Network Manager Collector Service runs as a cluster singleton.

Network Manager uses the Apache Storm real-time computation system toaggregate raw poll data into historical poll data, and stores raw and historical polldata in the NCPOLLDATA database. The Storm Spout provides polling data andrelated metadata through Kafka. Periodically, Spout checks the NCPOLLDATAdatabase for new metric data and new collection metadata.

Network Performance Insight periodically poll IP SLA probes that are discoveredby Tivoli Network Manager and perform reconciliation. If the new probes arediscovered, a new request is generated and if a probe is stopped, the existingrequest is withdrawn. It pulls the SNMP metrics that are polled and IP SLA probesthat are discovered from Tivoli Network Manager to Network Performance InsightStorage Service.

Metric data from Tivoli Network Manager is handled as follows:v Network Performance Insight receives data from three Tivoli Network Manager

NCPOLLDATA tables:– POLLDATA

Contains the actual Tivoli Network Manager polled data that includesreferences to the instance and object, the timestamp, and metric value.

– MONITOREDOBJECTIdentifies the Tivoli Network Manager metrics that are being computed orpolled.

– MONITOREDINSTANCE


Identifies the Tivoli Network Manager interfaces or entities that are beingpolled.

Storm Spout in Tivoli Network Manager supports requests for sending full setsof MONITOREDOBJECT (metrics) and MONITOREDINSTANCE (resources)data. These requests are initiated by messages on the data request topic. StormSpout pushes the data that is available in Tivoli Network ManagerMONITOREDOBJECT and MONITOREDINSTANCE tables and populates thedata into the same tables in Network Performance Insight.See Configuring Apache Storm Spout in Network Manager in Installing andConfiguring IBM Network Performance Insight

v Kafka Connect polls the Tivoli Network Manager database and populates theNCIM.NETWORK_INTERFACE table in Network Performance Insight database.The Tivoli Network Manager Collector Service refers to Network PerformanceInsight configuration settings to connect to Tivoli Network Manager database.For more information about these settings, see Configuring communication withTivoli Network Manager in Installing and Configuring IBM Network PerformanceInsight.The configuration settings are available on Ambari server from Services > NPI >Configs > NOI Core Settings under NOI Components > NOI SNMP Collectorpane. These settings are populated with default information.

The Tivoli Network Manager Collector Service has the Kafka Consumers for eachinbound data type. The Kafka Consumers receive the Tivoli Network Managerdata from the Kafka topic and publish it on an internal event stream.

The collector service collates the Tivoli Network Manager data and writes therecords into npi.timeseries.raw kafka topic. The data from the Kafka topic ispicked up by the Storage Service and written to ENTITY_METRIC.RAW table.

The Tivoli Network Manager poll data is made available to other NetworkPerformance Insight services through the Storage Service.Related information:

Tivoli Network Manager

Integrating Network Manager with Network Performance Insight

Formula ServiceThe Formula Service calculates metric values for the data that is collected bySNMP Collector. It uses formulas that are deployed against specific entity types.The Formula Service operates as a cluster singleton.

Based on the list of IP SLA probes that are detected by Tivoli Network ManagerCollector, the Formula Service creates poll definition requests for SNMP Collectorto start the polling. Formula Service detects new IP SLA Probes or deactivatesprobes and sets up SNMP collector for metric collection.

Currently, the Formula Service supports SNMP data from Cisco IPSLA devicesonly. The Formula Service requests SNMP data on probes and uses the samepolling interval that the probes use to measure RTT. If a probe is configured tomeasure every 30 seconds, then the Formula Service requests OIDs on the probeinstance every 30 seconds.


http://www.ibm.com/support/knowledgecenter/SSSHRK_4.2.0/itnm/ip/wip/poll/reference/nmip_ref_aboutncpolldatadb.html

https://www.ibm.com/support/knowledgecenter/SSTPTP_1.4.1/com.ibm.netcool_ops.doc/itnm/ip/wip/install/task/nmip_con_enablenpiintegration.html

The Formula Service obtains data to be used in formula execution and metriccalculation from the npi.snmp.poll.data Kafka topic. All the calculated metricvalues are submitted to npi.timeseries.raw Kafka topic. This data is picked up bythe Storage Service and written to ENTITY_METRIC.RAW table.

Supported formulas


Table 2. Entity Type: cisco.probe.http

Metric Description Units Formula

httpDnsRtt The sum of RTT taken toperform a Domain NameServer Query (DNS)within the HTTPoperations.

Milliseconds rttMonLatestHTTPOperDNSRTT

httpRtt If a successful test iscompleted since the lastpoll, return the sum of theround trip time (RTT) ofsuccessful HTTPoperations.

Milliseconds rttMonLatestHTTPOperRTT

httpTcpConnectRtt The sum of RTT taken toperform a TCP connectionto the HTTP server.

Milliseconds rttMonLatestHTTPOperTCPConnectRTT

httpTransactionRtt The sum of Round TripTime (RTT) taken todownload the object thatis specified by URLwithin HTTP operations.

Milliseconds rttMonLatestHTTPOperTransactionRTT

httpTransactionFailed The percentage of requestthat have HTTP errorswhen downloading thebase page within HTTPoperations. If no errorsreturn 0, otherwise return100.

Percent 100 * (not(rttMonLatestHTTPErrorSenseDescription==""))

httpTransactionSucceeded The percentage of requestthat had no HTTP errorswhen downloading thebase page within HTTPoperations. If no errorsreturn 100, otherwisereturn 0.

Percent 100 * ((rttMonLatestHTTPErrorSenseDescription==""))

14IB

M N

etwork Perform

ance Insight:N

etwork Perform

ance Insight Overview

Table 3. Entity Type: cisco.probe.jitter


jitterAvgInbound If a successful probe issent since the last poll,return the averageinbound jitter.

Milliseconds (rttMonLatestJitterOperSumOfPositivesDS +rttMonLatestJitterOperSumOfNegativesDS) /(rttMonLatestJitterOperNumOfRTT - 1)

jitterMaxNegativeInbound If a successful probe issent since the last poll,return the maximuminbound negative jitter.

Milliseconds rttMonLatestJitterOperMaxOfNegativesDS

jitterPacketLossInbound If a test is completed sincethe last poll, return thepercentage of inboundpackets lost.

Percent 100 * rttMonLatestJitterOperPacketLossDS /rttMonEchoAdminNumPackets

jitterMaxPositiveInbound If a successful probe issent since the last poll,return the maximuminbound positive jitter.

Milliseconds rttMonLatestJitterOperMaxOfPositivesDS

jitterSucceeded If a successful probe issent since the last poll,return the averageoutbound jitter.

Milliseconds 100 * rttMonLatestJitterOperNumOfRTT/ rttMonEchoAdminNumPackets

jitterMaxNegativeOutbound If a successful probe issent since the last poll,return the maximumoutbound negative jitter.

Milliseconds rttMonLatestJitterOperMaxOfNegativesSD

jitterPacketLossOutbound If a test is completed sincethe last poll, return thepercentage of outboundpackets lost.

Percent 100 * rttMonLatestJitterOperPacketLossSD /rttMonEchoAdminNumPackets

jitterMaxPositiveOutbound If a successful probe issent since the last poll,return the maximumoutbound positive jitter.

Milliseconds rttMonLatestJitterOperMaxOfPositivesSD

jitterPacketCount If a test is completed sincethe last poll, return thenumber of sent probes,otherwise return 0.

Number rttMonEchoAdminNumPackets

Chapter

2. Netw

ork Performance Insight architecture

15

Table 3. Entity Type: cisco.probe.jitter (continued)


jitterPacketLoss If a test is completed sincethe last poll, return thenumber of sent probes,otherwise return 0.

Percent rttMonEchoAdminNumPackets

jitterAvgOutbound If a successful probe issent since the last poll,return the standarddeviation of the round triptime.

Milliseconds (rttMonLatestJitterOperRTTSum2 /rttMonLatestJitterOperNumOfRTT -((rttMonLatestJitterOperRTTSum /rttMonLatestJitterOperNumOfRTT) ^ 2)) ^ 0.5

jitterPacketLossUnknown If a test is completed sincethe last poll, return thepercentage of packets thatare lost where thedirection is unknown.

Percent 100 * rttMonLatestJitterOperPacketMIA /rttMonEchoAdminNumPackets

jitterAvgInboundOneWay The average of the delaystraveling from destinationto source for all the probesduring the last test.

Milliseconds rttMonLatestJitterOperOWSumDS /rttMonLatestJitterOperNumOfRTT

jitterAvgOutboundOneWay The average of the delaystraveling from source todestination for all theprobes during the last test.

Milliseconds rttMonLatestJitterOperOWSumSD1 /rttMonLatestJitterOperNumOfRTT

16IB

M N

etwork Perform

ance Insight:N

etwork Perform


Table 4. Entity Type: cisco.probe.voip


voipIcpif The maximum of allCalculated PlanningImpairment Factor(ICPIF) value for thejitter operations.

Number rttMonLatestJitterOperICPIF

voipMos The maximum of allMean Opinion Score(MOS) value for the jitteroperations.

Number rttMonLatestJitterOperMOS / 100

jitterAvgInbound If a successful probe issent since the last poll,return the averageinbound jitter.

Milliseconds (rttMonLatestJitterOperSumOfPositivesDS +rttMonLatestJitterOperSumOfNegativesDS) /(rttMonLatestJitterOperNumOfRTT - 1)

jitterMaxNegativeInbound If a successful probe issent since the last poll,return the maximuminbound negative jitter.

Milliseconds rttMonLatestJitterOperMaxOfNegativesDS

jitterPacketLossInbound If a test is completedsince the last poll, returnthe percentage ofinbound packets lost.

Percent 100 * rttMonLatestJitterOperPacketLossSD/ rttMonEchoAdminNumPackets

jitterMaxPositiveInbound If a successful probe issent since the last poll,return the maximuminbound positive jitter.

Milliseconds rttMonLatestJitterOperMaxOfPositivesDS

jitterSucceeded If a successful probe issent since the last poll,return the averageoutbound jitter.

Milliseconds 100 * rttMonLatestJitterOperNumOfRTT /rttMonEchoAdminNumPackets

jitterMaxNegativeOutbound If a successful probe issent since the last poll,return the maximumoutbound negative jitter.

Milliseconds rttMonLatestJitterOperMaxOfNegativesSD

Chapter

2. Netw


17

Table 4. Entity Type: cisco.probe.voip (continued)


jitterPacketLossOutbound If a test is completedsince the last poll, returnthe percentage ofoutbound packets lost.

Percent 100 * rttMonLatestJitterOperPacketLossSD /rttMonEchoAdminNumPackets

jitterMaxPositiveOutbound If a test is completedsince the last poll, returnthe percentage ofoutbound packets lost.

Milliseconds rttMonLatestJitterOperMaxOfPositivesSD

jitterPacketCount If a test is completedsince the last poll, returnthe number of sentprobes, otherwise return0.

Number rttMonEchoAdminNumPackets

jitterPacketLoss If a test is completedsince the last poll, returnthe percentage of probeslost.

Percent rttMonEchoAdminNumPackets

jitterAvgOutbound If a successful probe issent since the last poll,return the standarddeviation of the roundtrip time.

Milliseconds (rttMonLatestJitterOperRTTSum2 /rttMonLatestJitterOperNumOfRTT -((rttMonLatestJitterOperRTTSum /rttMonLatestJitterOperNumOfRTT) ^ 2)) ^ 0.5

jitterPacketLossUnknown If a test is completedsince the last poll, returnthe percentage of packetsthat are lost where thedirection is unknown.

Percent 100 * rttMonLatestJitterOperPacketMIA /rttMonEchoAdminNumPackets

jitterAvgInboundOneWay The average of the delaystraveling fromdestination to source forall the probes during thelast test.

Milliseconds rttMonLatestJitterOperOWSumDS /rttMonLatestJitterOperNumOfRTT

jitterAvgOutboundOneWay The average of the delaystraveling from source todestination for all theprobes during the lasttest.

Milliseconds rttMonLatestJitterOperOWSumSD /rttMonLatestJitterOperNumOfRTT

18IB

M N

etwork Perform

ance Insight:N

etwork Perform


Chapter

2. Netw


19

Table 5. Entity Type: cisco.probe.rtp


rtpAvgInbound If a successful probe issent since the last poll,return the maximuminbound time.

Milliseconds rttMonLatestRtpOperAvgOWDS

rtpEarlyPacketsInbound Number of early packetsat source for the latestoperation.

Percent 100 * rttMonLatestRtpOperPacketEarlyDS /rttMonLatestRtpOperTotalPaksDS

rtpFrameLossInbound If a test is completedsince the last poll, returnthe percentage ofinbound frame lost.

Percent rttMonLatestRtpOperFrameLossDS

rtpInterArrivalJitterInbound If a successful probe issent since the last poll,return the maximuminbound jitter.

Milliseconds rttMonLatestRtpOperIAJitterDS

rtpLatePacketsInbound Number of late packetsat source for the latestoperation.

Percent 100 * rttMonLatestRtpOperPacketLateDS /rttMonLatestRtpOperTotalPaksDS

rtpMosCqInbound The average estimatedinbound Mean OpinionScore for ConversationalQuality (MOS-CQ).

Number rttMonLatestRtpOperMOSCQDS

rtpMosLqInbound The average estimatedinbound Mean OpinionScore for ListeningQuality (MOS-LQ).

Number rttMonLatestRtpOperMOSLQDS

rtpPacketLossInbound If a test is completedsince the last poll, returnthe percentage ofinbound packets lost.

Percent 100 * rttMonLatestRtpOperPacketLossDS /rttMonLatestRtpOperTotalPaksDS

rtpRFactorInbound The average ofestimated voice qualityvalue that is transmittedfrom destination tosource for the RTPoperation.

Number rttMonLatestRtpOperRFactorDS

20IB

M N

etwork Perform

ance Insight:N

etwork Perform


Table 5. Entity Type: cisco.probe.rtp (continued)


rtpAvgOutbound If a successful probe issent since the last poll,return the maximumoutbound time.

Milliseconds rttMonLatestRtpOperAvgOWSD

rtpInterArrivalJitterOutbound If a successful probe issent since the last poll,return the maximumoutbound jitter.

Milliseconds rttMonLatestRtpOperIAJitterSD.%I1 *

rtpMosCqOutbound The average estimatedoutbound MeanOpinion Score forConversational Quality(MOS-CQ).

Number rttMonLatestRtpOperMOSCQSD

rtpPacketLossOutbound If a test is completedsince the last poll, returnthe percentage ofoutbound packets lost.

Percent 100 * rttMonLatestRtpOperPacketLossSD /rttMonLatestRtpOperTotalPaksSD

rtpRFactorOutbound The average ofestimated voice qualityvalue that is transmittedfrom source todestination for the RTPoperation.

Number rttMonLatestRtpOperRFactorSD

rtpRtt If a successful test iscompleted since the lastpoll, return the roundtrip time.

Milliseconds rttMonLatestRtpOperRTT

rtpPacketLossUnknown If a test is completedsince the last poll, returnthe percentage ofpackets that are lostwhere the direction isunknown.

Percent 100 * rttMonLatestRtpOperPacketsMIA /rttMonLatestRtpOperTotalPaksSD

Chapter

2. Netw


21

Table 6. Entity Type: cisco.probe.echo


echoProbeCount If a test is completedsince the last poll, return1, otherwise return 0.

Number not(rttMonLatestRttOperTime ==last(rttMonLatestRttOperTime))

echoProbeLoss If a test is completedsince the last poll, returnthe percentage of probeslost.

Percent 100 * not(rttMonLatestRttOperSense == 1)

echoProbeSucceeded If a test is completedsince the last poll, returnthe percentage of probesthat succeeded.

Percent 100 * (rttMonLatestRttOperSense == 1)

echoRtt If a successful test iscompleted since the lastpoll, return the roundtrip time.

Milliseconds rttMonLatestRttOperCompletionTime

22IB

M N

etwork Perform

ance Insight:N

etwork Perform


SNMP CollectorThe SNMP Collector Service can poll SNMP enabled devices to provide metricpolling data to Network Performance Insight. The SNMP Collector Service receivespolling requests from Kafka, dispatches the appropriate SNMP requests, and placesthe resulting metric data back to Kafka for further analytic operations. SNMPCollector Service operates as a cluster singleton.

SNMP enabled devices can be configured to probe and measure how traffic isflowing across the network with the help of Cisco IP SLA metric data such asresponse times, latency, jitter, and packet loss. This information can be used todetermine the current performance of the network from the user perspective.

SNMP Collector Service collects Cisco IP SLA-specific data and provides SNMP IPSLA network monitoring for specific quality of service measurements. The SNMPCollector receives poll requests and SNMP credentials from Kafka topics.

Polling definitions

By default, the SNMP Collector receives polling definitions from a Kafka topic byname npi.snmp.poll.definition. The polling definition messages are in thefollowing format:[resource]/[oid]/[interval]

Where:

Field name Description

resource The resource that must be polled.

oid The OID on the resource that must bepolled.

interval The time interval that the OID or instanceon the resource that must be polled.

Polling credentials

By default, the SNMP Collector receives SNMP credentials from a Kafka topic byname npi.snmp.poll.credentials. The polling definition messages are in thefollowing format:

For example:{

"agent":{"retries":2,"version":1,"port":161,"ipAddress":"10.55.239.4","timeout":10000

},"credentials":{

"readCommunity":"j5pxixEqV/kv6ZHPYdSJ3w==","writeCommunity":"m7J4uOY3d2c0TJqis8Fg+Q=="

}}

Where:



agent Device

retries How many times the SNMPhelper and polling operationsattempted to access a device.

version SNMP version

port UDP port number for IP SLAagent. By default. 161.

ipAddress IP address of the resourcethat the credentials must beused.

timeout The time in milliseconds towait for a reply beforetiming out.

credentials readCommunity Encrypted read communitystring.

writeCommunity Encrypted write communitystring.Note: The read and writecommunity string protectsthe device againstunauthorized changes.

Polling data

By default, the SNMP collector produces polled data on a Kafka topic by namenpi.snmp.poll.data. The polled data messages are in the following format:<resource>,<attribute>,<timestamp>:,<value>

Fpr example:{"resource":"10.55.239.211:161","attribute":"1.3.6.1.4.1.9.9.42.1.5.2.1.14.500","timestamp":1495677950905,"value":"65"}


resource The resource that is polled.

attribute The OID on the resource that is polled.

timestamp The time stamp that the OID or instance onthe resource that is polled.

value The value that is retrieved from the OID orinstance on the resource.

Entity AnalyticsEntity Analytics service is used to perform aggregation of RAW data that iscollected from Tivoli Network Manager.

The Entity Analytics Service performs 30 minutes, 6 hours and 1-day aggregations.

The Tivoli Network Manager collector writes entity metric RAW records in parquetfile format on HDFS every minute and sends an aggregation status message toindicate that a new RAW data is available. It checks and triggers the nextaggregation when the data is available.


It refines the raw data, filters the results, and aggregates the KPI values. Thevalues are aggregated by sum, min, max, and count. The results are then stored inNetwork Performance Insight database.

The Entity Analytics Service operates as a cluster singleton. It delegates queries toApache Spark to achieve faster query response time.

Thresholdssupports burst or static thresholds.

Network Performance Insight and Tivoli Network Manager performs thresholdevaluation on the entity metric data by using period-based data aggregationsagainst the threshold definitions.

A threshold is a value that is compared against the predefined thresholdconfigurations. It is evaluated to see whether it violates a specific restriction. Theprimary objective of thresholding is to determine any violations and to generatealerts. When the value falls outside the acceptable threshold range, the systemgenerates and stores the event condition and forwards it to the Event ManagementSystem.

Static thresholds

Static (Burst) thresholding is user-defined static values at specific intervals, whichanalyze data and generate events when a violation occurs.

If your IBM Netcool Operations Insight solution is integrated with NetworkPerformance Insight, then you can define static thresholds for anomaly detection.

You can define a static threshold for a KPI within the poll definition that polls forthat KPI. If these static thresholds are violated for any performance measure on adevice or interface, IBM Tivoli Netcool/OMNIbus events are generated at anappropriate severity level.

Thresholds define the status of an attribute based on specific conditions. You canenable threshold evaluation on a selected resource or interface. A threshold isviolated when the result of the collected metric value is evaluated as exceeding(upper) or dropping (lower) to a specified configured threshold level. The actualevaluation and disposition depends on the threshold type, Upper, Lower, or Band.

To define an anomaly threshold, see Defining anomaly thresholds.Related information:

Defining performance thresholds for anomaly detection


https://www.ibm.com/support/knowledgecenter/SSTPTP_1.4.1/com.ibm.netcool_ops.doc/itnm/ip/wip/noi/task/noi_poll_crtbasictholddef.html

https://www.ibm.com/support/knowledgecenter/SSTPTP_1.4.1/com.ibm.netcool_ops.doc/itnm/ip/wip/noi/task/noi_poll_defininganomalythreshold.html

Flow Metric servicesServices that are required for Network Performance Insight traffic flow data that iscollected, aggregated, and monitored.

Flow CollectorFlow Collector Service is responsible for collecting IP traffic information fromflow-enabled devices. A flow is a sequence of packets with common characteristicssuch as same source and destination IP address, transport layer port information,and type of protocol. The flow enabled devices or exporters collect flow data fromthe network.

The Flow Collector Service in Network Performance Insight performs these basicfunctions:v Receives flow records from flow-exporters.v Parses, validates, and normalizes the various flow record formats into a common

format.v Enriches and filters Interfaces based on enable/disable flag set per network

interface.v Limits the number of interfaces that are enabled in Network Performance Insightv Stores the normalized and enriched flow records in apache.parquet files in

HDFS.v Notifies the Storage Service of the availability of Flow RAW data files via Kafka.

The configuration settings are available on Ambari server from Services > NPI >Configs > NPI Settings under NPI Components > NPI Flow Collector .

For more information about these configurations, see Configuring the Flow CollectorService in Installing and Configuring IBM Network Performance Insight.

Collection processA collection process must be able to receive the flow information that is passingthrough multiple network elements within the data network.

Flow metric collection

Traffic on a data network can be perceived as a flow of data between twoend-points that passes through network elements. For administrative or otherpurposes, it is beneficial to have the network elements observe these flows andreport their characteristics to Network Performance Insight to understand thenetwork usage patterns.

The three main components in NetFlow technology are; NetFlow cache, NetFlowexporter, and NetFlow collector.

NetFlow cacheA large amount of network information is condensed into a database ofNetFlow information that is called the NetFlow cache. NetFlow can beconfigured to capture flows to the NetFlow cache. Typically, the NetFlowcache is constantly filling with flows and the router or switch searches thecache for flows that are terminated or expired and these flows are exportedto the NetFlow collector.


NetFlow exporterThe NetFlow exporter sends flows that are in the cache to a NetFlowcollector. NetFlow exporters are configured for Ingress interface traffic andEgress interface traffic or both.

NetFlow collectorNetFlow Collector receives and pre-processes flow data that is receivedfrom a flow exporter.

A flow is ready for export when it is inactive, when no new packets are receivedfor the flow or if the flow is long lived (active) and lasts longer than the activetimer. A flow is inactive if it has not received a packet for a specific duration thatis longer than the inactive timeout value that is specified in the configuration. Theflow record is deleted from the flow cache and an export record is generated, whenthe inactive timeout is triggered. By default, active timeout value is 30 minutes andinactive timeout value is 15 seconds.

The collector component in Network Performance Insight can then process andtransform the data.

The collection process in Network Performance Insight is as follows:

Collector requirements in Network Performance InsightSpecifies the representation of different flows, the additional data that is requiredfor flow interpretation, packet format, transport mechanisms that are used.

Important: For more information about flow concepts, see the specific vendordocuments.


Flow version support

Flow formats VendorSupportedversions

Supportedtraffic types

Supportedprotocols

NetFlow Cisco 1, 5, 9Note: See,“RFC 3954” forNetFlow V9

v IPv4

v IPv6

v UDP

Note: ForNetworkPerformanceInsight 1.2.1,UDP is the onlysupportedprotocol.

JFlow JuniperNetworks

5, 9

CFlow Alcatel 5, 9

NetStream Huawei 5, 9

IPFIX IndustrystandardNote: See, “RFC7011”

X

NetFlow V1 formats:

V1 format is the original format that is supported in the initial NetFlow releases.

V1 header format

Bytes Contents Description

0-1 version NetFlow export formatversion number

2-3 count Number of flows that areexported in this packet (1-24)

4-7 SysUptime Current time in millisecondssince the export device isstarted

8-11 unix_secs Current count of secondssince 0000 CoordinatedUniversal Time 1970

12-16 unix_nsecs Residual nanoseconds since0000 Coordinated UniversalTime 1970

V1 Flow record format


0-3 srcaddr Source IP address

4-7 dstaddr Destination IP address

8-11 nexthop IP address of next hop router

12-13 input SNMP index of inputinterface

14-15 output SNMP index of outputinterface

16-19 dPkts Packets in the flow



20-23 dOctets Total number of Layer 3bytes in the packets of theflow

24-27 First SysUptime at start of flow

28-31 Last SysUptime at the time thelast packet of the flow wasreceived

32-33 srcport TCP/UDP source portnumber or equivalent

34-35 dstport TCP/UDP destination portnumber or equivalent

36-37 pad1 Unused (zero) byte

38 prot IP protocol type (forexample, TCP = 6; UDP = 17)

39 tos IP type of service (ToS)

40 flags Cumulative OR of TCP flags

41-43 pad1pad2pad3

Unused (zero) bytes

44-48 Reserved Unused (zero) bytes


NetFlow Export Datagram Format

NetFlow V5 formats:

V5 format is an enhancement that adds Border Gateway Protocol (BGP)autonomous system information and flow sequence numbers.

V5 header format

Bytes Fields Description

0-1 version NetFlow export formatversion number

2-3 count Number of flows that areexported in this packet (1-30)

4-7 SysUptime Current time in millisecondssince the export devicestarted

8-11 unix_secs Current count of secondssince 0000 CoordinatedUniversal Time 1970

12-15 unix_nsecs Residual nanoseconds since0000 Coordinated UniversalTime 1970

16-19 flow_sequence Sequence counter of totalflows seen

20 engine_type Type of flow-switchingengine


http://www.cisco.com/c/en/us/td/docs/net_mgmt/netflow_collection_engine/3-6/user/guide/format.html


21 engine_id Slot number of theflow-switching engine

22-23 sampling_interval First two bits hold thesampling mode; remaining14 bits hold value ofsampling interval

V5 Flow record format


0-3 srcaddr Source IP address

4-7 dstaddr Destination IP address

8-11 nexthop IP address of next hop router

12-13 input SNMP index of inputinterface

14-15 output SNMP index of outputinterface

16-19 dPkts Packets in the flow

20-23 dOctets Total number of Layer 3bytes in the packets of theflow

24-27 First SysUptime at start of flow

28-31 Last SysUptime at the time thelast packet of the flow wasreceived

32-33 srcport TCP/UDP source portnumber or equivalent

34-35 dstport TCP/UDP destination portnumber or equivalent

36 pad1 Unused (zero) byte

37 tcp_flags Cumulative OR of TCP flags

38 prot IP protocol type (forexample, TCP = 6; UDP = 17)

39 tos IP type of service (ToS)

40-41 src_as Autonomous system numberof the source, either origin orpeer

42-43 dst_as Autonomous system numberof the destination, eitherorigin or peer

44 src_mask Source address prefix maskbits

45 dst_mask Destination address prefixmask bits

46-47 pad2 Unused (zero) bytes


NetFlow Export Datagram Format


http://www.cisco.com/c/en/us/td/docs/net_mgmt/netflow_collection_engine/3-6/user/guide/format.html

NetFlow V9 formats:

V9 format is template-based. Templates provide an extensible design to the recordformat.

V9 packet layout

The NetFlow V9 record format consists of a packet header and at least one or moretemplate or data FlowSets. A template FlowSet provides a description of the fieldsthat will be present in future data FlowSets. These data FlowSets might occur laterwithin the same export packet or in subsequent export packets.

V9 packet header format

V9 packet header fields

Fields Description

version The version of NetFlow records exported inthis packet; for Version 9, this value is0x0009

count Number of FlowSet records (both templateand data) contained within this packet

SysUptime Current time in milliseconds since the exportdevice is started.

UNIX seconds Current time in seconds that have elapsedsince 00:00:00 Coordinated Universal Time,Thursday, 1 January 1970.

Sequence number Incremental sequence counter of all exportpackets that are sent by this export device;this value is cumulative, and it can be usedto identify any missed export packets.Note: This is a change from the NetFlowVersion 5 and Version 8 headers, where thisnumber represented “total flows.”


Fields Description

Source ID The Source ID field is a 32-bit value that isused to guarantee uniqueness for all flowsthat are exported from a particular device.(The Source ID field is the equivalent of theengine type and engine ID fields that arefound in the NetFlow Version 5 and Version8 headers). The format of this field isvendor-specific. In the Ciscoimplementation, the first two bytes arereserved for future expansion, and is alwayszero. Byte 3 provides uniqueness about therouting engine on the exporting device. Byte4 provides uniqueness about the particularline card or Versatile Interface processor onthe exporting device. Collector devices mustuse the combination of the source IP addressplus the Source ID field to associate anincoming NetFlow export packet with aunique instance of NetFlow on a particulardevice.


NetFlow Version 9 Flow-Record Format

Template FlowSet format:

One of the key elements in the new NetFlow V9 format is the template FlowSet.

NetFlow V9 template FlowSet format

Templates enhance the flexibility of the NetFlow record format because they allowa NetFlow collector or display application to process NetFlow data withoutnecessarily knowing the format of the data in advance. Templates are used todescribe the type and length of individual fields within a NetFlow data record thatmatch a template ID.Related information:


Template FlowSet field descriptions:

Template IDs are not consistent across a router restart. Template IDs must changeonly if the configuration of NetFlow on the export device changes.

NetFlow V9 template FlowSet field descriptions

Field Name Value

FlowSet ID The FlowSet ID is used to distinguishtemplate records from data records. Atemplate record always has a FlowSet ID inthe range of 0-255. Currently, the templaterecord that describes flow fields has aFlowSet ID of zero and the template recordthat describes option fields has a FlowSet IDof 1. A data record always has a nonzeroFlowSet ID greater than 255.


http://www.cisco.com/en/US/technologies/tk648/tk362/technologies_white_paper09186a00800a3db9.html


Field Name Value

Length Length refers to the total length of thisFlowSet. Because an individual templateFlowSet might contain multiple templateIDs, the length value must be used todetermine the position of the next FlowSetrecord, which might be either a template ora data FlowSet.

Length is expressed in Type/Length/Value(TLV) format, meaning that the valueincludes the bytes used for the FlowSet IDand the length bytes themselves, and thecombined lengths of all template recordsincluded in this FlowSet.

Template ID As a router generates different templateFlowSets to match the type of NetFlow datait is exporting, each template is given aunique ID. This uniqueness is local to therouter that generated the template ID.

Templates that define data record formatsbegin numbering at 256 since 0-255 arereserved for FlowSet IDs.

Field Count This field gives the number of fields in thistemplate record. Because a template FlowSetmight contain multiple template records,this field allows the parser to determine theend of the current template record and thestart of the next.

Field Type This numeric value represents the type ofthe field. The possible values of the fieldtype are vendor-specific. Cisco suppliedvalues are consistent across all platformsthat support NetFlow V9.

At the time, of the initial release of theNetFlow V9 code (and after any subsequentchanges that might add new field-typedefinitions), Cisco provides a file thatdefines the known field types and theirlengths.

Field Length This number gives the length of the definedfield, in bytes.

Templates periodically expire if they are not refreshed. Templates can be refreshedin two ways:v A template can be sent again every N number of export packets. Default

template-resend interval of 20 packets, configurable 1-1000 data packets.v A template can also be sent on a timer, so that it is refreshed every N number of

minutes. Both options are user configurable. Default template-resend time of 10minutes, configurable between 1 minute and 1 day.

Note: Both options are configurable by the user on the Exporter. When one ofthese expiry conditions is met, the Exporter must send the template FlowSet andOptions template.



NetFlow V 9 Flow-Record Format

V9 field type definitions:

When extensibility is required, the new field types can be added to the list. Thenew field types must be updated on the Exporter and Collector but the NetFlowexport format remains unchanged.

Field Type Value Length (bytes) Description

IN_BYTES 1 N (default is 4)Incoming counter with length N x8 bits for number of bytesassociated with an IP Flow.

IN_PKTS 2 N (default is 4)

Incoming counter with length N x8 bits for the number of packetsthat are associated with an IPFlow

FLOWS 3 NNumber of flows that areaggregated; default for N is 4

PROTOCOL 4 1 IP protocol byte

SRC_TOS 5 1Type of Service byte setting whenthere is an incoming interface

TCP_FLAGS 6 1Cumulative of all the TCP flagsseen for this flow

L4_SRC_PORT 7 2TCP/UDP source port number.That is, FTP, Telnet, or equivalent

IPV4_SRC_ADDR 8 4 IPv4 source address

SRC_MASK 9 1

The number of contiguous bits inthe source address subnet mask.That is, the submask in slashnotation

INPUT_SNMP 10 NInput interface index; default forN is 2 but higher values might beused

L4_DST_PORT 11 2TCP/UDP destination portnumber. That is, FTP, Telnet, orequivalent

IPV4_DST_ADDR 12 4 IPv4 destination address

DST_MASK 13 1

The number of contiguous bits inthe destination address subnetmask. That is, the submask inslash notation.

OUTPUT_SNMP 14 NOutput interface index; default forN is 2 but higher values might beused

IPV4_NEXT_HOP 15 4 IPv4 address of next-hop router

SRC_AS 16 N (default is 2)Source BGP autonomous systemnumber where N might be 2 or 4

DST_AS 17 N (default is 2)Destination BGP autonomoussystem number where N might be2 or 4




BGP_IPV4_NEXT_HOP 18 4Next-hop router's IP in the BGPdomain

MUL_DST_PKTS 19 N (default is 4)

IP multicast outgoing packetcounter with length N x 8 bits forpackets that are associated withthe IP Flow

MUL_DST_BYTES 20 N (default is 4)IP multicast outgoing byte counterwith length N x 8 bits for bytesassociated with the IP Flow

LAST_SWITCHED 21 4System uptime at which the lastpacket of this flow was switched

FIRST_SWITCHED 22 4System uptime at which the firstpacket of this flow was switched

OUT_BYTES 23 N (default is 4)Outgoing counter with length N x8 bits for the number of bytesassociated with an IP Flow

OUT_PKTS 24 N (default is 4)

Outgoing counter with length N x8 bits for the number of packetsthat are associated with an IPFlow.

MIN_PKT_LNGTH 25 2Minimum IP packet length onincoming packets of the flow

MAX_PKT_LNGTH 26 2Maximum IP packet length onincoming packets of the flow

IPV6_SRC_ADDR 27 16 IPv6 Source Address

IPV6_DST_ADDR 28 16 IPv6 Destination Address

IPV6_SRC_MASK 29 1Length of the IPv6 source mask incontiguous bits

IPV6_DST_MASK 30 1Length of the IPv6 destinationmask in contiguous bits

IPV6_FLOW_LABEL 31 3IPv6 flow label as in RFC 2460definition

ICMP_TYPE 32 2Internet Control Message Protocol(ICMP) packet type; reported as((ICMP Type*256) + ICMP code)

MUL_IGMP_TYPE 33 1Internet Group ManagementProtocol (IGMP) packet type

SAMPLING_INTERVAL 34 4

During the use of sampledNetFlow, the rate at which packetsare sampled. That is, a value of100 indicates that one of every 100packets is sampled

SAMPLING_ALGORITHM 35 1

The type of algorithm that is usedfor sampled NetFlow: 0x01Deterministic Sampling, 0x02Random Sampling

FLOW_ACTIVE_TIMEOUT 36 2Timeout value (in seconds) foractive flow entries in the NetFlowcache



FLOW_INACTIVE_TIMEOUT 37 2Timeout value (in seconds) forinactive flow entries in theNetFlow cache

ENGINE_TYPE 38 1Type of flow switching engine: RP= 0, VIP/Linecard = 1

ENGINE_ID 39 1ID number of the flow switchingengine

TOTAL_BYTES_EXP 40 N (default is 4)

Counter with length N x 8 bits forbytes for the number of bytesexported by the ObservationDomain

TOTAL_PKTS_EXP 41 N (default is 4)

Counter with length N x 8 bits forbytes for the number of packetsthat are exported by theObservation Domain

TOTAL_FLOWS_EXP 42 N (default is 4)

Counter with length N x 8 bits forbytes for the number of flows thatare exported by the ObservationDomain

*Vendor Proprietary* 43

IPV4_SRC_PREFIX 44 4IPv4 source address prefix(specific for Catalyst architecture)

IPV4_DST_PREFIX 45 4IPv4 destination address prefix(specific for Catalyst architecture)

MPLS_TOP_LABEL_TYPE 46 1

MPLS Top Label Type: 0x00UNKNOWN 0x01 TE-MIDPT 0x02ATOM 0x03 VPN 0x04 BGP 0x05LDP

MPLS_TOP_LABEL_IP_ADDR 47 4Forwarding Equivalent Classcorresponding to the MPLS TopLabel

FLOW_SAMPLER_ID 48 1Identifier that is shown in "showflow-sampler"

FLOW_SAMPLER_MODE 49 1

The type of algorithm that is usedfor sampling data: 0x02 randomsampling. Use withFLOW_SAMPLER_MODE

FLOW_SAMPLER_RANDOM_INTERVAL50 4Packet interval at which tosample. Use withFLOW_SAMPLER_MODE

*Vendor Proprietary* 51

MIN_TTL 52 1Minimum TTL on incomingpackets of the flow

MAX_TTL 53 1Maximum TTL on incomingpackets of the flow

IPV4_IDENT 54 2 The IP v4 that identifies field

DST_TOS 55 1Type of Service byte setting whenexiting outgoing interface

IN_SRC_MAC 56 6 Incoming source MAC address



OUT_DST_MAC 57 6Outgoing destination MACaddress

SRC_VLAN 58 2Virtual LAN identifier that isassociated with ingress interface

DST_VLAN 59 2Virtual LAN identifier that isassociated with egress interface

IP_PROTOCOL_VERSION 60 1

Internet Protocol version is set to4 for IPv4, and set to 6 for IPv6. Ifnot present in the template, thenversion 4 is assumed.

DIRECTION 61 1Flow direction: 0 - ingress flow, 1- egress flow

IPV6_NEXT_HOP 62 16IPv6 address of the next-hoprouter

BPG_IPV6_NEXT_HOP 63 16Next-hop router in the BGPdomain

IPV6_OPTION_HEADERS 64 4Bit-encoded field that identifiesIPv6 option headers found in theflow

Vendor Proprietary 65





MPLS_LABEL_1 70 3

MPLS label at position 1 in thestack. It comprises 20 bits ofMPLS label, 3 EXP (experimental)bits and 1 S (end-of-stack) bit.

MPLS_LABEL_2 71 3


MPLS_LABEL_3 72 3


MPLS_LABEL_4 73 3


MPLS_LABEL_5 74 3


MPLS_LABEL_6 75 3




MPLS_LABEL_7 76 3


MPLS_LABEL_8 77 3


MPLS_LABEL_9 78 3


MPLS_LABEL_10 79 3


IN_DST_MAC 80 6Incoming destination MACaddress

OUT_SRC_MAC 81 6 Outgoing source MAC address

IF_NAME 82

N

Shortened interface name, FE1/0(default that isspecified intemplate)

IF_DESC 83N (default thatis specified intemplate)

Full interface name, FastEthernet1/0

SAMPLER_NAME 84N (default thatis specified intemplate)

Name of the flow sampler

IN_ PERMANENT _BYTES 85 N (default is 4)Running byte counter for apermanent flow

IN_ PERMANENT _PKTS 86 N (default is 4)Running packet counter for apermanent flow

* Vendor Proprietary* 87

FRAGMENT_OFFSET 88 2The fragment-offset value fromfragmented IP packets



FORWARDING STATUS 89 1

Forwarding status is encoded on 1byte with the 2 left bits giving thestatus and the 6 remaining bitsgiving the reason code.

Status is either unknown (00),Forwarded (10), Dropped (10) orConsumed (11). List of forwardingstatus values with their meanings:

v Unknown

– 0

v Forwarded

– Unknown 64

– Forwarded Fragmented 65

– Forwarded not Fragmented66

v Dropped

– Unknown 128

– Drop ACL Deny 129

– Drop ACL drop 130

– Drop Unroutable 131

– Drop Adjacency 132

– Drop Fragmentation and DFset 133

– Drop Bad header checksum134

– Drop Bad total Length 135

– Drop Bad Header Length 136

– Drop bad TTL 137

– Drop Policer 138

– Drop WRED 139

– Drop RPF 140

– Drop For us 141

– Drop Bad output interface142

– Drop Hardware 143

v Consumed

– Unknown 192

– Terminate Punt Adjacency193

– Terminate IncompleteAdjacency 194

– Terminate For us 195

MPLS PAL RD 90 8 (array) MPLS PAL Route Distinguisher.



MPLS PREFIX LEN 91 1Number of consecutive bits in theMPLS prefix length.

SRC TRAFFIC INDEX 92 4BGP Policy Accounting SourceTraffic Index

DST TRAFFIC INDEX 93 4BGP Policy AccountingDestination Traffic Index

APPLICATION DESCRIPTION 94 N Application description.

APPLICATION TAG 95 1+n8 bits of engine ID, followed by nbits of classification.

APPLICATION NAME 96 NName that is associated with aclassification.

postipDiffServCodePoint 98 1

The value of a DifferentiatedServices Code Point (DSCP)encoded in the DifferentiatedServices field after modification.

replication factor 99 4 Multicast replication factor.

DEPRECATED 100 N DEPRECATED

layer2packetSectionOffset 102Layer 2 packet section offset.Potentially a generic offset.

layer2packetSectionSize 103Layer 2 packet section size.Potentially a generic size.

layer2packetSectionData 104 Layer 2 packet section data.

105 - 127 Reserved for future use by cisco



Mediation process:

The Collector component receives template definitions from an exporter before theflow records are received. The Flow Records are then decoded and stored locally. Ifthe template definitions are not received at the time a flow record is received, theCollector keeps the flow record for later decode after the template definitions arereceived.

The Collector does not assume that only one template FlowSet is present in anexport packet. In rare circumstances, the export packet might contain severaltemplate FlowSets.

Templates live only for a certain time frame. The lifetime of a template is deductedon the Collector that is based on the time when the last template FlowSet isreceived from the exporter. The Collector does not attempt to decode the flowrecords with an expired template. The Collector maintains a similar list as follows:<Exporter, Export Interface, Template ID, Template ID, Template Def, Last Received>

If a new template definition is received when the exporter is restarted, itimmediately overrides the existing definition.Related information:





IPFIX overview:

Internet Protocol Flow Information Export (IPFIX) is an IETF protocol, and thename of the working group that defines the protocol. The IPFIX protocol providesnetwork administrators with access to IP Flow information. It was created basedon the need for a common, universal standard of export for Internet Protocol flowinformation from routers, probes, and other devices that are used by mediationsystems, accounting, or billing systems and network management systems tofacilitate services such as measurement, accounting, and billing.

A Metering Process collects data packets at an Observation Point, optionally, filtersthem and aggregates information about these packets. Using the IPFIX protocol, anExporter then sends this information to a Collector.

IPFIX is a push protocol, that is, each sender periodically send IPFIX messages toconfigured receivers without any interaction by the receiver. The actual makeup ofdata in IPFIX messages is to a great extent up to the sender. IPFIX introduces themakeup of these messages to the receiver with the help of special Templates. Thesender is also free to use user-defined data types in its messages, so the protocol isfreely extensible and can adapt to different scenarios. IPFIX prefers the StreamControl Transmission Protocol as its transport layer protocol, but also allows theuse of the Transmission Control Protocol or User Datagram Protocol.Related information:

Specification of the IP Flow Information Export (IPFIX) Protocol for theExchange of Flow Information

IPFIX message format:

An IPFIX Message consists of a Message Header, followed by one or more Sets.The Sets can be any of the possible three types - Data Set, Template Set, or OptionsTemplate Set.

Note: The Exporter codes all binary integers of the Message Header and thedifferent Sets in network-byte order (also known as the big-endian byte ordering).

An IPFIX Message consisting of interleaved Template, Data, and Options TemplateSets.

IPFIX message header format

Following are the message header field descriptions:message header field descriptions


https://tools.ietf.org/html/rfc7011


Header field Description

Version Version of Flow Record format that isexported in this message. The value of thisfield is 0x000a for the current version,incrementing by one the version that is usedin the NetFlow services export version 9

Length Total length of the IPFIX Message, which ismeasured in octets, including MessageHeader and Sets.

Export Time Time, in seconds, since 0000 CoordinatedUniversal Time Jan 1, 1970, at which theIPFIX Message Header leaves the Exporter.

Sequence Number Incremental sequence counter-modulo 2^32of all IPFIX Data Records sent on thisPR-SCTP stream from the currentObservation Domain by the ExportingProcess. Check the specific meaning of thisfield in the subsections of Section 10 whenUDP or TCP is selected as the transportprotocol. This value must be used by theCollecting Process to identify whether anyIPFIX Data Records are missed. Templateand Options Template Records do notincrease the Sequence Number.

Observation Domain ID v A 32-bit identifier of the ObservationDomain that is locally unique to theExporting Process. The Exporting Processuses the Observation Domain ID touniquely identify to the Collector.

v Process the Observation Domain thatmetered the Flows. It is recommendedthat this identifier is unique per IPFIXDevice. Collecting Processes must use theTransport Session.

v Observation Domain ID field to separatedifferent export streams that originatefrom the same Exporting Process. TheObservation Domain ID must be 0 whenno specific Observation Domain ID isrelevant for the entire IPFIX Message. Forexample, when the Exporting ProcessStatistics are exported, or in a hierarchy ofCollectors when aggregated Data Recordsare exported.

IPFIX Set format

An IPFIX message consists of a message header followed by multiple Sets ofdifferent types. A Set is a generic term for collection of records that have a similarstructure. There are three types of Sets - Data Set, Template Set, and OptionsTemplate Set. Each oif these have a Set header and one or more records.

Every Set contains a common header. Following are the message header fielddescriptions:


Header Description

Set ID Set ID value identifies the Set. A value of 2is reserved for the Template Set. A value of 3is reserved for the Option Template Set. Allother values 4-255 are reserved for futureuse. Values more than 255 are used for DataSets. The Set ID values of 0 and 1 are notused for historical reasons

Length Total length of the Set, in octets, includingthe Set Header, all records, and the optionalpadding. Because an individual Set MAYcontain multiple records, the Length valuemust be used to determine the position ofthe next Set.

There are three types of sets:v Data Setv Template Setv Options Template SetRelated information:

Specification of the IP Flow Information Export (IPFIX) Protocol for theExchange of Flow Information

Normalized flow record fields in Network Performance InsightA list of normalized flow fields that are used across all vendors and protocols.




Field name V1 V5 V9 V10 Required

packetSequence=header(sequenceNumber)

Note: This is calculated by Collector. Yes

exportTimestampMillis =header(unixSeconds) * 1000 + (header(unixNSecs)/ 1000000)

baseTimestamp= exportTimestampMillis - header(sysuptime)

Not exported

No

templateID =header(sourceID) + templateID All fields for IPFix (V10) arethe same field index and typeas V9.startTimestampMillis =baseTimestamp+

bytes(24-27)=baseTimestamp+bytes(24-27)

=baseTimestamp+22(FIRST_SWITCHED)

endTimestampMillis =baseTimestamp+bytes(28-31)

=baseTimestamp+bytes(28-31)

=baseTimestamp+21(LAST_SWITCHED)

ingressOctets bytes (20-23) bytes (20-23) 1(IN_BYTES)

ingressPackets bytes (16-19) bytes(16-19) 2(IN_PKTS)

egressOctets 23(OUT_BYTES) No

egressPackets 24(OUT_PKTS) No

ingressInterface bytes (12-13) bytes (12-13) 10(INPUT_SNMP) Yes

egressInterface bytes (14-15) bytes (14-15) 14(OUTPUT_SNMP)

sourceIpAddress bytes (0-3) bytes (0-3) 8(IPV4_SRC_ADDR) | 27(IPV6_SRC_ADDR) Yes

destIpAddress bytes (4-7) bytes (4-7) 12(IPV4_DST_ADDR) | (28(IPV6_DST_ADDR) Yes

sourcePort bytes (32-33) bytes (32-33) 7(L4_SRC_PORT) Yes

destPort bytes (34-35) bytes (34-35) 12(L4_DST_PORT) Yes

protocol byte (38) byte (38) 4 (PROTOCOL) Yes

direction ingress ingress 61(direction) No

44IB

M N

etwork Perform

ance Insight:N

etwork Perform



nextHop bytes (8-11) bytes (8-11)15(IPV4_NEXT_HOP) |62(IPV6_NEXT_HOP)

See the note

No

tcpFlags byte (37) byte (37) 6(TCP_FLAGS) No

sourceTos byte (39) byte (39) 5(SRC_TOS) Yes

destTos 55(DEST_TOS) No

sourceAs bytes (40-41) 16(SRC_AS) No

destAs bytes (42-43) 17(DST_AS) No

srcMask byte (44) byte (44) 9(SRC_MASK) | 29(IPV6_SRC_MASK) No

destMask byte (45) byte (45) 13(DST_MASK) |30 (IPV6_DST_MASK) No

bgpNextHop 18(BGP_IPV4_NEXT_HOP) |63(BPG_IPV6_NEXT_HOP)

No

Added fields

applicationId/NameDerived from port +protocol

Same as V195(application_tag) |

Derived from port+protocol

applicationName Derived from appId Same as V196(application_name)

| Derived from appId

exporterIpExtracted from incoming packet,not a part of the incoming flow

record itself Yes

flowSequenceNumberDerived by collector, not a part

of the incoming flow recorditself Yes

version First byte in the incoming packet

Chapter

2. Netw


45


Discarded fields engineType(header(20)) Everything else

engineId(header (21))

samplisongInterval(header (22-23))

46IB

M N

etwork Perform

ance Insight:N

etwork Perform


Note: appName is resolved by using /etc/protocols and /etc/services as lookuptables or finding the service name based on a lookup from port and protocol fieldsfrom a NetFlow record. When no match is found, the appName field is populatedwith <protocolId>:<lowerPortNumber>

Normalized flow field name RAW field name

packetSequence pktSeqNum

egressOctets outOctets

egressPackets outPackets

ingressInterface inIfId

egressInterface outIfId

sourceIpAddress srcIp

destIpAddress dstIp

sourcePort srcPort

protocol protocolId

direction direction

nextHop nextHopIp

tcpFlags tcpBits

sourceTos srcTos

destTos dstTos

sourceAs bgpSrcAsNum

destAs bgpDstAsNum

srcMask srcMask

destMask dstMask

bgpNextHop bgpNextHopIp

exporterIp exporterIp

flowSequenceNumber flowSeqNum

Related concepts:“NetFlow V1 formats” on page 28V1 format is the original format that is supported in the initial NetFlow releases.“NetFlow V5 formats” on page 29V5 format is an enhancement that adds Border Gateway Protocol (BGP)autonomous system information and flow sequence numbers.“NetFlow V9 formats” on page 31V9 format is template-based. Templates provide an extensible design to the recordformat.

Basic fields for V9 and IPFIX records:

Required fields in V9 and IPFIX records for Network Performance Insight v1.2.1.

A flow record can contain either IN counters (bytes and packets) or OUT counters(bytes and packets) or both IN and OUT counters.v IN_BYTES or OUT_BYTESv IN_PACKETS or OUT_PACKETSv PROTOCOL_ID


v SRC_PORTv DST_PORTv INPUT_SNMPv OUTPUT_SNMP

Important: If a flow record does not have either INPUT_SNMP orOUTPUT_SNMP, the record is discarded.

Flow AnalyticsFlow Analytics Service performs aggregations on the flow data to compute Top-Naggregations.

The Flow Analytics Service computes Top-N aggregations for 1 minute, 30 minutes,and 1-day intervals and also evaluates thresholds on interface usage. It providesTraffic Ingress and Egress details of every interface and network level.

The Flow Analytics Service is also responsible for aggregation processes. It refinesthe raw data, filters the results, and aggregates the KPI values. The values areaggregated by SUM, and the results are then stored in Network PerformanceInsight database.

Flow Analytics Service aggregates the data every 1 minute, 30 minutes, and 1-dayintervals. For 1-minute aggregation, it is distributed based on record segmentationand the aggregation is based on SUM of RAW data. For 30 minutes and 1-dayaggregation, it is distributed based on the aggregation type. The aggregation for 30minutes is based on SUM of 1-minute data and for daily aggregation is based onSUM of 30-minutes data.

The Flow Analytics Service also triggers the IP to Domain Name, and DomainName to IP resolution to DNS service.

Flow Analytics Service provides these basic functions:v Categorizes, aggregates, and ranks the data that collected by the Collector.v Detects and alerts real-time threshold violations. A threshold is a value that is

compared against metrics to determine whether the metrics exceed or dropbelow a specific constraint. Currently, only burst threshold is supported. Burstthreshold ignores the natural network bursts by evaluating how long in a rowthe violations occurred per resource. Burst thresholds can be set and resetmultiple times.

Related concepts:“DNS” on page 6The DNS Service operates as a cluster singleton. The DNS service is stateless. Itsperformance depends on the performance of the supporting DNS server that isprovided by your enterprise.

Built-in aggregation definitionsThis section details the grouping keys of the TopN views from the Traffic Detailsdashboard.

Table 7. Traffic Details (Interface level)

Dashboard Views Grouping keys

Top Applications INTERFACE + APPLICATION_NAME


Table 7. Traffic Details (Interface level) (continued)

Dashboard Views Grouping keys

Top Applications with Source INTERFACE + APPLICATION_NAME +SOURCE_IP

Top Applications with Destination INTERFACE + APPLICATION_NAME +DESTINATION_IP

Top Applications with Conversation INTERFACE + APPLICATION_NAME +SOURCE _IP + DESTINATION_IP

Top Conversations INTERFACE + SOURCE_IP +DESTINATION_IP

Top Conversations with Application INTERFACE + SOURCE_IP +DESTINATION_IP + APPLICATION_NAME

Top Destinations INTERFACE + DESTINATION_IP

Top Destinations with Application INTERFACE + DESTINATION_IP +APPLICATION_NAME

Top Protocols INTERFACE + PROTOCOL_ID

Top Protocols with Source INTERFACE + PROTOCOL_ID +SOURCE_IP

Top Protocols with Destination INTERFACE + PROTOCOL_ID +DESTINATION_IP

Top Protocols with Conversation INTERFACE + PROTOCOL_ID +SOURCE_IP + DESTINATION_IP

Top Protocols with Application INTERFACE + PROTOCOL_ID +APPLICATION_NAME

Top Sources INTERFACE + SOURCE_IP

Top Sources with Application INTERFACE + SOURCE_IP +APPLICATION_NAME

Flow ThresholdsNetwork Performance Insight performs threshold evaluation on the data by usingperiod-based data aggregations against the threshold definitions. NetworkPerformance Insight supports burst thresholding. It ignores the natural networkbursts by counting consecutive violations occurrences. Burst threshold is auser-defined static value at specific intervals and is considered static thresholds.

A threshold is a value that is compared against the predefined thresholdconfigurations. It is checked to see whether it violates a specific restriction. Theprimary objective of thresholding is to determine any violations and to generatealerts. This information is stored in Network Performance Insight database and isused by IBM Netcool Operations Insight dashboards.

You can configure thresholds for interfaces based on Ingress or Egress traffic, orboth. The threshold configurations are also based on limit type, threshold upper, orlower limit and number of events. The threshold configuration UI is available onthe Dashboard Application Services Hub.

Enabling a threshold

Threshold values can be configured based on traffic volume data limit, andnumber of events.v Traffic volume data


Set the upper or lower limit for traffic volume data usage by the IP groups orother device groups.

v Number of eventsSet the number of times the traffic volume data usage is allowed to exceed athreshold value for every continuous minute before an alert is raised.

For more information about how to configure flow thresholds from NetworkPerformance Insight system, see Configuring flow thresholds in Installing andConfiguring IBM Network Performance Insight.

Static thresholds definition

Threshold values can be configured based on traffic volume data limit, andnumber of events.

Thresholds define the status of an attribute based on specific conditions. You canenable threshold evaluation on a selected resource or interface. A threshold isviolated when the result of the collected KPI metric is evaluated as exceeding(upper) or dropping (lower) to a specified configured threshold level. The actualevaluation and disposition depends on the threshold type, Over, Under, or Band.

Thresholds have the following attributes:

State Indicates whether a threshold is enabled or disabled to generate the eventswhen a threshold violation occurs.

Limit type

v Over - The measurement must be over the threshold limits to beconsidered in violation.

v Under - The measurement must be under the threshold limits to beconsidered in violation.

v Band - The measurement must be over the upper limit or under thelower limit to be considered in violation. That is, the band between theupper and lower limits is not a violation.

Threshold limitThe severity of a violation is decided based on the violation against eitheron the upper or lower limit that is configured for a threshold.

Number of eventsThe number of events provides a mechanism to suppress short durationviolations.

Number of events is the number of times the traffic volume data usage isallowed to exceed a threshold value for every continuous minute before analert is raised. For an event to be sent to OMNIbus, violations must occurthe specified number of times in a row. Since thresholds are applied to1-minute aggregations, number of events is effectively the number ofminutes that a measurement must be in violation.

Threshold violation:

A KPI is considered to be in a violation state when it breaches the limits of trafficvolume data for a specific number of times according to the configuration settings.

Threshold typeThe severity of a violation is decided based on the violation against eitheron the upper or lower limit that is configured for a threshold.


v If the threshold is configured to check for Over, then severity isdetermined as:– If KPI value is greater than or equal to the Upper Limit, then the

severity is Critical.– If the KPI value is greater than or equal to the lower limit, but not

exceeding the upper limit, then the severity is Major.v If the threshold is configured to check for Under, then severity is

determined as:– If the KPI value is lesser than or equal to the upper limit, but not

dropping below the lower limit, then the severity is Major.– If KPI value is lesser than or equal to the lower limit, then the

severity is Critical.v If the threshold is configured to check for Band, then severity is

determined as:– If the KPI value is greater than or equal to the upper limit, then the

severity is Critical.– If KPI value is lesser than or equal to the lower limit, then the

severity is Critical.

For example, if the limit type is Over, and the traffic volume data is greater thanthe upper limit, the severity of the threshold is set to Critical. If the traffic volumedata is greater than the lower limit, the threshold severity is set to Major.

The following table summarizes how a threshold severity is set to either Major orCritical for different limit types:

Table 8. Threshold severity for limit types

Limit Type Scenario Severity

Over Traffic volume data greaterthan the upper limit

Critical

Traffic volume data greaterthan the lower limit

Major

Under Traffic volume data lesser thanupper limit

Major

Traffic volume data lesser thanthe lower limit

Critical

Band Traffic volume data greaterthan the upper limit

CriticalTraffic volume data lesser thanthe lower limit

Note: For limit type Band, the severity is always set to Critical for any thresholdviolation scenarios.

Exiting a threshold violation

Exiting from a violation state is the action of setting the severity of an event toCLEAR. The severity is cleared after a KPI goes into a non-violation state. Thethreshold at which an event is raised then is cleared.Non-Violation -> Major Violation -> CLEAR -> Non-ViolationOrNon-Violation -> Critical Violation -> CLEAR -> Non-Violation


Network Performance Insight supports system generated auto clear. The followingtable shows, when the system can set the severity to CLEAR.

Table 9. Scenarios for CLEAR severity.

Table that shows scenarios when CLEAR severity occurs

Limit Type Current Severity Scenario New Severity

Over KPI is either in Majoror Critical level

Traffic volume datalesser than the lowerlimit

CLEAR

Under KPI is either in Majoror Critical level

Traffic volume datagreater than upperlimit

CLEAR

Band KPI is in Criticallevel

Traffic volume datafalls within the rangeof upper limit andlower limit

CLEAR

If an existing threshold configuration is modified, by changing any of theconfiguration values such as limit type, upper or lower limit, number of events ordisable the threshold, the system sets the severity to CLEAR.

For more information about the configuration steps, see Configuring Thresholds fromConsole Integrations in Using Network Performance Insight.

Remote Flow CollectorInstall Remote Flow Collector if you want the collector to be co-located with theFlow exporters from which it is collecting data. The flow records that are collectedby the Remote Collectors are sent to the Ambari agent hosts where the StorageService is located.

It performs the same functions as a Flow Collector Service does. The ratio betweenremote and local collectors must be 1:1.


Chapter 3. Platform support

All Network Performance Insight components must be installed on Red Hat Linux,Version 7.2 only.

Co-location rules

While it is possible to deploy all the Network Performance Insight and itsassociated components on a single instance for evaluation purpose. Typically, youmust have at least three hosts; one master Ambari server, and two Ambari agentslaves for Network Performance Insight cluster.

You must plan for servers to install Jazz for Service Management and otherNetcool Operations Insight components.Related information:

Performing a fresh installation

Suggested node and services layoutUse the following guide lines for setting up your IBM Open Platform with ApacheSpark and Apache Hadoop and Network Performance Insight 1.2.1 services in yourcluster.

Ambari agent deployment deploys Network Performance Insight service layer andapplication binary to the cluster hosts, and installs each component to the defaultlocation /opt/IBM/npi and the IBM Open Platform with Apache Spark and ApacheHadoop components to /usr/iop/current directory.

Multi-node cluster deployment

It is suggested that you have at least one Ambari server node and the rest of themas Ambari agents. In the diagram, HOST A is the Ambari server and HOST B, C,and D are the Ambari agents.


https://www.ibm.com/support/knowledgecenter/SSTPTP_1.4.1/com.ibm.netcool_ops.doc/soc/integration/task/soc_int_installfresh.html

Note:

v Make sure that you install Manager Service and Kafka Broker in all Ambariagent nodes. Kafka Schema Registry must be installed along with Kafka Broker.

v Because Zookeeper requires a majority, it is best to use an odd number ofmachines. For example, with four machines ZooKeeper can handle the failure ofa single machine; if two machines fail, the remaining two machines do notconstitute a majority. However, with five machines ZooKeeper can handle thefailure of two machines.

v One Flow Collector can support only one Remote Flow Collector.

Note:

See Netcool Operations Insight documentation for installation scenario for otherintegrated components.Related information:

Deployment considerations

Suggested services layout for IBM Open Platform with Apache Spark andApache Hadoop and BigInsights value-added services


https://www.ibm.com/support/knowledgecenter/SSTPTP_1.4.1/com.ibm.netcool_ops.doc/soc/integration/concept/soc_int_deployment.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/service_layout.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/service_layout.html

Cluster behaviorProvides the relevance between Network Performance Insight and its relatedservices with the node behavior in a cluster.

Network Performance Insight supports the following types of node behavior.

Cluster singletonA clustered singleton service (also known as an HA singleton) is a servicethat is deployed on multiple nodes in a cluster, but is providing its serviceon only one of the nodes. The node that is running the singleton service istypically called the oldest node.

Load balancingLoad balancing improves the distribution of workloads across multiplenodes where each of the node serves different set of clients that aremutually exclusive.

Managed load balancingThe difference between Load Balancing with Managed load balancing hereis that, node acts as manager node to monitor the load balancing activities.The manager node monitors and distributes the workload among theactive nodes.

Data replication A replication strategy determines the nodes where data replicas are placed.The replicas on multiple nodes are stored to ensure reliability and faulttolerance.

Monitoring in each node A service that is installed on each node in a cluster, where it monitors andprovides information on the installed nodes.

Single instanceA service that is installed on a single node in a cluster, which provides itsservice across all nodes.

The following table lists the service components and their node behavior. Use thefollowing information as guidance to set up your environment.

Chapter 3. Platform support 55

Table 10. Cluster node behavior

Services Service Type Service ComponentsCluster nodebehavior

NetworkPerformance Insight

Master Manager Monitoring on eachnode

Slave DNS Cluster singleton

Slave Entity Analytics Cluster singleton

Slave Event Cluster singleton

Slave Flow Analytics Managed loadbalancing

Slave Flow Collector Load balancing

For moreinformation, seeConfiguring thenumber of interfaces inInstalling andConfiguring IBMNetwork PerformanceInsight.

Slave Tivoli NetworkManager Collector

Cluster singleton

Slave Storage Cluster singleton

Slave Threshold Cluster singleton

Slave UI Load balancing

For moreinformation, see UIService in IBMNetwork PerformanceInsight: ProductOverview.

Slave SNMP Collector Cluster singleton

Slave Formula Service Cluster singleton

HDFS Master NameNode Single instance

Master SNameNode Single instance

Slave DataNode Data replication

YARN Master Timeline Server Single instance

Master Resource Manager Single instance

Slave Node Manager Managed loadbalancing

ZooKeeper Master ZooKeeper Data replication

Ambari Metrics Master Collector Single instance

Slave Monitor Monitoring on eachnode

Kafka Master Kafka Broker Data replication

Master Kafka Connect Single instance

Master Kafka SchemaRegistry

Load balancing


Table 10. Cluster node behavior (continued)

Services Service Type Service ComponentsCluster nodebehavior

MapReduce2 Master History Server Single instance

Chapter 3. Platform support 57

Notices

This information was developed for products and services offered in the US. Thismaterial might be available from IBM in other languages. However, you may berequired to own a copy of the product or product version in that language in orderto access it.

IBM may not offer the products, services, or features discussed in this document inother countries. Consult your local IBM representative for information on theproducts and services currently available in your area. Any reference to an IBMproduct, program, or service is not intended to state or imply that only that IBMproduct, program, or service may be used. Any functionally equivalent product,program, or service that does not infringe any IBM intellectual property right maybe used instead. However, it is the user's responsibility to evaluate and verify theoperation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matterdescribed in this document. The furnishing of this document does not grant youany license to these patents. You can send license inquiries, in writing, to:

IBM Director of LicensingIBM CorporationNorth Castle Drive, MD-NC119Armonk, NY 10504-1785US

For license inquiries regarding double-byte character set (DBCS) information,contact the IBM Intellectual Property Department in your country or sendinquiries, in writing, to:

Intellectual Property LicensingLegal and Intellectual Property LawIBM Japan Ltd.19-21, Nihonbashi-Hakozakicho, Chuo-kuTokyo 103-8510, Japan

INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THISPUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHEREXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIEDWARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESSFOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer ofexpress or implied warranties in certain transactions, therefore, this statement maynot apply to you.

This information could include technical inaccuracies or typographical errors.Changes are periodically made to the information herein; these changes will beincorporated in new editions of the publication. IBM may make improvementsand/or changes in the product(s) and/or the program(s) described in thispublication at any time without notice.

Any references in this information to non-IBM websites are provided forconvenience only and do not in any manner serve as an endorsement of those


websites. The materials at those websites are not part of the materials for this IBMproduct and use of those websites is at your own risk.

IBM may use or distribute any of the information you provide in any way itbelieves appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purposeof enabling: (i) the exchange of information between independently createdprograms and other programs (including this one) and (ii) the mutual use of theinformation which has been exchanged, should contact:

IBM Director of LicensingIBM CorporationNorth Castle Drive, MD-NC119Armonk, NY 10504-1785US

Such information may be available, subject to appropriate terms and conditions,including in some cases, payment of a fee.

The licensed program described in this document and all licensed materialavailable for it are provided by IBM under terms of the IBM Customer Agreement,IBM International Program License Agreement or any equivalent agreementbetween us.

The performance data discussed herein is presented as derived under specificoperating conditions. Actual results may vary.

The client examples cited are presented for illustrative purposes only. Actualperformance results may vary depending on specific configurations and operatingconditions.

Information concerning non-IBM products was obtained from the suppliers ofthose products, their published announcements or other publicly available sources.IBM has not tested those products and cannot confirm the accuracy ofperformance, compatibility or any other claims related to non-IBM products.Questions on the capabilities of non-IBM products should be addressed to thesuppliers of those products.

Statements regarding IBM's future direction or intent are subject to change orwithdrawal without notice, and represent goals and objectives only.

All IBM prices shown are IBM's suggested retail prices, are current and are subjectto change without notice. Dealer prices may vary.

This information is for planning purposes only. The information herein is subject tochange before the products described become available.

This information contains examples of data and reports used in daily businessoperations. To illustrate them as completely as possible, the examples include thenames of individuals, companies, brands, and products. All of these names arefictitious and any similarity to actual people or business enterprises is entirelycoincidental.

COPYRIGHT LICENSE:


This information contains sample application programs in source language, whichillustrate programming techniques on various operating platforms. You may copy,modify, and distribute these sample programs in any form without payment toIBM, for the purposes of developing, using, marketing or distributing applicationprograms conforming to the application programming interface for the operatingplatform for which the sample programs are written. These examples have notbeen thoroughly tested under all conditions. IBM, therefore, cannot guarantee orimply reliability, serviceability, or function of these programs. The sampleprograms are provided "AS IS", without warranty of any kind. IBM shall not beliable for any damages arising out of your use of the sample programs.

Each copy or any portion of these sample programs or any derivative work mustinclude a copyright notice as follows:

© (your company name) (year).Portions of this code are derived from IBM Corp. Sample Programs.© Copyright IBM Corp. _enter the year or years_.

TrademarksIBM, the IBM logo, and ibm.com are trademarks or registered trademarks ofInternational Business Machines Corp., registered in many jurisdictions worldwide.Other product and service names might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the web at "Copyright andtrademark information" at www.ibm.com/legal/copytrade.shtml.

Adobe, Acrobat, PostScript and all Adobe-based trademarks are either registeredtrademarks or trademarks of Adobe Systems Incorporated in the United States,other countries, or both.

IT Infrastructure Library is a registered trademark of the Central Computer andTelecommunications Agency which is now part of the Office of GovernmentCommerce.

Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo,Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks orregistered trademarks of Intel Corporation or its subsidiaries in the United Statesand other countries.

Linux is a registered trademark of Linus Torvalds in the United States, othercountries, or both

Microsoft and Windows are trademarks of Microsoft Corporation in the UnitedStates, other countries, or both.

ITIL is a registered trademark, and a registered community trademark of TheMinister for the Cabinet Office, and is registered in the U.S. Patent and TrademarkOffice.

UNIX is a registered trademark of The Open Group in the United States and othercountries.

Notices 61

http://www.ibm.com/legal/us/en/copytrade.shtml

Java and all Java-based trademarks and logosare trademarks or registered trademarks ofOracle and/or its affiliates.

Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in theUnited States, other countries, or both and is used under license therefrom.

Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo aretrademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.

Terms and conditions for product documentationPermissions for the use of these publications are granted subject to the followingterms and conditions.

Applicability

These terms and conditions are in addition to any terms of use for the IBMwebsite.

Personal use

You may reproduce these publications for your personal, noncommercial useprovided that all proprietary notices are preserved. You may not distribute, displayor make derivative work of these publications, or any portion thereof, without theexpress consent of IBM.

Commercial use

You may reproduce, distribute and display these publications solely within yourenterprise provided that all proprietary notices are preserved. You may not makederivative works of these publications, or reproduce, distribute or display thesepublications or any portion thereof outside your enterprise, without the expressconsent of IBM.

Rights

Except as expressly granted in this permission, no other permissions, licenses orrights are granted, either express or implied, to the publications or anyinformation, data, software or other intellectual property contained therein.

IBM reserves the right to withdraw the permissions granted herein whenever, in itsdiscretion, the use of the publications is detrimental to its interest or, asdetermined by IBM, the above instructions are not being properly followed.

You may not download, export or re-export this information except in fullcompliance with all applicable laws and regulations, including all United Statesexport laws and regulations.

IBM MAKES NO GUARANTEE ABOUT THE CONTENT OF THESEPUBLICATIONS. THE PUBLICATIONS ARE PROVIDED "AS-IS" AND WITHOUTWARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDINGBUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY,


NON-INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE.

Notices 63

IBM®

Printed in USA

with ibm corp. · 2017. 6. 9. · apache hadoop stack. the featur es of iop that ar e used in...

Documents