vsae performance donitor user danual · the vsan performance monitor is a monitoring and...

18
1 vSAN Performance Monitor User Manual

Upload: others

Post on 03-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

1

vSAN Performance Monitor User Manual

Page 2: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

2

TABLE OF CONTENET

OVERVIEW 3

REQUIREMENTS 4

INSTALLATION 5

CONFIGURATION 7

STARTUP 11

TROUBLESHOOTING 14

SCREENSHOTS 16

Page 3: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

3

Overview

The vSAN performance monitor is a monitoring and visualization tool based on vSAN

Performance metrics. It will collect vSAN Performance and other metrics periodically from the

clusters configured. The data collected is visualized in a more efficient and user-friendly way.

The vSAN performance monitor comes with preconfigured dashboards which will help

customers evaluate the performance of vSAN clusters, identify and diagnose problems, and

understand current and future bottlenecks. The dashboards are heavily inspired by vSAN

Observer.

The vSAN performance monitor is delivered in a virtual appliance with three major components,

i.e., a Telegraf collector, InfluxDB, and a Grafana frontend.

• Telegraf: Telegraf is the agent that collects metrics from vSAN cluster and stores them in

InfluxDB.

• InfluxDB: InfluxDB is the database to store the metrics

• Grafana: We use Grafana as the frontend to virtualize the metrics stored in the InfluxDB

Once deployed, users will need to do some simple configuration changes to point the collector

to target vSAN cluster(s) and start the service. After that, the data will be collected periodically

and can be visualized for meaningful insights.

Page 4: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

4

Requirements

• Web Browser: IE8+, Firefox or Chrome

For the client VM to deploy

• vSphere 6.0 / VM version 11 and later environments are needed for the client VM

deployment.

For the target vCenter you want to monitor

• vSphere 6.0 and later environments are required so that the vCenter can be monitored

• Clusters with vSAN enabled

• vSAN performance service needs to be turned on in the vCenter you want to monitor.

Please refer to the page for details to enable perf service. You can select specific vSphere

version on the right top corner of the page.

Page 5: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

5

Installation

1. Login into the vSphere client and right-click on the datacenter or cluster which you want to

deploy the virtual machine on

2. Choose “Deploy OVF template”

3. Choose the vSAN-Performance-Monitor as OVF template and follow the steps to select

compute resource, storage, network and setup root password.

Page 6: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

6

4. Once the status of “deploy OVF template” turns to “completed”, start the Virtual Machine

by clicking Action -> Power -> Power On

5. After the VM is successfully started, we can see the VM’s IP address and login to it to

complete the configuration. If you have any problem with starting VM, please refer to the

troubleshooting section.

Page 7: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

7

Configuration

To use the vSAN-performance-monitor to collect and virtualize the metrics from one or more

specific vCenters, you will need to update the /root/telegraf.conf. The configuration follows

the telegraf plugin we have published.

1. To do this, first open the command line and login the VM with ssh. Use the root credential

you set while deployment. The default root credential is vmware.

ssh root@<vcenter-ip>

e.g: ssh [email protected]

2. Edit /root/telegraf.conf with vim: vim /root/telegraf.conf

Page 8: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

8

3. You will need to change the following field:

• vCenter credentials

vcenters = [ "https://<vcenter-ip>/sdk"] # a list of vCenters to connect to

username = "<name>" # the username for vCenters

password = "<pwd>"

• vSAN cluster filter

# vSAN performance metrics are collected on the cluster level, and cluster to be monitored can be selected using Inventory Paths.

# You will need to config the path if your Inventory layout is customized.

# You can also modify the field if you want to col lect a portion of clusters

# By default, all clusters are collected.

vsan_cluster_include = ["/*/host/*"] # Inventory path to clusters to collect

• If you want to skip verifying vCenter’s certificate (in this case, skip next configuration)

# In this mode, TLS is susceptible to man -in-the-middle attacks.

# This should be used only for testing or if you want to skip certificate verify.

insecure_skip_verify = true # skip verify vCenter’s certificate chain

• vCenter certificate verification:

Page 9: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

9

vcenters = [ "https://<vc-name>/sdk"] # use vCenter hostname instead of IP

# The path /root/certificates/ is mounted to the docker container. Please keep your certificates within the same folder ssl_ca = "/root/certificates/<ca certificate>" # path to CA certificate

Step 1: Download root CA certificates:

Step 2: In telegraf.conf, set ssl_ca to the path of the CA file you just download. E.g.,

ssl_ca = "/root/certificates/f076756e.0"

Step 3: Use vCenter hostname instead of IP for the certificate chain to be verified. E.g.,

vcenters = [ "https://2-10-184-161-2.vmware.com/sdk"]

Page 10: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

10

4. Optional: you might also change the following field if necessary:

• Interval

Interval = “300S” # how often metrics are collected, 300s is recommended

flush_interval = “300s” # how often metrics are sent , 300s is recommended

• Metrics to collect

vsan_metric_include = […] # vSAN performance entity to collect.

The default config file collects following metrics: "summary.disk-usage", "summary.health", "summary.resync", "performance.cluster-domclient", "performance.cluster-domcompmgr", "performance.host-domclient", "performance.host-domcompmgr", "performance.cache-disk", "performance.disk-group", "performance.capacity-disk", "performance.disk-group", "performance.virtual-machine", "performance.vscsi", "performance.virtual-disk", "performance.vsan-host-net", "performance.vsan-vnic-net",

Page 11: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

11

"performance.vsan-pnic-net", "performance.vsan-iscsi-host", "performance.vsan-iscsi-target", "performance.vsan-iscsi-lun", "performance.lsom-world-cpu", "performance.nic-world-cpu", "performance.dom-world-cpu", "performance.cmmds-world-cpu", "performance.host-cpu", "performance.host-domowner", "performance.host-memory-slab", "performance.host-memory-heap"

• Query concurrency

collect_concurrency = 5 # The number of simultaneous queries for collection

discover_concurrency =5 # The number of simultaneous queries for discovery

• Others # whether or not to force discovery of new objects on initial gather call before

collecting metrics

force_discover_on_init = false

# the interval before (re)discovering objects subject to metrics collection

object_discovery_interval = "300s"

Startup

1. After the config file is edited and saved, you can use docker-compose up -d to start all

components in detached mode. Run docker ps to verify if Telegraf, InfluxDB and Grafana have

been successfully started. There should be three containers with STATUS of “Up”.

Page 12: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

12

2. Open http://vm-ip:3000 in the browser, and you will see a Grafana home page.

3. The default login credential is admin/admin if you wish to login and you might change your

password once login in.

4. After login in, click the "Dashboards" icon on the left bar and choose "Manage" to view all

available dashboards.

Page 13: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

13

5. With the default collecting frequency, the data will appear in Grafana after 5 – 10 minutes.

For example, you might click the “vSAN Overview dashboard” to see an overview.

6. To stop, run

docker-compose down

Page 14: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

14

Troubleshooting

1. “No host is compatible with the virtual machine”

Users may experience VMware specific problems when trying to start the fling VM (e.g. "The

guest operating system 'vmwarePhoton64Guest' is not supported").

Solution:

Right-click on the VM, click "Compatibility" > "Upgrade VM compatibility" > "Yes", when having

the option to choose "Compatible with" use the default option, e.g. "ESXi 6.7 and later" or

"Workstation 12 and later" and so on, and then press "OK". The VM should now be able to

start.

2. How to view the logs?

Run docker ps -a to list running and stopped containers with their IDs.

Run docker logs <container-id> to view logs.

3. Not all docker containers are successfully started.

If one or more docker containers are not able to start, the most likely reason is the

telegraf.conf is not configured correctly

Solution:

Run docker ps -a to find out which container fails

Run docker logs <container-id> to see the reason for failing.

4. No data points are available in the Granfana UI

With the default collecting frequency, data points usually appear in Grafana in 5 – 10 minutes

after you run docker-compose up -d. If you are not able to see data coming, please make sure:

Page 15: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

15

• All three containers are started successfully. Please refer to the previous troubleshooting

steps.

• The vSAN performance service is enabled on the target vCenter following the instructions

• The vSAN clusters’ Inventory Paths are configured correctly and they are successfully

discovered during the discover phrase. By default setting, the inventory path for vSAN

cluster are /DatacenterName/host/ClusterName, where the “host” folder is created by

system. We can use wildcard to select a group of resource, e.g.

▪ /DatacenterName/host/* for all clusters in a datacenter

▪ /*/host/* for all clusters in all datacenters

If you are not sure about the inventory path of a cluster, please go to https://<vcenter-

ip>/mob.

5. Certificate verification issue

• If you see errors like “x509: certificate signed by unknown authority”, please make sure you

have followed the above instructions and used the correct certification file.

• If you see errors like “vcenter cannot validate certificate for <ipaddress> because it doesn't

contain any IP SANs”, please make sure you are using the vCenter’s hostname instead of IP

address

• If the certificate verification still fails, you might try to skip verification by

insecure_skip_verify = true. In this mode, TLS is susceptible to man-in-the-middle attacks.

This should be used only for testing.

Page 16: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

16

Screenshots

Page 17: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

17

Page 18: vSAE Performance Donitor User Danual · The vSAN performance monitor is a monitoring and visualization tool based on vSAN Performance metrics. It will collect vSAN Performance and

18

References

1. Grafana: https://grafana.com/ 2. InfluxDB: https://www.influxdata.com/products/influxdb-overview/ 3. Telegraf: https://www.influxdata.com/time-series-platform/telegraf/