monitoring a galera cluster · high availability 6 slave client writes client reads slave client...

36
Monitoring a Galera Cluster Codership Training

Upload: others

Post on 31-Dec-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera ClusterCodership Training

Page 2: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Introduction

Page 3: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Introductions

3

Codership OyCreators & Developers of

Galera Cluster

Employees in Multiple Countries

Galera ClusterReleased Initially in May 2007Over 1.5 Million Downloads

Russell Dyer, PresenterKB Editor, Documentation,

Instructor (MySQL, MariaDB)

Writer (O'Reilly Books)

Page 4: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Tutorial Outline

4

Galera & Monitoring OverviewGalera Status VariablesServer Logs

Notification CommandCustomized ScriptsThird-Party Software

Page 5: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Galera & Monitoring Overview

Page 6: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Basic Replication ConceptsNodes — Physical & Virtual ServersDatabases — MySQL, MariaDB, XtraDB

ReplicationLoad BalancingHigh Availability

6

Slave

Client Writes

Client Reads

Slave

Client Reads

Client Reads

Master

Page 7: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Galera Cluster ConceptsVirtual Synchronous ReplicationTrue Multi-Master SolutionConflict Detection & Resolution on

CommitMultiple, Odd Number of Nodes

Easy MaintenanceAutomatic Provisioning

Node Isolation

Rolling Upgrades

7

Node 1

Node 2

Node 3

Galera Cluster

Client Writes

Client Writes Client Reads

Page 8: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Monitoring a ClusterWhat to Monitor

Nodes Down

Partitioned ClustersLagging Nodes

Resources AvailableStatus Variables

Log FilesScripts & Software

8

Page 9: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Galera Status Variables

Page 10: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Galera Status VariablesCluster Integrity

Communication among Nodes

Node StatusNetworking & Basic Requirements of

Individual Nodes

Replication HealthPerformance and Dependability of Schema

and Data Changes

10

SHOW GLOBAL STATUS LIKE 'wsrep_%';

+------------------------+-------------------+ | Variable_name | Value | +------------------------+-------------------+ | wsrep_local_state_uuid | bd5fe1c3-7d80-... | | wsrep_protocol_version | 10 | | ... | ... | | wsrep_thread_count | 6 | +------------------------+-------------------+

Page 11: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Cluster IntegrityCompare UUIDs

(wsrep_cluster_state_uuid)

Count the Nodes (wsrep_cluster_size)Do an Audit (wsrep_cluster_conf_id)

Ensure Primary Status (wsrep_cluster_status)

11

SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';

+---------------------------+--------+ | Variable_name | Value | +---------------------------+--------+ | wsrep_local_state_comment | Synced | +---------------------------+--------+

Page 12: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Node StatusReady & Connected (wsrep_ready,

wsrep_connected)

Node State Comment

12

ERROR 1047 (08501) Unknown Command

SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';

+---------------------------+--------+ | Variable_name | Value | +---------------------------+--------+ | wsrep_local_state_comment | Synced | +---------------------------+--------+

Page 13: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

SHOW STATUS LIKE 'wsrep_local_send%';+----------------------------+----------+| Variable_name | Value |+----------------------------+----------+| wsrep_local_send_queue | 0 || wsrep_local_send_queue_max | 1 || wsrep_local_send_queue_min | 0 || wsrep_local_send_queue_avg | 0.000000 |+----------------------------+----------+

Replication HealthBunching of Writes

(wsrep_local_recv_queue_avg)

Flow Control Paused (wsrep_flow_control_paused)

Sequentially in Parallel (wsrep_cert_deps_distance)

13

SHOW STATUS LIKE 'wsrep_local_recv%'; +----------------------------+----------+ | Variable_name | Value | +----------------------------+----------+ | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 1 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.145000 | +----------------------------+----------+

Documentation on Replication Health: https://galeracluster.com/library/documentation/monitoring-cluster.html#check-replication-health

Page 14: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Server Logs

Page 15: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

MySQL Server LogsError Log — Errors, Start-Up & Shutdown

Binary Log — Standard Replication and Point-in-Time Recovery

Slow Query LogSQL Error Log — Govt. Compliance

15

log-error=‘/var/log/mysqld.log'

log-warnings = 1

Relevant Entries from my.cnf configuration file

[mysqld] slow_query_log = ON slow_query_log_file=‘/var/log slow_query.log’ long_query_time = 0.5 log_queries_not_using_indexes log_slow_admin_statements

INSTALL PLUGIN sql_error_log SONAME 'sql_errlog.so';

Requires INSERT Privilege on mysql.plugin Table

SLOW QUERYLOG ENTRIES

log-bin binlog-expire-logs-seconds=302400

SQL ERROR LOG

BINARY LOG ENTRIES(PURGE AFTER 7 DAYS)

ERROR LOGENTRIES

Page 16: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Recording More in Server LogsLog Conflicts (wsrep_log_conflicts)

Certification Failures (cert.log_conflicts)

Debugging (wsrep_debug)

16

log-error wsrep_log_conflicts=ON wsrep_provider_options="cert.log_conflicts=ON" wsrep_debug=ON

Relevant Entries from my.cnf configuration file

Page 17: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera Cluster

Log Conflicts

Page 18: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Notification Command

Page 19: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

NotifyingNotification Script on Each NodeNode Status Changes (e.g., Disconnects)Notification Command Triggered

19

Node 1

Node 3

Node 2

Galera Cluster

DISCONNECTED

Trigger Notification Command

Page 20: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Node Status StringStarting Node

Undefined — Not Part of PC

Joiner — Receiving SST

Existing NodesDonor — Sending SST

Joined — Processing Updates

Synced — Synchronized

Error — Problems Occurred

20

Documentation on Node Status Values: https://galeracluster.com/library/documentation/notification-cmd.html#node-status

Page 21: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Members ListNode UUIDNode Name (wsrep_node_name)

Incoming Address (wsrep_node_incoming_address)

21

09fd7887-a84d-11e9-bdaa-1718db36144e/galera2/AUTO, 0a1e9504-a871-11e9-944c-13bde588c4e3/galera3/AUTO, 38807f55-a7d2-11e9-aa9b-7674d700b955/galera1/AUTO

A MEMBERS LIST

Line Breaks Added

Documentation on Members List: https://galeracluster.com/library/documentation/notification-cmd.html#member-list-format

Page 22: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Customized Scripts

Page 23: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera Cluster

Simple Notification Script

Page 24: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Setting a Notification CommandLocate Script in Protected DirectoryTouch Log FileChange Permissions & OwnershipSet wsrep_notify_cmd in

Configuration File

24

cd /var/log touch galera-node-monitor.log chown mysql:mysql galera-node-monitor.log chmod 755 galera-node-monitor.log

wsrep_notify_cmd='/commands/note-cmd.bash'

Executed from Command-Line

PREPARINGEMPTY LOG FILE

TELLING GALERA ABOUTNOTIFICATION COMMAND

Excerpt from MySQL Configuration File (e.g., my.cnf)

Page 25: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera Cluster

Testing a Notification Script

Page 26: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Notifications Logged in MySQLLogging to MySQL will Replicate to

All NodesGenerating Reports is Easier

Patterns may Emerge

Detect Problems Early

26

Page 27: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera Cluster

MySQL Database Log

Page 28: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera Cluster

Notification ScriptInterfacing with MySQL

Page 29: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Creating an AlertConditional Statements for Alert

WorthinessEmail Alertmail

sendmail

SMS Telephone MessageSubscription to Text Messaging Service

Determine DBA On-Call

Assemble Alert Message

29

curl -X POST https://textbelt.com/text --data-urlencode phone='$admin_nbr' --data-urlencode message='$alert' -d key=textbelt

Page 30: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Third-Party Software

Page 31: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Monitoring & Trending SoftwareMonitors

Cluster StatusNode States

Cluster Size

Flow Control

Send & Receive QueueSlow Nodes

TrendsRecord Stats for Reports & Graphs

Host Metrics — Repetition & PatternsSpot Physical Server & Network Problems

InnoDB Status (e.g., Spikes)

31

Page 32: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Popular SoftwareMonitoring

Nagios Plugin

Galera Monitor (MariaDB MaxScale)Load Balancing, Designating Master & Slaves

MonyogIndividual Nodes

TrendingSCUMM (Several Nines)

Percona Monitoring & Management

32

SCUMM: https://severalnines.com/database-blog/scumm-agent-based-database-monitoring-infrastructure-clustercontrol

Nagios Plugin: https://www.fromdual.com/galera-cluster-nagios-plugin-en

Percona Monitoring & Management: SCUMM: https://severalnines.com/database-blog/scumm-agent-based-database-monitoring-infrastructure-clustercontrol

Page 33: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Conclusion

Page 34: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

A Monitoring & Reaction PlanInventory of Nodes & Cluster

Notes on Location, Logins, Etc.

Minimum Nodes Needed for TrafficSpare Nodes — Notes on Deploying Them

Monitoring PlansList of Relevant Status Variables

Log Files — Path and Names

Method or Software for Monitoring

Practice Removing a Node for Maintenance

Adding a Node for Increased TrafficRestarting a Cluster

34

Page 35: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

[email protected]

Version 1.0Copyright Codership Oy 2019. All Rights Reserved.

Monitoring a Galera ClusterSlide

Additional ResourcesCodership Library (galeracluster.com/

library)Documentation (/library/documentation)

Knowledge Base (/library/kb)

FAQ (/library/faq)

Training (/library/training)

Videos (/library/training/videos)

Tutorials (/library/training/tutorials)

35

Tutorial Article on Monitoring Galera: https://galeracluster.com/library/training/tutorials/galera-monitoring.html

Page 36: Monitoring a Galera Cluster · High Availability 6 Slave Client Writes Client Reads Slave Client Reads Client Reads Master. ... Do an Audit (wsrep_cluster_conf_id) Ensure Primary

Monitoring a Galera Cluster

IntroductionGalera & Monitoring OverviewGalera Status Variables

Server LogsNotification CommandCustomized Scripts

Third-Party SoftwareConclusion

Thanks for Listening