regression test old etl

22
Administrator Training with Apache Cassandra Introduction to Operations Datastax Inc. http://www.datastax.com/ August 27, 2012 Datastax Inc. Administrator Training with Apache Cassandra 1/ 22

Upload: spredslide

Post on 26-Jan-2015

125 views

Category:

Technology


0 download

DESCRIPTION

Regression Test Old ETL

TRANSCRIPT

Page 1: Regression Test Old ETL

Administrator Training with Apache CassandraIntroduction to Operations

Datastax Inc.

http://www.datastax.com/

August 27, 2012

Datastax Inc. Administrator Training with Apache Cassandra 1/ 22

Page 2: Regression Test Old ETL

Outline

1 Introduction

2 Course Overview

Datastax Inc. Administrator Training with Apache Cassandra 2/ 22

Page 3: Regression Test Old ETL

Advanced Operations with Apache Cassandra

Many Facets to Operating Apache Cassandra

Why a multi-day Course?

Unfamiliar ToolsHands On TasksDiscussion

Datastax Inc. Administrator Training with Apache Cassandra 3/ 22

Page 4: Regression Test Old ETL

Outline

1 Introduction

2 Course Overview

Datastax Inc. Administrator Training with Apache Cassandra 4/ 22

Page 5: Regression Test Old ETL

What you need to know

Node Routing

Client Access Patterms

dumb clients

Datastax Inc. Administrator Training with Apache Cassandra 5/ 22

Page 6: Regression Test Old ETL

The Write Path

Memtables

SSTables

The Commit Log

Datastax Inc. Administrator Training with Apache Cassandra 6/ 22

Page 7: Regression Test Old ETL

The Read Path

Memtables

SSTables

The Commit Log

Row Merging

Cache

Datastax Inc. Administrator Training with Apache Cassandra 7/ 22

Page 8: Regression Test Old ETL

Compaction

Compaction Strategies

What is Compaction

Choosing a Compaction Strategy

Datastax Inc. Administrator Training with Apache Cassandra 8/ 22

Page 9: Regression Test Old ETL

Installing Apache Cassandra

Packaging

Binary and Source Releases

Datastax Inc. Administrator Training with Apache Cassandra 9/ 22

Page 10: Regression Test Old ETL

Sizing and Hardware

Different Hardware Configurations

Pros and Cons

Memory, CPU, IO

SSDs and Spindles

Cloud (Amazon/Rackspace)

Datastax Inc. Administrator Training with Apache Cassandra 10/ 22

Page 11: Regression Test Old ETL

Tuning

Disks

Memory

JVM

Heap SizeGarbage Collection

Swap

Cache

row_cache_provider+

Compaction

Compaction Strategies

Datastax Inc. Administrator Training with Apache Cassandra 11/ 22

Page 12: Regression Test Old ETL

Benchmarking

stress

reads

writes

mixed workloads

Datastax Inc. Administrator Training with Apache Cassandra 12/ 22

Page 13: Regression Test Old ETL

Schema

What is in the Schema?

Online Updates

Concurrent Updates

Best Practices for Schema Updates

Datastax Inc. Administrator Training with Apache Cassandra 13/ 22

Page 14: Regression Test Old ETL

Ring Management

Token Selection

Token Movement

Bootstrapping

Moving

Balancing

Replacing Dead Nodes (Strategies)

Datastax Inc. Administrator Training with Apache Cassandra 14/ 22

Page 15: Regression Test Old ETL

The System Keyspace

What is the System Keyspace?

What are the components?

Replication and Propogation

Datastax Inc. Administrator Training with Apache Cassandra 15/ 22

Page 16: Regression Test Old ETL

Managing Consistency

Nodetool Repair

Failure

Datastax Inc. Administrator Training with Apache Cassandra 16/ 22

Page 17: Regression Test Old ETL

Managing Data

Where is my data?

How is it stored?

What are all these files?

Recover a lost index or bloom filter?

Snapshots

Import/Export

Datastax Inc. Administrator Training with Apache Cassandra 17/ 22

Page 18: Regression Test Old ETL

Sizing

Target data size

Why smaller is better (usually)

Number of nodes

Replication and sizing

Datastax Inc. Administrator Training with Apache Cassandra 18/ 22

Page 19: Regression Test Old ETL

Troubleshooting

Data Corruption?

IO Exceptions

syslog

Recovery

Nodetool Scrub

Latency

Wedged Schema

When IO cannot keep up

Datastax Inc. Administrator Training with Apache Cassandra 19/ 22

Page 20: Regression Test Old ETL

Monitoring

OpsCenter

JMX

nodetool

cfstatsnetstatstpstats

Logs

Monitoring Compaction

Monitoring Repair

Monitoring Back-Pressure

Datastax Inc. Administrator Training with Apache Cassandra 20/ 22

Page 21: Regression Test Old ETL

Replication

Replication Strategies

Configuring Replication

Modifying Replication Parameters

Changing Replication Strategies

Datastax Inc. Administrator Training with Apache Cassandra 21/ 22

Page 22: Regression Test Old ETL

DataStax Enterprise

What is DataStax Enterprise?

Additional Capabilities

Operational Considerations

Overview of Hadoop Components

Datastax Inc. Administrator Training with Apache Cassandra 22/ 22