regression test old etl

Post on 26-Jan-2015

125 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Regression Test Old ETL

TRANSCRIPT

Administrator Training with Apache CassandraIntroduction to Operations

Datastax Inc.

http://www.datastax.com/

August 27, 2012

Datastax Inc. Administrator Training with Apache Cassandra 1/ 22

Outline

1 Introduction

2 Course Overview

Datastax Inc. Administrator Training with Apache Cassandra 2/ 22

Advanced Operations with Apache Cassandra

Many Facets to Operating Apache Cassandra

Why a multi-day Course?

Unfamiliar ToolsHands On TasksDiscussion

Datastax Inc. Administrator Training with Apache Cassandra 3/ 22

Outline

1 Introduction

2 Course Overview

Datastax Inc. Administrator Training with Apache Cassandra 4/ 22

What you need to know

Node Routing

Client Access Patterms

dumb clients

Datastax Inc. Administrator Training with Apache Cassandra 5/ 22

The Write Path

Memtables

SSTables

The Commit Log

Datastax Inc. Administrator Training with Apache Cassandra 6/ 22

The Read Path

Memtables

SSTables

The Commit Log

Row Merging

Cache

Datastax Inc. Administrator Training with Apache Cassandra 7/ 22

Compaction

Compaction Strategies

What is Compaction

Choosing a Compaction Strategy

Datastax Inc. Administrator Training with Apache Cassandra 8/ 22

Installing Apache Cassandra

Packaging

Binary and Source Releases

Datastax Inc. Administrator Training with Apache Cassandra 9/ 22

Sizing and Hardware

Different Hardware Configurations

Pros and Cons

Memory, CPU, IO

SSDs and Spindles

Cloud (Amazon/Rackspace)

Datastax Inc. Administrator Training with Apache Cassandra 10/ 22

Tuning

Disks

Memory

JVM

Heap SizeGarbage Collection

Swap

Cache

row_cache_provider+

Compaction

Compaction Strategies

Datastax Inc. Administrator Training with Apache Cassandra 11/ 22

Benchmarking

stress

reads

writes

mixed workloads

Datastax Inc. Administrator Training with Apache Cassandra 12/ 22

Schema

What is in the Schema?

Online Updates

Concurrent Updates

Best Practices for Schema Updates

Datastax Inc. Administrator Training with Apache Cassandra 13/ 22

Ring Management

Token Selection

Token Movement

Bootstrapping

Moving

Balancing

Replacing Dead Nodes (Strategies)

Datastax Inc. Administrator Training with Apache Cassandra 14/ 22

The System Keyspace

What is the System Keyspace?

What are the components?

Replication and Propogation

Datastax Inc. Administrator Training with Apache Cassandra 15/ 22

Managing Consistency

Nodetool Repair

Failure

Datastax Inc. Administrator Training with Apache Cassandra 16/ 22

Managing Data

Where is my data?

How is it stored?

What are all these files?

Recover a lost index or bloom filter?

Snapshots

Import/Export

Datastax Inc. Administrator Training with Apache Cassandra 17/ 22

Sizing

Target data size

Why smaller is better (usually)

Number of nodes

Replication and sizing

Datastax Inc. Administrator Training with Apache Cassandra 18/ 22

Troubleshooting

Data Corruption?

IO Exceptions

syslog

Recovery

Nodetool Scrub

Latency

Wedged Schema

When IO cannot keep up

Datastax Inc. Administrator Training with Apache Cassandra 19/ 22

Monitoring

OpsCenter

JMX

nodetool

cfstatsnetstatstpstats

Logs

Monitoring Compaction

Monitoring Repair

Monitoring Back-Pressure

Datastax Inc. Administrator Training with Apache Cassandra 20/ 22

Replication

Replication Strategies

Configuring Replication

Modifying Replication Parameters

Changing Replication Strategies

Datastax Inc. Administrator Training with Apache Cassandra 21/ 22

DataStax Enterprise

What is DataStax Enterprise?

Additional Capabilities

Operational Considerations

Overview of Hadoop Components

Datastax Inc. Administrator Training with Apache Cassandra 22/ 22

top related