This event has ended

Riptano Training for Apache Cassandra

Riptano

Friday, June 18, 2010 from 9:00 AM to 5:00 PM (PDT)

San Francisco, CA

Share Riptano Training for Apache Cassandra

Event Details

Riptano Training for Apache Cassandra

Riptano's training program takes you from 0 to 60 buiding scalable applications on Apache Cassandra.  This session is appropriate for developers and DBAs looking to understand design principles involved in modeling against Cassandra, as well as best practices for deploying and maintaining a Cassandra cluster.

This training will include hands-on exercises.  Attendees must bring their own laptop, and should pre-install VMWare Player (Fusion on OS X) to run the virtual machine provided by Riptano.  Training will run from 9 to 5 with a one-hour break for lunch (provided by Riptano) and two shorter breaks in the morning and afternoon.


Agenda

Installation and configuration

Your VM will come with a single-node Cassandra instance installed.  We'll extend that to three nodes locally, and explain the configuration options available, including multi-datacenter replication.  We'll show how to do simple benchmarks with the py_stress tool.

Application design

Cassandra data modeling is not like relational schema design.  We will cover why denormalization is your friend and how to think in ColumnFamilies, as well as the Thrift API.  As concrete examples, we will explain the data model behind the Twissandra application and CloudKick's time series data.

Basics of Cassandra Internals

To understand Cassandra performance, you need to know a little about how it was designed, just like with relational databases you need to understand query plans.  We'll explain memtables and sstables, Cassandra's SEDA design, and how to use the JMX metrics it exports to infer its internal state.

Operations

How Cassandra replication works with no single points of failure, and what this means for adding, load-balancing, and replacing machines safely and efficiently.  We'll explain gossip and failure detection, and also columnfamily modification, snapshots, and data import + export.

Tuning and troubleshooting

There are many factors that affect Cassandra performance.  We'll cover OS- and machine-level factors such as the OS buffer cache and disk utilization, JVM factors such as garbage collector settings, and Cassandra tunables such as cache sizes.  We'll also cover how to use the metrics covered previously to recognize warning signs that you need to add capacity to your cluster.  Yes, there will be war stories.


About Apache Cassandra

Cassandra is the "hands down winner for transaction processing performance at scale."  Cassandra is in use at Digg, Facebook, Twitter, Reddit, Rackspace, and many other companies with large, active data sets.  Cassandra's fully-distributed design with no single points of failure allows exceptional reliability.

 

About your instructor

Jonathan Ellis is project chair of Apache Cassandra and co-founder of Riptano.

 

Have questions about Riptano Training for Apache Cassandra? Contact Riptano

When & Where



Holiday Inn Civic Center
50 Eighth St
San Francisco, CA 94103

Friday, June 18, 2010 from 9:00 AM to 5:00 PM (PDT)


  Add to my calendar

Organizer

Riptano

Riptano is the leading expert for Apache Cassandra, providing software, support, and training for all things Cassandra. Riptano is obsessed with providing great customer service.

Our mission is to help you with all of your Cassandra needs so you can focus on your core business. Contact us with your questions.

 

  Contact the Organizer

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.