Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
CASSANDRA COMMUNITY WEBINARS AUGUST 2013
CASSANDRA
INTERNALS
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
AboutThe Last Pickle
Work with clients to deliver and improve
Apache Cassandra based solutions.
Apache Cassandra Committer, DataStax MVP,
Hector Maintainer, 6+ years combined
Cassandra experience.
Based in New Zealand & Austin,TX.
Architecture
Code
www.thelastpickle.com
Cassandra Architecture
API's
Cluster Aware
Cluster Unaware
Clients
Disk
www.thelastpickle.com
Cassandra Cluster Architecture
API's
Cluster Aware
Cluster Unaware
Clients
Disk
API's
Cluster Aware
Cluster Unaware
Disk
Node 1 Node 2
www.thelastpickle.com
Dynamo Cluster Architecture
API's
Dynamo
Database
Clients
Disk
API's
Dynamo
Database
Disk
Node 1 Node 2
www.thelastpickle.com
Architecture
API
Dynamo
Database
www.thelastpickle.com
APITransports
Thrift
Native Binary
www.thelastpickle.com
ThriftTransport
//Custom TServer implementations
o.a.c.thrift.CustomTThreadPoolServer
o.a.c.thrift.CustomTNonBlockingServer
o.a.c.thrift.CustomTHsHaServer
www.thelastpickle.com
APITransports
Thrift
Native Binary
www.thelastpickle.com
Native BinaryTransport
Beta in Cassandra 1.2
Uses Netty
Enabled with
start_native_transport
(Disabled by default)
www.thelastpickle.com
o.a.c.transport.Server.run()
//Setup the Netty server
new ExecutionHandler()
new NioServerSocketChannelFactory()
ServerBootstrap.setPipelineFactory()
www.thelastpickle.com
o.a.c.transport.Message.Dispatcher.messageReceived()
//Process message from client
ServerConnection.validateNewMessage()
Request.execute()
ServerConnection.applyStateTransition()
Channel.write()
www.thelastpickle.com
Messages
Defined in the Native Binary
Protocol
$SRC/doc/native_protocol.spec
www.thelastpickle.com
API Services
JMX
Thrift
CQL 3
www.thelastpickle.com
JMX Management Beans
Spread around the code base.
Interfaces named *MBean
www.thelastpickle.com
JMX Management Beans
Registered with names such as
org.apache.cassandra.db:
type=StorageProxy
www.thelastpickle.com
API Services
JMX
Thrift
CQL 3
www.thelastpickle.com
o.a.c.thrift.CassandraServer
// Implements Thrift Interface
// Access control
// Input validation
// Mapping to/from Thrift and internal types
www.thelastpickle.com
Thrift Interface
Thrift IDL
$SRC/interface/cassandra.thrift
www.thelastpickle.com
o.a.c.thrift.CassandraServer.get_slice()
// get columns for one row
Tracing.begin()
ClientState cState = state()
cState.hasColumnFamilyAccess()
multigetSliceInternal()
www.thelastpickle.com
CassandraServer.multigetSliceInternal()
// get columns for may rows
ThriftValidation.validate*()
// Create ReadCommands
getSlice()
www.thelastpickle.com
CassandraServer.getSlice()
// Process ReadCommands
// return Thrift types
readColumnFamily()
thriftifyColumnFamily()
www.thelastpickle.com
CassandraServer.readColumnFamily()
// Process ReadCommands
// Return ColumnFamilies
StorageProxy.read()
www.thelastpickle.com
API Services
JMX
Thrift
CQL 3
www.thelastpickle.com
o.a.c.cql3.QueryProcessor
// Prepares and executes CQL3 statements
// Used by Thrift & Native transports
// Access control
// Input validation
// Returns transport.ResultMessage
www.thelastpickle.com
CQL3 Grammar
ANTLR Grammar
$SRC/o.a.c.cql3/Cql.g
www.thelastpickle.com
o.a.c.cql3.statements.ParsedStatement
// Subclasses generated by ANTLR
// Tracks bound term count
// Prepare CQLStatement
prepare()
www.thelastpickle.com
o.a.c.cql3.statements.CQLStatement
checkAccess(ClientState state)
validate(ClientState state)
execute(ConsistencyLevel cl,
QueryState state,
List<ByteBuffer> variables)
www.thelastpickle.com
statements.SelectStatement.RawStatement
// Implements ParsedStatement
// Input validation
prepare()
www.thelastpickle.com
statements.SelectStatement.execute()
// Create ReadCommands
StorageProxy.read()
www.thelastpickle.com
Architecture
API
Dynamo
Database
www.thelastpickle.com
Dynamo Layer
o.a.c.service
o.a.c.net
o.a.c.dht
o.a.c.gms
o.a.c.locator
o.a.c.stream
www.thelastpickle.com
o.a.c.service.StorageProxy
// Cluster wide storage operations
// Select endpoints & check CL available
// Send messages to Stages
// Wait for response
// Store Hints
www.thelastpickle.com
o.a.c.service.StorageService
// Ring operations
// Track ring state
// Start & stop ring membership
// Node & token queries
www.thelastpickle.com
o.a.c.service.IResponseResolver
preprocess(MessageIn<T> message)
resolve() throws
DigestMismatchException
RowDigestResolver
RowDataResolver
RangeSliceResponseResolver
www.thelastpickle.com
Response Handlers / Callback
implements IAsyncCallback<T>
response(MessageIn<T> msg)
www.thelastpickle.com
o.a.c.service.ReadCallback.get()
//Wait for blockfor & data
condition.await(timeout,
TimeUnit.MILLISECONDS)
throw ReadTimeoutException()
resolver.resolve()
www.thelastpickle.com
o.a.c.service.StorageProxy.fetchRows()
getLiveSortedEndpoints()
new RowDigestResolver()
new ReadCallback()
MessagingService.sendRR()
---------------------------------------
ReadCallback.get() # blocking
catch (DigestMismatchException ex)
catch (ReadTimeoutException ex)
www.thelastpickle.com
Dynamo Layer
o.a.c.service
o.a.c.net
o.a.c.dht
o.a.c.gms
o.a.c.locator
o.a.c.stream
www.thelastpickle.com
o.a.c.net.MessagingService.verb<<enum>>
MUTATION
READ
REQUEST_RESPONSE
TREE_REQUEST
TREE_RESPONSE
(And more...)
www.thelastpickle.com
o.a.c.net.MessagingService.verbHandlers
new EnumMap<Verb,
IVerbHandler>(Verb.class)
www.thelastpickle.com
o.a.c.net.IVerbHandler<T>
doVerb(MessageIn<T> message,
String id);
www.thelastpickle.com
o.a.c.net.MessagingService.verbStages
new EnumMap<MessagingService.Verb,
Stage>(MessagingService.Verb.class)
www.thelastpickle.com
o.a.c.net.MessagingService.receive()
runnable = new MessageDeliveryTask(
message, id, timestamp);
StageManager.getStage(
message.getMessageType());
stage.execute(runnable);
www.thelastpickle.com
o.a.c.net.MessageDeliveryTask.run()
// If dropable and rpc_timeout
MessagingService.incrementDroppedMessag
es(verb);
MessagingService.getVerbHandler(verb)
verbHandler.doVerb(message, id)
www.thelastpickle.com
Architecture
API Layer
Dynamo Layer
Database Layer
www.thelastpickle.com
Database Layer
o.a.c.concurrent
o.a.c.db
o.a.c.cache
o.a.c.io
o.a.c.trace
www.thelastpickle.com
o.a.c.concurrent.StageManager
stages = new EnumMap<Stage,
ThreadPoolExecutor>(Stage.class);
getStage(Stage stage)
www.thelastpickle.com
o.a.c.concurrent.Stage
READ
MUTATION
GOSSIP
REQUEST_RESPONSE
ANTI_ENTROPY
(And more...)
www.thelastpickle.com
Database Layer
o.a.c.concurrent
o.a.c.db
o.a.c.cache
o.a.c.io
o.a.c.trace
www.thelastpickle.com
o.a.c.db.Table
// Keyspace
open(String table)
getColumnFamilyStore(String cfName)
getRow(QueryFilter filter)
apply(RowMutation mutation,
boolean writeCommitLog)
www.thelastpickle.com
o.a.c.db.ColumnFamilyStore
// Column Family
getColumnFamily(QueryFilter filter)
getTopLevelColumns(...)
apply(DecoratedKey key,
ColumnFamily columnFamily,
SecondaryIndexManager.Updater
indexer)
www.thelastpickle.com
o.a.c.db.IColumnContainer
addColumn(IColumn column)
remove(ByteBuffer columnName)
ColumnFamily
SuperColumn
www.thelastpickle.com
o.a.c.db.ISortedColumns
addColumn(IColumn column,
Allocator allocator)
removeColumn(ByteBuffer name)
ArrayBackedSortedColumns
AtomicSortedColumns
TreeMapBackedSortedColumns
www.thelastpickle.com
o.a.c.db.Memtable
put(DecoratedKey key,
ColumnFamily columnFamily,
SecondaryIndexManager.Updater
indexer)
flushAndSignal(CountDownLatch latch,
Future<ReplayPosition>
context)
www.thelastpickle.com
o.a.c.db.ReadCommand
getRow(Table table)
SliceByNamesReadCommand
SliceFromReadCommand
www.thelastpickle.com
o.a.c.db.IDiskAtomFilter
getMemtableColumnIterator(...)
getSSTableColumnIterator(...)
IdentityQueryFilter
NamesQueryFilter
SliceQueryFilter
www.thelastpickle.com
Summary
CustomTThreadPoolServer Message.Dispatcher
CassandraServer QueryProcessor
ReadCommand
StorageProxy
IResponseResolver
IAsyncCallback
MessagingService
IVerbHandler
Table ColumnFamilyStore IDiskAtomFilter
API
Dynamo
Database
www.thelastpickle.com
Thanks.
www.thelastpickle.com
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

More Related Content

Cassandra Community Webinar: Apache Cassandra Internals