Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
cassandra
Where Did Cassandra Come From
• Cassandra originated at Facebook in 2007 to
  solve that company’s inbox search problem
  – large volumes of data
  – many random reads
  – many simultaneous random writes
• was released as an open source Google Code
  project in July 2008
• March 2009 it was moved to an Apache Incubator
  project
• February 17, 2010 it was voted into a top-level
  project
Cassandra in 50 Words or Less
• Apache Cassandra is an
    –   open source
    –   distributed
    –   Decentralized
    –   elastically scalable
    –   highly available
    –   fault-tolerant
    –   tuneably consistent
    –   column-oriented
•   Database that
•   bases its distribution design on Amazon’s Dynamo
•   its data model on Google’s Bigtable
•   Created at Facebook
•   it is now used at some of the most popular sites on the Web
Who Is Using Cassandra
• Twitter is using Cassandra for analytics.
• Mahalo uses it for its primary near-time data store.
• Facebook still uses it for inbox search, though they are using a
  proprietary fork.
• Digg uses it for its primary near-time data store.
• Rackspace uses it for its cloud service, monitoring, and logging.
• Reddit uses it as a persistent cache.
• Cloudkick uses it for monitoring statistics and analytics.
• Ooyala uses it to store and serve near real-time video analytics
  data.
• SimpleGeo uses it as the main data store for its real-time location
  infrastructure.
• Onespot uses it for a subset of its main data store

Recommended for you

Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra

Introduction to Apache Cassandra (September 2014). Design principles, replication, consistency, clusters, CQL.

cassandra cql replication consistency
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016

Cassandra is the dominant data store used at Netflix and it's health is critical to many of its services. In this talk we will share details of the recent redesign of our health monitoring system and how we leveraged a reactive stream processing system to give us a real-time view our entire fleet while dramatically improving accuracy and reducing false alarms in our alerting. About the Speaker Jason Cacciatore Senior Software Engineer, Netflix Jason Cacciatore is a Senior Software Engineer at Netflix, where he's been working for the past several years. He's interested in stateful distributed systems and has a diverse background in technology. In his spare time he enjoys spending time with his wife and two sons, reading non-fiction, and watching Netflix documentaries.

breakoutsreactivefalse alarms
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation

Cassandra is used for real-time bidding in online advertising. It processes billions of bid requests per day with low latency requirements. Segment data, which assigns product or service affinity to user groups, is stored in Cassandra to reduce calculations and allow users to be bid on sooner. Tuning the cache size and understanding the active dataset helps optimize performance.

low latencycassandra m6drealtime bidding
Decentralized


• Master/slave:
     Decentralized                Master/slave
     all nodes are the same,      If the master node fails, the
     failures of a                whole database is in jeopardy
     node won’t disrupt service
Elastic Scalability
• add another machine—Cassandra will find it
  and start sending it work
High Availability and Fault Tolerance
SCID
• Atomic
  – All or nothing
• Consistent

• Isolated
  – Two transaction modify same data
• Durable

Recommended for you

Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...

Apache Cassandra operations have the reputation to be simple on single datacenter deployments and / or low volume clusters but they become way more complex on high latency multi-datacenter clusters with high volume and / or high throughout: basic Apache Cassandra operations such as repairs, compactions or hints delivery can have dramatic consequences even on a healthy high latency multi-datacenter cluster. In this presentation, Julien will go through Apache Cassandra mutli-datacenter concepts first then show multi-datacenter operations essentials in details: bootstrapping new nodes and / or datacenter, repairs strategy, Java GC tuning, OS tuning, Apache Cassandra configuration and monitoring. Based on his 3 years experience managing a multi-datacenter cluster against Apache Cassandra 2.0, 2.1, 2.2 and 3.0, Julien will give you tips on how to anticipate and prevent / mitigate issues related to basic Apache Cassandra operations with a multi-datacenter cluster. About the Speaker Julien Anguenot VP Software Engineering, iland Internet Solutions, Corp Julien currently serves as iland's Vice President of Software Engineering. Prior to joining iland, Mr. Anguenot held tech leadership positions at several open source content management vendors and tech startups in Europe and in the U.S. Julien is a long time Open Source software advocate, contributor and speaker: Zope, ZODB, Nuxeo contributor, Zope and OpenStack foundations member, his talks includes Apache Con, Cassandra summit, OpenStack summit, The WWW Conference or still EuroPython.

sessionspresentationstalks
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...

Running a Cassandra cluster in AWS that can store petabytes worth of data can be costly. This talk will detail the novel approach of using approximate data structures to keep costs low, yet retain insightful, and up to date query results. The talk will explore a number of real world examples from our environment to demonstrate the power of approximate data. It will cover: determining how many IP addresses are on a network, ranking IPs by traffic, and finally determining approximate min, max, and averages on values. The talk will also cover how this data is laid out in Cassandra, so that a query always returns up to date data, without burdening the compactor. About the Speaker Ben Kornmeier Engineer, ProtectWise Ben is a Staff Engineer at ProtectWise. When he is not building realtime processing pipelines, he enjoys hiking, biking, and keeping his dog out of trouble.

2016c*event
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics

iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information. In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc. Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.

#cassandrasummit cloud analytics cassandra
Brewer’s CAP Theorem
• you can strongly support only two of the Three:
  – Consistency
     • All database client will read the same value for same query,
       even given concurrent updates
  – Availability
     • All database clients will always be able to read and write
       data
  – Partition Tolerance
     • The database can be split into multiple machines
     • It can continue functioning in fact of network segmentation
       breaks
CAP




transaction
usage
•   Connect localhost/9160 ;
•   Show cluster name
•   Show keyspaces
•   Create keyspace XXXXX
•   Use XXXXX
•   Create column family YYYYY
•   Describe keyspace XXXXX
• Set YYYYY[“XiaoMing”][“name”] = “小明”
• Get YYYYY[“XiaoMing”]

Recommended for you

Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014

The document describes an agenda for a Cassandra training event on December 3rd and 4th, including an introduction to Cassandra, Spark, and related tools on the 3rd, and a Cassandra Summit conference on the 4th to learn how companies are using Cassandra to grow their businesses. It also provides information about DataStax as the main commercial backer of Cassandra and their Cassandra-based products and services.

cassandrajava
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra

This document discusses bulk loading and unloading data into and from Cassandra. It describes using CQL INSERT statements via Java drivers or CQLSH COPY FROM for loading, as well as using SSTable files via sstableloader or custom code. For unloading, it recommends using parallel CQL SELECT queries by splitting the token range across multiple connections. Testing showed Java asynchronous INSERTs to be the fastest loading method in most cases, while sstableloader requires all nodes be online. Batching INSERTs can improve throughput but increases latency.

cassandra-loaderbulkcassandra
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers

This talk will provide a high-level overview of Cassandra, the Cassandra Query Language (CQL) and more specifically the DataStax CQL Java driver. This talk will aim to introduce Java developers tools, techniques and best practices for building Java application leveraging the Cassandra database using CQL3.

apache cassandra java developers cql database intr
• List
• Map
• MapList<row_id, Map>
• Column Family 列簇
• create column family User
  with key_validation_class=UTF8Type
Column family
• Ddd
Super column family
• d

Recommended for you

How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)

This document discusses how to size a Cassandra cluster based on replication factor, data size, and performance needs. It describes that replication factor, data size, data velocity, and hardware considerations like CPU, memory, and disk type should all be examined to determine the appropriate number of nodes. The goal is to have enough nodes to store data, achieve target throughput levels, and maintain performance and availability even if nodes fail.

apache cassandraopen sourcenosql
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp

Christian Johannsen presents on evaluating Apache Cassandra as a cloud database. Cassandra is optimized for cloud infrastructure with features like transparent elasticity, scalability, high availability, easy data distribution and redundancy. It supports multiple data types, is easy to manage, low cost, supports multiple infrastructures and has security features. A demo of DataStax OpsCenter and Apache Spark on Cassandra is shown.

enterprisecloudcloud database
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark

An overview and lessons learned from developing a system to process 50,000 events per second with Cassandra and Spark.

apache sparkcassandra
Clusters (Ring)
• If the first node goes down, a replica can
  respond to queries. The peer-to-peer protocol
  allows the data to replicate across nodes in a
  manner transparent to the user

• Replaction factor
Keyspaces
• Don’t add too much Keyspaces

• (database)
Gossip protocols
• intra-ring communication so that each node
  can have state information about other nodes
• Runs every second
• Gossip Message:
  – Send: GossipDigestSynMessage
  – Ack: GossipDigestAckMessage
  – send: GossipDigestAck2Message
• algorithm :
  – Phi Accrual Failure Detection
Anti-entropy
• Anti-entropy is the replica synchronization
  mechanism in Cassandra for ensuring that
  data on different nodes is updated to the
  newest version
• Merkle tree

Recommended for you

Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo

In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.

kafkastreamsmesos
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra

Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up a node? Then look no further! Rebecca Mills, Junior Evangelist at Datastax, will guide you in the webinar “Getting Started with Apache Cassandra...” You'll get an overview of Planet Cassandra’s resources to get you started quickly and easily. Rebecca will take you down the path that's right for you, whether you are a developer or administrator. Join if you are interested in getting Cassandra up and working in the way that suits you best.

big datacassandraapache cassandra
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...

EmoDB is an open source RESTful data store built on top of Cassandra that stores JSON documents and, most notably, offers a databus that allows subscribers to watch for changes to those documents in real time. It features massive non-blocking global writes, asynchronous cross data center communication, and schema-less json content. For non-blocking global writes, we created a ""JSON delta"" specification that defines incremental updates to any json document. Each row, in Cassandra, is thus a sequence of deltas that serves as a Conflict-free Replicated Datatype (CRDT) for EmoDB's system of record. We introduce the concept of ""distributed compactions"" to frequently compact these deltas for efficient reads. Finally, the databus forms a crucial piece of our data infrastructure and offers a change queue to real time streaming applications. About the Speaker Fahd Siddiqui Lead Software Engineer, Bazaarvoice Fahd Siddiqui is a Lead Software Engineer at Bazaarvoice in the data infrastructure team. His interests include highly scalable, and distributed data systems. He holds a Master's degree in Computer Engineering from the University of Texas at Austin, and frequently talks at Austin C* User Group. About Bazaarvoice: Bazaarvoice is a network that connects brands and retailers to the authentic voices of people where they shop. More at www.bazaarvoice.com

crdtevent2016
Memtable&SSTable&CommitLog
• Memtable
  – Value is written to a memory-resident data structure
• SSTable
  – Include: Data, Index, and Filter
  – concept borrowed from Google’s Bigtable
  – Memtable reaches a threshold, flushed to disk
• Commit log
  – Flush status: 0 / 1
     • 1:start to flush
     • 0: flush success
hinted handoff & Compaction
• hinted handoff
  – When a write no available
  – Create a hint to node Cassandra


• Compaction:
  – In order to merge SSTable
  – merged data is sorted
  – new index is created over the sorted data
major compaction
• stored in memory
• used to improve performance by reducing disk
  access on key lookups
Tombstones 墓碑
• Knows as “soft delete”
• Not immediately deleted after execute a
  delete operation
• Garbage Collection Grace Seconds:
  – GCGraceSeconds
     • Default: 10 days (864000 sec)

Recommended for you

Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner

Swarnim Kulkarni (Cerner) Cerner has been an active consumer of HBase for a very long time, storing petabytes of healthcare data in its multiple isolated HBase clusters. This talk will walk through the design of Cerner's enterprise data hub with a focus on the multi-tenant HBase as a service offering within the hub.

hadoopclouderahbasecon
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes

- Micro-batching involves grouping statements into small batches to improve throughput and reduce network overhead when writing to Cassandra. - A benchmark was conducted to compare individual statements, regular batches, and partition-aware batches when inserting 1 million rows into Cassandra. - The results showed that partition-aware batches had shorter runtime, lower client and cluster CPU usage, and was more performant overall compared to individual statements and regular batches. However, it may have higher latency which is better suited for bulk data processing rather than real-time workloads.

cassandra supporthosted cassandraapache cassandra
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features

This presentation shortly describes key features of Apache Cassandra. It was held at the Apache Cassandra Meetup in Vienna in January 2014. You can access the meetup here: http://www.meetup.com/Vienna-Cassandra-Users/

mongodbintroductioncassandra
Staged Event-Driven Architecture
                (SEDA)
• originally proposed in a 2001 paper called “SEDA: An
  Architecture for Well-Conditioned, Scalable Internet
  Services”
• A stage consists of an incoming event queue
   –   Read
   –   Mutation
   –   Gossip
   –   Response
   –   Anti-Entropy
   –   Load Balance
   –   Migration
   –   Streaming
   –   …
Custom FactoryUtil
• Prevent version uncompatible
Configuring Cassandra
• system_add_keyspace
   – Creates a keyspace.
• system_rename_keyspace
   – Changes the name of a keyspace after taking a snapshot of it. Note that this
     method
   – blocks until its work is done.
• system_drop_keyspace
   – Deletes an entire keyspace after taking a snapshot of it.
• system_add_column_family
   – Creates a column family.
• system_drop_column_family
   – Deletes a column family after taking a snapshot of it.
• system_rename_column_family
   – Changes the name of a column family after taking a snapshot of it. Note that
     this
   – method blocks until its work is done.
Creating a Column Family
•   column_type
      – Either Super or Standard.
•   clock_type
      – The only valid value is Timestamp.
•   comparator
      – Valid options include AsciiType, BytesType, LexicalUUIDType, LongType, TimeUUID Type, and UTF8Type.
•   subcomparator
      – Name of comparator used for subcolumns when the column_type is Super. Valid options are the same as comparator.
•   reconciler
      – Name of the class that will reconcile conflicting column versions. The only valid value at this time is Timestamp.
•   comment
      – Any human-readable comment in the form of a string.
•   rows_cached
      – The number of rows to cache.
•   preload_row_cache
      – Set this to true to automatically load the row cache.
•   key_cache_size
      – The number of keys to pull into the cache.
•   read_repair_chance
      – Valid values are a number between 0.0 and 1.0.

Recommended for you

An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra

Apache Cassandra is a free, distributed, open source, and highly scalable NoSQL database that is designed to handle large amounts of data across many commodity servers. It provides high availability with no single point of failure, linear scalability, and tunable consistency. Cassandra's architecture allows it to spread data across a cluster of servers and replicate across multiple data centers for fault tolerance. It is used by many large companies for applications that require high performance, scalability, and availability.

apache cassandranosqldatastax
Cassandra
CassandraCassandra
Cassandra

This document outlines an online course on Cassandra that covers its key concepts and features. The course contains 8 modules that progress from introductory topics to more advanced ones like integrating Cassandra with Hadoop. It teaches students how to model and query data in Cassandra, configure and maintain Cassandra clusters, and build a sample application. The course includes live classes, recordings, quizzes, assignments, and an online certification exam to help students learn Cassandra.

Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra

An introduction to the Apache Cassandra database, as presented at the Northern Illinois Coders user group on 20141022.

cqlnosqldatabase
Replicas
• Simple Strategy
  – RackUnawareStrategy
• Old Network Topology Strategy
  – RackAwareStrategy
• Network Topology Strategy
  – DataCenterShardStrategy
  – datacenter.properties
Replication Factor
• specifies how many copies of each piece of
  data will be stored and distributed throughout
  the Cassandra cluster
• Factor = 1 : your data will exist only in a single
  node in the cluster. Losing that node means
  that data becomes unavailable
Increasing the Replication Factor
• Nodes grows and should increasing factor
• How to do:
  – ensure that all the data is flushed to the SSTables
     • flush -h 192.168.1.1 -p 9160
  – stop that node
  – copy the datafiles from your keyspaces
  – Paste those datafiles to the new node
Replica Placement Strategies
• Simple Strategy
• Old Network Topology Strategy
• Network Topology Strategy

Recommended for you

Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)

The database industry has been abuzz over the past year about NoSQL databases. Apache Cassandra, which has quickly emerged as a best-of-breed solution in this space, is used at many companies to achieve unprecedented scale while maintaining streamlined operations. This presentation goes beyond the hype, buzzwords, and rehashed slides and actually presents the attendees with a hands-on, step-by-step tutorial on how to write a Java application on top of Apache Cassandra. It focuses on concepts such as idempotence, tunable consistency, and shared-nothing clusters to help attendees get started with Apache Cassandra quickly while avoiding common pitfalls.

javaonejavaapache cassandra
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra

Apache Cassandra is an open source NoSQL database that provides high performance and scalability across many servers. It was originally developed at Facebook in 2008 and released as an open source project on Google Code before becoming an Apache project in 2009. Cassandra uses a decentralized architecture and replication strategy to ensure there is no single point of failure and the system remains operational as long as one node remains up.

oraclenosqlcluster
Dağıtık Sistemler / Programlama
Dağıtık Sistemler / ProgramlamaDağıtık Sistemler / Programlama
Dağıtık Sistemler / Programlama

Dağıtık bilgisayar sistemleri ve dağıtık yazılımlar ile ilgili araştırmam

Adding Nodes to a Cluster
• If you want to add a new seed node, then you should
  autobootstrap it first, and then change it to a seed
  afterward

• Node1:
   – listen_address: 192.168.1.1
   – rpc_address: 0.0.0.0
• Node2:
   – auto_bootstrap: true
   – listen_address: 192.168.2.34
   – rpc_address: 0.0.0.0
Hector
• Cluster myCluster =
  HFactory.getOrCreateCluster("Test Cluster",
  "192.168.2.3:9160");

• ThriftCfDef columnFamilyDefinition = new
  ThriftCfDef("s3","nb",ComparatorType.UTF8TYPE
  );
•
  columnFamilyDefinition.setReplicateOnWrite(tru
  e);
Hector
• ThriftCfDef columnFamilyDefinition = new
  ThriftCfDef("s3","bb",ComparatorType.UTF8TYPE);
•
  columnFamilyDefinition.setKeyValidationClass("org.apache.
  cassandra.db.marshal.UTF8Type");
•
  columnFamilyDefinition.setDefaultValidationClass("org.apa
  che.cassandra.db.marshal.UTF8Type");
•
  //myCluster.addColumnFamily(columnFamilyDefinition) ;
•     columnFamilyDefinition.setId(1013);
•
  myCluster.updateColumnFamily(columnFamilyDefinition);
Hector
• Keyspace myKeyspace =
  HFactory.createKeyspace("s3", myCluster);
•      Mutator<String> mutator =
  HFactory.createMutator(myKeyspace,
  StringSerializer.get());


•     mutator.insert("b", "bb",
    HFactory.createStringColumn("column1", "你好
    在"));

Recommended for you

Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service

WSO2 is an open source software company founded in 2005 that produces an entire middleware platform under the Apache license. Their business model involves selling comprehensive support and maintenance for their products. They have over 150 employees with offices globally. The document discusses using Apache Cassandra as a NoSQL database with WSO2's Column Store Service, including how to install the Cassandra feature, manage keyspaces and column families, and develop applications using the Java API Hector.

 
by WSO2
Cursos Big Data Open Source
Cursos Big Data Open SourceCursos Big Data Open Source
Cursos Big Data Open Source

Cursos Big Data Open Source: Introduccion al Big Data y Cursos técnicos para ingenieros Data Scientists Hadoop, spark, Cassandra, MongoDB, NOSql,

hadoopsparkkettle
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3

CQL is the query language for Apache Cassandra that provides an SQL-like interface. The document discusses the evolution from the older Thrift RPC interface to CQL and provides examples of modeling tweet data in Cassandra using tables like users, tweets, following, followers, userline, and timeline. It also covers techniques like denormalization, materialized views, and batch loading of related data to optimize for common queries.

ddtx13distributed databasesql
Hector
• ColumnQuery q = HFactory.createColumnQuery(myKeyspace,
  StringSerializer.get(), StringSerializer.get(), StringSerializer.get());
• // set key, name, cf and execute
• QueryResult<HColumn> r = q
•      .setColumnFamily("bb")
•      .setKey("b")
•      .setName("column1")
•      .execute();
• // read value from the result
• HColumn<String,String> c = r.get();
• String value = c.getValue();
• System.out.println(value);

More Related Content

What's hot

Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
Patricia Gorla
 
Introduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopIntroduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and Hadoop
Patricia Gorla
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
Julien Anguenot
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
DataStax
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
DataStax
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
DataStax
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Johnny Miller
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
Brian Hess
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
Julien Anguenot
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
Christian Johannsen
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
Ben Slater
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Joe Stein
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
DataStax
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
DataStax
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
HBaseCon
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
Instaclustr
 

What's hot (20)

Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
 
Introduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopIntroduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and Hadoop
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
 

Viewers also liked

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
DataStax Academy
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Cassandra
CassandraCassandra
Cassandra
Edureka!
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
Aaron Ploetz
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
zznate
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
Sperasoft
 
Dağıtık Sistemler / Programlama
Dağıtık Sistemler / ProgramlamaDağıtık Sistemler / Programlama
Dağıtık Sistemler / Programlama
Şahabettin Akca
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
WSO2
 
Cursos Big Data Open Source
Cursos Big Data Open SourceCursos Big Data Open Source
Cursos Big Data Open Source
Stratebi
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4jBases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
Diego López-de-Ipiña González-de-Artaza
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internals
Bhuvan Rawal
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 

Viewers also liked (14)

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
Dağıtık Sistemler / Programlama
Dağıtık Sistemler / ProgramlamaDağıtık Sistemler / Programlama
Dağıtık Sistemler / Programlama
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
 
Cursos Big Data Open Source
Cursos Big Data Open SourceCursos Big Data Open Source
Cursos Big Data Open Source
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4jBases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internals
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Similar to Cassandra

Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
SudheerKumar499932
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
Fabio Fumarola
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
Adnan Siddiqi
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
VitsRangannavar
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
Naveen Kumar
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
Brent Theisen
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
Joe Alex
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
KarthikR780430
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
Rich Lee
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
Satish Mehta
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
Jason Brown
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
András Fehér
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
DataStax Academy
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
Cliff Gilmore
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
Jacky Chu
 

Similar to Cassandra (20)

Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 

More from exsuns

Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
exsuns
 
Statistics
StatisticsStatistics
Statistics
exsuns
 
R
RR
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
exsuns
 
java memory management & gc
java memory management & gcjava memory management & gc
java memory management & gc
exsuns
 

More from exsuns (6)

Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
 
Statistics
StatisticsStatistics
Statistics
 
R
RR
R
 
Ios
IosIos
Ios
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
java memory management & gc
java memory management & gcjava memory management & gc
java memory management & gc
 

Recently uploaded

How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx
SATYENDRA100
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
ScyllaDB
 
Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024
The Digital Insurer
 
@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...
@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...
@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...
kantakumariji156
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)
Alpen-Adria-Universität
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
SeasiaInfotech2
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
uuuot
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)
apoorva2579
 

Recently uploaded (20)

How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
 
Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024
 
@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...
@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...
@Call @Girls Guwahati 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any...
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)
 

Cassandra

  • 2. Where Did Cassandra Come From • Cassandra originated at Facebook in 2007 to solve that company’s inbox search problem – large volumes of data – many random reads – many simultaneous random writes • was released as an open source Google Code project in July 2008 • March 2009 it was moved to an Apache Incubator project • February 17, 2010 it was voted into a top-level project
  • 3. Cassandra in 50 Words or Less • Apache Cassandra is an – open source – distributed – Decentralized – elastically scalable – highly available – fault-tolerant – tuneably consistent – column-oriented • Database that • bases its distribution design on Amazon’s Dynamo • its data model on Google’s Bigtable • Created at Facebook • it is now used at some of the most popular sites on the Web
  • 4. Who Is Using Cassandra • Twitter is using Cassandra for analytics. • Mahalo uses it for its primary near-time data store. • Facebook still uses it for inbox search, though they are using a proprietary fork. • Digg uses it for its primary near-time data store. • Rackspace uses it for its cloud service, monitoring, and logging. • Reddit uses it as a persistent cache. • Cloudkick uses it for monitoring statistics and analytics. • Ooyala uses it to store and serve near real-time video analytics data. • SimpleGeo uses it as the main data store for its real-time location infrastructure. • Onespot uses it for a subset of its main data store
  • 5. Decentralized • Master/slave: Decentralized Master/slave all nodes are the same, If the master node fails, the failures of a whole database is in jeopardy node won’t disrupt service
  • 6. Elastic Scalability • add another machine—Cassandra will find it and start sending it work
  • 7. High Availability and Fault Tolerance
  • 8. SCID • Atomic – All or nothing • Consistent • Isolated – Two transaction modify same data • Durable
  • 9. Brewer’s CAP Theorem • you can strongly support only two of the Three: – Consistency • All database client will read the same value for same query, even given concurrent updates – Availability • All database clients will always be able to read and write data – Partition Tolerance • The database can be split into multiple machines • It can continue functioning in fact of network segmentation breaks
  • 11. usage • Connect localhost/9160 ; • Show cluster name • Show keyspaces • Create keyspace XXXXX • Use XXXXX • Create column family YYYYY • Describe keyspace XXXXX
  • 12. • Set YYYYY[“XiaoMing”][“name”] = “小明” • Get YYYYY[“XiaoMing”]
  • 13. • List • Map • MapList<row_id, Map>
  • 14. • Column Family 列簇 • create column family User with key_validation_class=UTF8Type
  • 17. Clusters (Ring) • If the first node goes down, a replica can respond to queries. The peer-to-peer protocol allows the data to replicate across nodes in a manner transparent to the user • Replaction factor
  • 18. Keyspaces • Don’t add too much Keyspaces • (database)
  • 19. Gossip protocols • intra-ring communication so that each node can have state information about other nodes • Runs every second • Gossip Message: – Send: GossipDigestSynMessage – Ack: GossipDigestAckMessage – send: GossipDigestAck2Message • algorithm : – Phi Accrual Failure Detection
  • 20. Anti-entropy • Anti-entropy is the replica synchronization mechanism in Cassandra for ensuring that data on different nodes is updated to the newest version • Merkle tree
  • 21. Memtable&SSTable&CommitLog • Memtable – Value is written to a memory-resident data structure • SSTable – Include: Data, Index, and Filter – concept borrowed from Google’s Bigtable – Memtable reaches a threshold, flushed to disk • Commit log – Flush status: 0 / 1 • 1:start to flush • 0: flush success
  • 22. hinted handoff & Compaction • hinted handoff – When a write no available – Create a hint to node Cassandra • Compaction: – In order to merge SSTable – merged data is sorted – new index is created over the sorted data
  • 23. major compaction • stored in memory • used to improve performance by reducing disk access on key lookups
  • 24. Tombstones 墓碑 • Knows as “soft delete” • Not immediately deleted after execute a delete operation • Garbage Collection Grace Seconds: – GCGraceSeconds • Default: 10 days (864000 sec)
  • 25. Staged Event-Driven Architecture (SEDA) • originally proposed in a 2001 paper called “SEDA: An Architecture for Well-Conditioned, Scalable Internet Services” • A stage consists of an incoming event queue – Read – Mutation – Gossip – Response – Anti-Entropy – Load Balance – Migration – Streaming – …
  • 26. Custom FactoryUtil • Prevent version uncompatible
  • 27. Configuring Cassandra • system_add_keyspace – Creates a keyspace. • system_rename_keyspace – Changes the name of a keyspace after taking a snapshot of it. Note that this method – blocks until its work is done. • system_drop_keyspace – Deletes an entire keyspace after taking a snapshot of it. • system_add_column_family – Creates a column family. • system_drop_column_family – Deletes a column family after taking a snapshot of it. • system_rename_column_family – Changes the name of a column family after taking a snapshot of it. Note that this – method blocks until its work is done.
  • 28. Creating a Column Family • column_type – Either Super or Standard. • clock_type – The only valid value is Timestamp. • comparator – Valid options include AsciiType, BytesType, LexicalUUIDType, LongType, TimeUUID Type, and UTF8Type. • subcomparator – Name of comparator used for subcolumns when the column_type is Super. Valid options are the same as comparator. • reconciler – Name of the class that will reconcile conflicting column versions. The only valid value at this time is Timestamp. • comment – Any human-readable comment in the form of a string. • rows_cached – The number of rows to cache. • preload_row_cache – Set this to true to automatically load the row cache. • key_cache_size – The number of keys to pull into the cache. • read_repair_chance – Valid values are a number between 0.0 and 1.0.
  • 29. Replicas • Simple Strategy – RackUnawareStrategy • Old Network Topology Strategy – RackAwareStrategy • Network Topology Strategy – DataCenterShardStrategy – datacenter.properties
  • 30. Replication Factor • specifies how many copies of each piece of data will be stored and distributed throughout the Cassandra cluster • Factor = 1 : your data will exist only in a single node in the cluster. Losing that node means that data becomes unavailable
  • 31. Increasing the Replication Factor • Nodes grows and should increasing factor • How to do: – ensure that all the data is flushed to the SSTables • flush -h 192.168.1.1 -p 9160 – stop that node – copy the datafiles from your keyspaces – Paste those datafiles to the new node
  • 32. Replica Placement Strategies • Simple Strategy • Old Network Topology Strategy • Network Topology Strategy
  • 33. Adding Nodes to a Cluster • If you want to add a new seed node, then you should autobootstrap it first, and then change it to a seed afterward • Node1: – listen_address: 192.168.1.1 – rpc_address: 0.0.0.0 • Node2: – auto_bootstrap: true – listen_address: 192.168.2.34 – rpc_address: 0.0.0.0
  • 34. Hector • Cluster myCluster = HFactory.getOrCreateCluster("Test Cluster", "192.168.2.3:9160"); • ThriftCfDef columnFamilyDefinition = new ThriftCfDef("s3","nb",ComparatorType.UTF8TYPE ); • columnFamilyDefinition.setReplicateOnWrite(tru e);
  • 35. Hector • ThriftCfDef columnFamilyDefinition = new ThriftCfDef("s3","bb",ComparatorType.UTF8TYPE); • columnFamilyDefinition.setKeyValidationClass("org.apache. cassandra.db.marshal.UTF8Type"); • columnFamilyDefinition.setDefaultValidationClass("org.apa che.cassandra.db.marshal.UTF8Type"); • //myCluster.addColumnFamily(columnFamilyDefinition) ; • columnFamilyDefinition.setId(1013); • myCluster.updateColumnFamily(columnFamilyDefinition);
  • 36. Hector • Keyspace myKeyspace = HFactory.createKeyspace("s3", myCluster); • Mutator<String> mutator = HFactory.createMutator(myKeyspace, StringSerializer.get()); • mutator.insert("b", "bb", HFactory.createStringColumn("column1", "你好 在"));
  • 37. Hector • ColumnQuery q = HFactory.createColumnQuery(myKeyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get()); • // set key, name, cf and execute • QueryResult<HColumn> r = q • .setColumnFamily("bb") • .setKey("b") • .setName("column1") • .execute(); • // read value from the result • HColumn<String,String> c = r.get(); • String value = c.getValue(); • System.out.println(value);