SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Store and Process Big Data
with Hadoop and Cassandra
     Apache BarCamp
              By
      Deependra Ariyadewa
          WSO2, Inc.
Store Data with

 ● Project site : http://cassandra.apache.org

 ● The latest release version is 1.0.7

 ● Cassandra is in use at Netflix, Twitter, Urban Airship, Constant
  Contact, Reddit, Cisco, OpenX, Digg, CloudKick and Ooyala

 ● Cassandra Users : http://www.datastax.com/cassandrausers

 ● The largest known Cassandra cluster has over 300 TB of data in over
   400 machines.

 ● Commercial support http://wiki.apache.org/cassandra/ThirdPartySupport
Cassandra Deployment architecture
                                   hash(key1)




                      hash(key2)




 key => {(k,v),(k,v),(k,v)}

 hash(key) => key order
How to Install Cassandra

 ● Download the artifact
   apache-cassandra-1.0.7-bin.tar.gz from   http://cassandra.apache.org/download/


 ● Extract
   tar -xzvf apache-cassandra-1.0.7-bin.tar.gz

 ● Set up folder paths

        mkdir -p /var/log/cassandra

        chown -R `whoami` /var/log/cassandra

        mkdir -p /var/lib/cassandra
        chown -R `whoami` /var/lib/cassandra
How to Configure Cassandra
Main Configuration file :

  $CASSANDRA_HOME/conf/cassandra.yaml

           cluster_name: 'Test Cluster'

            seed_provider:
                              - seeds: "192.168.0.121"

             storage_port: 7000

             listen_address: localhost

             rpc_address: localhost

             rpc_port: 9160
Cassandra Clustering

 initial_token:

 partitioner: org.apache.cassandra.dht.RandomPartitioner


 http://wiki.apache.org/cassandra/Operations
Cassandra DevOps

$CASSANDRA_HOME/bin$ ./cassandra-cli --host localhost

   [default@unknown] show keyspaces;
   Keyspace: system:
    Replication Strategy: org.apache.cassandra.locator.LocalStrategy
    Durable Writes: true
     Options: [replication_factor:1]
    Column Families:
     ColumnFamily: HintsColumnFamily (Super)
     "hinted handoff data"
       Key Validation Class: org.apache.cassandra.db.marshal.BytesType
       Default column value validator: org.apache.cassandra.db.marshal.BytesType
       Columns sorted by: org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.
   marshal.BytesType
       Row cache size / save period in seconds / keys to save : 0.0/0/all
       Row Cache Provider: org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider
       Key cache size / save period in seconds: 0.01/0
       GC grace seconds: 0
       Compaction min/max thresholds: 4/32
       Read repair chance: 0.0
       Replicate on write: true
       Bloom Filter FP chance: default
       Built indexes: []
       Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
Cassandra CLI

[default@apache] create column family Location with comparator=UTF8Type and
default_validation_class=UTF8Type and key_validation_class=UTF8Type;
f04561a0-60ed-11e1-0000-242d50cf1fbf
Waiting for schema agreement...
... schemas agree across the cluster


[default@apache] set Location[00001][City]='Colombo';
Value inserted.
Elapsed time: 140 msec(s).


[default@apache] list Location;
Using default limit of 100
-------------------
RowKey: 00001
=> (column=City, value=Colombo, timestamp=1330311097464000)

1 Row Returned.
Elapsed time: 122 msec(s).
Store Data with Hector
import me.prettyprint.cassandra.service.CassandraHostConfigurator;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.factory.HFactory;

import java.util.HashMap;
import java.util.Map;

public class ExampleHelper {

    public static final String CLUSTER_NAME = "ClusterOne";
    public static final String USERNAME_KEY = "username";
    public static final String PASSWORD_KEY = "password";
    public static final String RPC_PORT = "9160";
    public static final String CSS_NODE0 = "localhost";
    public static final String CSS_NODE1 = "css1.stratoslive.wso2.com";
    public static final String CSS_NODE2 = "css2.stratoslive.wso2.com";

    public static Cluster createCluster(String username, String password) {
      Map<String, String> credentials =
            new HashMap<String, String>();
      credentials.put(USERNAME_KEY, username);
      credentials.put(PASSWORD_KEY, password);
      String hostList = CSS_NODE0 + ":" + RPC_PORT + "," + CSS_NODE1 + ":" + RPC_PORT + "," +
                                                                                  CSS_NODE2 + ":" + RPC_PORT;
      return HFactory.createCluster(CLUSTER_NAME,
                           new CassandraHostConfigurator(hostList), credentials);
    }

}
Store Data with Hector
Create Keyspace:

    KeyspaceDefinition definition = new ThriftKsDef(keyspaceName);
    cluster.addKeyspace(definition);

Add column family:
    ColumnFamilyDefinition familyDefinition = new ThriftCfDef(keyspaceName, columnFamily);
    cluster.addColumnFamily(familyDefinition);

Write Data:

Mutator<String> mutator = HFactory.createMutator(keyspace, new StringSerializer());

String columnValue = UUID.randomUUID().toString();
          mutator.insert(rowKey, columnFamily, HFactory.createStringColumn(columnName, columnValue));


Read Data:
        ColumnQuery<String, String, String> columnQuery = HFactory.createStringColumnQuery(keyspace);

         columnQuery.setColumnFamily(columnFamily).setKey(key).setName(columnName);
         QueryResult<HColumn<String, String>> result = columnQuery.execute();
         HColumn<String, String> hColumn = result.get();

         System.out.println("Column: " + hColumn.getName() + " Value : " + hColumn.getValue() + "n");
Variable Consistency
    ● ANY: Wait until some replica has responded.

    ● ONE: Wait until one replica has responded.

    ● TWO: Wait until two replicas have responded.

    ● THREE: Wait until three replicas have responded
.
    ● LOCAL_QUORUM: Wait for quorum on the datacenter the connection was
      stablished.

    ● EACH_QUORUM: Wait for quorum on each datacenter.

    ● QUORUM: Wait for a quorum of replicas (no matter which datacenter).

    ● ALL: Blocks for all the replicas before returning to the client.
Variable Consistency

Create a customized Consistency Level:

ConfigurableConsistencyLevel configurableConsistencyLevel = new ConfigurableConsistencyLevel();
Map<String, HConsistencyLevel> clmap = new HashMap<String, HConsistencyLevel>();


clmap.put("MyColumnFamily", HConsistencyLevel.ONE);

configurableConsistencyLevel.setReadCfConsistencyLevels(clmap);
configurableConsistencyLevel.setWriteCfConsistencyLevels(clmap);


HFactory.createKeyspace("MyKeyspace", myCluster, configurableConsistencyLevel);
CQL

Insert data with CQL:

  cqlsh> INSERT INTO Location (KEY, City) VALUES ('00001', 'Colombo');


Retrieve data with CQL

  cqlsh> select * from Location where KEY='00001';
Apache


 ● Project Site: http://hadoop.apache.org

 ● Latest Version 1.0.1

 ● Hadoop is in use at Amazon, Yahoo, Adobe, eBay,
   Facebook

 ● Commercial support : http://hortonworks.com
                        http://www.cloudera.com
Hadoop deployment Architecture
How to install Hadoop

 ● Download the artifact from:

                      http://hadoop.apache.org/common/releases.
html

 ● Extract : tar -xzvf hadoop-1.0.1.tar.gz


 ● Copy and extract installation to each data node.


       scp hadoop-1.0.1.tar.gz user@datanode01:/home/hadoop

 ● Start Hadoop : $HADOOP_HOME:/bin/start-all
Hadoop CLI - HDFS


Format Namenode :

  $HADOOP_HOME:/bin/hadoop namenode -format

File operations on HDFS:

  $HADOOP_HOME:/bin/hadoop dfs -lsr /

  $HADOOP_HOME:/bin/hadoop dfs -mkdir /users/deep/wso2
Mapreduce




source:http://developer.yahoo.com/hadoop/tutorial/module4.html
Simple Mapreduce Job

Mapper

public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

      private final static IntWritable one = new IntWritable(1);
      private Text word = new Text();

      public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter)
throws                                                                                                              I
OException {

          String line = value.toString();
          StringTokenizer tokenizer = new StringTokenizer(line);

          while (tokenizer.hasMoreTokens()) {
            word.set(tokenizer.nextToken());
            output.collect(word, one);
          }
      }
  }
Simple Mapreduce Job
Reducer:
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {

   public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter
reporter)                                                                                                           throws
IOException {
        int sum = 0;
        while (values.hasNext()) {
          sum += values.next().get();
        }
        output.collect(key, new IntWritable(sum));
    }
  }
Simple Mapreduce Job

Job Runner:

     JobConf conf = new JobConf(WordCount.class);
     conf.setJobName("wordcount");

     conf.setOutputKeyClass(Text.class);
     conf.setOutputValueClass(IntWritable.class);

     conf.setMapperClass(Map.class);
     conf.setCombinerClass(Reduce.class);
     conf.setReducerClass(Reduce.class);

     conf.setInputFormat(TextInputFormat.class);
     conf.setOutputFormat(TextOutputFormat.class);

     FileInputFormat.setInputPaths(conf, new Path(args[0]));
     FileOutputFormat.setOutputPath(conf, new Path(args[1]));

     JobClient.runJob(conf);
High level Mapreduce Interfaces

● Hive

● Pig
Q&A

Mais conteúdo relacionado

Mais procurados

C*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with CassandraC*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with CassandraDataStax
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerDataStax
 
VPN Access Runbook
VPN Access RunbookVPN Access Runbook
VPN Access RunbookTaha Shakeel
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Markus Klems
 
Database administration commands
Database administration commands Database administration commands
Database administration commands Varsha Ajith
 
اسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونی
اسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونیاسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونی
اسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونیMohammad Reza Kamalifard
 
The Ring programming language version 1.8 book - Part 34 of 202
The Ring programming language version 1.8 book - Part 34 of 202The Ring programming language version 1.8 book - Part 34 of 202
The Ring programming language version 1.8 book - Part 34 of 202Mahmoud Samir Fayed
 
Python in the database
Python in the databasePython in the database
Python in the databasepybcn
 
Php 5.4: New Language Features You Will Find Useful
Php 5.4: New Language Features You Will Find UsefulPhp 5.4: New Language Features You Will Find Useful
Php 5.4: New Language Features You Will Find UsefulDavid Engel
 

Mais procurados (20)

C*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with CassandraC*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with Cassandra
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
MongoDB-SESSION03
MongoDB-SESSION03MongoDB-SESSION03
MongoDB-SESSION03
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
 
Cassandra 2.2 & 3.0
Cassandra 2.2 & 3.0Cassandra 2.2 & 3.0
Cassandra 2.2 & 3.0
 
VPN Access Runbook
VPN Access RunbookVPN Access Runbook
VPN Access Runbook
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3
 
Database administration commands
Database administration commands Database administration commands
Database administration commands
 
Oracle ORA Errors
Oracle ORA ErrorsOracle ORA Errors
Oracle ORA Errors
 
Config BuildConfig
Config BuildConfigConfig BuildConfig
Config BuildConfig
 
はじめてのGroovy
はじめてのGroovyはじめてのGroovy
はじめてのGroovy
 
اسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونی
اسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونیاسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونی
اسلاید اول جلسه چهارم کلاس پایتون برای هکرهای قانونی
 
The Ring programming language version 1.8 book - Part 34 of 202
The Ring programming language version 1.8 book - Part 34 of 202The Ring programming language version 1.8 book - Part 34 of 202
The Ring programming language version 1.8 book - Part 34 of 202
 
Python database access
Python database accessPython database access
Python database access
 
Lodash js
Lodash jsLodash js
Lodash js
 
Python in the database
Python in the databasePython in the database
Python in the database
 
Mysql
MysqlMysql
Mysql
 
Bootstrap
BootstrapBootstrap
Bootstrap
 
Dmxedit
DmxeditDmxedit
Dmxedit
 
Php 5.4: New Language Features You Will Find Useful
Php 5.4: New Language Features You Will Find UsefulPhp 5.4: New Language Features You Will Find Useful
Php 5.4: New Language Features You Will Find Useful
 

Destaque

Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Kevin Crocker
 
Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2Rohit Agrawal
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Introduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop EcosystemIntroduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop EcosystemMahabubur Rahaman
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Hadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduceHadoop, HDFS and MapReduce
Hadoop, HDFS and MapReducefvanvollenhoven
 
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop MeetupIntegrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetupgethue
 
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...Cloudera, Inc.
 
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Jonathan Seidman
 
Simplified Data Management And Process Scheduling in Hadoop
Simplified Data Management And Process Scheduling in HadoopSimplified Data Management And Process Scheduling in Hadoop
Simplified Data Management And Process Scheduling in HadoopGetInData
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyRohit Kulkarni
 
Last Special Programming Task on December Presentation
Last Special Programming Task on December PresentationLast Special Programming Task on December Presentation
Last Special Programming Task on December PresentationDion Webiaswara
 

Destaque (20)

Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
 
Amazon Elastic Computing 2
Amazon Elastic Computing 2Amazon Elastic Computing 2
Amazon Elastic Computing 2
 
Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2
 
Big Data and Hadoop - An Introduction
Big Data and Hadoop - An IntroductionBig Data and Hadoop - An Introduction
Big Data and Hadoop - An Introduction
 
Taller hadoop
Taller hadoopTaller hadoop
Taller hadoop
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Hadoop fault-tolerance
Hadoop fault-toleranceHadoop fault-tolerance
Hadoop fault-tolerance
 
Introduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop EcosystemIntroduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop Ecosystem
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduceHadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduce
 
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop MeetupIntegrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
 
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
 
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
 
Simplified Data Management And Process Scheduling in Hadoop
Simplified Data Management And Process Scheduling in HadoopSimplified Data Management And Process Scheduling in Hadoop
Simplified Data Management And Process Scheduling in Hadoop
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
 
Last Special Programming Task on December Presentation
Last Special Programming Task on December PresentationLast Special Programming Task on December Presentation
Last Special Programming Task on December Presentation
 
E-learning and agriMoodle, OER Growers Autumn 2012
E-learning and agriMoodle, OER Growers Autumn 2012E-learning and agriMoodle, OER Growers Autumn 2012
E-learning and agriMoodle, OER Growers Autumn 2012
 

Semelhante a Store and Process Big Data with Hadoop and Cassandra

Hadoop Integration in Cassandra
Hadoop Integration in CassandraHadoop Integration in Cassandra
Hadoop Integration in CassandraJairam Chandar
 
Cascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUGCascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUGMatthew McCullough
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseCodepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseSages
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loadingalex_araujo
 
RestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueRestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueGleicon Moraes
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and MonoidsHugo Gävert
 
NoSQL and JavaScript: a Love Story
NoSQL and JavaScript: a Love StoryNoSQL and JavaScript: a Love Story
NoSQL and JavaScript: a Love StoryAlexandre Morgaut
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19confluent
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksMongoDB
 
Productionalizing spark streaming applications
Productionalizing spark streaming applicationsProductionalizing spark streaming applications
Productionalizing spark streaming applicationsRobert Sanders
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSuzquiano
 
Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108Mathias Herberts
 
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...InfluxData
 
Cs267 hadoop programming
Cs267 hadoop programmingCs267 hadoop programming
Cs267 hadoop programmingKuldeep Dhole
 
Kerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit eastKerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit eastJorge Lopez-Malla
 
Spring data iii
Spring data iiiSpring data iii
Spring data iii명철 강
 
Hazelcast
HazelcastHazelcast
Hazelcastoztalip
 
What is row level isolation on cassandra
What is row level isolation on cassandraWhat is row level isolation on cassandra
What is row level isolation on cassandraKazutaka Tomita
 

Semelhante a Store and Process Big Data with Hadoop and Cassandra (20)

Hadoop Integration in Cassandra
Hadoop Integration in CassandraHadoop Integration in Cassandra
Hadoop Integration in Cassandra
 
Cascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUGCascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUG
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseCodepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
Solr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene EuroconSolr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene Eurocon
 
RestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueRestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message Queue
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and Monoids
 
NoSQL and JavaScript: a Love Story
NoSQL and JavaScript: a Love StoryNoSQL and JavaScript: a Love Story
NoSQL and JavaScript: a Love Story
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
 
Productionalizing spark streaming applications
Productionalizing spark streaming applicationsProductionalizing spark streaming applications
Productionalizing spark streaming applications
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
 
Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108
 
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
 
Cs267 hadoop programming
Cs267 hadoop programmingCs267 hadoop programming
Cs267 hadoop programming
 
Kerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit eastKerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit east
 
Spring data iii
Spring data iiiSpring data iii
Spring data iii
 
Hazelcast
HazelcastHazelcast
Hazelcast
 
V8
V8V8
V8
 
What is row level isolation on cassandra
What is row level isolation on cassandraWhat is row level isolation on cassandra
What is row level isolation on cassandra
 

Último

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Último (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Store and Process Big Data with Hadoop and Cassandra

  • 1. Store and Process Big Data with Hadoop and Cassandra Apache BarCamp By Deependra Ariyadewa WSO2, Inc.
  • 2. Store Data with ● Project site : http://cassandra.apache.org ● The latest release version is 1.0.7 ● Cassandra is in use at Netflix, Twitter, Urban Airship, Constant Contact, Reddit, Cisco, OpenX, Digg, CloudKick and Ooyala ● Cassandra Users : http://www.datastax.com/cassandrausers ● The largest known Cassandra cluster has over 300 TB of data in over 400 machines. ● Commercial support http://wiki.apache.org/cassandra/ThirdPartySupport
  • 3. Cassandra Deployment architecture hash(key1) hash(key2) key => {(k,v),(k,v),(k,v)} hash(key) => key order
  • 4. How to Install Cassandra ● Download the artifact apache-cassandra-1.0.7-bin.tar.gz from http://cassandra.apache.org/download/ ● Extract tar -xzvf apache-cassandra-1.0.7-bin.tar.gz ● Set up folder paths mkdir -p /var/log/cassandra chown -R `whoami` /var/log/cassandra mkdir -p /var/lib/cassandra chown -R `whoami` /var/lib/cassandra
  • 5. How to Configure Cassandra Main Configuration file : $CASSANDRA_HOME/conf/cassandra.yaml cluster_name: 'Test Cluster' seed_provider: - seeds: "192.168.0.121" storage_port: 7000 listen_address: localhost rpc_address: localhost rpc_port: 9160
  • 6. Cassandra Clustering initial_token: partitioner: org.apache.cassandra.dht.RandomPartitioner http://wiki.apache.org/cassandra/Operations
  • 7. Cassandra DevOps $CASSANDRA_HOME/bin$ ./cassandra-cli --host localhost [default@unknown] show keyspaces; Keyspace: system: Replication Strategy: org.apache.cassandra.locator.LocalStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: HintsColumnFamily (Super) "hinted handoff data" Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db. marshal.BytesType Row cache size / save period in seconds / keys to save : 0.0/0/all Row Cache Provider: org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider Key cache size / save period in seconds: 0.01/0 GC grace seconds: 0 Compaction min/max thresholds: 4/32 Read repair chance: 0.0 Replicate on write: true Bloom Filter FP chance: default Built indexes: [] Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
  • 8. Cassandra CLI [default@apache] create column family Location with comparator=UTF8Type and default_validation_class=UTF8Type and key_validation_class=UTF8Type; f04561a0-60ed-11e1-0000-242d50cf1fbf Waiting for schema agreement... ... schemas agree across the cluster [default@apache] set Location[00001][City]='Colombo'; Value inserted. Elapsed time: 140 msec(s). [default@apache] list Location; Using default limit of 100 ------------------- RowKey: 00001 => (column=City, value=Colombo, timestamp=1330311097464000) 1 Row Returned. Elapsed time: 122 msec(s).
  • 9. Store Data with Hector import me.prettyprint.cassandra.service.CassandraHostConfigurator; import me.prettyprint.hector.api.Cluster; import me.prettyprint.hector.api.factory.HFactory; import java.util.HashMap; import java.util.Map; public class ExampleHelper { public static final String CLUSTER_NAME = "ClusterOne"; public static final String USERNAME_KEY = "username"; public static final String PASSWORD_KEY = "password"; public static final String RPC_PORT = "9160"; public static final String CSS_NODE0 = "localhost"; public static final String CSS_NODE1 = "css1.stratoslive.wso2.com"; public static final String CSS_NODE2 = "css2.stratoslive.wso2.com"; public static Cluster createCluster(String username, String password) { Map<String, String> credentials = new HashMap<String, String>(); credentials.put(USERNAME_KEY, username); credentials.put(PASSWORD_KEY, password); String hostList = CSS_NODE0 + ":" + RPC_PORT + "," + CSS_NODE1 + ":" + RPC_PORT + "," + CSS_NODE2 + ":" + RPC_PORT; return HFactory.createCluster(CLUSTER_NAME, new CassandraHostConfigurator(hostList), credentials); } }
  • 10. Store Data with Hector Create Keyspace: KeyspaceDefinition definition = new ThriftKsDef(keyspaceName); cluster.addKeyspace(definition); Add column family: ColumnFamilyDefinition familyDefinition = new ThriftCfDef(keyspaceName, columnFamily); cluster.addColumnFamily(familyDefinition); Write Data: Mutator<String> mutator = HFactory.createMutator(keyspace, new StringSerializer()); String columnValue = UUID.randomUUID().toString(); mutator.insert(rowKey, columnFamily, HFactory.createStringColumn(columnName, columnValue)); Read Data: ColumnQuery<String, String, String> columnQuery = HFactory.createStringColumnQuery(keyspace); columnQuery.setColumnFamily(columnFamily).setKey(key).setName(columnName); QueryResult<HColumn<String, String>> result = columnQuery.execute(); HColumn<String, String> hColumn = result.get(); System.out.println("Column: " + hColumn.getName() + " Value : " + hColumn.getValue() + "n");
  • 11. Variable Consistency ● ANY: Wait until some replica has responded. ● ONE: Wait until one replica has responded. ● TWO: Wait until two replicas have responded. ● THREE: Wait until three replicas have responded . ● LOCAL_QUORUM: Wait for quorum on the datacenter the connection was stablished. ● EACH_QUORUM: Wait for quorum on each datacenter. ● QUORUM: Wait for a quorum of replicas (no matter which datacenter). ● ALL: Blocks for all the replicas before returning to the client.
  • 12. Variable Consistency Create a customized Consistency Level: ConfigurableConsistencyLevel configurableConsistencyLevel = new ConfigurableConsistencyLevel(); Map<String, HConsistencyLevel> clmap = new HashMap<String, HConsistencyLevel>(); clmap.put("MyColumnFamily", HConsistencyLevel.ONE); configurableConsistencyLevel.setReadCfConsistencyLevels(clmap); configurableConsistencyLevel.setWriteCfConsistencyLevels(clmap); HFactory.createKeyspace("MyKeyspace", myCluster, configurableConsistencyLevel);
  • 13. CQL Insert data with CQL: cqlsh> INSERT INTO Location (KEY, City) VALUES ('00001', 'Colombo'); Retrieve data with CQL cqlsh> select * from Location where KEY='00001';
  • 14. Apache ● Project Site: http://hadoop.apache.org ● Latest Version 1.0.1 ● Hadoop is in use at Amazon, Yahoo, Adobe, eBay, Facebook ● Commercial support : http://hortonworks.com http://www.cloudera.com
  • 16. How to install Hadoop ● Download the artifact from: http://hadoop.apache.org/common/releases. html ● Extract : tar -xzvf hadoop-1.0.1.tar.gz ● Copy and extract installation to each data node. scp hadoop-1.0.1.tar.gz user@datanode01:/home/hadoop ● Start Hadoop : $HADOOP_HOME:/bin/start-all
  • 17. Hadoop CLI - HDFS Format Namenode : $HADOOP_HOME:/bin/hadoop namenode -format File operations on HDFS: $HADOOP_HOME:/bin/hadoop dfs -lsr / $HADOOP_HOME:/bin/hadoop dfs -mkdir /users/deep/wso2
  • 19. Simple Mapreduce Job Mapper public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws I OException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } }
  • 20. Simple Mapreduce Job Reducer: public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } }
  • 21. Simple Mapreduce Job Job Runner: JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf);
  • 22. High level Mapreduce Interfaces ● Hive ● Pig
  • 23. Q&A