SlideShare uma empresa Scribd logo
1 de 17
Big Data Practice
A Planning guide to set up Big Things..
Knowing the Big Data Fundamentals
 Big Data describes a holistic information management strategy that includes
and integrates many new types of data and data management alongside
traditional data.
 Big Data has also been defined by the four “V”s: Volume, Velocity, Variety,
and Value
 The massive proliferation of data and the lower cost computing models that
have encouraged broader adoption.
 Big Data has popularized two foundational storage and processing
technologies: Apache Hadoop and the NoSQL database
The Big Questions on Big Data
 The good news is that everyone has questions about Big Data! Both business
and IT are taking risks and experimenting, and there is a healthy bias by all to
learn.
 So big data is an enterprise asset and needs to be managed from business
alignment to governance as an integrated element of your current
information management architecture.
 As we transform from a proof of concept to run at scale, you will run into the
same issues as other information management challenges, namely, skill set
requirements, governance, performance, scalability, management,
integration, security, and access.
THE BIG DATA QUESTIONS
Areas Questions Possible Answers
Business Context
Business Context How will we make use of the data? Sell new products and services » Personalize
customer experiences
Business Usage Which business processes can benefit? Operational ERP/CRM systems » BI and
Reporting systems » Predictive analytics,
modeling, data mining
Data Ownership Do we need to own (and archive) the data?
» Proprietary » Require historical data »
Ensure lineage » Governance
Architecture Vision
Data Storage What storage technologies are best for our
data reservoir?
» HDFS (Hadoop plus others) » File system »
Data Warehouse » RDBMS » NoSQL database
Data Processing What strategy is practical for my
application?
Leave it at the point of capture » Add minor
transformations » ETL data to analytical
platform » Export data to desktops
Performance How to maximize speed of ad hoc query,
data transformations, and analytical
modeling?
Analyze and transform data in real-time
Use parallel processing
» Increase hardware and memory
Continued ..
Areas Questions Possible Answers
Current State
Open Source Experience What experience do we have in open source
Apache projects? (Hadoop, NoSQL, etc)
Scattered experiments » Proof of concepts »
Production experience » Contributor
Analytics Skills To what extent do we employ Data
Scientists and Analysts familiar with
advanced and predictive analytics tools and
techniques?
» Yes
» No
Future State
Best Practices What are the best resources to guide
decisions to build my future state?
Reference architecture » Development
patterns » Operational processes »
Governance structures and polices »
Conferences and communities of interest »
Vendor best practices
Data Types How much transformation is required for
raw unstructured data in the data reservoir?
None »
Derive a fundamental understanding with
schema or
key-value pairs » Enrich data
Data Sources How frequently do sources or content
structure change?
» Frequently » Unpredictable » Never
Data Quality When to apply transformations? » In the network »
In the reservoir »
In the data warehouse
» By the user at point of use
» At run time
What’s Different about Big Data?
 Big Data introduces new technology, processes, and skills to your information architecture
and the people that design, operate, and use them.
 “V”s define attributes of Big Data.
 Big data approaches data structure and analytics differently than traditional information
architectures.
 “schema on write” :
 The traditional data warehouse approach undergo standardized ETL processes and eventually
map into predefined schemas.
 Lengthy Process to amend changes to pre-defined schema.
 “schema on Read” :
 Here data can be captured without requiring a ‘defined’ data structure. The structure will be
derived either from the data itself or through other algorithmic process.
 With new low-cost, inmemory parallel processing hardware/software architectures, such as
HDFS/Hadoop and Spark.
 “Eliminates high cost of moving data through ETL process and brings analytical capabilities in
place
Reference architecture sample
What are the main tools/ technologies
that you employ at Big Data?
 The current big data engagements in the industry are centered on Hadoop.
 For wide Variety ,Various types, Volumes of data the tools that in use are..
 Large unstructured data managed by technologies like Hadoop
 Large structured data managed by technologies such as Teradata and Netezza
 Real-time data managed by the likes of SAP HANA, NoSQL, MongoDB, Cassandra
 Finally the ultimate aim is to provide business and IT user to work with a
variety of technologies that best suits their data analytics requirements.
Big data Market place
Big Data Technologies Types &
Specialised Vendors
Appache Hadoop* Stack
Apache Hadoop* a community development effort
Key development sub project
Apache *Hadoop* Distributed File
System(HDFS)
The primary storage system that uses multiple replicas of data blocks, distributes them on
nodes throught a cluster, and provides high – through put access to application data.
Apache Hadoop Map reduce A programming model and software framework for applications that perform distributed
processing of large datasets on compute clusters.
Apache Hadoop Common Utilities that support the Hadoop framework ,including File system ,remote procedure
call(RPC) and serialization libraries.
Other Related Hadoop Projects
Apache Avro* A data serialization system
Apache Cassandra* A scalable multi master database with single point of failure
Apache Chukwa* A data collection system for monitoring large distributed systems built on HDFS and
mapreduce;includes toolkit for displaying, monitoring and analysing results.
Apache Hbase* A scalable distributed database that supports structured data storage for large tables used for
random, real-time read/write access to big data.
Apache Hive* A data warehouse infrastructure that provides data summuraization,adhoc-quering and
analysing large data sets in Hadoop compatible file systems.
Apache Mahout* A scalable machine learning and data mining library with implementations of wide range of
algorithms, including clustering,classification,collaborative filtering and frequent-paterrn
mining.
Apache Pig* A high level data flow language and execution frame work for expressing parallel data
analytics
Apache Zookeeper* A high performance ,centralized service that maintains configuration information and naming
and provides distributed synchronization and group service for distributed applications.
Hardware requirements
 Here are the recommended specifications for DataNode/TaskTrackers in a
balanced Hadoop cluster:
 12-24 1-4TB hard disks in a JBOD (Just a Bunch Of Disks) configuration
 2 quad-/hex-/octo-core CPUs, running at least 2-2.5GHz
 64-512GB of RAM
 Bonded Gigabit Ethernet or 10Gigabit Ethernet (the more storage density, the
higher the network throughput needed)
Start a Big Data Practice
 Identify,Integrate,Colloborate & Solve(IICS) -
1. Establish the Potential values & Objectives
2. Commitment from Senior mgmt.
3. Unveil dedicated COE for Big data
 Key aspects of Big Data COE function -
 1.Big Data team: setting up teams with right mix technical and domain experts.
 2.Big Data Lab: evaluating tools,frameworks & Solutioning
 3.Business Domains: Appliances of big data in various business verticals.
 4.POC’s: Prepare Use-Case, models and poilet projects as testimonials
 5.Knowledge Banks: Blogs, community learning,Artifacts,white papers.
 The lesson to learn is that you will go further faster if you leverage prior
investments and training.
Big Data Architecture Patterns with a
use cases
 Use Case #1: Retail Web Log Analysis In this first example, a leading retailer reported disappointing results
from its web channels during the Christmas season. It is looking to improve customers’ experience at the online
shopping site. Analysts at the retailer will investigate the website navigation pattern, especially abandoned shopping
carts. The architecture challenge is to quickly implement a solution using mostly existing tools, skills, and
infrastructure in order to minimize cost and to quickly deliver a solution to the business. The number of skilled
Hadoop programmers on staff is very few but they do have SQL expertise. Loading all of the data into the existing
Oracle data warehouse enabling the SQL programmers to access the data there is rejected because the data
movement would be extensive and the processing power and storage required would not make economic sense. The
2nd option was to load the data into Hadoop and directly access the data in HDFS using SQL. The conceptual
architecture as shown in Figure 19 provides direct access to the Hadoop Distributed File System by simply associating
it with an Oracle Database external table. Once connected, Oracle Big Data SQL enables traditional SQL tools to
explore the data set.
 The key benefits of this architecture include: » Low cost Hadoop storage » Ability to leverage existing investments
and skills in Oracle SQL and BI tools » No client side software installation » Leverage Oracle data warehouse security
» No data movement into the relational database » Fast ingestion and integration of structured and unstructured data
sets
References
 Step by step Tableau Big data Visualization Projects:
 http://blog.albatrosa.com/visualisation/step-by-step-guide-to-building-your-
tableau-visualisation-big-data-project/
 Google search
 http://www.oracle.com/technetwork/
 www.intel.in/content/dam/..
Finally …..
Thank you

Mais conteúdo relacionado

Mais procurados

Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
DataWorks Summit
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
Hortonworks
 

Mais procurados (20)

Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architecture
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
Building a Data Lake - An App Dev's Perspective
Building a Data Lake - An App Dev's PerspectiveBuilding a Data Lake - An App Dev's Perspective
Building a Data Lake - An App Dev's Perspective
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 
Hadoop Big Data Lakes Keynote
Hadoop Big Data Lakes KeynoteHadoop Big Data Lakes Keynote
Hadoop Big Data Lakes Keynote
 

Semelhante a Big Data Practice_Planning_steps_RK

WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 

Semelhante a Big Data Practice_Planning_steps_RK (20)

Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Big Data
Big DataBig Data
Big Data
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Big Data SE vs. SE for Big Data
Big Data SE vs. SE for Big DataBig Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Hadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | SysforeHadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | Sysfore
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Big Data
Big DataBig Data
Big Data
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 

Big Data Practice_Planning_steps_RK

  • 1. Big Data Practice A Planning guide to set up Big Things..
  • 2. Knowing the Big Data Fundamentals  Big Data describes a holistic information management strategy that includes and integrates many new types of data and data management alongside traditional data.  Big Data has also been defined by the four “V”s: Volume, Velocity, Variety, and Value  The massive proliferation of data and the lower cost computing models that have encouraged broader adoption.  Big Data has popularized two foundational storage and processing technologies: Apache Hadoop and the NoSQL database
  • 3. The Big Questions on Big Data  The good news is that everyone has questions about Big Data! Both business and IT are taking risks and experimenting, and there is a healthy bias by all to learn.  So big data is an enterprise asset and needs to be managed from business alignment to governance as an integrated element of your current information management architecture.  As we transform from a proof of concept to run at scale, you will run into the same issues as other information management challenges, namely, skill set requirements, governance, performance, scalability, management, integration, security, and access.
  • 4. THE BIG DATA QUESTIONS Areas Questions Possible Answers Business Context Business Context How will we make use of the data? Sell new products and services » Personalize customer experiences Business Usage Which business processes can benefit? Operational ERP/CRM systems » BI and Reporting systems » Predictive analytics, modeling, data mining Data Ownership Do we need to own (and archive) the data? » Proprietary » Require historical data » Ensure lineage » Governance Architecture Vision Data Storage What storage technologies are best for our data reservoir? » HDFS (Hadoop plus others) » File system » Data Warehouse » RDBMS » NoSQL database Data Processing What strategy is practical for my application? Leave it at the point of capture » Add minor transformations » ETL data to analytical platform » Export data to desktops Performance How to maximize speed of ad hoc query, data transformations, and analytical modeling? Analyze and transform data in real-time Use parallel processing » Increase hardware and memory
  • 5. Continued .. Areas Questions Possible Answers Current State Open Source Experience What experience do we have in open source Apache projects? (Hadoop, NoSQL, etc) Scattered experiments » Proof of concepts » Production experience » Contributor Analytics Skills To what extent do we employ Data Scientists and Analysts familiar with advanced and predictive analytics tools and techniques? » Yes » No Future State Best Practices What are the best resources to guide decisions to build my future state? Reference architecture » Development patterns » Operational processes » Governance structures and polices » Conferences and communities of interest » Vendor best practices Data Types How much transformation is required for raw unstructured data in the data reservoir? None » Derive a fundamental understanding with schema or key-value pairs » Enrich data Data Sources How frequently do sources or content structure change? » Frequently » Unpredictable » Never Data Quality When to apply transformations? » In the network » In the reservoir » In the data warehouse » By the user at point of use » At run time
  • 6. What’s Different about Big Data?  Big Data introduces new technology, processes, and skills to your information architecture and the people that design, operate, and use them.  “V”s define attributes of Big Data.  Big data approaches data structure and analytics differently than traditional information architectures.  “schema on write” :  The traditional data warehouse approach undergo standardized ETL processes and eventually map into predefined schemas.  Lengthy Process to amend changes to pre-defined schema.  “schema on Read” :  Here data can be captured without requiring a ‘defined’ data structure. The structure will be derived either from the data itself or through other algorithmic process.  With new low-cost, inmemory parallel processing hardware/software architectures, such as HDFS/Hadoop and Spark.  “Eliminates high cost of moving data through ETL process and brings analytical capabilities in place
  • 8. What are the main tools/ technologies that you employ at Big Data?  The current big data engagements in the industry are centered on Hadoop.  For wide Variety ,Various types, Volumes of data the tools that in use are..  Large unstructured data managed by technologies like Hadoop  Large structured data managed by technologies such as Teradata and Netezza  Real-time data managed by the likes of SAP HANA, NoSQL, MongoDB, Cassandra  Finally the ultimate aim is to provide business and IT user to work with a variety of technologies that best suits their data analytics requirements.
  • 10. Big Data Technologies Types & Specialised Vendors
  • 12. Apache Hadoop* a community development effort Key development sub project Apache *Hadoop* Distributed File System(HDFS) The primary storage system that uses multiple replicas of data blocks, distributes them on nodes throught a cluster, and provides high – through put access to application data. Apache Hadoop Map reduce A programming model and software framework for applications that perform distributed processing of large datasets on compute clusters. Apache Hadoop Common Utilities that support the Hadoop framework ,including File system ,remote procedure call(RPC) and serialization libraries. Other Related Hadoop Projects Apache Avro* A data serialization system Apache Cassandra* A scalable multi master database with single point of failure Apache Chukwa* A data collection system for monitoring large distributed systems built on HDFS and mapreduce;includes toolkit for displaying, monitoring and analysing results. Apache Hbase* A scalable distributed database that supports structured data storage for large tables used for random, real-time read/write access to big data. Apache Hive* A data warehouse infrastructure that provides data summuraization,adhoc-quering and analysing large data sets in Hadoop compatible file systems. Apache Mahout* A scalable machine learning and data mining library with implementations of wide range of algorithms, including clustering,classification,collaborative filtering and frequent-paterrn mining. Apache Pig* A high level data flow language and execution frame work for expressing parallel data analytics Apache Zookeeper* A high performance ,centralized service that maintains configuration information and naming and provides distributed synchronization and group service for distributed applications.
  • 13. Hardware requirements  Here are the recommended specifications for DataNode/TaskTrackers in a balanced Hadoop cluster:  12-24 1-4TB hard disks in a JBOD (Just a Bunch Of Disks) configuration  2 quad-/hex-/octo-core CPUs, running at least 2-2.5GHz  64-512GB of RAM  Bonded Gigabit Ethernet or 10Gigabit Ethernet (the more storage density, the higher the network throughput needed)
  • 14. Start a Big Data Practice  Identify,Integrate,Colloborate & Solve(IICS) - 1. Establish the Potential values & Objectives 2. Commitment from Senior mgmt. 3. Unveil dedicated COE for Big data  Key aspects of Big Data COE function -  1.Big Data team: setting up teams with right mix technical and domain experts.  2.Big Data Lab: evaluating tools,frameworks & Solutioning  3.Business Domains: Appliances of big data in various business verticals.  4.POC’s: Prepare Use-Case, models and poilet projects as testimonials  5.Knowledge Banks: Blogs, community learning,Artifacts,white papers.  The lesson to learn is that you will go further faster if you leverage prior investments and training.
  • 15. Big Data Architecture Patterns with a use cases  Use Case #1: Retail Web Log Analysis In this first example, a leading retailer reported disappointing results from its web channels during the Christmas season. It is looking to improve customers’ experience at the online shopping site. Analysts at the retailer will investigate the website navigation pattern, especially abandoned shopping carts. The architecture challenge is to quickly implement a solution using mostly existing tools, skills, and infrastructure in order to minimize cost and to quickly deliver a solution to the business. The number of skilled Hadoop programmers on staff is very few but they do have SQL expertise. Loading all of the data into the existing Oracle data warehouse enabling the SQL programmers to access the data there is rejected because the data movement would be extensive and the processing power and storage required would not make economic sense. The 2nd option was to load the data into Hadoop and directly access the data in HDFS using SQL. The conceptual architecture as shown in Figure 19 provides direct access to the Hadoop Distributed File System by simply associating it with an Oracle Database external table. Once connected, Oracle Big Data SQL enables traditional SQL tools to explore the data set.  The key benefits of this architecture include: » Low cost Hadoop storage » Ability to leverage existing investments and skills in Oracle SQL and BI tools » No client side software installation » Leverage Oracle data warehouse security » No data movement into the relational database » Fast ingestion and integration of structured and unstructured data sets
  • 16. References  Step by step Tableau Big data Visualization Projects:  http://blog.albatrosa.com/visualisation/step-by-step-guide-to-building-your- tableau-visualisation-big-data-project/  Google search  http://www.oracle.com/technetwork/  www.intel.in/content/dam/..

Notas do Editor

  1. Courtesy: Reference Google- www.directbi.com
  2. Big Data Team While setting up a team for Big Data, the skills to look for : Data processing: Using 1.Open-source frameworks such as Apache Hadoop, HBase, Hive, PIG, Scala,Spark,Python 2.Big data Vendors- Cloudera,MangoDB,Cassandra, Data analytics: Team needs to have expertise in advanced analytical models, statistical methods, machine learning and data science Big Data Lab Setting up clusters Hardware/Boxes having sufficiently larger RAM than usual for Big Data processing. Big Data Proof-Of-Concepts (Case Studies) Develop proof-of-concepts (POC) projects,use case models to demonstrate our capabilities to showcase to your potential customers as testmonials