Enviar pesquisa
Carregar
ORC File Introduction
•
Transferir como PPTX, PDF
•
13 gostaram
•
11,815 visualizações
Owen O'Malley
Seguir
I present the Optimized Row Columnar (ORC) file format for Apache Hive.
Leia menos
Leia mais
Tecnologia
Vista de apresentação de diapositivos
Denunciar
Compartilhar
Vista de apresentação de diapositivos
Denunciar
Compartilhar
1 de 12
Baixar agora
Recomendados
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
Sqoop
Sqoop
Prashant Gupta
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
SQOOP PPT
SQOOP PPT
Dushhyant Kumar
Apache hive introduction
Apache hive introduction
Mahmood Reza Esmaili Zand
ORC Files
ORC Files
Owen O'Malley
Recomendados
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
Sqoop
Sqoop
Prashant Gupta
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
SQOOP PPT
SQOOP PPT
Dushhyant Kumar
Apache hive introduction
Apache hive introduction
Mahmood Reza Esmaili Zand
ORC Files
ORC Files
Owen O'Malley
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
Hive
Hive
Manas Nayak
Inside Parquet Format
Inside Parquet Format
Yue Chen
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Cloudera, Inc.
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
DataWorks Summit
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
YugabyteDB
ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013
Owen O'Malley
Hadoop HDFS.ppt
Hadoop HDFS.ppt
6535ANURAGANURAG
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
SeaweedFS introduction
SeaweedFS introduction
chrislusf
Unit 5-apache hive
Unit 5-apache hive
vishal choudhary
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for Hadoop
Cloudera, Inc.
Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
Apache hive
Apache hive
pradipbajpai68
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
DataWorks Summit
Hbase
Hbase
Milton Bahia
Introduction to sqoop
Introduction to sqoop
Uday Vakalapudi
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
StampedeCon
Structor - Automated Building of Virtual Hadoop Clusters
Structor - Automated Building of Virtual Hadoop Clusters
Owen O'Malley
Mais conteúdo relacionado
Mais procurados
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
Hive
Hive
Manas Nayak
Inside Parquet Format
Inside Parquet Format
Yue Chen
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Cloudera, Inc.
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
DataWorks Summit
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
YugabyteDB
ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013
Owen O'Malley
Hadoop HDFS.ppt
Hadoop HDFS.ppt
6535ANURAGANURAG
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
SeaweedFS introduction
SeaweedFS introduction
chrislusf
Unit 5-apache hive
Unit 5-apache hive
vishal choudhary
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for Hadoop
Cloudera, Inc.
Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
Apache hive
Apache hive
pradipbajpai68
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
DataWorks Summit
Hbase
Hbase
Milton Bahia
Introduction to sqoop
Introduction to sqoop
Uday Vakalapudi
Mais procurados
(20)
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
Hive
Hive
Inside Parquet Format
Inside Parquet Format
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013
Hadoop HDFS.ppt
Hadoop HDFS.ppt
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
SeaweedFS introduction
SeaweedFS introduction
Unit 5-apache hive
Unit 5-apache hive
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for Hadoop
Hadoop Overview & Architecture
Hadoop Overview & Architecture
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
Apache hive
Apache hive
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
Hbase
Hbase
Introduction to sqoop
Introduction to sqoop
Destaque
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
StampedeCon
Structor - Automated Building of Virtual Hadoop Clusters
Structor - Automated Building of Virtual Hadoop Clusters
Owen O'Malley
Adding ACID Updates to Hive
Adding ACID Updates to Hive
Owen O'Malley
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
Owen O'Malley
Data protection2015
Data protection2015
Owen O'Malley
Plugging the Holes: Security and Compatability in Hadoop
Plugging the Holes: Security and Compatability in Hadoop
Owen O'Malley
Next Generation MapReduce
Next Generation MapReduce
Owen O'Malley
Bay Area HUG Feb 2011 Intro
Bay Area HUG Feb 2011 Intro
Owen O'Malley
Next Generation Hadoop Operations
Next Generation Hadoop Operations
Owen O'Malley
Optimizing Hive Queries
Optimizing Hive Queries
Owen O'Malley
Hadoop Security Architecture
Hadoop Security Architecture
Owen O'Malley
Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...
Julien Le Dem
Sql on everything with drill
Sql on everything with drill
Julien Le Dem
Mapreduce total order sorting technique
Mapreduce total order sorting technique
Uday Vakalapudi
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Julien Le Dem
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
DataWorks Summit
Data Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet Arrow
Julien Le Dem
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
DataWorks Summit
Optimizing Hive Queries
Optimizing Hive Queries
DataWorks Summit
Destaque
(20)
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Structor - Automated Building of Virtual Hadoop Clusters
Structor - Automated Building of Virtual Hadoop Clusters
Adding ACID Updates to Hive
Adding ACID Updates to Hive
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
Data protection2015
Data protection2015
Plugging the Holes: Security and Compatability in Hadoop
Plugging the Holes: Security and Compatability in Hadoop
Next Generation MapReduce
Next Generation MapReduce
Bay Area HUG Feb 2011 Intro
Bay Area HUG Feb 2011 Intro
Next Generation Hadoop Operations
Next Generation Hadoop Operations
Optimizing Hive Queries
Optimizing Hive Queries
Hadoop Security Architecture
Hadoop Security Architecture
Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...
Sql on everything with drill
Sql on everything with drill
Mapreduce total order sorting technique
Mapreduce total order sorting technique
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
Data Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet Arrow
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Optimizing Hive Queries
Optimizing Hive Queries
Semelhante a ORC File Introduction
Inside hadoop-dev
Inside hadoop-dev
Steve Loughran
Availability and Integrity in hadoop (Strata EU Edition)
Availability and Integrity in hadoop (Strata EU Edition)
Steve Loughran
HA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talk
Steve Loughran
Mobile Development Meets Semantic Technology
Mobile Development Meets Semantic Technology
Blue Slate Solutions
Orange County HUG - Agile Data on HDP
Orange County HUG - Agile Data on HDP
Hortonworks
An Introduction to Spring Data
An Introduction to Spring Data
Oliver Gierke
LA HUG - Agile Analytics Applications on HDP
LA HUG - Agile Analytics Applications on HDP
Hortonworks
Hadoop: today and tomorrow
Hadoop: today and tomorrow
Steve Loughran
Cloud Consolidation with Oracle (RAC) - How much is too much?
Cloud Consolidation with Oracle (RAC) - How much is too much?
Markus Michalewicz
Sentri SharePoint Performance webinar
Sentri SharePoint Performance webinar
Sentri
DB2 z/OS & Java - What\'s New?
DB2 z/OS & Java - What\'s New?
Laura Hood
Agile analytics applications on hadoop
Agile analytics applications on hadoop
Hortonworks
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
russell_jurney
Compaction and Splitting in Apache Accumulo
Compaction and Splitting in Apache Accumulo
Hortonworks
Introduction to DDD
Introduction to DDD
Radosław Mejer
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, Smaller
DataWorks Summit
Ozone and HDFS’s evolution
Ozone and HDFS’s evolution
DataWorks Summit
ORC 2015
ORC 2015
t3rmin4t0r
Storage Characteristics Of Call Data Records In Column Store Databases
Storage Characteristics Of Call Data Records In Column Store Databases
David Walker
Java and Mongo
Java and Mongo
Marcio Mangar
Semelhante a ORC File Introduction
(20)
Inside hadoop-dev
Inside hadoop-dev
Availability and Integrity in hadoop (Strata EU Edition)
Availability and Integrity in hadoop (Strata EU Edition)
HA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talk
Mobile Development Meets Semantic Technology
Mobile Development Meets Semantic Technology
Orange County HUG - Agile Data on HDP
Orange County HUG - Agile Data on HDP
An Introduction to Spring Data
An Introduction to Spring Data
LA HUG - Agile Analytics Applications on HDP
LA HUG - Agile Analytics Applications on HDP
Hadoop: today and tomorrow
Hadoop: today and tomorrow
Cloud Consolidation with Oracle (RAC) - How much is too much?
Cloud Consolidation with Oracle (RAC) - How much is too much?
Sentri SharePoint Performance webinar
Sentri SharePoint Performance webinar
DB2 z/OS & Java - What\'s New?
DB2 z/OS & Java - What\'s New?
Agile analytics applications on hadoop
Agile analytics applications on hadoop
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
Compaction and Splitting in Apache Accumulo
Compaction and Splitting in Apache Accumulo
Introduction to DDD
Introduction to DDD
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, Smaller
Ozone and HDFS’s evolution
Ozone and HDFS’s evolution
ORC 2015
ORC 2015
Storage Characteristics Of Call Data Records In Column Store Databases
Storage Characteristics Of Call Data Records In Column Store Databases
Java and Mongo
Java and Mongo
Mais de Owen O'Malley
Running An Apache Project: 10 Traps and How to Avoid Them
Running An Apache Project: 10 Traps and How to Avoid Them
Owen O'Malley
Big Data's Journey to ACID
Big Data's Journey to ACID
Owen O'Malley
ORC Deep Dive 2020
ORC Deep Dive 2020
Owen O'Malley
Protect your private data with ORC column encryption
Protect your private data with ORC column encryption
Owen O'Malley
Fine Grain Access Control for Big Data: ORC Column Encryption
Fine Grain Access Control for Big Data: ORC Column Encryption
Owen O'Malley
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
Strata NYC 2018 Iceberg
Strata NYC 2018 Iceberg
Owen O'Malley
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
ORC Column Encryption
ORC Column Encryption
Owen O'Malley
Mais de Owen O'Malley
(9)
Running An Apache Project: 10 Traps and How to Avoid Them
Running An Apache Project: 10 Traps and How to Avoid Them
Big Data's Journey to ACID
Big Data's Journey to ACID
ORC Deep Dive 2020
ORC Deep Dive 2020
Protect your private data with ORC column encryption
Protect your private data with ORC column encryption
Fine Grain Access Control for Big Data: ORC Column Encryption
Fine Grain Access Control for Big Data: ORC Column Encryption
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Strata NYC 2018 Iceberg
Strata NYC 2018 Iceberg
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
ORC Column Encryption
ORC Column Encryption
Último
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Softradix Technologies
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Paola De la Torre
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
OnBoard
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Delhi Call girls
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Pooja Nehwal
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Slack Application Development 101 Slides
Slack Application Development 101 Slides
praypatel2
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Katpro Technologies
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Memoori
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Último
(20)
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Slack Application Development 101 Slides
Slack Application Development 101 Slides
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
ORC File Introduction
1.
ORC Files Owen O’Malley owen@hortonworks.com December
2012 © Hortonworks Inc. 2012 Page 1
2.
Top Level
Page 2 © Hortonworks Inc. 2012
3.
File Structure
Page 3 © Hortonworks Inc. 2012
4.
Stripe Structure
Page 4 © Hortonworks Inc. 2012
5.
File Layout
Page 5 © Hortonworks Inc. 2012
6.
Integer Column Serialization
Page 6 © Hortonworks Inc. 2012
7.
String Column Serialization
Page 7 © Hortonworks Inc. 2012
8.
Compression
Page 8 © Hortonworks Inc. 2012
9.
Projection and Predicate
Filtering Page 9 © Hortonworks Inc. 2012
10.
Example File Sizes
Page 10 © Hortonworks Inc. 2012
11.
Final notes
Page 11 © Hortonworks Inc. 2012
12.
Comparison
RC File Trevni ORC File Hive Type Model N N Y Separate complex columns N Y Y Splits found quickly N Y Y Default column group size 4MB 64MB* 250MB Files per a bucket 1 >1 1 Store min, max, sum, count N N Y Versioned metadata N Y Y Run length data encoding N N Y Store strings in dictionary N N Y Store row count N Y Y Skip compressed blocks N N Y Store internal indexes N N Y Page 12 © Hortonworks Inc. 2012
Baixar agora