SlideShare uma empresa Scribd logo
1 de 53
HP Vertica Certification Guide
Softtek 2015
Vertica Architecture
Identify key features of Vertica
1. Performance Features
1. Column-orientation
2. Aggressive Compression
3. Read-Optimized Storage
4. Ability to exploit multiple sort orders
5. Parallel shared-nothing design on on-the-shelf hardware
6. Bottom Line
2. Administrative and Management Features
1. Vertica Database Designer
2. Recovery and High Availability through K-Safety
3. Continuous Load: Snapshot Isolation and the WOS
4. Monitoring and Administration Tools and APIs
Cristóbal Gómez | Identify key features of Vertica | 1
The Vertica Analytic Database Architecture
Cristóbal Gómez | Identify key features of Vertica | 2
ROS Distribution And Tuple Mover
Cristóbal Gómez | Identify key features of Vertica | 3
Victor Espinosa | Topic | # Page
Temas:
- Describe High Availability capabilities
and describe Vertica’s transaction
model.
- Identify characteristics and determine
features of projections used in Vertica.
High Availability. Ability of the database to continue running even if a node goes
down.
Proj A Proj B Proj C
Proj C Proj A Proj B
Buddy Projections: copies of existing projections stored in adjacent nodes.
K-Safety: 0,1,2
High Availability and Recovery
- HP Vertica is said to be K-safe.
High Availability with Projections.
- Vertica Replicate small, unsegmented projections.
- creates buddy projections for large, segmented projections.
- for small tables, it replicates them, creating and storing duplicates of these
projections on all nodes.
- HP Vertica creates buddy projections, which are copies of segmented
projections that are distributed across database nodes.
Features
- Columnar Orientation.
Vertica stores data in columns, reads only the columns referenced by the query.
- Advanced Encoding / Compression.
compress and encode as part of the database design.
reduce disk storage.
data does not need to be unencoded to return a result.
- High Availability.
- Automatic Database Design
transform data into column-based projections.
query performance can be enhanced by comparing the data loaded and the most
commonly used SQL queries.
- Application Integration.
Vertica uses standard SQL.
- Massively Parallel Processing.
ETL
Replication
Data Quality
Vertica Analytics
Reporting
Projections
Characteristics and Features.
Projection is a representation of the columns in the source tables.
Vertica stores data in columnar format called Projections.
Vertica stores all data in Projections.
Projections are updated automatically as data is loaded into the database.
Data is sorted and compressed.
Vertica distribute the data across all nodes.
3 Types of Projections:
Superprojections. Contain all data, they are created when data is first loaded into
the database.
Query-Specific Projections. Contain only the columns needed for a specific query.
Buddy Projections. Copies of projections stored on an adjacent node.
Projections with large amount of data:
For small amount of data segmentation is not efficient, Vertica copy the full
projection to each node.
Create projections using DDL (Data Definition Language)
Vertica’s Transaction Model.
Vertica follows the SQL-92 transaction model.
- DML commands: INSERT, UPDATE, DELETE.
- you don’t have to explicitly start a transaction.
- we must use COMMIT, ROLLBACK or COPY to end a transaction.
In Vertica:
- DELETE doesn’t delete data from disk storage, it marks rows as deleted so
they can be be found by historical queries.
- UPDATE write two rows: one with new data and one marked for deletion.
Like COPY, by default, INSERT, UPDATE and DELETE commands write the data to
the WOS and on overflow write to the ROS. For large INSERTS or UPDATES, you
can use the DIRECT keyword to force HP Vertica to write rows directly to the ROS.
Loading large number of rows as single row inserts are not recommended for
performance reasons. Use COPY instead.
Cristóbal Gómez | Topic | # Page
Temas
A1 - Identify key features of Vertica
C1 - Identify benefits of loading data into WOS and directly into ROS
D4 - Distinguish between deleting partitions and deleting records
F1 - Identify situations when a backup is recommended
H1 - Understanding analytics syntax
Identify benefits of loading data into WOS and directly into
ROS
Arely Sandoval
Encoding
Is the process of converting data into a standard format. Vertica uses a number of different encoding
strategies, depending on column data type, table cardinality, and sort order.
Compression
Is process of transforming data into a compact format.
Encoding Types
ENCODING AUTO (default)
Lempel-Ziv-Oberhumer-based (LZO) compression is used for CHAR/VARCHAR, BOOLEAN,
BINARY/VARBINARY, and FLOAT columns.
ENCODING DELTAVAL
Stores only the differences between sequential data values instead of the values themselves. This
encoding type is best used for integer-based columns, but also applies to
DATE/TIME/TIMESTAMP/INTERVAL columns. It has no effect on other data types.
ENCODING RLE
Arely Sandoval | A3- Differentiate between compression and encoding| # Page
ENCODING BLOCK_DICT
For each block of storage, Vertica compiles distinct column values into a dictionary and then stores the dictionary and a
list of indexes to represent the data block. Is ideal for few-valued, unsorted columns in which saving space is more
important than encoding speed. BINARY/VARBINARY columns do not support BLOCK_DICT encoding.
ENCODING BLOCKDICT_COMP
This encoding type is similar to BLOCK_DICT except that dictionary indexes are entropy coded. This encoding type
requires significantly more CPU time to encode and decode and has a poorer worst-case performance. However, use
of this type can lead to space savings if the distribution of values is extremely skewed.
ENCODING DELTARANGE_COMP
Is ideal for many-valued FLOAT columns that are either sorted or confined to a range. Do not use it with unsorted
columns that contain NULL values, as the storage cost for representing a NULL value is high.It has a high cost for both
compression and decompression.
ENCODING COMMONDELTA_COMP
Is ideal for sorted FLOAT and INTEGER-based (DATE/TIME/TIMESTAMP/INTERVAL) data columns with predictable
sequences and only the occasional sequence breaks, such as timestamps recorded at periodic intervals or primary
keys.
ENCODING NONE
Do not specify this value. Increases space usage, increases processing time, and leads to problems
Arely Sandoval | A3- Differentiate between compression and encoding| # Page
SELECT
PROJECTION_NAME,
PROJECTION_COLUMN_NAME,
ENCODING_TYPE,DATA_TYPE
FROM
PROJECTION_COLUMNS
WHERE
PROJECTION_COLUMN_NAME='Column_Name';
Differentiate between compression and encoding
● Encoded data can be processed directly by Vertica.
● Compressed data cannot be directly processed by Vertica. Data must first
be decompressed.
● Encoding depends on the data type of the data being encoded, and
compression treats a compressed block as opaque / doesn't really care
what's in it.
Arely Sandoval | A3- Differentiate between compression and encoding| # Page
● D6 - Identify the advantages of a group by pipe versus a
group by hash
● F3 - Define the Resource Manager's role in query
processing
● H3 - Using explain plans and query profiles
Arely Sandoval | A3- Differentiate between compression and encoding| # Page
Juan Carlos Vázquez Tapia | Topic | # Page
Juan Carlos Vazquez Tapia
Temas
● Viernes 20 de Marzo
○ Sección: Projection Design
■ B5 - Understanding buddy projections.
● Martes 24 de Marzo
○ Sección: Removing Data Permanently from Vertica and Advanced Projection Design.
■ D2 - Identify the advantages and disadvantages of using delete vectors to identify records marked for
deletion.
● Miercoles 25 de Marzo
○ Sección: Cluster Management in Vertica.
■ E4 - Define local segmentation capability in Vertica.
● Jueves 26 de Marzo
○ Sección: Monitoring and Troubleshooting Vertica.
■ G4 - Defining, using and logging into Management Console.
Juan Carlos Vázquez Tapia | Understanding Buddy Projections | # Page
Projection Design
B5 - Understanding Buddy Projections
Definition:
HP Vertica creates buddy projections, which are replicas of projections of the data in the database
that exist in the cluster and these replicas are distributed across database nodes.
HP Vertica ensures that projections that contain the same data are distributed to different nodes.
This ensures that if a node goes down, all the data is available on the remaining nodes.
The number of buddy projections is determined by the value of K as in K-safety
Juan Carlos Vázquez Tapia | Understanding Buddy Projections | # Page
B5 - Understanding Buddy Projections
Requirements:
There are some requirements that two projections need to accomplish to be considered “buddies”,
those requirements are:
● They have to contain the same columns
● They have to have the same hash segmentation
● Use different node ordering
Buddy projections can have different sort orders for query performance purposes.
Juan Neve
B4.- Describe the process of projection segmentation.
D1.- Describe the process used to mark records for
deletion.
E3.- Identify the steps of online recovery of a failed node.
G3.- Describe how to disallow user connections, while
preserving dbadmin connectivity.
B4.- Describe the purpose of
projection segmentation
● Provides high availability
● Recovery of data
● Optimizes query execution
Juan Antonio Neve Gómez | Page 1
Segmented Duplicated
Segmentation
Juan Antonio Neve Gómez | Page 2
The Random distribution of data is very
important for segmentation to be
effective. it keeps the load on the
nodes to the minimum so it runs more
efficiently.
Replicate projections provide high
availability because all of the data is
available on each node. And of course it
helps to recovery because there are more
copies on the other nodes.
Juan Antonio Neve Gómez | Page 3
Carlos Leal
1. Determining segmentation and partitioning (B6)
1. Identify the process for processing a large delete or update (D3)
1. Distinguish between the items in Vertica Cluster (E5)
1. Administering a cluster using management console (F5)
Carlos Ivan Leal
Determining Segmentation and Partitioning
Partitioning and segmentation have completely separate functions in Vertica. It is important to clarify the
differences because the concepts are similar, and there terms are often used interchangeably for other
databases.
Carlos Leal | Segmentation and Partitioning | B6
Segmentation and Partitioning
Segmentation defines how data is spread among cluster nodes, while partitioning specifies how data is
organized within the individual nodes. Segmentation is defined by the projection, and partitioning is defined
by the table. Logically, the partition clause is applied after the segmented by clause.
Carlos Leal | Segmentation and Partitioning | B6
Segmentation and Partitioning
Segmentation and partitioning have opposite goals regarding data localization. Partitioning attempts to
introduce hot spots within the node, allowing for a convenient way to drop data and reclaim the disk space.
Segmentation (by hash) distributes the data evenly across all nodes in a Vertica cluster.
Carlos Leal | Segmentation and Partitioning | B6
Segmentation and Partitioning
Partitioning by year, for example, makes sense if you intend to retain and drop data at the granularity of a
year. On the other hand, segmenting the data by year would be an extremely bad choice, as the node holding
data for the current year would likely answer far more queries than the other nodes.
Carlos Leal | Segmentation and Partitioning | B6
Carlos Leal | Identify the process for processing a large
delete or update
Identify the process for processing a large
delete or update D3
● Performance Considerations for Deletes and Updates
A large number of (un-purged) deleted rows could negatively affect query and recovery performance.
To eliminate the rows that have been deleted from the result, a query must do extra processing. It has been
observed if 10% or more of the total rows in a table have been deleted, the performance of a query on the table
slows down. However your experience may vary depending upon the size of the table, the table definition, and
the query. The same problem can also happen during the recovery. To avoid this, the delete rows need to be
purged in Vertica. For more information, see Purge Procedure.
Carlos Leal | Concurrency
Concurrency
Deletes and updates take exclusive locks on the table. Hence, only one delete or update
transaction on that table can be in progress at a time and only when no loads (or INSERTs) are
in progress. Deletes and updates on different tables can be run concurrently.
Carlos Leal | Optimizing
Optimizing Deletes and Updates for Performance
The process of optimizing a design for deletes and updates is the same. Some simple steps to
optimize a projection design or a delete or update statement can increase the query
performance by tens to hundreds of times. The following section details several proposed
optimizations to significantly increase delete and update performance.
Temas (Manuel Loza)
● B2 - Define RLE
● C6 - Understanding both WOS and ROS
● E1 - Identify the steps used to add nodes to an existing
clusters
● G1 - Define the use of Management Console in
monitoring Vertica
Define RLE
Run-Length Encoding
o is an encoding method.
o increases performance because there is less disk I/O during query
execution.
o Stores more data in less space.
How it works?
● replaces sequences of the same data values within a column by a
single value and a count number.
Typically used when data is:
1. Sorted
2. Low cardinality
3. Any data type
Example:
Understanding both WOS and ROS
Write Optimized Store (WOS)
● Memory-Resident
● Used to store INSERT, UPDATE, DELETE and COPY actions
● Arranged by projection
● Records are stored in the order they are inserted
o Stores data without compression or indexing
 Support very fast load speed
● A projection is sorted only when queried
o Remains sorted until new data is inserted into it
● Holds both committed and uncommitted transactions
Read Optimized Store (ROS)
● Disk storage structure
o Highly optimized
o Read oriented
● Like WOS, ROS is arranged by projection
o Projections in ROS are stored in ROS contain
● Makes optimal use of sorting (indexing) and compression
● COPY...DIRECT and INSERT (with /*direct*/ hint)
o Load data directly into ROS
Luis Cárdenas
C2 Define the actions of the move out and merge out tasks
D5 Identify the advantages of merge join versus hash join.
F2 Features of the vertica file used for back up and restore
H2 Using event based windows, time series, event server
join and pattern matching.
Ruben Gonzalez
A. Vertica Architecture (Viernes 20)
4. Installation of Vertica.
C. Loading Data into Vertica. (Lunes 23)
4. Copying data directly to ROS
D Removing Data Permanently from Vertica and Advanced Projection Design. (Martes 24)
7. Describe the characteristics of a prejoin projection.
F Backup/Restore and Resource Management in Vertica. (Jueves 26)
4. Describe the differences between maxconcurrency and planned concurrency.
Laura López
B3 - Describe Order By importance in projection design
C7 - Distinguishing between moveout and mergeout
actions
E2 - Describe the benefits of having identically sorted
buddy projections
G2 - Determine methods to troubleshoot spread
B3 - Describe Order By importance in projection design
● Specifies the columns to sort the projection on.
● You cannot specify an ascending or descending clause.
● HP Vertica always uses an ascending sort order in
physical storage.
● If you do not specify the ORDER BY table-column
parameter, HP Vertica uses the order in which columns
are specified as the sort order for the projection.
● One of the ways the projections can be optimized.
B3 - Describe Order By importance in projection design
Identifying characteristics of data file
directory
Disk Space Requirements for HP Vertica
In addition to actual data stored in the database, HP Vertica requires disk space for several
data
reorganization operations, such as mergeout and managing nodes in the cluster. For best
results,
HP recommends that disk utilization per node be no more than sixty percent (60%) for a K-
Safe=1
database to allow such operations to proceed.
Identifying characteristics of data file
directory
In addition, disk space is temporarily required by certain query execution operators, such as hash
joins and sorts, in the case when they cannot be completed in memory (RAM). Such operators
might be encountered during queries, recovery, refreshing projections, and so on. The amount of
disk space needed (known as temp space) depends on the nature of the queries, amount of data on
the node and number of concurrent users on the system. By default, any unused disk space on the
data disk can be used as temp space. However, HP recommends provisioning temp space
separate from data disk space. See Configuring Disk Usage to Optimize Performance
Prepare the Logical Schema Script
Designing a logical schema for an HP Vertica database is no different from designing one for any
other SQL database. Details are described more fully in Designing a Logical Schema.
To create your logical schema, prepare a SQL script (plain text file, typically with an extension of
.sql) that:
Identifying characteristics of data file
directory
Prepare Data Files
Prepare two sets of data files:
l Test data files. Use test files to test the database after the partial data load. If possible, use part
of the actual data files to prepare the test data files.
l Actual data files. Once the database has been tested and optimized, use your data files for your
initial Bulk Loading Data.
How to Name Data Files
Name each data file to match the corresponding table in the logical schema. Case does not matter.
Use the extension .tbl or whatever you prefer. For example, if a table is named Stock_Dimension,
name the corresponding data file stock_dimension.tbl. When using multiple data files, append _
nnn (where nnn is a positive integer in the range 001 to 999) to the file name. For example, stock_
dimension.tbl_001, stock_dimension.tbl_002, and so on.
Identifying characteristics of data file
directory
Documentation
Core:
● HP Vertica Architecture White Paper (Key Features)
● HP Vertica 7.1 complete
● HP_Vertica_7.1.x_administrators Guide
● HP Vertica’s Certification Topic List
● Braindumps
● Built-in Pools
● HP2 - N36 Exam Prep Guide
● Vertica Client 7.1.1.032 32bits
● VNC Portable
● DBeaver
● PuTTY Direct Download
● Host: verticaserver.cloudapp.net Port: 22 User: dbadmin Pass: admin
● Para acceder a Vertica > VMart: Ejecutar comando: “/opt/vertica/bin/admintools”
● Tableau (Cliente para extracción de datos).
● JDBC Driver
Documentation pt2
The Following files are located inside the install disc:
HP_Vertica_7.1.x_ Administrators Guide
HP_Vertica_7.1.x_ Analyzing Data
HP_Vertica_7.1.x_ Best Practices for OEM Customers
HP_Vertica_7.1.x_ Concepts Guide
HP_Vertica_7.1.x_ Connecting To HP Vertica
HP_Vertica_7.1.x_ Cpp_SDK_API
HP_Vertica_7.1.x_ Distributed_R
HP_Vertica_7.1.x_ Error Messages
HP_Vertica_7.1.x_ Extending HP Vertica
HP_Vertica_7.1.x_ Flex_tables
HP_Vertica_7.1.x_ Flex Canonical CEF Parser
HP_Vertica_7.1.x_ Flextables Quickstart
HP_Vertica_7.1.x_ Getting Started
HP_Vertica_7.1.x_ HP Vertica For SQL On Hadoop
HP_Vertica_7.1.x_ Informatica_plug-ing_Guide
HP_Vertica_7.1.x_ Install_Guide
HP_Vertica_7.1.x_ Integrating Apache Hadoop
Documentation pt3
The Following files are located inside the install disc:
HP_Vertica_7.1.x_ Java_SDK_API
HP_Vertica_7.1.x_ MS_Connectivity_Pack
HP_Vertica_7.1.x_ New_Features
HP_Vertica_7.1.x_ Place
HP_Vertica_7.1.x_ Pulse
HP_Vertica_7.1.x_ SQL_Reference_Manual
HP_Vertica_7.1.x_ Supported_Platforms
HP_Vertica_7.1.x_ Third_Party
Speaker Name | Topic | # Page
Layout
FAQs
1. What HDD Format Configuration is recommended for Data and Log Files?
2. What are the TOP Best Practices for Configuration?

Mais conteúdo relacionado

Semelhante a Hpverticacertificationguide 150322232921-conversion-gate01

NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
Steve Min
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
inside-BigData.com
 

Semelhante a Hpverticacertificationguide 150322232921-conversion-gate01 (20)

Ibm redbook
Ibm redbookIbm redbook
Ibm redbook
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Using Release(deallocate) and Painful Lessons to be learned on DB2 locking
Using Release(deallocate) and Painful Lessons to be learned on DB2 lockingUsing Release(deallocate) and Painful Lessons to be learned on DB2 locking
Using Release(deallocate) and Painful Lessons to be learned on DB2 locking
 
Vertica
VerticaVertica
Vertica
 
Apache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisApache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysis
 
Advanced Index, Partitioning and Compression Strategies for SQL Server
Advanced Index, Partitioning and Compression Strategies for SQL ServerAdvanced Index, Partitioning and Compression Strategies for SQL Server
Advanced Index, Partitioning and Compression Strategies for SQL Server
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
Vertica 7.0 Architecture Overview
Vertica 7.0 Architecture OverviewVertica 7.0 Architecture Overview
Vertica 7.0 Architecture Overview
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Database Cloud Services Office Hours : Oracle sharding  hyperscale globally d...Database Cloud Services Office Hours : Oracle sharding  hyperscale globally d...
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
 
Svccg nosql 2011_v4
Svccg nosql 2011_v4Svccg nosql 2011_v4
Svccg nosql 2011_v4
 

Último

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Último (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

Hpverticacertificationguide 150322232921-conversion-gate01

  • 1. HP Vertica Certification Guide Softtek 2015
  • 3. Identify key features of Vertica 1. Performance Features 1. Column-orientation 2. Aggressive Compression 3. Read-Optimized Storage 4. Ability to exploit multiple sort orders 5. Parallel shared-nothing design on on-the-shelf hardware 6. Bottom Line 2. Administrative and Management Features 1. Vertica Database Designer 2. Recovery and High Availability through K-Safety 3. Continuous Load: Snapshot Isolation and the WOS 4. Monitoring and Administration Tools and APIs Cristóbal Gómez | Identify key features of Vertica | 1
  • 4. The Vertica Analytic Database Architecture Cristóbal Gómez | Identify key features of Vertica | 2
  • 5. ROS Distribution And Tuple Mover Cristóbal Gómez | Identify key features of Vertica | 3
  • 6. Victor Espinosa | Topic | # Page Temas: - Describe High Availability capabilities and describe Vertica’s transaction model. - Identify characteristics and determine features of projections used in Vertica.
  • 7. High Availability. Ability of the database to continue running even if a node goes down. Proj A Proj B Proj C Proj C Proj A Proj B Buddy Projections: copies of existing projections stored in adjacent nodes. K-Safety: 0,1,2
  • 8. High Availability and Recovery - HP Vertica is said to be K-safe. High Availability with Projections. - Vertica Replicate small, unsegmented projections. - creates buddy projections for large, segmented projections. - for small tables, it replicates them, creating and storing duplicates of these projections on all nodes. - HP Vertica creates buddy projections, which are copies of segmented projections that are distributed across database nodes.
  • 9. Features - Columnar Orientation. Vertica stores data in columns, reads only the columns referenced by the query. - Advanced Encoding / Compression. compress and encode as part of the database design. reduce disk storage. data does not need to be unencoded to return a result. - High Availability. - Automatic Database Design transform data into column-based projections. query performance can be enhanced by comparing the data loaded and the most commonly used SQL queries. - Application Integration. Vertica uses standard SQL. - Massively Parallel Processing. ETL Replication Data Quality Vertica Analytics Reporting
  • 10. Projections Characteristics and Features. Projection is a representation of the columns in the source tables. Vertica stores data in columnar format called Projections. Vertica stores all data in Projections. Projections are updated automatically as data is loaded into the database. Data is sorted and compressed. Vertica distribute the data across all nodes. 3 Types of Projections: Superprojections. Contain all data, they are created when data is first loaded into the database. Query-Specific Projections. Contain only the columns needed for a specific query. Buddy Projections. Copies of projections stored on an adjacent node.
  • 11. Projections with large amount of data: For small amount of data segmentation is not efficient, Vertica copy the full projection to each node.
  • 12. Create projections using DDL (Data Definition Language)
  • 13. Vertica’s Transaction Model. Vertica follows the SQL-92 transaction model. - DML commands: INSERT, UPDATE, DELETE. - you don’t have to explicitly start a transaction. - we must use COMMIT, ROLLBACK or COPY to end a transaction. In Vertica: - DELETE doesn’t delete data from disk storage, it marks rows as deleted so they can be be found by historical queries. - UPDATE write two rows: one with new data and one marked for deletion. Like COPY, by default, INSERT, UPDATE and DELETE commands write the data to the WOS and on overflow write to the ROS. For large INSERTS or UPDATES, you can use the DIRECT keyword to force HP Vertica to write rows directly to the ROS. Loading large number of rows as single row inserts are not recommended for performance reasons. Use COPY instead.
  • 14. Cristóbal Gómez | Topic | # Page Temas A1 - Identify key features of Vertica C1 - Identify benefits of loading data into WOS and directly into ROS D4 - Distinguish between deleting partitions and deleting records F1 - Identify situations when a backup is recommended H1 - Understanding analytics syntax
  • 15. Identify benefits of loading data into WOS and directly into ROS
  • 16. Arely Sandoval Encoding Is the process of converting data into a standard format. Vertica uses a number of different encoding strategies, depending on column data type, table cardinality, and sort order. Compression Is process of transforming data into a compact format. Encoding Types ENCODING AUTO (default) Lempel-Ziv-Oberhumer-based (LZO) compression is used for CHAR/VARCHAR, BOOLEAN, BINARY/VARBINARY, and FLOAT columns. ENCODING DELTAVAL Stores only the differences between sequential data values instead of the values themselves. This encoding type is best used for integer-based columns, but also applies to DATE/TIME/TIMESTAMP/INTERVAL columns. It has no effect on other data types. ENCODING RLE Arely Sandoval | A3- Differentiate between compression and encoding| # Page
  • 17. ENCODING BLOCK_DICT For each block of storage, Vertica compiles distinct column values into a dictionary and then stores the dictionary and a list of indexes to represent the data block. Is ideal for few-valued, unsorted columns in which saving space is more important than encoding speed. BINARY/VARBINARY columns do not support BLOCK_DICT encoding. ENCODING BLOCKDICT_COMP This encoding type is similar to BLOCK_DICT except that dictionary indexes are entropy coded. This encoding type requires significantly more CPU time to encode and decode and has a poorer worst-case performance. However, use of this type can lead to space savings if the distribution of values is extremely skewed. ENCODING DELTARANGE_COMP Is ideal for many-valued FLOAT columns that are either sorted or confined to a range. Do not use it with unsorted columns that contain NULL values, as the storage cost for representing a NULL value is high.It has a high cost for both compression and decompression. ENCODING COMMONDELTA_COMP Is ideal for sorted FLOAT and INTEGER-based (DATE/TIME/TIMESTAMP/INTERVAL) data columns with predictable sequences and only the occasional sequence breaks, such as timestamps recorded at periodic intervals or primary keys. ENCODING NONE Do not specify this value. Increases space usage, increases processing time, and leads to problems Arely Sandoval | A3- Differentiate between compression and encoding| # Page
  • 18. SELECT PROJECTION_NAME, PROJECTION_COLUMN_NAME, ENCODING_TYPE,DATA_TYPE FROM PROJECTION_COLUMNS WHERE PROJECTION_COLUMN_NAME='Column_Name'; Differentiate between compression and encoding ● Encoded data can be processed directly by Vertica. ● Compressed data cannot be directly processed by Vertica. Data must first be decompressed. ● Encoding depends on the data type of the data being encoded, and compression treats a compressed block as opaque / doesn't really care what's in it. Arely Sandoval | A3- Differentiate between compression and encoding| # Page
  • 19. ● D6 - Identify the advantages of a group by pipe versus a group by hash ● F3 - Define the Resource Manager's role in query processing ● H3 - Using explain plans and query profiles Arely Sandoval | A3- Differentiate between compression and encoding| # Page
  • 20. Juan Carlos Vázquez Tapia | Topic | # Page Juan Carlos Vazquez Tapia Temas ● Viernes 20 de Marzo ○ Sección: Projection Design ■ B5 - Understanding buddy projections. ● Martes 24 de Marzo ○ Sección: Removing Data Permanently from Vertica and Advanced Projection Design. ■ D2 - Identify the advantages and disadvantages of using delete vectors to identify records marked for deletion. ● Miercoles 25 de Marzo ○ Sección: Cluster Management in Vertica. ■ E4 - Define local segmentation capability in Vertica. ● Jueves 26 de Marzo ○ Sección: Monitoring and Troubleshooting Vertica. ■ G4 - Defining, using and logging into Management Console.
  • 21. Juan Carlos Vázquez Tapia | Understanding Buddy Projections | # Page Projection Design B5 - Understanding Buddy Projections Definition: HP Vertica creates buddy projections, which are replicas of projections of the data in the database that exist in the cluster and these replicas are distributed across database nodes. HP Vertica ensures that projections that contain the same data are distributed to different nodes. This ensures that if a node goes down, all the data is available on the remaining nodes. The number of buddy projections is determined by the value of K as in K-safety
  • 22. Juan Carlos Vázquez Tapia | Understanding Buddy Projections | # Page B5 - Understanding Buddy Projections Requirements: There are some requirements that two projections need to accomplish to be considered “buddies”, those requirements are: ● They have to contain the same columns ● They have to have the same hash segmentation ● Use different node ordering Buddy projections can have different sort orders for query performance purposes.
  • 23. Juan Neve B4.- Describe the process of projection segmentation. D1.- Describe the process used to mark records for deletion. E3.- Identify the steps of online recovery of a failed node. G3.- Describe how to disallow user connections, while preserving dbadmin connectivity.
  • 24. B4.- Describe the purpose of projection segmentation ● Provides high availability ● Recovery of data ● Optimizes query execution Juan Antonio Neve Gómez | Page 1
  • 26. The Random distribution of data is very important for segmentation to be effective. it keeps the load on the nodes to the minimum so it runs more efficiently. Replicate projections provide high availability because all of the data is available on each node. And of course it helps to recovery because there are more copies on the other nodes. Juan Antonio Neve Gómez | Page 3
  • 27. Carlos Leal 1. Determining segmentation and partitioning (B6) 1. Identify the process for processing a large delete or update (D3) 1. Distinguish between the items in Vertica Cluster (E5) 1. Administering a cluster using management console (F5) Carlos Ivan Leal
  • 28. Determining Segmentation and Partitioning Partitioning and segmentation have completely separate functions in Vertica. It is important to clarify the differences because the concepts are similar, and there terms are often used interchangeably for other databases. Carlos Leal | Segmentation and Partitioning | B6
  • 29. Segmentation and Partitioning Segmentation defines how data is spread among cluster nodes, while partitioning specifies how data is organized within the individual nodes. Segmentation is defined by the projection, and partitioning is defined by the table. Logically, the partition clause is applied after the segmented by clause. Carlos Leal | Segmentation and Partitioning | B6
  • 30. Segmentation and Partitioning Segmentation and partitioning have opposite goals regarding data localization. Partitioning attempts to introduce hot spots within the node, allowing for a convenient way to drop data and reclaim the disk space. Segmentation (by hash) distributes the data evenly across all nodes in a Vertica cluster. Carlos Leal | Segmentation and Partitioning | B6
  • 31. Segmentation and Partitioning Partitioning by year, for example, makes sense if you intend to retain and drop data at the granularity of a year. On the other hand, segmenting the data by year would be an extremely bad choice, as the node holding data for the current year would likely answer far more queries than the other nodes. Carlos Leal | Segmentation and Partitioning | B6
  • 32. Carlos Leal | Identify the process for processing a large delete or update Identify the process for processing a large delete or update D3 ● Performance Considerations for Deletes and Updates A large number of (un-purged) deleted rows could negatively affect query and recovery performance. To eliminate the rows that have been deleted from the result, a query must do extra processing. It has been observed if 10% or more of the total rows in a table have been deleted, the performance of a query on the table slows down. However your experience may vary depending upon the size of the table, the table definition, and the query. The same problem can also happen during the recovery. To avoid this, the delete rows need to be purged in Vertica. For more information, see Purge Procedure.
  • 33. Carlos Leal | Concurrency Concurrency Deletes and updates take exclusive locks on the table. Hence, only one delete or update transaction on that table can be in progress at a time and only when no loads (or INSERTs) are in progress. Deletes and updates on different tables can be run concurrently.
  • 34. Carlos Leal | Optimizing Optimizing Deletes and Updates for Performance The process of optimizing a design for deletes and updates is the same. Some simple steps to optimize a projection design or a delete or update statement can increase the query performance by tens to hundreds of times. The following section details several proposed optimizations to significantly increase delete and update performance.
  • 35. Temas (Manuel Loza) ● B2 - Define RLE ● C6 - Understanding both WOS and ROS ● E1 - Identify the steps used to add nodes to an existing clusters ● G1 - Define the use of Management Console in monitoring Vertica
  • 36. Define RLE Run-Length Encoding o is an encoding method. o increases performance because there is less disk I/O during query execution. o Stores more data in less space. How it works? ● replaces sequences of the same data values within a column by a single value and a count number. Typically used when data is: 1. Sorted 2. Low cardinality 3. Any data type
  • 38. Understanding both WOS and ROS Write Optimized Store (WOS) ● Memory-Resident ● Used to store INSERT, UPDATE, DELETE and COPY actions ● Arranged by projection ● Records are stored in the order they are inserted o Stores data without compression or indexing  Support very fast load speed ● A projection is sorted only when queried o Remains sorted until new data is inserted into it ● Holds both committed and uncommitted transactions
  • 39. Read Optimized Store (ROS) ● Disk storage structure o Highly optimized o Read oriented ● Like WOS, ROS is arranged by projection o Projections in ROS are stored in ROS contain ● Makes optimal use of sorting (indexing) and compression ● COPY...DIRECT and INSERT (with /*direct*/ hint) o Load data directly into ROS
  • 40. Luis Cárdenas C2 Define the actions of the move out and merge out tasks D5 Identify the advantages of merge join versus hash join. F2 Features of the vertica file used for back up and restore H2 Using event based windows, time series, event server join and pattern matching.
  • 41. Ruben Gonzalez A. Vertica Architecture (Viernes 20) 4. Installation of Vertica. C. Loading Data into Vertica. (Lunes 23) 4. Copying data directly to ROS D Removing Data Permanently from Vertica and Advanced Projection Design. (Martes 24) 7. Describe the characteristics of a prejoin projection. F Backup/Restore and Resource Management in Vertica. (Jueves 26) 4. Describe the differences between maxconcurrency and planned concurrency.
  • 42. Laura López B3 - Describe Order By importance in projection design C7 - Distinguishing between moveout and mergeout actions E2 - Describe the benefits of having identically sorted buddy projections G2 - Determine methods to troubleshoot spread
  • 43. B3 - Describe Order By importance in projection design ● Specifies the columns to sort the projection on. ● You cannot specify an ascending or descending clause. ● HP Vertica always uses an ascending sort order in physical storage. ● If you do not specify the ORDER BY table-column parameter, HP Vertica uses the order in which columns are specified as the sort order for the projection. ● One of the ways the projections can be optimized.
  • 44. B3 - Describe Order By importance in projection design
  • 45. Identifying characteristics of data file directory Disk Space Requirements for HP Vertica In addition to actual data stored in the database, HP Vertica requires disk space for several data reorganization operations, such as mergeout and managing nodes in the cluster. For best results, HP recommends that disk utilization per node be no more than sixty percent (60%) for a K- Safe=1 database to allow such operations to proceed.
  • 46. Identifying characteristics of data file directory In addition, disk space is temporarily required by certain query execution operators, such as hash joins and sorts, in the case when they cannot be completed in memory (RAM). Such operators might be encountered during queries, recovery, refreshing projections, and so on. The amount of disk space needed (known as temp space) depends on the nature of the queries, amount of data on the node and number of concurrent users on the system. By default, any unused disk space on the data disk can be used as temp space. However, HP recommends provisioning temp space separate from data disk space. See Configuring Disk Usage to Optimize Performance Prepare the Logical Schema Script Designing a logical schema for an HP Vertica database is no different from designing one for any other SQL database. Details are described more fully in Designing a Logical Schema. To create your logical schema, prepare a SQL script (plain text file, typically with an extension of .sql) that:
  • 47. Identifying characteristics of data file directory Prepare Data Files Prepare two sets of data files: l Test data files. Use test files to test the database after the partial data load. If possible, use part of the actual data files to prepare the test data files. l Actual data files. Once the database has been tested and optimized, use your data files for your initial Bulk Loading Data. How to Name Data Files Name each data file to match the corresponding table in the logical schema. Case does not matter. Use the extension .tbl or whatever you prefer. For example, if a table is named Stock_Dimension, name the corresponding data file stock_dimension.tbl. When using multiple data files, append _ nnn (where nnn is a positive integer in the range 001 to 999) to the file name. For example, stock_ dimension.tbl_001, stock_dimension.tbl_002, and so on.
  • 48. Identifying characteristics of data file directory
  • 49. Documentation Core: ● HP Vertica Architecture White Paper (Key Features) ● HP Vertica 7.1 complete ● HP_Vertica_7.1.x_administrators Guide ● HP Vertica’s Certification Topic List ● Braindumps ● Built-in Pools ● HP2 - N36 Exam Prep Guide ● Vertica Client 7.1.1.032 32bits ● VNC Portable ● DBeaver ● PuTTY Direct Download ● Host: verticaserver.cloudapp.net Port: 22 User: dbadmin Pass: admin ● Para acceder a Vertica > VMart: Ejecutar comando: “/opt/vertica/bin/admintools” ● Tableau (Cliente para extracción de datos). ● JDBC Driver
  • 50. Documentation pt2 The Following files are located inside the install disc: HP_Vertica_7.1.x_ Administrators Guide HP_Vertica_7.1.x_ Analyzing Data HP_Vertica_7.1.x_ Best Practices for OEM Customers HP_Vertica_7.1.x_ Concepts Guide HP_Vertica_7.1.x_ Connecting To HP Vertica HP_Vertica_7.1.x_ Cpp_SDK_API HP_Vertica_7.1.x_ Distributed_R HP_Vertica_7.1.x_ Error Messages HP_Vertica_7.1.x_ Extending HP Vertica HP_Vertica_7.1.x_ Flex_tables HP_Vertica_7.1.x_ Flex Canonical CEF Parser HP_Vertica_7.1.x_ Flextables Quickstart HP_Vertica_7.1.x_ Getting Started HP_Vertica_7.1.x_ HP Vertica For SQL On Hadoop HP_Vertica_7.1.x_ Informatica_plug-ing_Guide HP_Vertica_7.1.x_ Install_Guide HP_Vertica_7.1.x_ Integrating Apache Hadoop
  • 51. Documentation pt3 The Following files are located inside the install disc: HP_Vertica_7.1.x_ Java_SDK_API HP_Vertica_7.1.x_ MS_Connectivity_Pack HP_Vertica_7.1.x_ New_Features HP_Vertica_7.1.x_ Place HP_Vertica_7.1.x_ Pulse HP_Vertica_7.1.x_ SQL_Reference_Manual HP_Vertica_7.1.x_ Supported_Platforms HP_Vertica_7.1.x_ Third_Party
  • 52. Speaker Name | Topic | # Page Layout
  • 53. FAQs 1. What HDD Format Configuration is recommended for Data and Log Files? 2. What are the TOP Best Practices for Configuration?