Producing and Analyzing Rich Data with PostgreSQL

sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
How to get the best of both worlds with PostgreSQL
Relational and Non-Relational
Databases

sales@chartio.com
(855) 232-0320
The Prevailing View - Logical
Dimension Relational Non-Relational
Schema objects ● Structured rows and columns
● Schema on write
● Referential integrity
● Painful migrations
● Unstructured files, docs, etc
● Schema on read
● No referential integrity
● No migrations
Query languages ● SQL
● Declarative
● Easy enough for non-tech users
● Various
● Procedural
● Requires some programming skills
Exploratory analysis ● Native support for joins
● Interactive/low execution overhead
● No native support for joins
● OLAP - Batch processing
Data science and ML ● Only descriptive statistics
● Requires exporting dumps/samples
● Robust ecosystem
● Does not require exports

sales@chartio.com
(855) 232-0320
The Prevailing View - Physical
Dimension Relational Non-Relational
Parallel query
processing
● Single node system
● Single process per query
● Multiple node system
● Multiple processes per query
Concurrency ● High concurrency
● Single process per connection
● OLAP - low concurrency/high
scheduling overhead
High Availability &
Replication
● Async and sync replication
● HA may not be native
● Async and sync replication
● HA likely to be native
Sharding ● Sharding may not be native
● Difficult to manage
● Sharding likely to be native
● Easy to manage

sales@chartio.com
(855) 232-0320
The Prevailing View - Summary
- RDBMS have nice properties for producing rich data
- Easier for non-tech users and exploratory analysis
- Probably don’t meet the needs of today’s analysts
- Data science & Machine Learning
- Parallel processing
- Definitely don’t meet the needs of today’s apps
- Schema migrations
- Replication and sharding

sales@chartio.com
(855) 232-0320
The Practical Reality
- The product offerings are starting to overlap
- As we’ll see, old dogs are learning new tricks and vice versa
- However, the market sizes and talent pools still look very different

sales@chartio.com
(855) 232-0320
SQL 2011: Not Your Parents’ SQL
- Many people still think of SQL in terms of SQL-92
- Since then we’ve had: SQL:1999, SQL:2003, SQL:2006, SQL:2008,
SQL:2011
- http://use-the-index-luke.com/blog/2015-02/modern-sql
- Common Table Expressions (CTEs) / Recursive CTEs
- Window Functions
- Ordered-set Aggregates
- Lateral joins
- Temporal support
- The list goes on...

sales@chartio.com
(855) 232-0320
The PostgreSQL Extension System
- Various extension points
- Procedural languages
- Foreign data wrappers
- Data types/operators
- UDFs/UDAs (SQL, C/C++)
- Indexes (GiST, GIN)
- Custom Background Workers
- PostgreSQL Extension Network
- http://pgxn.org/
- PyPI, RubyGems, CPAN, CRAN

sales@chartio.com
(855) 232-0320
Query Languages
- Not only SQL
- Native
pgSQL Tcl Perl Python
- Community
Java PHP R Javascript Ruby Scheme
sh

sales@chartio.com
(855) 232-0320
Data Types & Schema Objects
- JSONB
- https://wiki.postgresql.org/images/b/b4/Pg-as-nosql-pgday-
fosdem-2013.pdf
- postgresql-hll
- PostGIS
- Foreign data wrappers
- Oracle, SQL Server, MySQL, JDBC, MongoDB, CouchDB,
Redis, S3, twitter, OpenStreetMap, LDAP, RSS, more…
- Multicorn
- https://github.com/shish/pgosquery

sales@chartio.com
(855) 232-0320
Data Science and Machine Learning
http://madlib.net/

sales@chartio.com
(855) 232-0320
Sharding
- https://github.com/citusdata/pg_shard

sales@chartio.com
(855) 232-0320
Parallel Processing
- MPP
- Proprietary
- Open source
- Columnar FDW:
- https://github.com/citusdata/cstore_fdw
- Parallel query
- http://rhaas.blogspot.com/2015/03/parallel-sequential-scan-for-postgresql.html

sales@chartio.com
(855) 232-0320
How Far Can We Go?
- Web app framework
- http://blog.aquameta.com/
- REST API
- https://github.com/begriffs/postgrest
- Unit testing framework
- http://pgtap.org/
- Firewall
- https://github.com/uptimejp/sql_firewall
- Full-text search
- https://github.com/zombodb/zombodb

sales@chartio.com
(855) 232-0320
Conclusion
- With today’s RDBMS you get
- more than rows and columns
- more than SELECT, FROM, WHERE, GROUP BY, ORDER BY
- more than a single machine
- Make sure you get the full return on your investment!
Get your Chartio free trial!
sales@chartio.com
(855) 232-0320

Producing and Analyzing Rich Data with PostgreSQL

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Producing and Analyzing Rich Data with PostgreSQL

Semelhante a Producing and Analyzing Rich Data with PostgreSQL (20)

Mais de Chartio

Mais de Chartio (6)

Último

Último (20)

Producing and Analyzing Rich Data with PostgreSQL