10. Our Use Case
Using online project of flat files for big data by Chris Wilson from Time
Magazine, based off publicly available datasets.
Uses flat files and Ajax to produce workable datasets from open payments
data from cms.gov of highly anticipated datasets
https://source.opennews.org/articles/case-flat-files-big-data-projects/
12. Value of the Data
$0
$100
$200
$300
$400
$500
$600
10 20 30 40 50 60 70 80 90
Average Payment and Percentile for Physicians
from Drug Companies Per Medication
Payments
13. Delivery Method- A Real Decision
Data is large, no matter if a big data solution or otherwise, (VLDB).
90% of data between environments is often consistent, with data appends.
How often is a solution choice based off skill set of those in place and how will
this support the future with growth?
Do you want to pay for licensing of database and client servers as stated in our
use case example?
No need to patch, upgrade, etc. Just lock down file permissions and maintain
was the goal and this resonated with many customer scenarios.
14. Should vs. Did
Although an RDBMS with JSON would have been the preferred method to
deliver the data, the author’s team made a different choice…
15. This…
“Presented a technical challenge, because our small [team] is
a more comfortable with client-side web development than we
are with administering servers and databases. So we decided
to make the whole thing searchable using only flat files and
Ajax requests.”
I’ve heard similar stories before.
16. My Virtualization Demo Environment
Each zip file was under 1GB, (NOT big data), 16G uncompressed.
Unstructured, it was cumbersome to work with.
Gave excellent example of network bottlenecks transferring to Source.
21. Virtualize Options for Big Data
Partitioning- As many big data is partitioned resources across a single
physical system, virtualizing is often easy with modern virtualization products.
Isolation- Many big data environments may already be on VMs, to create a
virtualized dataset could eliminate extensive storage requirements of duplicate
data.
Package- Collect all tiers and dependencies for a big data solution and
containerize, making development, testing and delivery simple and automated.
23. What is a Delphix vFile?
Feature for “Unstructured Files”
A directory tree of files for Delphix to manage.
Can be:
- Link from an existing dataset on a source server into a dSource
- Files will be projected using NFS to a target server.
Small part of bigger “swiss knife”, as able to virtualize
relational databases, applications, etc.