Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build multi-use data infrastructure that is not subject to past constraints. This tutorial covers design assumptions, design principles, and how to approach the architecture and planning for multi-use data infrastructure in IT.
Long:
The goal in most organizations is to build multi-use data infrastructure that is not subject to past constraints. This session will discuss hidden design assumptions, review design principles to apply when building multi-use data infrastructure, and provide a reference architecture to use as you work to unify your analytics infrastructure.
The focus in our market has been on acquiring technology, and that ignores the more important part: the larger IT landscape within which this technology lives and the data architecture that lies at its core. If one expects longevity from a platform then it should be a designed rather than accidental architecture.
Architecture is more than just software. It starts from use and includes the data, technology, methods of building and maintaining, and organization of people. What are the design principles that lead to good design and a functional data architecture? What are the assumptions that limit older approaches? How can one integrate with, migrate from or modernize an existing data environment? How will this affect an organization's data management practices? This tutorial will help you answer these questions.
Topics covered:
* A brief history of data infrastructure and past design assumptions
* Categories of data and data use in organizations
* Data architecture
* Functional architecture
* Technology planning assumptions and guidance