Stop the madness - Never doubt the quality of BI again using Data Governance
Data Governance with Tableau Improves Business Insights
1. Presented by:
Ashley Ohmann
Data Visualization Practice Leader, Kforce Advisory and Solutions Practice
Enterprise Data Governance with Tableau
2. • This presentation discusses
– What Data Governance is
– Why it’s important to your organization
– The Goals of Data Governance
– How it interacts with your Tableau plans
– How good Data Governance practices will impact the user experience
and make your business and analytics projects more successful
Objectives
4. “Data Governance is a set of
processes that ensures that
important data assets are
formally managed throughout the
enterprise.”
- Wikipedia
5. “Data Governance is an
emerging discipline with an
evolving definition.”
And then, the definition talks
about trust, accountability,
evolution, and people.
- Wikipedia also says,
6. “…the specification of decision rights and
an accountability framework to encourage
desirable behavior in the valuation,
creation, storage, use, archival and
deletion of data and information.
It includes the processes, roles,
standards and metrics that ensure the
effective and efficient use of data and
information in enabling an organization to
achieve its goals.”
- Oracle’s definition of Data Governance
7. Your data is your asset.
If you are providing a service—whether it’s financial
services or healthcare, or retail—your data defines and
quantifies the relationships between your clients and
your products.
Data Governance strategies do not have to answer
every question within your organization—you can get as
much impact from many smaller, project-based
strategies as you can from an enterprise-wide strategy.
What does this mean?
8. Data Governance is the determination of which data
assets are important to your business.
It also is the selection of tools that will help you make
the best use of your data—because it’s your asset, and
this is where Tableau comes into the discussion.
Data stewardship is also important: it is the process in
which we define individual data elements, map their
usages, and then define how they will be used and
transformed, so that something like Profit means the
same thing in one system that it does in another.
What does DG look like to me?
9. • Not only are all of the transactions and interactions between
your products and customers quantified, but the tools and
formats used to capture them are different.
• Human beings generate most of the data that we’re now
storing, and most of it is unstructured.
• The way that you store, analyze, and share your data
determines your abilities to understand your business
• Traditionally, information management has been reactive, rather
than proactive. The purpose of Data Governance is to reverse
that paradigm.
Everything is quantified.
Why is DG Important?
10. • Data Governance goals include
• Standards—which data elements are important and why
• Processes—how are data elements loaded and transformed?
• Compliance—Sarbanes-Oxley, BaseI I and Basel II, HIPAA, to name
a few
• Security—keeping the data safe without unnecessary restrictions on
use
• Rights—who need to see what to do their job
• Metrics—measure the performance of your enterprise
• Data Governance is not tactics, like ETL, data cleansing,
architecture, or master data management
• Determining what’s important enables the development of
stakeholders, metrics, timelines, and communication strategies
Goals of Data Governance
More accuracy and better decisions with less waste
11. • My job was to report on employee performance for our four different
business lines.
– We downloaded data from SAP. It was painful. And only quarterly.
– These analytics are important because they determine headcount.
• Our IT group attempted to build an analytics application that we
scoped very carefully, but in the end, their project failed:
– Bad database design
– Employed existing, flawed techniques to build a tool for the future
– No accountability
My Tableau Story Starts with DG
Our data was a mess.
All great projects start somewhere!
12. • Failure actually is a great opportunity to reassess.
• We determined what did not work. And that was painful.
And expensive!
• The root cause had nothing to do with the technology.
– It was a leadership issue.
– Our business executives and our IT executives had not agreed on
how to use, protect, and analyze our data.
• We actually didn’t need IT.
• We needed ownership and accountability.
• And good change management.
We took a long, hard look.
What did we do first?
13. • “Managerial courage”
• We identified stakeholders around our business.
– Then, we spent days on the phone and in person deciding what
aspects of our business performance were important to measure
• Once we knew what was important to measure, we decided
how to measure, and when.
• We also determined how and when to store our data.
• We assigned responsibility and communication plans. These
were critical.
• We determined which tools we needed.
Data Governance and analytics are about people.
The tools are secondary to the people who will use them to make decisions.
How Did We Fix the Problems?
14. • One of the challenges of Data Governance is to select
tools that will help enable the growth of your organization.
• The tools that you use—and the analytics that you build—
should be extensions of your strategy.
• We decided to build a new EDW, because we had serious
issues with
– Data quality
– Data storage
– Accuracy
• We selected SQL Server; we already had the licenses, and we
had the knowledge that we needed to build the processes to
load and transform data from its nearest sources.
Tools should be enablers, not barriers.
The tools followed the people.
15. • Four business lines and 45 metrics is a lot. We needed
– One set of metrics for everyone
– A consistent presentation layer for executives
– The same level of visibility into detail for our regions
• We wanted to make sure that the 2,000 engineers in our
organization—the people who were interacting with—knew on
what they were being measured, how, why, and where to get
the data to help their peers improve.
• With our EDW, we build an analytics tool with a high level of
trust and enabled the people running our business to make
decisions and plan for the future.
• Our data governance committee met twice a month,
and we discussed on issues, resolutions, and trends.
Tableau was the icing on the cake.
What we did with Tableau
16. • Trust is one of the most important products of a good
Data Governance strategy.
• If people do not trust data, they will not use it.
• It doesn’t make sense to spend millions of dollars storing
and aggregating data that people do not think is correct.
• We used Tableau as an extension of our Data
Governance strategy by including several metadata
elements in our tooltips, which we sources from a
dimension of metrics:
– Metric definition, both in English and math
– Data element sources
– Metric thresholds
– Data latency
Looks aren’t everything.
Goal: Building Trust
17. • If you’re building new analytics, you can use Tableau to
control the level of detail of granular data that users can
see.
– This can help them understand your data and even help test it—
and realistically, most applications have some errors when they go
into production
– With Tableau and an agile EDW, it’s often easy—and
inexpensive—to correct errors. Everyone here has made mistakes-
-the biggest problem with making a mistake is denying that you
made it.
• Transparency is a big part of the change management of
a new tool, too. It builds trust and speeds adoption.
Transparency is important.
Goal: Communicate Validity
18. • There are many people in your organization reporting on
the same data. And their analyses all look different.
• Shared data sources are wonderful: if you’re running
Tableau Server, find out the highest priority and most
used data sources, and create a shared data source that
refreshes frequently.
• We used Tableau Server (and our EDW) as an access
point for data for other areas of our business.
• Certify your data sources for accuracy.
Democratization of data
Goal: Enable and Control Access
19. • We solved one big hurdle of analytics—naming conventions—
with our EDW.
• What you call your data elements is very important, because
people need to know what they’re looking at quickly.
• In our EDW, we added the datestamp that each record was loaded
so that we could expose it to our users
• This also fed back into our data quality metrics: we sometimes found
out that we had not received a daily load from an upstream source
when end-users saw the age.
“Design is empathy.” – Brian Chesky
Goal: Manage Metadata
20. • People need relevant information to make decisions—they don’t
need all of the information.
• Determining access privileges and needs is a bit part of Data
Governance strategies.
• Overloading people with information is the same as disabling them.
• Tableau Server’s groups are excellent ways to manage who can see
what data, as are row-level permissions
Not all information is useful information.
Goal: Provision Rights
21. • Most business now are regulated in some way.
• The challenge with analytics is making sure that people who should
not get data do not need data.
• Considerations for need and legal compliance should be top of mind
when designing new dashboards.
• Tableau Server has excellent features for securing data, from row-
level security (which can be applied offline within Reader!) to the
ability to hide granular data from end-users.
• The challenge is giving people enough data to do their jobs, but not
so much data that you get in trouble.
• Determining how to maintain compliance with government
regulations is expensive, but the opportunity cost of failure is even
higher.
Lawsuits generally are expensive.
Goal: Maintain Compliance
22. • Our dashboards served hundreds of users. The 150 key users
spend about three hours less a week analyzing their
performance than they had before. Good Data Governance
strategies caused this.
• You can’t afford not to include your Tableau users in your DG
strategies, or to plan for it. The opportunity cost is failure.
• Implementing DG with Tableau, particularly Tableau Server, is
very easy, and it has a big payoff, both in terms of level of effort
and burden on your infrastructure.
• The most important factors in analytics are relevance, accuracy,
and consistency
• Our biggest accomplishment: an informed, strong, vocal,
influential group of users.
DG and Tableau = Better, Trusted, Powerful Visualizations
What was the Outcome?
23. THE DATA GOVERNANCE LIFECYCLE
Identification of Problem
Assessment of the Cause
Identification of Stakeholders
Congregation of Stakeholders and Delegates
Determination of Metrics/Data Elements and Priorities
Development of Processes
Designation of Toolsets
Development of Reporting Criteria
Development of Training Curricula
Frequent Communications!
24. Thank You
Follow me on twitter @ashleyswain
Check out my blog!
www.dataviz.ninja
or email aohmann@kforce.com
25. Give Us Feedback
1.Navigate to this session from the Schedule
menu of the Data14 app (or use search)
2.Scroll down to Feedback and fill out the quick,
3-question survey
3.Hit Send Feedback
Notas do Editor
Wikipedia actually has a very good definition of Data Governance.
Why is this a difficult topic? It’s a set of theories, that we as humans then use to design processes and implement within our organizations. Data Governance is not a product. It’s a method of determine what is meaningful to you and then defining it and make people responsible for it.
So, you may be wondering—why is Ashley talking to you about Data Governance today?
My Tableau Story is one of Data Governance, and it’s a good example of how intertwined good Data Governance practices and excellent analytics products are.
When our journey started, we did not know that we would end up using Tableau. We just knew that we needed to take charge, and we needed to do things better than anyone had done them in our organization.
Our four different business lines were reporting to the same EVP, and they had the “same” KPIs, but each of them had just slightly different business rules for those metrics. And they were only quarterly, which is okay for measuring things like retail seasonality, but it isn’t good for managing daily service.
Headcount = who keeps their job.
Why did I mention courage? Because change is hard. And people get mad. And change management is as much a part of a successful Tableau deployment—or any new tool—as the tool itself.
If you can’t tell people quickly and easily why it benefits them to use a new tool, they will not use it.
A mandate is not the same as leadership.
Why did I mention courage? Because change is hard. And people get mad. And change management is as much a part of a successful Tableau deployment—or any new tool—as the tool itself.
If you can’t tell people quickly and easily why it benefits them to use a new tool, they will not use it.
A mandate is not the same as leadership.
Tools are very important. That’s where the people who work for your company input data. It’s how customer transactions are created and stored. If your data is your asset, then you owe it to your organization to store and analyze it in the best environments available within your budget. But more importantly, you owe it to your organization to plan for its future. How much will it grow? Where will it reside? How quickly will you need to be able to query it? And how will it need to be transformed in other systems and joined with other transactional and master data?
Each of these elements is part of Data Governance, particularly trust and the ability to plan for the future.
Trust had been a big issue at all levels of the organization, and that originated both in the validity of data within our EDW and with the transparency of previous tools. I found an issue once in which activities that took place on the weekend just were not being logged into a previous analytics tool. No reason why—after reading hundreds of lines of code, an individual developer had made the decision not to include them. This is the opposite of Data Governance—that person should have known what we were measuring, why it was important to the business, and how we were stewarding the individual data elements that he was using to create metrics. But, he didn’t—it was a failure to communicate and failure of accountability. It took me a day to pore through hundreds of lines of his code and figure out why the data coming from his application didn’t match manual calculations from SAP.
With our EDW and with our Tableau UI, we resolved that problem, but it took a long time to rebuilt the trust of our users.
If you build it, they will use it? Maybe.
If you build it, they will use it? Maybe.
So, the finance teams who supplied manual data to us actually became recipients of their own data, as we joined it with other performance data and master data to which they didn’t have the access or ability to blend. We did not see this as redundancy—we created a denormalized fact table for use in Tableau, because the volume of data was high, and our finance team created an SSIS package that loaded the data into their own tables. (Since their VP was one of our Data Governance stakeholders, we thought that this was a great demonstration of our capabilities.)
The result was that there were multiple uses of the same data. Notice that I didn’t say “multiple versions”—there was only one version. Their looked differently from ours, but the results were the same.
I can’t even tell you how much time we saved as an organization by redistributing our validated, certified data. In an organization run by people who have built their careers on their ability to analyze data, this is a big deal, because not only do they doubt other peoples’ analyses until they understand the methodology and assumptions completely, but they have seen so many tools fail before.
This goes back to what John Maeda said last year, which was a re-quote from Brian Chesky, the founder of AirBnB: “Design is empathy.” Tableau gives you greater abilities to design user experiences that are pleasurable. And part of that is speed. Your users should be able to understand exactly what they’re looking at without having to think about it—and simple, explicable field names are a big part of that, and every instance of a field should have the same name. This builds trust and consistency.
While Tableau doesn’t have a dedicated metadata layer, like QlikView, you can create one fairly easily by renaming fields, and this is something that Desktop authors will appreciate too. You can rename fields with a view that you then connect to from Tableau—which is very efficient if other people are consuming the view for use outside of Tableau—or in a custom SQL data connection. You also can change field names within a data source, and you can share data sources with other people.
Renaming fields manually is a pain, right? Who wants to spend an entire morning removing underscores? And it really does distract from the purpose of using Tableau, which is to deliver actionable, accurate insights.
Our business lines tended to change names, so we used views and slowly changing dimensions to track their name changes, so that the re-work in Tableau would be minimal and we could maintain a consistent, trusted user experience.
I recently worked in an organization with a magnificent IT group and some of the best data modelers around. And you know what? Very few of our field names had any vowels. “Sales” was “sls”. This was probably to save space. But everyone needed to change the names in their analytics, because it just wasn’t user-friendly.
We were using Teradata. Teradata is expensive—they charge by the cpu mili-second, right? And we had a ton of non-technical analysts querying the exact tables for very similar data, which also was expensive for our network. Our COE built a shared data source and then certified the data source and educated the users about it, so that their presentations would look very similar, and they would be using the same vocabulary properly.
Something I tell analysts who are new to the tool is to make data source, field names, and workbook names self-explanatory and simple, because someone else will be looking at their work at some point in the near future and will need to understand quickly what it is and how it adds value.
The most important factors in analytics are relevance, accuracy, and consistency: if you can design Data Governance strategies that can be communicated to business users, they will adapt and use them, and the result will be consistent, more efficient analytics that enable better decisions.
The most important factors in analytics are relevance, accuracy, and consistency: if you can design Data Governance strategies that can be communicated to business users, they will adapt and use them, and the result will be consistent, more efficient analytics that enable better decisions.