4. Import into Tableauâs fast data engine
Microsoft JET driver
â 255 columns, 255 characters
â No COUNT DISTINCT or MEDIAN
â Files greater than 4GB in size
Tableau text parser
â Used when no data modelled
File-based data sources
5. Indexes
â On all columns part of table JOINs
â On any column used in FILTER
Referential integrity
â Primary and foreign key explicitly defined
âą Helps bypass integrity checks
â Join culling
Partitioning
â Split larger table into smaller, individual tables
â Partition across dimension
Relational data sources
6. NULLs
â Define dimension as NOT NULL
â Increase effectiveness of indexes
Calculations
â Very complex calculations in a view or function within DB
â Create a custom SQL
Summary tables
â Summarize data to higher level of aggregation
Relational data sources
7. OLAP data sources
â Underlying language differences
âą Metadata definition
âą Filtering
âą Totals and aggregations
âą Data blending
Web-based data sources
â âConnect liveâ not an option
â Extracts can be refreshed
âą Automated and scheduled
Other data sources
9. Slow-running visualisation
â Time it takes to query
â Time to stream records back
Number of records
â Large number vs. smaller number of aggregated records
Desktop log files
Performance Recorder
Understanding the query
10. Multiple tables
â Dynamically create SQL based on fields
â Join culling to drop unused dimension tables
âą Referential integrity
Custom SQL
â Never deconstructed, and executed atomically
â Use in conjunction with Tableauâs data engine
â Context filters materialise results in a temp table
â Include parameters in SQL statements
Multiple tables vs. custom SQL
11. More than one data source
â Blend data or federated database system
Join is better on same data source
â Improve performance
â Improve filtering control
Blend is better across data sources
â Too many records for a join to be practical
â Display summary and details at the same time
Blending vs. joining
13. Factors
â Database technology
â Network speed
â Data volumes
â Workstation specs
âą Fast CPU with multiple cores
âą Lots of RAM
âą Fast I/O
Creation requires temp disk space
â Up to square of the size of resulting extract file
Create extract on workstation, populate on server
Creating extracts
14. Helps improve performance
â Aggregate to summary level
â Filter unnecessary values
â Hide all unused fields
Multiple level of detail across extracts
â Querying from higher aggregation to detail
Aggregated extracts
15. Optimize deterministic calculations
â String manipulations and concatenations
â Groups and sets
Non-deterministic calculations not stored
Cheaper to store data than recalculate
Optimizing extracts
16. Speed of data engine is relative
Data engine is faster than
â Non-optimized data base
â File-based data source
Data engine probably slower than
â Big cluster of fast machines
Aggregate extracts to offload summary-style analysis
â Detailed source data in data warehouse
Extracts vs. live connections
18. Discrete filters can be slower
â Keep only and exclude perform poor
âą Complex WHERE clause
âą Join on a temp table
Ranged filters can be faster
â Faster than large itemised list of discrete values
â Faster when increasing dimensions cardinality
Indexes impact efficiency of filters
Filtering categorical dimensions
19. Discrete dates or date levels
â Can result in poor query execution
âą Tables not partitioned on DATEPART
â Data extract can optimise performance
âą DATEPART materialised in extract
Range of contiguous dates
â Very efficient for query optimisers
âą Leverage indexes and partitions
Relative to a specific date
â Uses ranged date filter
Filtering dates
20. All filters are computed independently
Set one or more filters as context filters
â Any other filters are defined as dependent filters
â Writes filter result set to temp table
âą Subsequent filters and queries on smaller dataset
Creation of temp table expensive activity
â Context filters not frequently changed by user
Custom SQL statements can be optimised
Context filters
21. Too many quick filters will slow you down
â âOnly Relevant Valuesâ
â Lots of discrete lists
Try guided analytics over many quick filters
â Multiple dashboards with different levels
â Action filters within a dashboard
Quick filters
22. Enumerated quick filters can be slow
â Requires a query for all potential field values
âą Multiple value list
âą Single value list
âą Compact list
âą Slider
âą Measure filters
âą Ranged date filters
Non-enumerated quick filters can be helpful
â Do not require field values
âą Custom value list
âą Wildcard match
âą Relative date filters
âą Browse period date filters
Performance at the expense of visual context for end user
Quick filters
23. Show potential values in 3 different ways
â All values in database
âą No need to re-query when other filters changed
â All values in context
âą Temp table regardless of other filters
â Only relevant values
âą Other filters are considered
Quick filters
24. Create a parameter and filter based on usersâ selection
â PROS
âą Do not require a query before rendering
âą Parameters + calculated fields = complex logic
âą Can be used to filter across data sources
â CONS
âą Single-value selection only
âą Selection list cannot be dynamic
Quick filter alternatives
25. Use filter actions between views
â PROS
âą Supports multi-value selection
âą Evaluated at run-time to show a dynamic list
âą Can be used to filter across data sources
â CONS
âą Filter actions are more complex to set up
âą Not the same UI as parameters or quick filters
âą Source sheet still needs to query the data source
Quick filter alternatives
26. More data source I/O
â Need to ask exact same query again
More cache space required
â Each user session creates own query results and model cache
Caches being cleared can result in more I/O
User filters
28. Basic and aggregate calculations
â Expressed as part of the query sent to data source
â Calculated by the database
â Basic calculations scale very well
âą Tuning techniques can improve performance
Table calculations
â Calculated locally on query results returned
âą Generally done over a smaller set of records
â If performance is slowâŠ
âą Then push calculations to data source layer
âą Consider aggregated data extracts
Basic and aggregate calculations
29. Native features often more efficient than a manual calculation
â Grouping dimension members together
âą Consider using groups
â Grouping measure values together into âbinsâ
âą Consider using bins
â Changing displayed values for dimension members
âą Consider using aliases
Calculations vs. native features
30. Data type used has a significant impact on performance
â Integers are faster than Booleans
â Both are faster than Strings
Use Booleans for basic logic calculations
â Bad
âą IF [DATE]= TODAY() THEN âTODAYâ ELSE âNOT TODAYâ END
â Good
âą [DATE]=TODAY()
String searches
â FIND() slower than CONTAINS()
â CONTAINS() slower than wildcard match quick filter
Performance techniques
31. Parameters for conditional calculations
â Take advantage of âdisplay asâ
âą Integer values for calculation logic
Performance techniques
VALUE DISPLAY AS
YEAR YEAR
QUARTER QUARTER
MONTH MONTH
WEEK WEEK
DAY DAY
Â
VALUE DISPLAY AS
1 YEAR
2 QUARTER
3 MONTH
4 WEEK
5 DAY
Â
32. Date conversion
â Numeric field to a string to a date is inefficient
âą Bad
â DATE(LEFT(STR([YYYYMMDD]),4) + â-â + MID(STR([YYYYMMDD]),4,2) + â-â + RIGHT(STR([YYYYMMDD]),2))
â Keep numeric field and use DATEADD()
âą Good
â DATEADD(âDAYâ, [YYYYMMDD]%100-1, DATEADD(âMONTHâ, INT(([YYYYMMDD]%10000)/100)-1, DATEADD(âYEARâ,
INT([YYYYMMDD]/10000)-1900, #1900-01-01#)))
Date functions
â NOW() for time stamp
â TODAY() for date level
Performance techniques
33. Logic statements
â ELSEIF != ELSE IF
IF [REGION] = "EAST" AND [CUSTOMER SEGMENT] = "CONSUMER"
THEN "EAST-CONSUMER"
ELSE IF [REGION] = "EAST" AND CUSTOMER SEGMENT] <>"CONSUMER"
THEN "EAST-ALL OTHERS"
END
END
â would run much faster as:
IF [REGION] = "EAST" AND [CUSTOMER SEGMENT] = "CONSUMER"
THEN "EAST-CONSUMER"
ELSEIF [REGION] = "EAST" AND [CUSTOMER SEGMENT] <>"CONSUMER"
THEN "EAST-ALL OTHERS"
END
â but this is faster still:
IF [REGION] = "EAST" THEN
IF [CUSTOMER SEGMENT] = "CONSUMER" THEN
"EAST-CONSUMER"
ELSE "EAST-ALL OTHERS"
END
END
Performance techniques
34. Separate basic and aggregate calculations
â When using extracts and custom aggregations
âą Divide calculations into multiple parts
â Row level calculations on one calculated field
â Aggregated calculations on second calculated field
Performance techniques
36. Only fetch and draw what you need
â Remove unnecessary fields from level of detail
Charts vs. crosstabs
â Marks display faster than a tabular report
âą Rendering a text table consumes more memory
â Tableau is not a back door data extract process
âą Leverage aggregate to detail action filters
Removing unnecessary geographic roles
â Save time to lookup generated latitudes & longitudes
Views
37. Blending vs. custom geographic roles
â Custom geocoding embeds entire GEOCODING.FDB
âą Significant increase in TWBX file size
â Joining or blending geographic data is smaller size
âą Save only relevant custom geographic data
Views
38. Less views, less quick filters
â Each view requires at least one query
â Each quick filter requires at least one query
â Can result into a lot of I/O before rendering
â Dashboards process views from same data source in a serial fashion
Turn off tabs
â Must process every view in every tab
âą Need to understand structure of tabs for actions or filters
â Reducing tabs improves performance
âą Try â:tabs=noâ
Dashboards
39. âExclude all valuesâ
â Avoids expensive query of asking for all data
Fixed size dashboards
â Different window sizes mean views are drawn differently
âą VizQL server must render view separately for each user
â âAutomatic (Fit To Window)â has low cache hit rate
Container considerations
âą Consider reducing excessive usage of containers
Dashboards
41. Executing query
â Process
âą Execute a query to return records for the view
â Investigate
âą Review log file to see queries taking long time
â Possible solution
âą Consider calculation, query and filter techniques
Tableau desktop messages
42. Computing View Layout
â Process
âą Tableau rendering display on all data received
â Investigate
âą Slow-running table calculations
âą Very large crosstab
âą Lots of marks rendered
â Possible solution
âą Review techniques for calculation optimisation and view design
Tableau desktop messages
43. Computing quick filters
â Process
âą Tableau is processing quick filters for view(s)
â Investigate
âą Long time rendering and refreshing quick filters
â Possible solution
âą Consider not using âshow relevant valuesâ
âą Consider not using enumerated quick filters
âą Review techniques for filter performance and view design
Tableau desktop messages
44. Understand where bottlenecks are occurring
â Information on what Tableau is doing
â Tableau communication with the data source
â Time taken by each Tableau step
For Tableau Desktop
â C:UsersusernameDocumentsMy Tableau RepositoryLogs
For Tableau Server, the VizQL log file:
â C:ProgramDataTableauTableau ServerdatatabsvcvizqlserverLogs
Log files
45. Database performance monitors
â Insight on queries hitting your DB
â Understand how database processes them
â Advice on additional tuning
Tableau 8 performance metrics
â Turn on to record metrics
Database performance monitors
46. Interpret performance recording
â Timeline
âą Workbook, dashboard, worksheet
â Events
âą Event nature and duration
â Query
âą Executing query in either timeline or events
Performance metrics V8
47. Computing layouts
â If layouts are taking too long, consider simplifying your workbook.
Connecting to data source
â Slow connections could be due to network issues or issues with the database server.
Executing query
â If queries are taking too long, consult your database serverâs documentation.
Generating extract
â To speed up extract generation, consider only importing some data from the original data source.
Geocoding
â To speed up geocoding performance, try using less data or filtering out data.
Blending data
â To speed up data blending, try using less data or filtering out data.
Server rendering
â You can speed up server rendering by running additional VizQL Server processes on additional
machines.
Performance metrics events
49. General guidelines
â Use 64-bit Tableau
âą Processes have access to more memory
â Add more cores and memory
Configuration
â Schedule extract refreshes for off-peak hours
â Check the VizQL session timeout limit
âą Default is 30 minutes
â Idle session still consumes memory
âą Change using tabadmin
â vizqlserver.session.expiry.timeout.setting
â Assess your process configuration
General guidelines
50. Maximise cache use
âReuse image tile and model caches
â Set dashboard size rule to âexact sizeâ
Tuning caches
âTableau server configuration utility
â Minimise queries
â Balanced
â Most up-to-date
» â:refresh=yesâ to view URL
âTabadmin
â Model cache
» vizqlserver.modelcachesize:30
â Query cache
» vizqlserver.querycachesize:64
Caching