This webinar was held on December 12, 2012 and provided an overview of free and low-cost tools for cleaning and preparing data and building useful and beautiful data visualizations.
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
Â
Exploring Data Preparation and Visualization Tools for Urban Forestry
1. Exploring Data Preparation and Visualization
Tools for Urban Forestry
340 N 12th St, Suite 402
Philadelphia, PA 19107
215.925.2600
info@azavea.com
www.azavea.com/opentreemap
2. About Us
Deborah Boyer
OpenTreeMap Project Manager
dboyer@azavea.com
215.701.7506
Jeremy Heffner
Product Manager
jheffner@azavea.com
215.701.7712
3. About Azavea
⢠Founded in 2000
⢠B Corporation
⢠30+ people
⢠Based in Philadelphia
â Boston office
⢠Geospatial + web + mobile
â Software development
â Spatial analysis services
â User experience
4. Agenda
⢠The Ideal: Gathering Organized, Perfect Data
⢠The Reality: Cleaning and Preparing Your Data
⢠Adding Context
⢠Exploring, Preparing and Sharing Data Visualizations
⢠Questions
6. An open source tree data management system
for collaborative, geography enabled urban tree inventory
7. Main Features
⢠Search and Explore Tree
Data
⢠View Ecosystem Benefits
⢠Add New Trees
⢠Edit and Update Trees
⢠Upload Tree Photos
⢠Track Stewardship Activities
8.
9. Data Quality Checks
⢠Remove duplicate
trees during data
upload
⢠Tree watch list
⢠Drop down lists
⢠User groups
⢠Reputation points
11. Data Cleaning: Your Questions
⢠At what point in the data maintenance
process do you find yourself cleaning data?
⢠Are there ways that you would like to
improve the workflow?
12.
13. Cleaning & Preparing Data
⢠Making sense of data starts at the point of collection
â Define what you want to measure / track
⢠Clearly define schema and fields
â Have a shared meaning for values
â Data validation on entry
â Collect your data
â Examine results
⢠Are there common mistakes you could prevent?
⢠Are there different interpretations of fields?
â Close the feedback loop & iterate
14. Cleaning & Preparing Data
⢠Common data quality issues
â Combined fields
⢠Address: â340 N 12th St, Suite 402 , Philadelphia, PA 19107â
â Invalid entries
⢠ZIP code: 1234 (length check, is number)
⢠Age: 204 (reasonable range check, is number)
â Format variations
⢠State: PA vs. Pennsylvania (drop down or scrubbing rules)
â Duplicates
⢠CRM: John Smith with old and new addresses
16. What does this have to do with trees?
⢠We track things - tree inventories, potential planting
sites, community groups, people who requested
trees, etc .
⢠Data comes from lots of places - web forms,
collected by various staff, submitted by community
groups.
⢠None of it matches.
⢠Good data makes our lives easier.
17. Cleaning & Preparing Data
⢠Tools to clean tabular data
â Excel (or open source equivalent)
⢠Pros:
â Broad features
â Widely utilized / common skill
â Formulas / sorting / flexible
⢠Cons:
â Doesnât understand record concept
â Mass changes can be tedious
18. Cleaning & Preparing Data
⢠Tools to clean tabular data
â DataWrangler
⢠http://vis.stanford.edu/wrangler/
⢠Pros:
â Focused on transforming data into relational format
â Live previews
⢠Cons:
â Alpha quality version
â Data size limits / online tool
â Can be difficult to figure out what set of transforms are needed
19. Cleaning & Preparing Data
⢠Tools to clean tabular data
â Google Refine
⢠http://code.google.com/p/google-refine/
⢠Pros:
â Understands record concept
â Formulas / Facets
â Undo capability
â Windows / Mac / Linux
⢠Cons:
â There is a learning curve
â Unusual type of app
Âť Download, unzip, run exe file, access through browser
22. Context: Your Questions
⢠What challenges have you faced putting your data
in context?
⢠Are you struggling to identify what âcontextâ means
for your organization?
⢠Do you know what data youâd like to use, but have
trouble finding it?
23. Your Data in Context
⢠Your data is essential!
⢠But it is more meaningful in contextâŚ
â Ratios & rates
⢠Service level
⢠Market penetration
â Indicators & trends
⢠How you compare
â Targeting
⢠Key demographics Juice Analytics
⢠Custom summaries
24. What does this have to do with trees?
⢠Trees donât exist in a vacuum.
⢠Contextual data = more effective outreach.
⢠More info gives you new insights.
25. Making Sense of the Census
⢠American FactFinder
⢠http://factfinder2.census.gov
â Decennial Census
⢠Every 10 years
⢠Full population survey
⢠Just 10 questions
â American Community Survey (ACS)
⢠Monthly sample
⢠Aggregated over different time periods (1-, 3- and 5-year)
⢠Extremely detailed questions
⢠Subject to sampling error
27. Helpers: Social Explorer
⢠http://www.socialexplorer.com/
⢠Data Dictionary
â Survey
â Dataset
â Table
â Variable
â Formula
â Population
29. Helpers: ACS Alchemist
⢠https://github.com/azavea/acs-alchemistÂ
⢠Retrieval of block group-level data
⢠Custom variable selection
⢠Delivery in spatial data format ready for mapping
This tool was developed by Azavea in collaboration with Jerry Ratcliffe and Ralph Taylor of Temple
University Center for Security and Crime Science. This project was supported by Award No. 2010-DE-BX-
K004, awarded by the National Institute of Justice, Office of Justice Programs, U.S. Department of Justice.
31. Helpers: ACS Alchemist
As easy as 1-2-3
1.Create a document with your selected variables
2.Pick your geographies
32. Helpers: ACS Alchemist
As easy as 1-2-3
1.Create a document with your selected variables
2.Pick your geographies and geolevels
3.Retrieve your shapefiles
33. Other Sources
⢠Public data
â Open Data Portals
⢠Federal, state & local data
â Political Data
⢠Voter data
⢠Legislative boundaries
⢠Commercial data
â Population Projections
â Consumer Data
35. Data Visualization: Your Questions
⢠Do you currently share data with your constituents?
⢠Where do you use data visualizations (e.g. annual
report, embedded infographics, live data trackers)?
⢠Do you currently map your data?
36. What does this have to do with trees?
⢠Charts, graphs, maps, and photos help us
tell a story.
⢠Show that trees are more than just leaves
and branches.
⢠Explore the science without making
peopleâs eyes glaze over.
37. Exploring Data
⢠Visualization tools
â Tableau
⢠http://www.tableausoftware.com/
⢠Pros:
â Flexible interface makes data exploration easy
â Fast even on large data sets
⢠Cons:
â Easy to visualize something that doesnât make sense to look at
â Price (for desktop tool)
39. Exploring Data
⢠Visualization tools
â GeoCommons (GeoIQ)
⢠http://geocommons.com/
⢠Pros:
â Intuitive interface
â Analysis tools
â Geocoding for up to 5,000 records
â Supports KML (Google Maps) import & export
⢠Cons:
â US-only geocoding
40. Exploring Data
⢠Desktop GIS: Proprietary
â Esri ArcGIS
⢠Pros:
â Industry standard
â Many tools
â Extensive training materials
â Customer support
⢠Cons:
â Windows only
â Potentially expensive *
*
41. Exploring Data
⢠Visualization tools
â ArcGIS Explorer online
⢠http://www.arcgis.com/explorer/
⢠Pros:
â Supports many data formats
â Online digitizing
â Integration with other Esri services
â Presentation view / mobile app
⢠Cons:
â Canât export geocoded results
â Geocoding limited to 250 records
45. Contact Us
Deborah Boyer
OpenTreeMap Project Manager
dboyer@azavea.com
215.701.7506
Jeremy Heffner
Product Manager
jheffner@azavea.com
215.701.7712
46. Exploring Data Preparation and Visualization
Tools for Urban Forestry
340 N 12th St, Suite 402
Philadelphia, PA 19107
215.925.2600
info@azavea.com
www.azavea.com/opentreemap