Everyone can do data science with the help of tools such as:
- import.io for visually scraping data from the web
- Pandas to wrangle data in Python
- BigML to apply machine learning to data.
In this presentation , I introduce what machine learning is before moving on to a case study where I show how to build a real estate pricing model. Check out import.io's webinar for the whole thing: http://blog.import.io/post/become-a-data-scientist-in-an-hour
21. “A significant constraint on
realizing value from big data
will be a shortage of talent,
particularly of people with
deep expertise in statistics
and machine learning.”
(McKinsey & Co.)
35. • Classification and regression
• 2 phases in ML: train and predict
• Prediction APIs make it easy to build
models
• Let’s use them on real estate data to
predict price from house
characteristics
36. • Encoding domain knowledge
• Making our life easier: restricting
data to only 1 city
37.
38. BigML!
• Look at data
• Split into training and test
• Build model from training
• Evaluate model on test
• Errors: mean absolute error (or
percentage?)
39.
40. Other import.io + BigML use cases:!
- Predict ebook rating from description
- Predict sales of etsy stores