IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
Analytic innovation transforming instagram data into predicitive analytics with references
1. Analytic Innovation:
Transforming Instagram Data
Into Predictive Analytics
Suresh.sood@uts.edu.au or linkedin.com/in/sureshsood
Xinhua.zhu@uts.edu.au or linkedin.com/pub/xinhua-zhu/27/448/621
2. Useful References Informing our Thinking
(Silva et al (2013) A comparison of Foursquare and Instagram to the study of city
dynamics and urban social behavior, Proceedings of the 2nd ACM SIGKDD
International Workshop on Urban Computing
Instagram and Foursquare datasets might be compatible in finding popular regions of
city
Chaoming Song, et al. (2010), Limits of Predictability in Human Mobility, Science
There is a potential 93% average predictability in user mobility, an exceptionally high
value rooted in the inherent regularity of human behavior. Yet it is not the 93%
predictability that we find the most surprising. Rather, it is the lack of variability in
predictability across the population.
Scellato et al. (2011), NextPlace: A Spatio-temporal Prediction Framework for
Pervasive Systems. Proceedings of the 9th International Conference on Pervasive
Computing (Pervasive'11)
Daily and weekly routines => Few significant places every day => Regularity in human
activities => Regularity leads to predictability
3. Useful References Informing our Thinking
Domenico, A. Lima, Musolesi.M. (2012) Interdependence and Predictability of Human
Mobility and Social Interactions. Proceedings of the Nokia Mobile Data Challenge
Workshop.
we have shown that it is possible to exploit the correlation between movement data and
social interactions in order to improve the accuracy of forecasting of the future geographic
position of a user. In particular, mobility correlation, measured by means of mutual
information, and the presence of social ties can be used to improve movement forecasting
by exploiting mobility data of friends. Moreover, this correlation can be used as indicator of
potential existence of physical or distant social interactions and vice versa.
Sadilek, A and Krumm, J. (2012) Far Out: Predicting Long-Term Human Mobility
Where are you going to be 285 days from now at 2pm …we show that it is possible to
predict location of a wide variety of hundreds of subjects even years into the future and
with high accuracy.
4. Topic Areas
1.
2.
3.
4.
5.
6.
7.
Analytic innovation and exploratory analysis
Motivations for Instagram project
Pattern mining trajectories
Instagram analytics tools
NoSQL- MongoDB
Datafication 3 back end (walk thru)
Q&A
5. Analytic Innovation
“Let’s define analytic innovation as any type of
analytical approach that is new and unique. It is
something a given organization has not done
before, and perhaps something nobody
anywhere has done before…An analytic
innovation should be focused on analyzing a
new data source, solving a new problem…”
Franks, B. (2012) Taming the Big Data Tidal Wave, p. 255, John Wiley & Son
6. Discovery (Exploratory) Analytics
Exploratory
–
–
–
–
–
Unstructured
Machine learning
Data mining
Complex analysis
Data diversity
Richness
X Business Intelligence
– Dashboard
– Real time decisioning
– Alerts
– Fresh data
– Response time
Speed of Query
7. Smartphone, Google Glass or Apple Watchwill
Know What you Want before you do
“…from 2014 your phone *glasses or watch+ will
anticipate your needs, do the research, tell you
what what you want to know – sometimes
before the question even occurs to you…”
Chapman, Jake (2013), The Wired World in 2014
11. Motivations for Instagram Project
•
Trajectory data (not i.i.d. – independent and identically distributed)
•
A new authentication approach based on trajectory
•
Predictive capability phones, glasses and watches
•
Internet of Things (Sensors, RFID and Drones)
•
Indoor GPS
•
Car parking “anywhere”
•
Location based services e.g. advertising
•
Tourist recommender system
•
Food analytics and traceability (farm fork)
•
Mobile apps with trajectory data e.g. Foursquare, Instagram, Nike+ EveryTrial
•
Insurance “pay as you drive”– telematics black box based insurance policy
12. Black Box Insurance
• Telematics technology (black box) helps assess the driving
behavior and deliver true driver centric premiums by
capturing:
–
–
–
–
–
–
–
Number of journeys
Distances travelled
Types of roads
Speed
Time of travel
Acceleration and braking
Any accidents
• Benefits low mileage, smooth and safe drivers
• Privacy vs. Saving monies on insurance (Canada)
– http://bit.ly/Black_box
13. Pattern Mining Trajectories
Trajectory Patterns:
Group
of
Trajectories
1. Hot regions (basic unit)
2. Trajectory pattern is
relationships amongst regions
Opportunities :
Location based networks
Destination prediction
Car-pooling
Personal route planning
Group buying
Loyalty
Credit card data
Adapted from: Chang, Wei, Yeh and Peng, “Discovering Personalised Routes from Trajectories”
ACM, LBSN’11, Chicago,illinois,USA, 1 November 2011
14. Instagram Analytics Tools (off the shelf)
•
Statigram
–
–
–
–
•
Simply Measured
–
–
–
–
–
–
–
•
Lifetime likes
Total comments
New followers/last 7 days
Most liked photos
Total engagement Instagram, Facebook and Twitter
Engaging photo/filter/location
Top photos by date
Active commenters
Best time for engagement
Best day for engagement
Top filters
Nitrogram
–
–
–
–
Countries of followers
Most engaging
Most commented
Likes and comments on a photo
16. Why is Instagram Popular ?
• Mobile photo sharing app + social network
• Mobile first Workflow:
– take picture or select => crop/filter => geo-tag/hashtag/description/share
•
•
•
•
Instagram is “Twitter but with photo updates”
Status updates are transformed photos
Default is pictures and accounts are public
Pictures include:
– Geolocation, hashtags, comments and likes
• Mobile app friendly vs. desktop
17. MongoDB - An Innovation in Databases?
“MongoDB gets the job done”
“document-oriented NoSQL database”
“MongoDB is natural choice when dealing with JSON”
“Same data model in code = same model in database”
“Data structure store to model applications”
“In MongoDB Instagram post can be stored in single collection and stored exactly as represented in the program as one
object. In a relational database an Instagram post would occupy multiple tables.”
“MongoDB understands geo-spatial co-ordinates and supports geo-spatial indexing”
“Initial MongoDB prototype RedHat OpenShift (Public/Private or Community “Platform as a Service”)
Recommendation engine integrating Mahout libraries and MongoDB (see Roadmap)
As discussed @ Journey to MongoDB:Trajectory Pattern Mining in Australian Instagram
By Suresh Sood and Xinhua Zhu
**Sydney MongoDB Meetup 30 April 2013