SlideShare uma empresa Scribd logo
1 de 14
Baixar para ler offline
TLC–
YellowTaxi
By:Alex Blom, James Kimball,
MoniqueAtkinson, ZhiyunQi,
andAlexander Stein
Business
Understanding
❖Business Understanding: TLC's board wants a better understandingof
the customers they are serving during large holidayslikeNewYear's
Day.TLC would liketo predict which customers tip versus those who do
not tip to increaseemployeemotivation.This is because we hypothesize
that larger groups tip better, especiallyonholidayssuchas NewYear's
Day. However, the current consensusamong the drivers is that large
groups are lessprofitableand more annoying.
❖Business Question: What customer segment is more likelyto tip taxi
drivers?
❖Success Matrix: We will determine this model a successif we discover a
unique trend between the group size and tippingamount.
Data
Understanding
The data was collected
and obtained from the
New York City Taxi &
Limousine Commission.
This data set is
information exclusively
gained from interactions
during the month of
January 2019.
The Data Dictionary
includes categorical,
text, nominal, and
quantitative data.
The fields we will be
using within our model
include:
•Ride Length (quantitative)
•Passenger Count (quantitative)
•Fare Amount (quantitative)
•Tip Amount (quantitative)
•Tip form (categorical)
Data Preparation
❖We began by deleting two columns where no taxi driver filled out
the requested information. We may come back another time to
discover any significance of the data absence.
❖We deleted entries that were the outside the month of January
❖We condensed the data to business on 1/1/2019
❖We calculated ride length by changing the data type toTime
format and subtracted the pick-up and drop-off times
❖In a new sheet we put the tpep_pickup_datetime,
tpep_dropoff_datetime, Ride_Length passenger_count
trip_distance payment_type fare_amount tip_amount columns to
run with the model
Modeling
❖We chose to use the CustomerSegmentation Model. We wanted
to group different customers and compare the segments to ride length, trip
fare, and trip amount.
❖The variables we considered were the pickup time, drop-off time, passenger
count, trip distance, fare amount, payment type, and tip amount. We calculated
ride length by doing a transformation on the pickup and drop-off times. We
also considered location variable, but the location was based off of area codes
fromTLC,not general zip codes or Bureaus.SinceTLCis focused on group size
and tip amount, we did not use location for the final variable.
❖We assessed the results through segmentations featured on the following slides
as created by Solver.
❖We assessed the validity of the data results by converting the data into a box and
whiskers plot so we could visually see statistical outliers.
Modeling
Modeling
Modeling
❖This graph showTip
Amount per cluster.
Segment 2 shows many
outliers that are important
to investigate. This shows
that single riders who ride
more than 30 min have the
highest spread of tips. This
is our first clue that
distance, and not party
number, may be more
useful in determining tip
amount.
Modeling
This graph shows the fare
amount per cluster. The graph
shows that taxi drivers in
segments four and five have
negative tipping amounts. This is
most likely due to mistaken
entries from the taxi drivers. The
mistakes may derive from these
segments being less than 10 min
rides.
Modeling
This graph shows the trip distance per cluster. It is again
showing that segment two has the longest ride mileage.
Modeling
This graph shows the number of riders per cluster. While
segment two had the largest tipping spread in the earlier
graph, it tends to have a very low amount of riders that
disproves our earlier hypothesis. The segments with the most
riders are segment 1 and 4.
Model
Evaluation
We found through our 5 customer segmentations that the
distance of the trip bears no correlation to passenger(s) tipping
on the ride. However, we discovered two levers that changed if
the rider(s) tipped: the amount of time they spent in the taxi and
payment type. Our model shows that if the rider(s) spent, on
average, at least 10 minutes in the taxi, then the customer was
likely to tip.The other lever that changed if the rider(s) tipped at
all was payment type. The group tipped if credit card was used
and did not tip if cash was used. We believe this to be a
consequence of the data being self-reported. Therefore, cash
tips may not be reported and may be pocketed by the driver.
Group size, trip distance, and the amount of the fare did not
change the tip amount in a way that we could see.
Recommendation
We recommendthat we dismissthis modeland create a new
model thatfocuseson thedistanceof thecab ride instead of
thegroup sizeto see any possible correlationof tipping
patterns.While we still believe there is a way to increase
driver motivation,thismodel helped us determine which
variables may be morebeneficial to focuson.
A possiblefix to our model is to include moresegmentations
to identify more trends in oursegmentation.
Sources
Data Set:
 https://www1.nyc.gov/site/tlc/about/tlc-trip-record-
data.page

Mais conteúdo relacionado

Semelhante a Marketing Analytics Final Project

E shuttle (final presentation)
E shuttle (final presentation) E shuttle (final presentation)
E shuttle (final presentation)
Tommy Na
 
Environmental Finance Course Project: CTA ridership and peak pricing
 Environmental Finance Course Project: CTA ridership and peak pricing Environmental Finance Course Project: CTA ridership and peak pricing
Environmental Finance Course Project: CTA ridership and peak pricing
Xiaoqian Ruan
 
Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...
Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...
Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...
Dr Tony Webber
 
developing-disruptive-business-strategies-with-simulation.pdf
developing-disruptive-business-strategies-with-simulation.pdfdeveloping-disruptive-business-strategies-with-simulation.pdf
developing-disruptive-business-strategies-with-simulation.pdf
alwishariff
 
Insight and analysis 2013
Insight and analysis 2013Insight and analysis 2013
Insight and analysis 2013
Oliver Ranson
 
BHPH July-August 2015 Front Page - and Page 17
BHPH July-August 2015 Front Page - and Page 17BHPH July-August 2015 Front Page - and Page 17
BHPH July-August 2015 Front Page - and Page 17
Angelica Jeffreys
 
text_messaging.03.16.15
text_messaging.03.16.15text_messaging.03.16.15
text_messaging.03.16.15
Rick Moody
 

Semelhante a Marketing Analytics Final Project (20)

CAB RIDES EDA.pptx
CAB RIDES EDA.pptxCAB RIDES EDA.pptx
CAB RIDES EDA.pptx
 
CabModelWriteup
CabModelWriteupCabModelWriteup
CabModelWriteup
 
E shuttle (final presentation)
E shuttle (final presentation) E shuttle (final presentation)
E shuttle (final presentation)
 
Environmental Finance Course Project: CTA ridership and peak pricing
 Environmental Finance Course Project: CTA ridership and peak pricing Environmental Finance Course Project: CTA ridership and peak pricing
Environmental Finance Course Project: CTA ridership and peak pricing
 
Attribution Models: The secret ways guests discover your hotel
Attribution Models: The secret ways guests discover your hotelAttribution Models: The secret ways guests discover your hotel
Attribution Models: The secret ways guests discover your hotel
 
Traffic Congestion and Poor Air Quality Affecting Your Business? Consider Va...
Traffic Congestion and Poor Air Quality Affecting Your Business?  Consider Va...Traffic Congestion and Poor Air Quality Affecting Your Business?  Consider Va...
Traffic Congestion and Poor Air Quality Affecting Your Business? Consider Va...
 
Part 3
Part 3Part 3
Part 3
 
Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...
Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...
Airline Intelligence & Research Revenue Modelling Case Study Transurban Westl...
 
How to Design an On-Demand Transit Service
How to Design an On-Demand Transit ServiceHow to Design an On-Demand Transit Service
How to Design an On-Demand Transit Service
 
Could demand-based tolling unclog your roads?
Could demand-based tolling unclog your roads?Could demand-based tolling unclog your roads?
Could demand-based tolling unclog your roads?
 
Brm project report [meru cab]
Brm project report [meru cab]Brm project report [meru cab]
Brm project report [meru cab]
 
developing-disruptive-business-strategies-with-simulation.pdf
developing-disruptive-business-strategies-with-simulation.pdfdeveloping-disruptive-business-strategies-with-simulation.pdf
developing-disruptive-business-strategies-with-simulation.pdf
 
Final_Report.docx (2)
Final_Report.docx (2)Final_Report.docx (2)
Final_Report.docx (2)
 
Brochure_TCO
Brochure_TCOBrochure_TCO
Brochure_TCO
 
Insight and analysis 2013
Insight and analysis 2013Insight and analysis 2013
Insight and analysis 2013
 
Hailo interview with cc
Hailo   interview with ccHailo   interview with cc
Hailo interview with cc
 
BHPH July-August 2015 Front Page - and Page 17
BHPH July-August 2015 Front Page - and Page 17BHPH July-August 2015 Front Page - and Page 17
BHPH July-August 2015 Front Page - and Page 17
 
text_messaging.03.16.15
text_messaging.03.16.15text_messaging.03.16.15
text_messaging.03.16.15
 
Simple Analysis on Liftago
Simple Analysis on LiftagoSimple Analysis on Liftago
Simple Analysis on Liftago
 
Demand Responsive Transit Brokerages
Demand Responsive Transit BrokeragesDemand Responsive Transit Brokerages
Demand Responsive Transit Brokerages
 

Último

Driving AI Competency - Key Considerations for B2B Marketers - Rosemary Brisco
Driving AI Competency - Key Considerations for B2B Marketers - Rosemary BriscoDriving AI Competency - Key Considerations for B2B Marketers - Rosemary Brisco
Driving AI Competency - Key Considerations for B2B Marketers - Rosemary Brisco
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 

Último (20)

LinkedIn Social Selling Master Class - David Wong
LinkedIn Social Selling Master Class - David WongLinkedIn Social Selling Master Class - David Wong
LinkedIn Social Selling Master Class - David Wong
 
Social Media Marketing PPT-Includes Paid media
Social Media Marketing PPT-Includes Paid mediaSocial Media Marketing PPT-Includes Paid media
Social Media Marketing PPT-Includes Paid media
 
Cash payment girl 9257726604 Hand ✋ to Hand over girl
Cash payment girl 9257726604 Hand ✋ to Hand over girlCash payment girl 9257726604 Hand ✋ to Hand over girl
Cash payment girl 9257726604 Hand ✋ to Hand over girl
 
Situation Analysis | Management Company.
Situation Analysis | Management Company.Situation Analysis | Management Company.
Situation Analysis | Management Company.
 
Labour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptxLabour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptx
 
Podcast Marketing Master Class - Roger Nairn
Podcast Marketing Master Class - Roger NairnPodcast Marketing Master Class - Roger Nairn
Podcast Marketing Master Class - Roger Nairn
 
No Cookies No Problem - Steve Krull, Be Found Online
No Cookies No Problem - Steve Krull, Be Found OnlineNo Cookies No Problem - Steve Krull, Be Found Online
No Cookies No Problem - Steve Krull, Be Found Online
 
Kraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentationKraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentation
 
BDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort Service
 
Driving AI Competency - Key Considerations for B2B Marketers - Rosemary Brisco
Driving AI Competency - Key Considerations for B2B Marketers - Rosemary BriscoDriving AI Competency - Key Considerations for B2B Marketers - Rosemary Brisco
Driving AI Competency - Key Considerations for B2B Marketers - Rosemary Brisco
 
Enjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort Service
 
Unlocking the Mystery of the Voynich Manuscript
Unlocking the Mystery of the Voynich ManuscriptUnlocking the Mystery of the Voynich Manuscript
Unlocking the Mystery of the Voynich Manuscript
 
Generative AI Master Class - Generative AI, Unleash Creative Opportunity - Pe...
Generative AI Master Class - Generative AI, Unleash Creative Opportunity - Pe...Generative AI Master Class - Generative AI, Unleash Creative Opportunity - Pe...
Generative AI Master Class - Generative AI, Unleash Creative Opportunity - Pe...
 
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptx
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptxDigital-Marketing-Into-by-Zoraiz-Ahmad.pptx
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptx
 
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose GuirgisCreator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
 
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
 
Social media, ppt. Features, characteristics
Social media, ppt. Features, characteristicsSocial media, ppt. Features, characteristics
Social media, ppt. Features, characteristics
 
Unraveling the Mystery of the Hinterkaifeck Murders.pptx
Unraveling the Mystery of the Hinterkaifeck Murders.pptxUnraveling the Mystery of the Hinterkaifeck Murders.pptx
Unraveling the Mystery of the Hinterkaifeck Murders.pptx
 
The+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdfThe+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdf
 
Unraveling the Mystery of The Circleville Letters.pptx
Unraveling the Mystery of The Circleville Letters.pptxUnraveling the Mystery of The Circleville Letters.pptx
Unraveling the Mystery of The Circleville Letters.pptx
 

Marketing Analytics Final Project

  • 1. TLC– YellowTaxi By:Alex Blom, James Kimball, MoniqueAtkinson, ZhiyunQi, andAlexander Stein
  • 2. Business Understanding ❖Business Understanding: TLC's board wants a better understandingof the customers they are serving during large holidayslikeNewYear's Day.TLC would liketo predict which customers tip versus those who do not tip to increaseemployeemotivation.This is because we hypothesize that larger groups tip better, especiallyonholidayssuchas NewYear's Day. However, the current consensusamong the drivers is that large groups are lessprofitableand more annoying. ❖Business Question: What customer segment is more likelyto tip taxi drivers? ❖Success Matrix: We will determine this model a successif we discover a unique trend between the group size and tippingamount.
  • 3. Data Understanding The data was collected and obtained from the New York City Taxi & Limousine Commission. This data set is information exclusively gained from interactions during the month of January 2019. The Data Dictionary includes categorical, text, nominal, and quantitative data. The fields we will be using within our model include: •Ride Length (quantitative) •Passenger Count (quantitative) •Fare Amount (quantitative) •Tip Amount (quantitative) •Tip form (categorical)
  • 4. Data Preparation ❖We began by deleting two columns where no taxi driver filled out the requested information. We may come back another time to discover any significance of the data absence. ❖We deleted entries that were the outside the month of January ❖We condensed the data to business on 1/1/2019 ❖We calculated ride length by changing the data type toTime format and subtracted the pick-up and drop-off times ❖In a new sheet we put the tpep_pickup_datetime, tpep_dropoff_datetime, Ride_Length passenger_count trip_distance payment_type fare_amount tip_amount columns to run with the model
  • 5. Modeling ❖We chose to use the CustomerSegmentation Model. We wanted to group different customers and compare the segments to ride length, trip fare, and trip amount. ❖The variables we considered were the pickup time, drop-off time, passenger count, trip distance, fare amount, payment type, and tip amount. We calculated ride length by doing a transformation on the pickup and drop-off times. We also considered location variable, but the location was based off of area codes fromTLC,not general zip codes or Bureaus.SinceTLCis focused on group size and tip amount, we did not use location for the final variable. ❖We assessed the results through segmentations featured on the following slides as created by Solver. ❖We assessed the validity of the data results by converting the data into a box and whiskers plot so we could visually see statistical outliers.
  • 8. Modeling ❖This graph showTip Amount per cluster. Segment 2 shows many outliers that are important to investigate. This shows that single riders who ride more than 30 min have the highest spread of tips. This is our first clue that distance, and not party number, may be more useful in determining tip amount.
  • 9. Modeling This graph shows the fare amount per cluster. The graph shows that taxi drivers in segments four and five have negative tipping amounts. This is most likely due to mistaken entries from the taxi drivers. The mistakes may derive from these segments being less than 10 min rides.
  • 10. Modeling This graph shows the trip distance per cluster. It is again showing that segment two has the longest ride mileage.
  • 11. Modeling This graph shows the number of riders per cluster. While segment two had the largest tipping spread in the earlier graph, it tends to have a very low amount of riders that disproves our earlier hypothesis. The segments with the most riders are segment 1 and 4.
  • 12. Model Evaluation We found through our 5 customer segmentations that the distance of the trip bears no correlation to passenger(s) tipping on the ride. However, we discovered two levers that changed if the rider(s) tipped: the amount of time they spent in the taxi and payment type. Our model shows that if the rider(s) spent, on average, at least 10 minutes in the taxi, then the customer was likely to tip.The other lever that changed if the rider(s) tipped at all was payment type. The group tipped if credit card was used and did not tip if cash was used. We believe this to be a consequence of the data being self-reported. Therefore, cash tips may not be reported and may be pocketed by the driver. Group size, trip distance, and the amount of the fare did not change the tip amount in a way that we could see.
  • 13. Recommendation We recommendthat we dismissthis modeland create a new model thatfocuseson thedistanceof thecab ride instead of thegroup sizeto see any possible correlationof tipping patterns.While we still believe there is a way to increase driver motivation,thismodel helped us determine which variables may be morebeneficial to focuson. A possiblefix to our model is to include moresegmentations to identify more trends in oursegmentation.