O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Barcelona ML Meetup - Lessons Learned

2.728 visualizações

Publicada em

Lessons jearned from building real-life Machine Learning systems

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Barcelona ML Meetup - Lessons Learned

  1. 1. LessonsLearned
  2. 2. LessonsLearned
  3. 3. MoreDatavs.BetterModels
  4. 4. Really? Anand Rajaraman: Former Stanford Prof. & Senior VP at Walmart
  5. 5. Sometimes, it’s not about more data
  6. 6. Norvig: “Google does not have better Algorithms only more Data” Many features/ low-bias models
  7. 7. Sometimes, it’s not about more data
  8. 8. YouMightnotneed allyour“bigData”
  9. 9. ○ ○
  10. 10. Sometimesyoudoneed aComplexModel
  11. 11. Itpaysofftobesmartabout Hyperparameters
  12. 12. ○ ○
  13. 13. Supervisedvs.plus UnsupervisedLearning
  14. 14. ○ ○ ○ ○ ○
  15. 15. ○ ○
  16. 16. Everythingisanensemble
  17. 17. ○ ○ ○ ○ ○ ○
  18. 18. ○ ○
  19. 19. Theoutputofyourmodel willbetheinputofanotherone (andothersystemdesignproblems)
  20. 20. ○ ○ ○
  21. 21.
  22. 22. Thepains&gains ofFeatureEngineering
  23. 23. ○ ○ ○ ○
  24. 24. ○ ○ ○ ○
  25. 25. Implicitsignalsbeat explicitones (almostalways)
  26. 26. ○ ○ ○ ○
  27. 27. ○ ○ ○
  28. 28. bethoughtfulaboutyour TrainingData
  29. 29. ○ ○
  30. 30. ○ ○ ○
  31. 31. YourModelwilllearn whatyouteachittolearn
  32. 32. ○ ○ ○ ○
  33. 33.
  34. 34. Learntodealwith PresentationBias
  35. 35. More likely to see Less likely
  36. 36. DataandModelsaregreat.Youknowwhat’ sevenbetter? Therightevaluationapproach!
  37. 37. ○ ○
  38. 38. Youdon’tneedtodistribute yourMLalgorithm
  39. 39. ○ ○ ○
  40. 40. ○ ○ ○
  41. 41. but,Ifyoudo,youshouldunderstandat whatleveltodoit
  42. 42. The three levels of Distribution/Parallelization ● For each subset of the population (e.g. region) ● For each combination of the hyperparameters ● For each subset of the training data Each level has different requirements ANN Training over distributed GPU’s
  43. 43. somethingsarebetterdone Online and othersoffline…and,thereis Nearlinefor everythinginbetween
  44. 44. System Overview ● Blueprint for multiple personalization algorithm services ● Ranking ● Row selection ● Ratings ● … ● Recommendation involving multi-layered Machine Learning
  45. 45. Matrix Factorization Example
  46. 46. Thetwofacesofyour MLinfrastructure
  47. 47. ○ ○
  48. 48. ○ ○ ○ ○
  49. 49. ○ ○ ○ ○
  50. 50. ○ ○
  51. 51. Whyyoushouldcareabout answeringquestions(aboutyourmodel)
  52. 52. ○ ○ ○
  53. 53. Theuntoldstoryof DataScienceandvs.MLengineering
  54. 54. ○ ○ ○
  55. 55. ○ ○
  56. 56. ○ ○

×