SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Titolo presentazione
sottotitolo
Milano, XX mese 20XX
Content-Based approaches for
Cold-Start Job Recommendations
ACM RecSys Challenge 2017
Lunatic Goats @PoliMi
M. Bianchi, F. Cesaro, F. Ciceri, M. Dagrada, A. Gasparin, D. Grattarola,
I. Inajjar, A. M. Metelli, L. Cella
Lunatic Goats @PoliMi
Task Outline
● Cold Start recommendation scenario:
○ job posting recommendations;
○ focus on getting positive interactions;
○ penalized for negative interaction;
○ rewarded for recruiter Interest.
● Two phases:
○ Offline - predictions for fixed sets of items and users.
○ Online - daily recommendation to variable sets of users.
Lunatic Goats @PoliMi
Data Analysis - Impressions vs Interactions
● Impressions: ~97% of the data, little to no information
contained (discarded).
● Interactions: ~3% of the data.
● Interactions divided in:
○ positive interactions (types 1, 2 and 3);
○ negative interactions (type 4);
○ recruiter interest (type 5).
● Interactions treated with implicit approach.
Lunatic Goats @PoliMi
Local Validation
● Split the dataset in train and validation set.
● Random sampling procedure:
○ randomly select target items from dataset;
○ remove all interactions with these items;
○ pick target users as a subset of those who have
interactions with these items.
● Preserve the user-item ratio.
● No cross-validation, too much data
Lunatic Goats @PoliMi
Solution - Preprocessing
● One Hot Encoding of both user and items features.
● Feature aggregation:
● TF-IDF application.
● Negative User Filtering: removing heavy deleters.
Lunatic Goats @PoliMi
Solution Overview
Lunatic Goats @PoliMi
Solution - Negative Recommendation
● Scoring heavily penalized negative (type 4) interactions
● Using CBF approach, predict type 4 interactions
● Ensemble these predictions with negative weight
Lunatic Goats @PoliMi
Solution – Content Based Filtering algorithms (CBF)
Recommend to a user items similar to the ones he/she likes.
● Run separately on positive (CBF+) and negative (CBF-)
interactions.
● Tanimoto similarity between items:
● Recommendation performed for filtered users only:
● Penalize heavy clickers.
Lunatic Goats @PoliMi
Solution – Profile Matching (PM)
Recommend to a user items matching his/her profile.
● Cosine similarity between user and item:
● Items’ tags and titles compared with users’ jobroles.
● Recommendation performed for filtered users only.
● Differently from CBF, PM is able to recommend also cold-start
users.
Lunatic Goats @PoliMi
Solution – Collaborative Filtering algorithms
● CF cannot be run directly in a cold-start scenario.
● Content-based microclustering approach:
○ for each cold-start item associate the interactions of the
top 5 CBF-similar non-cold-start items;
○ run standard CF algorithms.
● CF algorithms:
○ CF with item cosine similarity;
○ iALS (Implicit Alternating Least Squares).
Lunatic Goats @PoliMi
Solution - Ensemble Structure
● Divide algorithms by nature.
● Normalize and weight each
layer.
● Generate upper layers by
adding lower layers.
● Output 100 best scores.
Lunatic Goats @PoliMi
Solution - Parameter Tuning
● Ensemble tuning:
○ 9 weights (one for each block), reduced to 6 due to
normalization;
○ non-differentiable scoring function;
○ gradient-free optimization methods:
■ Genetic Algorithms - quick and acceptable results;
■ Powell’s Conjugate Direction method - slower but
superior results.
● Individual algorithms tuning:
○ greedy search on local test.
Lunatic Goats @PoliMi
Online - Changes to ensemble
● Normalization type.
● Cutting for each user
before items.
● Excluding slower
algorithms - prompt push
gives more exposure →
better scores.
Lunatic Goats @PoliMi
Architecture & Runtime
● Recommender is run on VM’s with 8 cores and 16GB RAM.
● Only exception is content-based microclustering and iALS,
run on 8 core 64GB RAM.
● Code is heavily optimized to use little memory efficiently
(sparse matrix representations, efficient matrix operations).
● Results in optimal runtime.
Lunatic Goats @PoliMi
Scores - Local vs Offline
Algorithm Local score Leaderboard score Execution time
CBF+ 57852 60257 13 min
CBF- -1330 -8529 4 min
PM 17260 16777 7 min
CF 42213 39250 12 min
iALS 48081 52411 150 min
XING Baseline 14742 14395 40 min
Ensemble 60625 71372 2 min
Lunatic Goats @PoliMi
Results and Conclusions
● 2nd
place in the online phase;
● 1st
place in the offline phase.
● Points of strength:
○ speed (in particular offline ~20 min);
○ ease of implementation.
● Extensions:
○ feature weighting (user personalized, feature interaction);
○ time decay models.

Mais conteúdo relacionado

Semelhante a Content-Based approaches for Cold-Start Job Recommendations

Identifying Personas With Agile Research - Dawn of the Data Age Lecture Series
Identifying Personas With Agile Research - Dawn of the Data Age Lecture SeriesIdentifying Personas With Agile Research - Dawn of the Data Age Lecture Series
Identifying Personas With Agile Research - Dawn of the Data Age Lecture SeriesLuciano Pesci, PhD
 
BSSML16 L5. Summary Day 1 Sessions
BSSML16 L5. Summary Day 1 SessionsBSSML16 L5. Summary Day 1 Sessions
BSSML16 L5. Summary Day 1 SessionsBigML, Inc
 
Model selection and tuning at scale
Model selection and tuning at scaleModel selection and tuning at scale
Model selection and tuning at scaleOwen Zhang
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 Natalia Díaz Rodríguez
 
VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1BigML, Inc
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolLouis Cialdella
 
Click prediction: kaggle competitions vs real life
Click prediction: kaggle competitions vs real lifeClick prediction: kaggle competitions vs real life
Click prediction: kaggle competitions vs real lifeAlexey Grigorev
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
Credit Card Default Risk
Credit Card Default RiskCredit Card Default Risk
Credit Card Default RiskVipul55627
 
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...SigOpt
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictionsAnton Kulesh
 
Automatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAutomatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAlexey Grigorev
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learningBig Data Colombia
 
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...Unicon, Inc.
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithmJie-Han Chen
 

Semelhante a Content-Based approaches for Cold-Start Job Recommendations (20)

Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Identifying Personas With Agile Research - Dawn of the Data Age Lecture Series
Identifying Personas With Agile Research - Dawn of the Data Age Lecture SeriesIdentifying Personas With Agile Research - Dawn of the Data Age Lecture Series
Identifying Personas With Agile Research - Dawn of the Data Age Lecture Series
 
Recommender systems
Recommender systems Recommender systems
Recommender systems
 
BSSML16 L5. Summary Day 1 Sessions
BSSML16 L5. Summary Day 1 SessionsBSSML16 L5. Summary Day 1 Sessions
BSSML16 L5. Summary Day 1 Sessions
 
Model selection and tuning at scale
Model selection and tuning at scaleModel selection and tuning at scale
Model selection and tuning at scale
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
 
VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1
 
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product School
 
Click prediction: kaggle competitions vs real life
Click prediction: kaggle competitions vs real lifeClick prediction: kaggle competitions vs real life
Click prediction: kaggle competitions vs real life
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Credit Card Default Risk
Credit Card Default RiskCredit Card Default Risk
Credit Card Default Risk
 
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
 
Automatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAutomatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to Production
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithm
 

Último

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 

Content-Based approaches for Cold-Start Job Recommendations

  • 1. Titolo presentazione sottotitolo Milano, XX mese 20XX Content-Based approaches for Cold-Start Job Recommendations ACM RecSys Challenge 2017 Lunatic Goats @PoliMi M. Bianchi, F. Cesaro, F. Ciceri, M. Dagrada, A. Gasparin, D. Grattarola, I. Inajjar, A. M. Metelli, L. Cella
  • 2. Lunatic Goats @PoliMi Task Outline ● Cold Start recommendation scenario: ○ job posting recommendations; ○ focus on getting positive interactions; ○ penalized for negative interaction; ○ rewarded for recruiter Interest. ● Two phases: ○ Offline - predictions for fixed sets of items and users. ○ Online - daily recommendation to variable sets of users.
  • 3. Lunatic Goats @PoliMi Data Analysis - Impressions vs Interactions ● Impressions: ~97% of the data, little to no information contained (discarded). ● Interactions: ~3% of the data. ● Interactions divided in: ○ positive interactions (types 1, 2 and 3); ○ negative interactions (type 4); ○ recruiter interest (type 5). ● Interactions treated with implicit approach.
  • 4. Lunatic Goats @PoliMi Local Validation ● Split the dataset in train and validation set. ● Random sampling procedure: ○ randomly select target items from dataset; ○ remove all interactions with these items; ○ pick target users as a subset of those who have interactions with these items. ● Preserve the user-item ratio. ● No cross-validation, too much data
  • 5. Lunatic Goats @PoliMi Solution - Preprocessing ● One Hot Encoding of both user and items features. ● Feature aggregation: ● TF-IDF application. ● Negative User Filtering: removing heavy deleters.
  • 7. Lunatic Goats @PoliMi Solution - Negative Recommendation ● Scoring heavily penalized negative (type 4) interactions ● Using CBF approach, predict type 4 interactions ● Ensemble these predictions with negative weight
  • 8. Lunatic Goats @PoliMi Solution – Content Based Filtering algorithms (CBF) Recommend to a user items similar to the ones he/she likes. ● Run separately on positive (CBF+) and negative (CBF-) interactions. ● Tanimoto similarity between items: ● Recommendation performed for filtered users only: ● Penalize heavy clickers.
  • 9. Lunatic Goats @PoliMi Solution – Profile Matching (PM) Recommend to a user items matching his/her profile. ● Cosine similarity between user and item: ● Items’ tags and titles compared with users’ jobroles. ● Recommendation performed for filtered users only. ● Differently from CBF, PM is able to recommend also cold-start users.
  • 10. Lunatic Goats @PoliMi Solution – Collaborative Filtering algorithms ● CF cannot be run directly in a cold-start scenario. ● Content-based microclustering approach: ○ for each cold-start item associate the interactions of the top 5 CBF-similar non-cold-start items; ○ run standard CF algorithms. ● CF algorithms: ○ CF with item cosine similarity; ○ iALS (Implicit Alternating Least Squares).
  • 11. Lunatic Goats @PoliMi Solution - Ensemble Structure ● Divide algorithms by nature. ● Normalize and weight each layer. ● Generate upper layers by adding lower layers. ● Output 100 best scores.
  • 12. Lunatic Goats @PoliMi Solution - Parameter Tuning ● Ensemble tuning: ○ 9 weights (one for each block), reduced to 6 due to normalization; ○ non-differentiable scoring function; ○ gradient-free optimization methods: ■ Genetic Algorithms - quick and acceptable results; ■ Powell’s Conjugate Direction method - slower but superior results. ● Individual algorithms tuning: ○ greedy search on local test.
  • 13. Lunatic Goats @PoliMi Online - Changes to ensemble ● Normalization type. ● Cutting for each user before items. ● Excluding slower algorithms - prompt push gives more exposure → better scores.
  • 14. Lunatic Goats @PoliMi Architecture & Runtime ● Recommender is run on VM’s with 8 cores and 16GB RAM. ● Only exception is content-based microclustering and iALS, run on 8 core 64GB RAM. ● Code is heavily optimized to use little memory efficiently (sparse matrix representations, efficient matrix operations). ● Results in optimal runtime.
  • 15. Lunatic Goats @PoliMi Scores - Local vs Offline Algorithm Local score Leaderboard score Execution time CBF+ 57852 60257 13 min CBF- -1330 -8529 4 min PM 17260 16777 7 min CF 42213 39250 12 min iALS 48081 52411 150 min XING Baseline 14742 14395 40 min Ensemble 60625 71372 2 min
  • 16. Lunatic Goats @PoliMi Results and Conclusions ● 2nd place in the online phase; ● 1st place in the offline phase. ● Points of strength: ○ speed (in particular offline ~20 min); ○ ease of implementation. ● Extensions: ○ feature weighting (user personalized, feature interaction); ○ time decay models.