SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
FD	recommendation system
Dung Chu,	data	scientist
FD	Mediagroep
Outline
• Business	cases	analysis	and	recommendation	engine	
roadmap
• Personalized	email	campaigns
• A/B	testing	results
• Upcoming	work
About	me
• Dung Chu	
• MSc	in	cloud	computing	@UvA
• Research	in	image	processing	group	ISLA	@UvA:	
https://ivi.fnwi.uva.nl/isis/index.html
• Data	scientist	@FD	mediagroep
FD	mediagroep
Het Financieele Dagblad
Het Financieele Dagblad is dé nieuws- en inspiratiebron die op elk moment
van de dag financieel-economische betekenis geeft aan ontwikkelingen in
de wereld.
Company info geeft altijd real time toegang tot actuele bedrijfs- en
prospectinformatie van 2,5 miljoen bedrijven in Nederland
FD/BNR Networks
FD/BNR Networks brengt ideeën, meningen en talentrijke ondernemers
en professionals samen door de organisatie van o.a. forums, debatten,
netwerksessies, en evenementenreeksen.
BNR Nieuwsradio
BNR Nieuwsradio is de enige radiozender in Nederland waar ambitieuze en
ondernemende mensen 24 uur per dag op de hoogte worden gehouden van
relevant nieuws.
Pensioen Pro
De joint venture tussen FD Pensioen Pro en IPN en levert betrouwbaar
nieuws, achtergrond en analyse voor de pensioenprofessional.
Redactie Partners
Produceert custom media producten voor zowel interne als externe
communicatievraagstukken
Fondsnieuws
Het grootste journalistieke platform voor beleggingsprofessionals in
Nederland.
Energeia
Biedt nieuws, data en opinie voor energieprofessionals.
FD.nl
Publishing	rate
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Number	of	articles
Nr	articles
Per	week:	~	500	articles	
Digital-
first
FD	traffic
2/1/17
2/2/17
2/3/17
2/4/17
2/5/17
2/6/17
2/7/17
2/8/17
2/9/17
2/10/17
2/11/17
2/12/17
2/13/17
2/14/17
2/15/17
2/16/17
2/17/17
2/18/17
2/19/17
2/20/17
2/21/17
2/22/17
2/23/17
2/24/17
2/25/17
2/26/17
2/27/17
2/28/17
FD	traffic	- february	2017	- daily
Users Pageviews
Business	cases	analysis
• Improve	customer	retention	through	direct	digital	
activation	with	increased	relevance
– Online	traffic
– Email	traffic	
• Improve	conversion	via	more	digital	engagement
– Convert	un-registered	to	registered	customers	
• Increase	ad	sales	via	increased	traffic
All	users
Logged-in	users
Business	cases	analysis
• Support	journalists	in	understanding	customer	
reading	tastes:
– Provide	more	article	insights	to	journalists
Choice:	internal	development	of	recommendation	systems*
*	We	did	try	a	pilot	phase	with	a	third	party
Where	to	start
• Online	recommendation	in	website	fd.nl
• Email	newsletters
Where	to	start
• Online	recommendation	in	website	fd.nl
• Email	newsletters
Current	email	newsletters
Current	email	newsletters
Popularity
sorting
Personalized	email	newsletters
• Targeted	audience	was	registered	customers	without	
subscriptions:	
– Free	customers	with	5	articles	as	free
– We	wanted	to	improve	conversion	rates
Personalization:	collaborative	filtering
• Popularity	sorting:	no	relation	between	articles
• Relations	between	articles	A	and	B:	if	someone	is	
reading	A,	what	is	the	prob that	he/she	finds	B	also	
interesting.	
• Relation:
– Content-based	relation:	
– Relation	based	on	collaborative	behavior	of	reading:
Architecture
Collect	data
Article	distance	
matrix
Recommend
PySpark:	
+	Convert	reader-article	matrix	to	IndexedRowMatrix
+	columnSimilarities():	fast	compute	of	cosine	
similarity	between	columns
+	Input:	klant_nr
+	Infer	list	of	articles	read	from	reader-article	
matrix
+	Recommended	articles:	articles	with	largest	
sum	of	distances	to	the	articles	read	by	the	
user
Article
DB
Reader-article	matrix
Email	
campaign	
management
Client	list
Emailing
system
Published	
last	week
Older	
articles
28	days
7	days
Recommendation	step
Item-to-item	collaborative	filtering
• Always	sort	articles	based	on	cosine	metric
• Cold-start	problems:	with	articles	that	were	read	
only	few	times
– Evaluate	dot-product	metric
– Use	popularity	version
Personalized	emails
A/B	testing
• A	setting	(“combi”):	recommended	articles	using	
item-to-item	collaborative	filtering
• B	setting	(“popularity”):	recommended	articles	using	
article	popularity	scores
Results:	free	clients
T-test
Results:	paid	clients	(fixed	2000)
T-test
Number	of	articles	sent	– 1	week
Popularity	emails
Personalized	emails
6	articles 178	articles	(60	clicked)
Long	tail	
articles
Toolings
Future	work
Take-home	messages
• Simple	recommendation	model	(with	existing	tools)	
works
• FD	Mediagroep has	started	with	AI.	And	we	are	doing	
more	and	more	in	this	journey.
Twitter:	@dungchu
Email:	dung.manh.chu@fdmediagroep.nl
LinkedIn:	https://www.linkedin.com/in/dungmanhchu/

Mais conteúdo relacionado

Semelhante a FD recommendation engine in personalized newsletters

HPE IDOL 10 (Intelligent Data Operating Layer)
HPE IDOL 10 (Intelligent Data Operating Layer)HPE IDOL 10 (Intelligent Data Operating Layer)
HPE IDOL 10 (Intelligent Data Operating Layer)Andrey Karpov
 
Third Industrial Revolution_Teigland
Third Industrial Revolution_TeiglandThird Industrial Revolution_Teigland
Third Industrial Revolution_TeiglandRobin Teigland
 
Athens Technology Center - Corporate profile
Athens Technology Center - Corporate profileAthens Technology Center - Corporate profile
Athens Technology Center - Corporate profileAthens Technology Center
 
New concept Information systems
New concept Information systemsNew concept Information systems
New concept Information systemsmohanraj123
 
Big Data Forum by Institute of Actuaries in Belgium (IABE)
Big Data Forum by Institute of Actuaries in Belgium (IABE)Big Data Forum by Institute of Actuaries in Belgium (IABE)
Big Data Forum by Institute of Actuaries in Belgium (IABE)Mateusz Maj
 
We spline invdeck_apr2018_2
We spline invdeck_apr2018_2We spline invdeck_apr2018_2
We spline invdeck_apr2018_2Fernanda Torós
 
How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...AI Guild
 
We spline invdeck_apr2018_2
We spline invdeck_apr2018_2We spline invdeck_apr2018_2
We spline invdeck_apr2018_2Fernanda Torós
 
We spline invdeck_mar2018
We spline invdeck_mar2018We spline invdeck_mar2018
We spline invdeck_mar2018Fernanda Torós
 
FlexMR Credentials
FlexMR CredentialsFlexMR Credentials
FlexMR CredentialsFlexMR
 
We spline invdeck_apr2018
We spline invdeck_apr2018We spline invdeck_apr2018
We spline invdeck_apr2018Fernanda Torós
 
Duuzra pharmaceutical brochure
Duuzra pharmaceutical brochureDuuzra pharmaceutical brochure
Duuzra pharmaceutical brochureJonathan Basler
 
We spline invdeck_mar2018
We spline invdeck_mar2018We spline invdeck_mar2018
We spline invdeck_mar2018Fernanda Torós
 
HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...
HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...
HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...HfS Research
 
Edge develop com_previous_clients_html
Edge develop com_previous_clients_htmlEdge develop com_previous_clients_html
Edge develop com_previous_clients_htmlDaniel Adenew
 
Bse consulting & sharepoint introduction ver1.0 20141201
Bse consulting & sharepoint introduction ver1.0 20141201Bse consulting & sharepoint introduction ver1.0 20141201
Bse consulting & sharepoint introduction ver1.0 20141201Steve Kim
 
Department of Business and Innovation - Case Study
Department of Business and Innovation - Case StudyDepartment of Business and Innovation - Case Study
Department of Business and Innovation - Case StudySushant Arora
 
Tracking Disruptive Tech: Pivotl IQ and KM (English)
Tracking Disruptive Tech: Pivotl IQ and KM (English)Tracking Disruptive Tech: Pivotl IQ and KM (English)
Tracking Disruptive Tech: Pivotl IQ and KM (English)Chris Peter ⓥ
 
LinkedIn: Where business happens - Fredrik Bernsel (Linkedin EMEA)
LinkedIn: Where business happens -  Fredrik Bernsel (Linkedin EMEA)LinkedIn: Where business happens -  Fredrik Bernsel (Linkedin EMEA)
LinkedIn: Where business happens - Fredrik Bernsel (Linkedin EMEA)Social .Lab
 

Semelhante a FD recommendation engine in personalized newsletters (20)

HPE IDOL 10 (Intelligent Data Operating Layer)
HPE IDOL 10 (Intelligent Data Operating Layer)HPE IDOL 10 (Intelligent Data Operating Layer)
HPE IDOL 10 (Intelligent Data Operating Layer)
 
Third Industrial Revolution_Teigland
Third Industrial Revolution_TeiglandThird Industrial Revolution_Teigland
Third Industrial Revolution_Teigland
 
Athens Technology Center - Corporate profile
Athens Technology Center - Corporate profileAthens Technology Center - Corporate profile
Athens Technology Center - Corporate profile
 
New concept Information systems
New concept Information systemsNew concept Information systems
New concept Information systems
 
About ing bna
About ing bnaAbout ing bna
About ing bna
 
Big Data Forum by Institute of Actuaries in Belgium (IABE)
Big Data Forum by Institute of Actuaries in Belgium (IABE)Big Data Forum by Institute of Actuaries in Belgium (IABE)
Big Data Forum by Institute of Actuaries in Belgium (IABE)
 
We spline invdeck_apr2018_2
We spline invdeck_apr2018_2We spline invdeck_apr2018_2
We spline invdeck_apr2018_2
 
How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...
 
We spline invdeck_apr2018_2
We spline invdeck_apr2018_2We spline invdeck_apr2018_2
We spline invdeck_apr2018_2
 
We spline invdeck_mar2018
We spline invdeck_mar2018We spline invdeck_mar2018
We spline invdeck_mar2018
 
FlexMR Credentials
FlexMR CredentialsFlexMR Credentials
FlexMR Credentials
 
We spline invdeck_apr2018
We spline invdeck_apr2018We spline invdeck_apr2018
We spline invdeck_apr2018
 
Duuzra pharmaceutical brochure
Duuzra pharmaceutical brochureDuuzra pharmaceutical brochure
Duuzra pharmaceutical brochure
 
We spline invdeck_mar2018
We spline invdeck_mar2018We spline invdeck_mar2018
We spline invdeck_mar2018
 
HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...
HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...
HfS Webinar Slides: Unveiling the Early Leaders Providing AI capabilities for...
 
Edge develop com_previous_clients_html
Edge develop com_previous_clients_htmlEdge develop com_previous_clients_html
Edge develop com_previous_clients_html
 
Bse consulting & sharepoint introduction ver1.0 20141201
Bse consulting & sharepoint introduction ver1.0 20141201Bse consulting & sharepoint introduction ver1.0 20141201
Bse consulting & sharepoint introduction ver1.0 20141201
 
Department of Business and Innovation - Case Study
Department of Business and Innovation - Case StudyDepartment of Business and Innovation - Case Study
Department of Business and Innovation - Case Study
 
Tracking Disruptive Tech: Pivotl IQ and KM (English)
Tracking Disruptive Tech: Pivotl IQ and KM (English)Tracking Disruptive Tech: Pivotl IQ and KM (English)
Tracking Disruptive Tech: Pivotl IQ and KM (English)
 
LinkedIn: Where business happens - Fredrik Bernsel (Linkedin EMEA)
LinkedIn: Where business happens -  Fredrik Bernsel (Linkedin EMEA)LinkedIn: Where business happens -  Fredrik Bernsel (Linkedin EMEA)
LinkedIn: Where business happens - Fredrik Bernsel (Linkedin EMEA)
 

Último

How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 

Último (17)

How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 

FD recommendation engine in personalized newsletters