This document discusses using machine learning techniques to optimize digital marketing campaigns. Specifically, it analyzes data from campaigns using clustering, visualization and predictive models. Unsupervised learning methods like k-means clustering, PCA, MDS and SOM are used to identify patterns in large digital data. Supervised models like SVMs and random forests predict conversions. The goal is to extract actionable insights to improve ROI, engagement and sales through optimization of parameters like ad design, keywords, bids, channels and budget allocation.
1.
Optimization
of
Digital
Marketing
Campaigns
Armando
Vieira,
Inesting
Abstract
In
this
work
we
apply
several
clustering,
visualization
and
predictive
machine
learning
techniques
to
analyse
data
from
digital
marketing
campaigns.
For
data
exploration
we
used
unsupervised
techniques
like
k-‐means,
Principal
Component
Analysis
(PCA),
Multidimensional
Scaling
(MDS)
and
Self-‐Organized
Maps
(SOM).
We
identified
patterns
that
help
the
analyst
understand
the
vast
amount
of
data
produced
by
digital
trails
and
guide
their
actions
(actionable
insights).
Support
Vector
Machines
and
Random
Forest
algorithm
were
used
for
supervised
learning
of
conversions
prediction.
Keywords:
ad
optimization,
Adwords,
Predictive
Analytics,
SEO,
digital
marketing
1 Introduction
Online
advertising
has
evolved
into
a
$50
billion
industry
and
continues
to
grow
by
double
digits.
On
the
other
hand,
powerful
web
analytic
tools,
such
as
Google
Analytics,
Facebook
Insights
or
Kissmetrics,
provide
key
data
easily
available
to
anyone
who
wants
to
monitor
the
performance
of
their
campaigns
online.
For
e-‐commerce
sites,
the
analyst
has
the
ability
to
track
every
single
action
of
the
visitor
over
the
conversion
path
and
answer
the
fundamental
questions:
who,
what,
why,
how
and
when,
from
a
lead
to
the
purchase.
Our
interest
lies
in
monitoring
the
impact
campaigns
have
on
website
traffic,
engagement
and
revenue
(in
the
case
of
e-‐commerce).
A
principal
form
of
online
advertising
is
the
promotion
of
products
and
services
through
search-‐based
advertising.
Today’s
most
popular
search-‐based
advertising
platform
is
Google
Adwords,
having
the
largest
share
of
revenues.
Search
remains
the
largest
online
advertising
revenue
format,
accounting
for
46.5%
of
2011
advertising
revenues,
up
from
44.8%
in
2010.
In
2011,
Search
revenues
totalled
$14.8
billion,
up
almost
27%
from
$11.7
billion
in
2010.
This
gives
an
unprecedent
power
to
the
marketing
team
but
at
a
cost:
the
huge
amounts
of
unstructured,
disparate
and
complex
data
to
be
processed
and
parameters
to
be
adjusted.
The
effort
required
to
deal
with
the
number
of
options
and
configurations
for
optimal
performance
of
a
company
website
is
simple
far
beyond
human
capabilities.
Furthermore
some
parameters
have
non-‐linear
interactions:
for
instance
the
quality
of
the
SEO
boosts
the
position
of
the
Ad
in
Adwords
campaigns,
thus
achieving
a
better
performance
for
a
lower
PPC.
The
budget
allocated
to
the
campaign
also
influences
the
Ad
position.
There
are
even
subtler
influences
and
nuances
when
measuring
the
ROI.
For
instance,
it
is
known
that
although
display
advertising
brings
very
little
direct
sales,
it
may
boost
the
performance
of
search
Ads
since
users
where
previously
exposed
to
the
product
or
brand.
To
optimize
this
myriad
of
parameters
we
need
to
rely
on
machine
learning
algorithms
to
extract
actionable
insights
and
answers
some
simple
questions
like:
how
to
improve
my
return
on
investment
(ROI)?
How
to
boost
costumer
engagement?
What
product
generate
most
interest?
What
catalysis
sales?
What
strategy
to
opt?
What
channels
to
choose?
How
much
should
I
invest?
When,
how?
These
are
very
important
question
with
no
clear
a
single
answer.
Most
of
them
depend
on
each
case,
and
some
are
two
vague
to
be
answered.
Under
these
circumstances,
the
safe
strategy
starts
by
design
carefully
an
ad,
select
adequate
keywords,
set
the
bids,
segment
the
campaign
properly
and
test
continuously
for
2. fine-‐tuning.
If
results
are
not
as
expected,
then
look
at
the
data,
learn,
make
corrections,
and
repeat
the
cycle.
Most
the
research
have
been
focused
on
the
publisher
side,
trying
to
device
strategies
to
maximize
the
CTR
of
Ads,
by
means
of
content
contextualization,
ads
personalization
among
others
[**].
In
this
work,
however,
we
take
the
perspective
of
the
advertiser
and
will
explore
the
potential
of
machine
learning
tools
for
prediction
and
optimization
of
the
marketing
strategy.
The
objective
is
to
maximize
performance
and
effectiveness
of
marketing
campaigns,
namely
the
Return
On
Investment
(ROI).
We
propose
a
system
to
extract
information
from
Google
Analytics
and
determine
the
most
important
for
optimization.
The
article
is
organized
as
follows.
In
section
2
we
introduce
the
data
and
pre-‐processing.
In
section
3
we
explore
the
data
and
extract
relevant
features
using
clustering
algorithms,
like
k-‐means,
PCA
and
MDS
and
SOM.
In
Section
4
we
introduce
the
supervised
learning,
where
we
predict
Conversions,
Revenues
and
user
engagement.
Finally
in
section
6
some
conclusions
are
drawn.
2 Data
2.1 Data
Extraction
and
description
Data
was
collected
from
a
costumer
running
campaigns
on
an
ecommerce
site
with
Adwords
campaign,
Facebook
and
email
marketing.
Data,
collected
on
a
daily
frequency
over
a
period
of
6
months,
is
described
in
Table
1.
Our
main
data
sources
are
Google
Analytics
(GA)
-‐
that
aggregate
data
from
Google
Adwords
-‐
and
Facebook
Insights.
We
focused
on
inputs
that
may
give
us
access
to
insights,
namely
correlations
between
conversions
and
site
usage
or
Adwords
campaigns.
We
used
the
package
RGoogleAnalytics
(RGA)
to
extracted
data
into
R
from
Google
Analytics.
We
collected
data
from
Adwords,
Facebook
and
email
campaigns
-‐
Table
1.
Data
was
collected
over
different
timeframes
and
consolidated
by
date.
For
some
cases,
data
was
decomposed
by
traffic
source
in
GA,
and
by
group
segment
as
in
case
of
Adwords,
so
each
data
point
corresponds
to
a
specific
segment
on
a
specific
day.
Two
data
set
were
build:
Data
1:
with
just
adwords
other
with
analytics+facebook+email:
Data
2.
Table
1
variables
used
for
analysis.
The
colour
fields
are
data
from
campaigns.
Variable
Visit
length
Number
of
visits
Bounce
rate
Page
per
visit
Ad/campaign
group
Cost
per
Click
Position
Type
Click
Through
Rate
Conversion
Rate
Impressions
F AdWords
a
c
e
b
o
o
k
General
Traffic
source
Name
Comments
(Metric/Dimension)
TO
(D)
Organic,
Email,
Adwords,
Facebook,
Others
VL
(M)
NV
(M)
BR
(M)
PV
(M)
CG
(D)
Group
of
Ad
CPC
(M)
P
(M)
T
(D)
Search,
display
CTR
(M)
CRA
(M)
Imp
(M)
3. Emails
Click
through
rate
CTRf(M)
Cost
per
like
CPL
(M)
Convertion
Rate
Facebook
CRF(M)
Emails
Sent
Open
Rate
Click
Rate
Conversion
Rate
email
Total
revenue
Revenue
from
sales
Em
(M)
OR
(M)
CT
(M)
CRE
(M)
Re
(M)
2.2 Performance
Ratios
For
visualization
proposes,
we
consider
several
aggregated
metrics
to
benchmark
the
performance
of
a
website
and
the
digital
campaigns.
We
divide
the
metrics
into
two
major
categories:
website
usability
and
financial
performance.
All
indexes
are
defined
to
have
values
between
0
and
1.
A
site
can
be
highly
engaging…
Website
usability
metrics
We
defined
the
engagement
as
a
composite
index,
defined
according
to
[8]
as:
E = ∑ Cdi + Dd i +Idi + (1 − Bri )
i
where
Br
is
the
bounce
rate
and
the
other
indices
are
defined
below.
The
sum
runs
over
any
aggregation
metric
that
we
may
be
interested.
The
coefficients
are
obtained
from
sessions
originated
from
a
particular
dimension:
visitor
id,
traffic
source,
time,
etc.
This
index
has
the
€
advantage
of
benchmarking
the
quality
of
the
site
and
the
interaction
of
user
with
the
content.
Click
Depth
index
(Cd)
measures
the
degree
depth
visits
and
is
defined
as:
Cd =
Sessions with at least 4 page views
All sessions
Duration
Depth
index
(Dd)
measures
the
intensity
of
the
visits
captured
by
the
duration
of
visits
on
the
website.
It
is
defined
as:
€
4.
Dd =
Sessions with a duration of at least 3 min
All sessions
The
Interaction
depth
index,
(Id),
captures
the
visitor
interaction
with
content
or
functionality
designed
to
increase
level
of
Attention.
It
is
defined
as:
€
Id =
Sessions where visitors complete an action
All sessions
where
an
action
can
be
defined
as
a
goal
on
GA,
from
downloading
a
document,
to
filling
a
form
or
watching
a
video.
€
Financial
metrics
Engagement
with
a
website
is
important,
but
the
really
important
metrics,
especially
for
e-‐
commerce
sites,
are
sales
or
leads.
This
is
captured
by
financial
metrics
ratios.
There
are
dozens
of
financial
ratios
to
measure
efficiency
of
a
sales
channel,
but
we
will
focus
on
the
following:
• CR,
Conversion
Rate
• RPC,
Revenue
Per
Channel
• ROI,
Return
On
Investment
The
CR
rate
is
simple
defined
as:
CR =
Sessions where visitors purchage a produt
All sessions
Typical
CR
are
low,
1%
is
considered
very
good
for
most
sites,
but
it
can
be
as
low
as
0.001%.
The
Revenue
per
channel
(RPC)
is
the
total
value
earned
by
a
sales
channel
over
a
fixed
€
period
of
time.
The
ROI
of
a
channel
is
simply
the
ratio
of
revenue
per
total
investment
made
on
this
channel:
ROI =
RPC
Total cost
In
Figure
1
we
show
the
evolution
of
Engagement
and
ROI
over
time
for
the
2
mains
traffic
origin
sources.
€
5. Figure
1:
Engagement
over
time
(days)
for
using
a
moving
average.
In
Figure
2,
we
plot
the
revenue
per
origin
of
traffic.
The
most
important
source
for
revenue
was
Facebook,
while
Google
Organic
ranks
second
and
Adwords
third.
The
most
consist
channels
are
Direct
traffic
and
email
newsletter.
Figure
2:
revenue
distribution
per
channel
(top
6).
3 Data
visualization
with
unsupervised
techniques
In
this
section
we
will
use
some
techniques
for
data
exploration
and
visualization
in
order
to
detect
patterns
and
features
that
are
hidden
in
high
dimensional
data.
We
will
use
non-‐
supervised
clustering
techniques,
from
simpler
ones,
like
k-‐means,
to
more
elaborate
one,
like
Self
Organized
Maps
(SOM)
and
Multi
Dimensional
Scaling
(MDS).
6. 3.1 Adwords
Data
We
start
by
characterizing
the
data
by
plotting
the
box
plots
in
Figure
3
where
the
number
of
conversions,
the
CTR
and
CR
are
displayed
for
all
Adgroups
in
our
campaign.
There
are
three
Ad
groups
that
have
the
majority
of
conversions
(sales):
group
9,
10
and
11.
The
average
CTR
is
almost
constant
for
most
of
the
groups
(around
6%),
but
in
some
cases
we
don’t
have
enough
data
to
evaluate
it
with
accurately.
The
average
position
is
1.68
and
the
average
CR
is
0.2%,
showing
a
greater
variability
than
the
CTR.
Figure
3:
Boxplot
of
CTR
(red),
number
of
conversions
(blue)
and
CR
(green)
for
all
Adwords
groups
In
Figure
4
we
plot
the
weekly
revenues
and
costs
over
a
period
of
6
months
of
the
adwords
campaign.
Initally
the
campaign
was
not
very
efficient
since
we
run
a
trial
period
to
test
and
optimized
its
content,
targeting
and
keywords.
After
week
6
a
boost
on
investment
also
bring
a
more
than
propotional
increase
in
sales.
7.
Figure
4:
Revenue
and
cost
per
week
on
Adwords
campaigns.
Clustering
We
then
cluster
the
data
using
the
k-‐means
algorithm.
K-‐means
is
one
of
the
simplest
and
most
widely
used
algorithm
for
non-‐supervised
clustering.
The
only
input
is
the
number
of
clusters
k
and
the
metric
used
to
calculate
the
distances
between
points.
We
tested
the
algorithm
from
two
to
five
clusters
using
the
Euclidian
distance
on
the
Adwords
data.
The
optimum
compromise
between
intra
and
inter
cluster
distance
was
achieved
at
k
=
3
clusters.
Results
are
presented
in
Figure
5
where
we
selected
the
dimensions
CTR
and
number
of
Clicks
as
representative
axis.
The
four
patterns
are
very
clear
in
this
figure
and
the
centroids
are
presented
in
Table
2.
It
can
be
seen
that
most
conversions
come
from
the
green
group,
which
corresponds
to
the
greater
number
of
visits
and
clicks.
Number
of
page
visits
is
also
a
strong
indicator
of
revenue.
Error!
Reference
source
not
found.
show
the
clustering
on
page
views
and
visitors.
CTR,
CPC
and
position
are
almost
the
same
for
the
three
groups.
Figure
5:
K-‐means
algorithm
with
3
clusters
for
data
set
1.
8.
Table
2:
Centres
of
the
4
clusters
obtained
by
kmeans
for
the
Adwords
data
set
Cluster
Cost
Clicks
Imp.
Revenue
1
56.7
327
4739
85.1
2
81.7
474
6610
3
20.8
73
1194
CTR(%)
CPC
Position
0.07
0.14
1.79
124.9
0.08
0.15
1.71
14.1
0.06
0.17
1.30
In
Figure
6
we
plot
the
structure
of
Graph
of
correlations
with
R
function
qgraph
for
the
Adwords
data
set.
There
are
strong
correlations
between
**.???
Figure
6
correlations
with
QGrapph
3.2 PCA
Principal
Component
Analysis
is
one
of
the
oldest
and
wider
used
approaches
to
compress
high
dimensional
data
into
a
sub-‐set
of
linear
components.
It
has
the
disadvantage
of
being
a
linear
model,
but
it
still
very
useful.
In
Figure
7
we
plot
the
eigen-‐values
of
the
components
in
a
bi-‐
dimensional
plot.
Two
main
principal
components
are
clearly
seen.
Note
that
conversions
are
highly
correlated
with
ad
groups.
9.
Figure
7
PCA
for
the
Adwords
(left)
data
and
Google
Analytics
(right).
3.3 SOM
Self-‐organizing
map
(SOM)
is
an
unsupervised
neural
network
proposed
by
Kohonen
(Kohonen
2001)
for
visual
cluster
analysis.
The
neurons
of
the
map
are
located
on
a
regular
grid
embedded
in
a
low
(usually
2
or
3)
dimensional
space,
and
associated
with
the
cluster
prototypes.
In
the
course
of
learning
process,
the
neurons
compete
with
each
other
through
the
best
matching
principle,
i.e.,
the
input
is
projected
to
the
nearest
neuron
using
a
defined
distance
metric.
The
winner
neuron
and
its
neighbours
on
the
map
are
adjusted
towards
the
input
in
proportion
with
the
neighbourhood
distance,
consequently
the
neighbouring
neurons
likely
represent
the
similar
patterns
of
the
input
data
space.
Due
to
the
data
clustering
and
spatialization
through
the
topology
preserving
projection,
SOM
is
widely
used
in
the
context
of
visual
clustering
applications.
SOM
is
very
appropriate
to
analyze
the
high-‐dimensional
data
of
digital
metrics
range
of
research
groups
concentrate
on
the
bankruptcy
prediction
problem,
usually
solved
as
a
classification
task
to
separate
the
companies
into
distress
and
healthy
category
(binary)
or
a
number
of
predefined
credit
rates
(multi-‐class).
SOM
is
used
to
determine
the
class
through
a
visual
exploration
(Merkevicius,
Garsva
&
Simutis
2004).
An
enhanced
version
of
LVQ
can
boost
the
prediction
performance
of
multi-‐
layer
perceptron
neural
network
(Neves
&
Vieira
2006).
In
cooperation
with
independent
component
analysis
for
dimensionality
reduction,
LVQ
is
employed
to
recognize
the
distressed
French
companies
(Chen
&
Vieira
2009).
10. Figure
8:
SOM
for
data
set
1
–
Adword
campaigns
on
a
6x5
=
30
cells
space.
3.4 MDS
SOM
methods,
presented
previously,
involves
the
estimation
of
the
conditional
probability
which
is
computationally
expensive
and
hard
to
extract.
Here
we
test
the
Multidimensional
Scaling
algorithm
(MDS).
MDS,
is
a
non-‐linear
approach,
mostly
used
for
visualizing,
that
captures
the
level
of
similarity
of
individual
cases
of
a
dataset.
It
is
used
to
display
the
information
contained
in
a
distance
matrix,
evaluated
according
with
some
metric.
The
MDS
algorithm
place
each
object
in
N-‐dimensional
space
such
that
the
between-‐object
distances
are
preserved
as
well
as
possible.
Each
object
is
then
assigned
coordinates
in
each
of
the
N
dimensions.
The
number
of
dimensions
of
an
MDS
plot
N
can
exceed
2
and
is
specified
a
priori.
Choosing
N=2
optimizes
the
object
locations
for
a
two-‐dimensional
scatterplot
-‐
Figure
9.
Figure
9:
Aggregation
by
MDS
on
data
set
2.
Colours
represents
revenues
levels
(black
=
lowest,
light
blue
=
highest).
3.5 Heatmaps
and
ROI
We
now
investigate
the
return
on
investment
(ROI)
from
Adwords
and
Facebook
campaigns.
The
Facebook
campaign
run
over
the
same
period
as
the
Adwords
with
a
daily
budget
between
11. 10
and
40
euros
-‐
Figure
10.
The
ROI
is
in
general
bigger
than
1,
meaning
that
the
campaign
is
producing
good
results.
We
we
consider
the
global
performance
(Sales
originated
from
all
channels)
the
ROI
almost
duplicate
–
considering
as
cost
only
the
investment
in
Adwords
and
Facebook.
Figure
10
:
ROI
over
time
(days)
-‐
using
moving
averages:
(red)
Adwords,
(blue)
Total.
We
now
plot
the
ROI
for
the
payed
channels.
Email
is
number
one,
as
expected,
due
to
the
small
cost
of
promotion.
ROI
and
Eng
for
Data
1.
**
Heat
maps
Heat
maps
are
a
good
visualization
method
for
data
exploration
and
causality
explanation.
In
this
case
we
use
it
to
explore
conversions
and
engagement
into
a
calendar
to
visually
spot
trends.
We
use
the
GGplot2
library
to
create
a
Calendar
heatmap
with
data
from
6
months.
We
plot
engagement,
visits
as
well
as
transactions
on
calendar
so
we
get
perspective
on
how
they
interact
viz-‐a-‐viz
timeline.
In
this
case
it
is
interesting
to
note
that
Tuesdays
have
high
visits
days
but
Wednesday
has
been
the
day
when
most
transactions
occurs.
Visits
increases
towards
the
end
of
year
(shopping
season)
and
then
slows
down
towards
year
start.
Engagement
has
been
improving
over
time.
12.
Figure
11:
Heatmap
calendar
for
visits
(top)
and
revenue
(bottom)
over
the
last
6
months.
4 Supervised
Learning
for
Revenue
Prediction
In
previous
sections
we
explored
the
data
patterns
without
concerns
about
causality
between
observations
(non-‐supervised
learning).
In
this
section
we
go
a
step
forward
and
use
supervised
learning
to
make
predictions
on
data
based
on
past
records.
This
is
very
important
as
it
provides
explanation,
“the
why”
instead
of
“the
what”
as
we
enter
the
field
of
predictive
analytics.
First
we
consider
the
problem
from
a
broader
perspective:
can
we
predict
the
revenue
from
a
certain
channel
by
looking
at
the
data
traffic
generated?
If
so,
with
how
much
accuracy
and
confidence?
What
is
the
difference
in
behaviour
from
a
user
13. that
finalizes
a
purchase
from
other
users?
To
answer
these
questions
we
run
supervised
algorithms
trained
with
past
data
and
perform
classification
analysis.
First
step,
we
enrich
our
data
extracting
extra
metrics
drill
down
by
5
dimensions
(time,
traffic
source,
adwords
ad
group,
operating
system,
and
city).
The
metrics
used
are:
number
of
visits,
average
pages
per
visit,
average
visit
duration,
bounce
rate,
visit
depth,
CTR,
page
load
time,
social
interaction
and
cost
of
ads
on
Adwords
and
Facebook.
From
these
metrics
we
extract
the
additional
performance
ratios
described
in
Section
2.2.
In
which
concerns
the
traffic
sources,
we
selected
only
the
top
10
performers.
We
consider
a
conversion
when
at
least
one
sale
is
concluded.
All
data
is
aggregated
with
a
daily
granularity.
We
run
the
algorithms
as
a
classification
task,
trying
to
predict
when
a
given
visit
leads
to
a
conversion
in
a
given
session.
The
data
set
contains
5680
sessions
of
which
432
have
conversions.
We
used
Support
Vector
Machines
and
Random
Forest
algorithm
since
they
can
easily
deal
with
categorical
and
continuous
inputs,
can
be
trained
with
very
few
examples,
and
does
not
overfit.
Since
many
more
visit
lead
to
non-‐conversions
than
conversion,
we
create
a
balanced
data
set
by
randomly
eliminating
entries
that
don’t
lead
to
conversions.
We
end
up
with
864
training
examples.
All
data
was
normalized
and
the
algorithm
was
tested
using
10-‐fold
cross
validation.
In
Figure
13
we
plot
the
ROC
curve
obtained
over
a
period
of
165
days.
The
AUC
obtained
was
0.84.
For
comparison,
we
used
SVM
and
the
AUC
=
**.
This
is
somehow
surprising
result
given
the
small
set
of
inputs.
In
order
to
separate
the
traffic
from
Adwords,
we
run
the
algorithm
without
traffic
from
this
source.
The
results
have
improved
slightly.
Random
forest
returns
several
measures
of
variable
importance.
The
most
reliable
measure
is
based
on
the
decrease
of
classification
accuracy
when
values
of
a
variable
in
a
node
of
a
tree
are
permuted
randomly,
and
this
is
the
measure
of
variable
importance.
Table
3
presents
the
best
discriminating
indicators
in
predicting
conversions:
traffic
origin
and
the
number
of
visits
–
see
also
Figure
12.
14.
Figure
12:
dispersion
of
inputs
for
data
set
2.
Figure
13:
ROC
curve
for
the
conversion
prediction
with
Random
Forest
and
SVM
algorithms.
FPR:
False
positive
rate,
TPR:
true
positive
rate.
Table
3:
Best
performing
conversion
prediction
indicators
for
the
two
datasets.
All
Variables
Traffic
Source
Number
of
visits
Number
of
visits
Bounce
Rate
Bounce
rate
Visit
Length
Visit
length
All
without
Adwords
Time
on
site
15. 5
Conclusions
In
this
work
we
have
used
a
set
of
machine
learning
techniques
for
data
exploration
and
predictive
analytics.
It
was
shown
that
exploratory
tools
can
help
understand
the
dynamics
of
digital
campaigns.
We
used
Random
Forest
algorithms
(a
collection
of
decision
trees)
and
SVM
to
predict
the
conversions
with
a
reasonable
accuracy.
The
most
important
features
are
number
of
visits,
origin
of
traffic
and
visit
duration.
Surprisingly,
we
found
that
CTR
and
CR
have
little
influence
as
predictors
of
conversions.
6
References
•
•
•
•
•
•
•
1.
Benjamin
Edelman,
Michael
Ostrovsky,
and
Michael
Schwarz:
"Internet
Advertising
and
the
Generalized
Second-‐Price
Auction:
Selling
Billions
of
Dollars
Worth
of
Keywords".
American
Economic
Review
97(1),
2007
pp
242-‐259
2.
P.
Maille,
E.
Markakis,
M.
Naldi,
G.
D.
Stamoulis,
B.
Tuffin.
Sponsored
Search
Auctions:
An
Overview
of
Research
with
Emphasis
on
Game
Theoretic
Aspects.
To
appear
in
the
Electronic
Commerce
Research
journal
(ECR).
3.
Andrei
Broder,
Vanja
Josifovski.
Introduction
to
Computational
Advertising
Course,
Stanford
University,
California
4.
Anand
Rajaraman
and
Jeffrey
D.
Ullman.
Mining
of
massive
datasets.
Cambridge
University
Press,
2012,
Chapter
8
–
Advertising
on
the
Web
5.
James
Shanahan.
Digital
Advertising
and
Marketing:
A
review
of
three
generations.
Tutorial
on
WWW
2012
7.
IAB’s
Internet
Advertising
Revenue
Report
http://www.iab.net/AdRevenueReport
http://www.webanalyticsdemystified.com/downloads/Web_Analytics_Demystified_an
d_NextStage_Global_-‐_Measuring_the_Immeasurable_-‐_Visitor_Engagement.pdf