33. Example: Logistic Regression
n val data = spark.textFile(...).map(readPoint).cache()
n var w = Vector.random(D)
n for (i <- 1 to ITERATIONS) {
n val gradient = data.map(p =>
n (1 / (1 + exp(-p.y*(w dot p.x))) - 1) * p.y * p.x
n ).reduce(_ + _)
n w -= gradient
n }
n println("Final w: " + w)
Source: Matei Zaharia(2013)
34. Internet as a mass media
“Half
the
money
I
spend
on
adver;sing
is
wasted;
the
trouble
is
I
don‘t
know
which
half.”
-‐-‐
John
Wanamaker,
~
1875
35. Current Challenges
How
to
iden;fy?
Find
the
"best
match"
between
a
given
user
in
a
given
context
and
a
suitable
adver;sement.
-‐-‐
Dr.
Andrei
Broder
and
Dr.
Vanja
Josifovski,
Standford
University
Limited
Info.
Budget?
Crea;ve?...
Bid
Price?
36. Channels play different roles in the
customer journey
Source:
hVp://www.thinkwithgoogle.com/
37. Advertiser Utility: The Value Funnel
CPM
campaign:
Revenue
=
N/1000
⋅CPM
CPC
campaign:
Revenue
=
N
⋅
CTR
⋅
CPC
CPA
campaign:
Revenue
=
N
⋅
CTR
⋅
CVR⋅
CPA
38. How DSP Track & Optimize Bidding
• Pixel/Beacon:
landing,
browse,
shopping
cart,
conversion
…
• Cookie
in
web
(Cookie
mapping)
• IDFA/AID
in
mobile
Audience
Tracking
• Feature
engineering/Pre-‐generated
tags,
Look
alike,
Re-‐targe;ng
• Privacy
-‐>
Campaign-‐based
• P(c|u)
Audience
Selec;on
• Campaign/Ad,
TA,
Crea;ve…
• Base/up
bound
price,
freq.
cap…
Campaign
Mgmt
• 100ms
• Winning
probability
func;on,
Traffic
forecas;ng
Real-‐;me
bidding
Impression• Click,
CTR
Conversion• CVR,
CPA
40. Traffic forecasting
n An impression on Jeremy Lin BBS post of MiuPTT
n Two product ads
n A: Linsanity T-Shirt
n B: Baseketball shoes
n Not optimized if only bid for highest price
n B bid higher than A
n Inventory A is much fewer than inventory B
47. If
only
aVach
importance
to
quan;fy
the
business
model,
it
will
not
have
the
ability
to
find
a
poten;al
growth
opportuni;es:
"The
pursuit
of
quan;fying
the
biggest
problem
is
that
people
ignore
the
context
of
the
behavior
generated,
detached
from
the
context
of
the
event,
and
have
not
been
included
in
the
model
ignores
variables
effec;veness.
"
企業若只重視量化模式,
將無法擁有尋得潛在成長
契機的能力:「追求量化
最大的問題在於,忽略人
們產生行為的脈絡,把事
件從情境中抽離,且忽略
沒有被納入模式中的變數
效力。」
-‐
Roger
Mar;n
Rothman
School
of
Management,
Toronto
48. 3R:Reach+Richness+Range
大數據經濟學
資料豐富度
(The
power
source
of
behavioral
forecas;ng)
Reach
Richness
High
High
Low
使用者接觸量(Reach
of
UU)
Range
High 使用者情境 (The
audience
affiliate
of
whole
context)
50. Takeaway ~
n RTB, SSP, AdX, DSP
n Data Scientist as CEO of Data / Data Consultant
n Big Data Pricing Engine
l Scalable Big data infrastructure
l Spark, Kafka, Docker, HDFS, Couchbase, …
l Bidding Strategy & Design of Pricing Engine
n Reach, Richness, Range
l Reach:audience span, base of segmentation
l Richness:relatedness(contribution) to conversion (target)
l Range:affiliation with audience
ü Integrated, all media, full context engaging factors