SlideShare uma empresa Scribd logo
1 de 83
Baixar para ler offline
Federated Learning,
from Research to Practice
Brendan McMahan
Presenting the work of many
CMU 2019.09.05 g.co/federated
Federated learning
Enable machine learning engineers and data scientists
to work productively with decentralized data
with privacy by default
Outline
1. Why do we need federated learning?
2. What is federated learning?
3. Federated learning at Google
4. Federated learning beyond Google
5. Federated learning and privacy
6. Open problems
Why federated learning?
Data is born at the edge
Billions of phones & IoT devices constantly generate data
Data enables better products and smarter models
Data processing is moving on device:
● Improved latency
● Works offline
● Better battery life
● Privacy advantages
E.g., on-device inference for mobile
keyboards and cameras.
Can data live at the edge?
Data processing is moving on device:
● Improved latency
● Works offline
● Better battery life
● Privacy advantages
E.g., on-device inference for mobile
keyboards and cameras.
Can data live at the edge?
What about analytics?
What about learning?
2014: Three choices
Don’t use data to
improve products
and services
Log the data
centrally anyway
Invent a
new solution
Choose your poison 🕱
2014: Three choices
Don’t use data to
improve products
and services
Log the data
centrally anyway
Invent a
new solution
Choose your poison 🕱
2019: Good reason to hope
Don’t use data to
improve products
and services
Log the data
centrally anyway
Federated learning
and analytics
What is federated learning?
cloud
data
Train & evaluate
on cloud data
Model engineer
workflow
Final model
validation steps
Model
deployment
workflow
Model
deployment
workflow
Deploy model
to devices
for on-device
inference
Train & evaluate
on decentralized
data
Model engineer
workflow
opted-in
Federated learning
data device
Need me?
Federated learning
data device
Need me?
Not now
data device
Need me?
Yes!
Federated learning
initial model
Federated learning
data device updated
model
data device
Federated learning
initial model
(ephemeral)
updated
model
data device
Federated learning
initial model
updated
model
Privacy principle
Focused collection
Devices report only what is
needed for this computation
data device
Federated learning
initial model
updated
model
Privacy principle
Ephemeral reports
Server never persists
per-device reports
Stronger with SMPC
data device
combined
model
∑
Federated learning
initial model
updated
model
data device
combined
model
∑
Federated learning
initial model
updated
model
Privacy principle
Only-in-aggregate
Engineer may only
access combined
device reports
(another)
initial model
data device
combined
model
Federated learning
(another)
combined
model
(another)
initial model
∑
Federated learning
data device
data device
(another)
combined
model
(another)
initial model
∑
Typical orders-of-magnitude
100-1000s of users per round
100-1000s of rounds to convergence
1-10 minutes per round
Federated learning
Characteristics of federated learning
vs. traditional distributed learning
Data locality and distribution
● massively decentralized, naturally arising
(non-IID) partition
● Data is siloed, held by a small number of
coordinating entities
● system-controlled (e.g. shuffled, balanced)
Data availability
● limited availability, time-of-day variations
● almost all data nodes always available
Addressability
● data nodes are anonymous and interchangeable
● data nodes are addressable
Node statefulness
● stateless (generally no repeat computation)
● stateful
Node reliability
● unreliable (~10% failures)
● reliable
Wide-area communication pattern
● hub-and-spoke topology
● peer-to-peer topology (fully decentralized)
● none (centralized to one datacenter)
Distribution scale
● massively parallel (1e9 data nodes)
● single datacenter
Primary bottleneck
● communication
● computation
∑
∑
Constraints in the
federated setting
H. Eichner, et al. Semi-Cyclic Stochastic Gradient Descent. ICML 2019.
Each device reflects one users' data.
So no one device is representative of
the whole population.
Devices must idle, plugged-in, on wi-fi
to participate.
Device availability correlates with both
geo location and data distribution.
Round completion rate by hour (US)
Hour (over three days)
data device
(another)
combined
model
(another)
initial model
∑
Federated Averaging
Algorithm
Devices run multiple
steps of SGD on their
local data to compute
an update.
Server computes
overall update using
a simple weighted
average.
McMahan, et al. Communication-Efficient Learning of Deep
Networks from Decentralized Data. AISTATS 2017.
Server
Until Converged:
1. Select a random subset of clients
2. In parallel, send current parameters θt
to those clients
3. θt+1
= θt
+ data-weighted average of client updates
Selected Client k
1. Receive θt
from server.
2. Run some number of minibatch SGD steps,
producing θ'
3. Return θ'-θt
to server.
H. B. McMahan, et al.
Communication-Efficient Learning of
Deep Networks from Decentralized
Data. AISTATS 2017
The Federated Averaging algorithm
θt
θ'
Rounds to reach 10.5% Accuracy
23x decrease in
communication
rounds
H. B. McMahan, et al.
Communication-Efficient Learning of
Deep Networks from Decentralized
Data. AISTATS 2017
Large-scale LSTM for next-word prediction
Dataset: Large Social Network, 10m public posts, grouped by author.
Updates to reach 82%
49x
(IID and balanced data)
decrease in
communication
(updates) vs SGD
H. B. McMahan, et al.
Communication-Efficient Learning of
Deep Networks from Decentralized
Data. AISTATS 2017
CIFAR-10 convolutional model
Federated learning at Google
Federated learning at Google
0.5B installs
Daily use by multiple teams, dozens of ML engineers
Powering features in Gboard and on Pixel Devices
Federated Learning on Pixel Phones
Gboard: mobile keyboard
● Predict the next word based on typed text so far
● Powers the predictions strip
Gboard: language modeling
When should you consider federated learning?
● On-device data is more relevant than
server-side proxy data
● On-device data is privacy sensitive or large
● Labels can be inferred naturally from user
interaction
Gboard: language modeling
Federated model
compared to baseline A. Hard, et al. Federated Learning for
Mobile Keyboard Prediction.
arXiv:1811.03604
Gboard: language modeling
Federated RNN (compared to prior n-gram model):
● Better next-word prediction accuracy: +24%
● More useful prediction strip: +10% more clicks
Other federated models in Gboard
Emoji prediction
● 7% more accurate emoji predictions
● prediction strip clicks +4% more
● 11% more users share emojis!
Action prediction
When is it useful to suggest a gif, sticker, or
search query?
● 47% reduction in unhelpful suggestions
● increasing overall emoji, gif, and sticker
shares
Discovering new words
Federated discovery of what words people
are typing that Gboard doesn’t know.
T. Yang, et al. Applied Federated
Learning: Improving Google Keyboard
Query Suggestions. arXiv:1812.02903
M. Chen, et al. Federated Learning
Of Out-Of-Vocabulary Words.
arXiv:1903.10635
Ramaswamy, et al. Federated Learning
for Emoji Prediction in a Mobile
Keyboard. arXiv:1906.04329.
Google’s federated system
Towards Federated Learning at Scale: System Design
ai.google/research/pubs/pub47976
K. Bonawitz, et al. Towards
Federated Learning at Scale:
System Design. SysML 2019.
Federated learning beyond Google
Federated Learning Research
FL Workshop in Seattle 6/17-18
Federated learning workshops:
● NeurIPS 2019 (upcoming) Workshop on Federated
Learning for Data Privacy and Confidentiality
Submission deadline Sept. 9
● IJCAI 2019 Workshop on Federated Machine Learning for
User Privacy and Data Confidentiality (FML’19)
● Google-hosted June 2019
~40 faculty, 20 students, 40 Googlers
TensorFlow Federated
Experiment with federated technologies
in simulation
TensorFlow Federated
What’s in the box?
Federated Learning (FL) API
● Implementations of federated training/evaluation
● Can be immediately applied to existing TF models/data
Federated Core (FC) API
● Lower-level building blocks for expressing new federated algorithms
Local runtime for simulations
68.0
70.5
69.8
70.1
SERVER
69.5
Federated computation in TFF
the “placement”
type of local items
on each client
has type
68.0
70.5
69.8
70.1
SERVER
69.5
Federated computation in TFF
federated “op” can be interpreted as a function even
though its inputs and outputs are in different places
→
68.0
70.5
69.8
70.1
SERVER
69.5
Federated computation in TFF
federated “op” can be interpreted as a function even
though its inputs and outputs are in different places
→
it represents an abstract specification of
a distributed communication protocol
Federated computation in TFF
50
output: fraction of sensors
with readings > threshold
(an input)
(an input)
1
0
1`
>
to_float
>
>
THRESHOLD_TYPE = tff.FederatedType(tf.float32, tff.SERVER)
@tff.federated_computation(READINGS_TYPE, THRESHOLD_TYPE)
def get_fraction_over_threshold(readings, threshold):
@tff.tf_computation(tf.float32, tf.float32)
def _is_over_as_float(val, threshold):
return tf.to_float(val > threshold)
return tff.federated_mean(
tff.federated_map(_is_over_as_float, [
readings, tff.federated_broadcast(threshold)]))
For more TFF tutorials visit...
tensorflow.org/federated
Federated learning and privacy
ML on Sensitive Data: Privacy versus Utility
Privacy
Utility
ML on Sensitive Data: Privacy versus Utility
Privacy
Utility
perception
today
Privacy
Utility
1. Policy
2. Technology
ML on Sensitive Data: Privacy versus Utility
perception
Privacy
Utility
goal 1. Policy
2. New Technology
ML on Sensitive Data: Privacy versus Utility (?)
Push the pareto frontier
with better technology.
today
federated
training
model
deployment
Improving privacy
federated
training
What can the
network see?
What can the
server see?
What can the
device see?
Improving privacy
What can the
engineer see?
federated
training
Improving privacy
What can the server
admin see?
model
deployment
What can the
world see?
federated
training
Improving privacy
federated
training
model
deployment
What can the
world see?
What can the
network see?
What can the
server see?
What can the
device see?
What can the
engineer see?
Improving privacy
What can the server
admin see?
federated
training
model
deployment
What can the
world see?
What can the
network see?
What can the
server see?
What can the
device see?
What can the
engineer see?
Improving privacy
Ephemeral reports,
focused collection
Encryption at rest
and on the wire.
only-in-aggregate
release
What can the server
admin see?
federated
training
model
deployment
What can the
server see?
Improving privacy
Privacy Technology: Secure aggregation
Secure aggregation
Compute a (vector) sum of encrypted device reports
A practical protocol with
● Security guarantees
● Communication efficiency
● Dropout tolerance
K. Bonawitz, et al. Practical Secure Aggregation for
Privacy-Preserving Machine Learning. CCS 2017.
Secure
aggregation
Privacy Technology: Secure aggregation
Secure exchange of
data-masking pairs
= 0
+
K. Bonawitz, et al. Practical Secure Aggregation for
Privacy-Preserving Machine Learning. CCS 2017.
Secure
aggregation
Privacy Technology: Secure aggregation
All devices in a round
exchange pairs
K. Bonawitz, et al. Practical Secure Aggregation for
Privacy-Preserving Machine Learning. CCS 2017.
Secure
aggregation
Privacy Technology: Secure aggregation
Device report
+ +
+
+
+
+
Report vectors are
completely masked
by random values
K. Bonawitz, et al. Practical Secure Aggregation for
Privacy-Preserving Machine Learning. CCS 2017.
Secure
aggregation
Pairs cancel out when aggregated
together, revealing only the sum
Privacy Technology: Secure aggregation
+
+
+
+
+
+
+ +
+
+
+
+
K. Bonawitz, et al. Practical Secure Aggregation for
Privacy-Preserving Machine Learning. CCS 2017.
Secure
aggregation
Pairs cancel out when aggregated
together, revealing only the sum
Privacy Technology: Secure aggregation
+
+
+
+
+
+
+ +
+
+
+
+
K. Bonawitz, et al. Practical Secure Aggregation for
Privacy-Preserving Machine Learning. CCS 2017.
federated
training
model
deployment
What can the
world see?
What can the
engineer see?
Improving privacy
Gboard: language modeling
● Better next-word prediction accuracy: +24%
● More useful prediction strip: +10% more clicks
Gboard: language modeling
Can a language model be too
good?
Can a language model be too
good?
Gboard: language modeling
4012 8888 8888 1881
Brendan McMahan's credit
card number is ….
Can a language model be too
good?
Gboard: language modeling
4012 8888 8888 1881
Brendan McMahan's credit
card number is ….
Privacy Principle:
Don’t memorize
individuals’ data
Differential privacy can
provide formal bounds
Differential privacy
Dwork and Roth. The Algorithmic
Foundations of Differential Privacy. 2014.
Statistical science of learning common patterns in a
dataset without memorizing individual examples
Differential privacy
complements
federated learning
Use noise to obscure an
individual’s impact on the
learned model.
github.com/tensorflow/privacy
Privacy Principle #4:
Don’t memorize
individuals’ data
data device
updated
model
model
+
∑
1. Devices “clip” their updates, limiting any one
user's contribution
2. Server adds noise when combining updates
H. B. McMahan, et al. Learning Differentially Private
Recurrent Language Models. ICLR 2018.
Privacy Technology: Differentially Private Model Averaging
Open problems
Motivation from real-world constraints
Data locality and distribution
● massively decentralized, naturally arising
(non-IID) partition
● Data is siloed, held by a small number of
coordinating entities
● system-controlled (e.g. shuffled, balanced)
Data availability
● limited availability, time-of-day variations
● almost all data nodes always available
Addressability
● data nodes are anonymous and interchangeable
● data nodes are addressable
Node statefulness
● stateless (generally no repeat computation)
● stateful
Node reliability
● unreliable (~10% failures)
● reliable
Wide-area communication pattern
● hub-and-spoke topology
● peer-to-peer topology (fully decentralized)
● none (centralized to one datacenter)
Distribution scale
● massively parallel (1e9 data nodes)
● single datacenter
Primary bottleneck
● communication
● computation
Sample open problem areas
● Optimization algorithms for FL, particularly
communication-efficient algorithms tolerant
of non-IID data
● Approaches that scale FL to larger models,
including model and gradient compression
techniques
● Novel applications of FL, extension to new
learning algorithms and model classes.
● Theory for FL
● Enhancing the security and privacy of FL,
including cryptographic techniques and
differential privacy
● Bias and fairness in the FL setting (new
possibilities and new challenges)
● Attacks on FL including model poisoning,
and corresponding defenses
● Not everyone has to have the same model
(multi-task and pluralistic learning,
personalization, domain adaptation)
● Generative models, transfer learning,
semi-supervised learning
Finally ...
g.co/federated

Mais conteúdo relacionado

Semelhante a 2019-09-05Federated Learning.pdf

Federated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devicesFederated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devicesAlAtfat
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
 
Hac IT 4. Emerging Technologies (1).pdf
Hac IT 4. Emerging Technologies  (1).pdfHac IT 4. Emerging Technologies  (1).pdf
Hac IT 4. Emerging Technologies (1).pdfAAFREEN SHAIKH
 
New Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile AgentNew Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile AgentMohammed Adam
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
 
Introduction to roof computing by Nishant Krishna
Introduction to roof computing by Nishant KrishnaIntroduction to roof computing by Nishant Krishna
Introduction to roof computing by Nishant KrishnaCodeOps Technologies LLP
 
Federated learning
Federated learningFederated learning
Federated learningMindos Cheng
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Mark Goldstein
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RPoo Kuan Hoong
 
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Ron Bodkin
 
PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012Charith Perera
 
Gridcomputing
GridcomputingGridcomputing
Gridcomputingpchengi
 
Building Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated DiscoveryBuilding Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated Discoveryadamkraut
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Sharmila Sathish
 

Semelhante a 2019-09-05Federated Learning.pdf (20)

Federated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devicesFederated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devices
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Hac IT 4. Emerging Technologies (1).pdf
Hac IT 4. Emerging Technologies  (1).pdfHac IT 4. Emerging Technologies  (1).pdf
Hac IT 4. Emerging Technologies (1).pdf
 
New Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile AgentNew Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile Agent
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 
Grid computing
Grid computingGrid computing
Grid computing
 
Introduction to roof computing by Nishant Krishna
Introduction to roof computing by Nishant KrishnaIntroduction to roof computing by Nishant Krishna
Introduction to roof computing by Nishant Krishna
 
Federated learning
Federated learningFederated learning
Federated learning
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017
 
PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012
 
Gridcomputing
GridcomputingGridcomputing
Gridcomputing
 
BigData Hadoop
BigData Hadoop BigData Hadoop
BigData Hadoop
 
Building Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated DiscoveryBuilding Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated Discovery
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 

Último

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 

Último (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 

2019-09-05Federated Learning.pdf

  • 1. Federated Learning, from Research to Practice Brendan McMahan Presenting the work of many CMU 2019.09.05 g.co/federated
  • 2. Federated learning Enable machine learning engineers and data scientists to work productively with decentralized data with privacy by default
  • 3. Outline 1. Why do we need federated learning? 2. What is federated learning? 3. Federated learning at Google 4. Federated learning beyond Google 5. Federated learning and privacy 6. Open problems
  • 5. Data is born at the edge Billions of phones & IoT devices constantly generate data Data enables better products and smarter models
  • 6. Data processing is moving on device: ● Improved latency ● Works offline ● Better battery life ● Privacy advantages E.g., on-device inference for mobile keyboards and cameras. Can data live at the edge?
  • 7. Data processing is moving on device: ● Improved latency ● Works offline ● Better battery life ● Privacy advantages E.g., on-device inference for mobile keyboards and cameras. Can data live at the edge? What about analytics? What about learning?
  • 8. 2014: Three choices Don’t use data to improve products and services Log the data centrally anyway Invent a new solution Choose your poison 🕱
  • 9. 2014: Three choices Don’t use data to improve products and services Log the data centrally anyway Invent a new solution Choose your poison 🕱
  • 10. 2019: Good reason to hope Don’t use data to improve products and services Log the data centrally anyway Federated learning and analytics
  • 11. What is federated learning?
  • 12. cloud data Train & evaluate on cloud data Model engineer workflow
  • 15. Train & evaluate on decentralized data Model engineer workflow
  • 20. data device Federated learning initial model (ephemeral) updated model
  • 21. data device Federated learning initial model updated model Privacy principle Focused collection Devices report only what is needed for this computation
  • 22. data device Federated learning initial model updated model Privacy principle Ephemeral reports Server never persists per-device reports Stronger with SMPC
  • 24. data device combined model ∑ Federated learning initial model updated model Privacy principle Only-in-aggregate Engineer may only access combined device reports
  • 27. data device (another) combined model (another) initial model ∑ Typical orders-of-magnitude 100-1000s of users per round 100-1000s of rounds to convergence 1-10 minutes per round Federated learning
  • 28. Characteristics of federated learning vs. traditional distributed learning Data locality and distribution ● massively decentralized, naturally arising (non-IID) partition ● Data is siloed, held by a small number of coordinating entities ● system-controlled (e.g. shuffled, balanced) Data availability ● limited availability, time-of-day variations ● almost all data nodes always available Addressability ● data nodes are anonymous and interchangeable ● data nodes are addressable Node statefulness ● stateless (generally no repeat computation) ● stateful Node reliability ● unreliable (~10% failures) ● reliable Wide-area communication pattern ● hub-and-spoke topology ● peer-to-peer topology (fully decentralized) ● none (centralized to one datacenter) Distribution scale ● massively parallel (1e9 data nodes) ● single datacenter Primary bottleneck ● communication ● computation
  • 29. ∑ ∑ Constraints in the federated setting H. Eichner, et al. Semi-Cyclic Stochastic Gradient Descent. ICML 2019. Each device reflects one users' data. So no one device is representative of the whole population. Devices must idle, plugged-in, on wi-fi to participate. Device availability correlates with both geo location and data distribution. Round completion rate by hour (US) Hour (over three days)
  • 30. data device (another) combined model (another) initial model ∑ Federated Averaging Algorithm Devices run multiple steps of SGD on their local data to compute an update. Server computes overall update using a simple weighted average. McMahan, et al. Communication-Efficient Learning of Deep Networks from Decentralized Data. AISTATS 2017.
  • 31. Server Until Converged: 1. Select a random subset of clients 2. In parallel, send current parameters θt to those clients 3. θt+1 = θt + data-weighted average of client updates Selected Client k 1. Receive θt from server. 2. Run some number of minibatch SGD steps, producing θ' 3. Return θ'-θt to server. H. B. McMahan, et al. Communication-Efficient Learning of Deep Networks from Decentralized Data. AISTATS 2017 The Federated Averaging algorithm θt θ'
  • 32. Rounds to reach 10.5% Accuracy 23x decrease in communication rounds H. B. McMahan, et al. Communication-Efficient Learning of Deep Networks from Decentralized Data. AISTATS 2017 Large-scale LSTM for next-word prediction Dataset: Large Social Network, 10m public posts, grouped by author.
  • 33. Updates to reach 82% 49x (IID and balanced data) decrease in communication (updates) vs SGD H. B. McMahan, et al. Communication-Efficient Learning of Deep Networks from Decentralized Data. AISTATS 2017 CIFAR-10 convolutional model
  • 35. Federated learning at Google 0.5B installs Daily use by multiple teams, dozens of ML engineers Powering features in Gboard and on Pixel Devices
  • 36. Federated Learning on Pixel Phones
  • 38. ● Predict the next word based on typed text so far ● Powers the predictions strip Gboard: language modeling When should you consider federated learning? ● On-device data is more relevant than server-side proxy data ● On-device data is privacy sensitive or large ● Labels can be inferred naturally from user interaction
  • 39. Gboard: language modeling Federated model compared to baseline A. Hard, et al. Federated Learning for Mobile Keyboard Prediction. arXiv:1811.03604
  • 40. Gboard: language modeling Federated RNN (compared to prior n-gram model): ● Better next-word prediction accuracy: +24% ● More useful prediction strip: +10% more clicks
  • 41. Other federated models in Gboard Emoji prediction ● 7% more accurate emoji predictions ● prediction strip clicks +4% more ● 11% more users share emojis! Action prediction When is it useful to suggest a gif, sticker, or search query? ● 47% reduction in unhelpful suggestions ● increasing overall emoji, gif, and sticker shares Discovering new words Federated discovery of what words people are typing that Gboard doesn’t know. T. Yang, et al. Applied Federated Learning: Improving Google Keyboard Query Suggestions. arXiv:1812.02903 M. Chen, et al. Federated Learning Of Out-Of-Vocabulary Words. arXiv:1903.10635 Ramaswamy, et al. Federated Learning for Emoji Prediction in a Mobile Keyboard. arXiv:1906.04329.
  • 42. Google’s federated system Towards Federated Learning at Scale: System Design ai.google/research/pubs/pub47976 K. Bonawitz, et al. Towards Federated Learning at Scale: System Design. SysML 2019.
  • 44. Federated Learning Research FL Workshop in Seattle 6/17-18 Federated learning workshops: ● NeurIPS 2019 (upcoming) Workshop on Federated Learning for Data Privacy and Confidentiality Submission deadline Sept. 9 ● IJCAI 2019 Workshop on Federated Machine Learning for User Privacy and Data Confidentiality (FML’19) ● Google-hosted June 2019 ~40 faculty, 20 students, 40 Googlers
  • 45. TensorFlow Federated Experiment with federated technologies in simulation
  • 46. TensorFlow Federated What’s in the box? Federated Learning (FL) API ● Implementations of federated training/evaluation ● Can be immediately applied to existing TF models/data Federated Core (FC) API ● Lower-level building blocks for expressing new federated algorithms Local runtime for simulations
  • 47. 68.0 70.5 69.8 70.1 SERVER 69.5 Federated computation in TFF the “placement” type of local items on each client has type
  • 48. 68.0 70.5 69.8 70.1 SERVER 69.5 Federated computation in TFF federated “op” can be interpreted as a function even though its inputs and outputs are in different places →
  • 49. 68.0 70.5 69.8 70.1 SERVER 69.5 Federated computation in TFF federated “op” can be interpreted as a function even though its inputs and outputs are in different places → it represents an abstract specification of a distributed communication protocol
  • 50. Federated computation in TFF 50 output: fraction of sensors with readings > threshold (an input) (an input) 1 0 1` > to_float > >
  • 51. THRESHOLD_TYPE = tff.FederatedType(tf.float32, tff.SERVER) @tff.federated_computation(READINGS_TYPE, THRESHOLD_TYPE) def get_fraction_over_threshold(readings, threshold): @tff.tf_computation(tf.float32, tf.float32) def _is_over_as_float(val, threshold): return tf.to_float(val > threshold) return tff.federated_mean( tff.federated_map(_is_over_as_float, [ readings, tff.federated_broadcast(threshold)]))
  • 52. For more TFF tutorials visit... tensorflow.org/federated
  • 54. ML on Sensitive Data: Privacy versus Utility Privacy Utility
  • 55. ML on Sensitive Data: Privacy versus Utility Privacy Utility perception
  • 56. today Privacy Utility 1. Policy 2. Technology ML on Sensitive Data: Privacy versus Utility perception
  • 57. Privacy Utility goal 1. Policy 2. New Technology ML on Sensitive Data: Privacy versus Utility (?) Push the pareto frontier with better technology. today
  • 59. federated training What can the network see? What can the server see? What can the device see? Improving privacy
  • 60. What can the engineer see? federated training Improving privacy What can the server admin see?
  • 61. model deployment What can the world see? federated training Improving privacy
  • 62. federated training model deployment What can the world see? What can the network see? What can the server see? What can the device see? What can the engineer see? Improving privacy What can the server admin see?
  • 63. federated training model deployment What can the world see? What can the network see? What can the server see? What can the device see? What can the engineer see? Improving privacy Ephemeral reports, focused collection Encryption at rest and on the wire. only-in-aggregate release What can the server admin see?
  • 65. Privacy Technology: Secure aggregation Secure aggregation Compute a (vector) sum of encrypted device reports A practical protocol with ● Security guarantees ● Communication efficiency ● Dropout tolerance K. Bonawitz, et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.
  • 66. Secure aggregation Privacy Technology: Secure aggregation Secure exchange of data-masking pairs = 0 + K. Bonawitz, et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.
  • 67. Secure aggregation Privacy Technology: Secure aggregation All devices in a round exchange pairs K. Bonawitz, et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.
  • 68. Secure aggregation Privacy Technology: Secure aggregation Device report + + + + + + Report vectors are completely masked by random values K. Bonawitz, et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.
  • 69. Secure aggregation Pairs cancel out when aggregated together, revealing only the sum Privacy Technology: Secure aggregation + + + + + + + + + + + + K. Bonawitz, et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.
  • 70. Secure aggregation Pairs cancel out when aggregated together, revealing only the sum Privacy Technology: Secure aggregation + + + + + + + + + + + + K. Bonawitz, et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.
  • 71. federated training model deployment What can the world see? What can the engineer see? Improving privacy
  • 72. Gboard: language modeling ● Better next-word prediction accuracy: +24% ● More useful prediction strip: +10% more clicks
  • 73. Gboard: language modeling Can a language model be too good?
  • 74. Can a language model be too good? Gboard: language modeling 4012 8888 8888 1881 Brendan McMahan's credit card number is ….
  • 75. Can a language model be too good? Gboard: language modeling 4012 8888 8888 1881 Brendan McMahan's credit card number is …. Privacy Principle: Don’t memorize individuals’ data Differential privacy can provide formal bounds
  • 76. Differential privacy Dwork and Roth. The Algorithmic Foundations of Differential Privacy. 2014. Statistical science of learning common patterns in a dataset without memorizing individual examples
  • 77. Differential privacy complements federated learning Use noise to obscure an individual’s impact on the learned model. github.com/tensorflow/privacy Privacy Principle #4: Don’t memorize individuals’ data
  • 78. data device updated model model + ∑ 1. Devices “clip” their updates, limiting any one user's contribution 2. Server adds noise when combining updates H. B. McMahan, et al. Learning Differentially Private Recurrent Language Models. ICLR 2018. Privacy Technology: Differentially Private Model Averaging
  • 80. Motivation from real-world constraints Data locality and distribution ● massively decentralized, naturally arising (non-IID) partition ● Data is siloed, held by a small number of coordinating entities ● system-controlled (e.g. shuffled, balanced) Data availability ● limited availability, time-of-day variations ● almost all data nodes always available Addressability ● data nodes are anonymous and interchangeable ● data nodes are addressable Node statefulness ● stateless (generally no repeat computation) ● stateful Node reliability ● unreliable (~10% failures) ● reliable Wide-area communication pattern ● hub-and-spoke topology ● peer-to-peer topology (fully decentralized) ● none (centralized to one datacenter) Distribution scale ● massively parallel (1e9 data nodes) ● single datacenter Primary bottleneck ● communication ● computation
  • 81. Sample open problem areas ● Optimization algorithms for FL, particularly communication-efficient algorithms tolerant of non-IID data ● Approaches that scale FL to larger models, including model and gradient compression techniques ● Novel applications of FL, extension to new learning algorithms and model classes. ● Theory for FL ● Enhancing the security and privacy of FL, including cryptographic techniques and differential privacy ● Bias and fairness in the FL setting (new possibilities and new challenges) ● Attacks on FL including model poisoning, and corresponding defenses ● Not everyone has to have the same model (multi-task and pluralistic learning, personalization, domain adaptation) ● Generative models, transfer learning, semi-supervised learning