SlideShare uma empresa Scribd logo
1 de 23
Federated Learning & Privacy-preserving AI
Oguzhan Gencoglu, Head of AI - Top Data Science
Big Data Helsinki - 27 June 2019
About Top Data Science
● Business : “AI as a Service”
● Located in Helsinki, Finland
● 15 people (12 data scientists with MScs and PhDs)
● Excellent customer track record - Finland, Germany,
Denmark, Japan, Vietnam, Israel, USA
● 60+ machine learning solutions delivered
Customers & Partners
Outline
● The Problem
● The Solution : Federated Learning
● Application Example
● Differential Privacy
● Other Privacy-preserving AI Concepts
The Problem?
Federated Learning
Example - Gboard
Example - Gboard
Hard, Andrew, et al. "Federated learning for mobile keyboard prediction." arXiv preprint arXiv:1811.03604 (2018).
● Higher next-word prediction accuracy = + 24%
● More useful prediction strip = + 10% more clicks
● Better emoji recommendation = + 7%
● 11% more users share emojis
Differential Privacy
a constraint on the algorithms used to publish aggregate information
about a database which limits the disclosure of private information
Learning common patterns in a dataset without memorizing
individual examples
“ Do you pee in the shower? ”
yes no
Mikko’s real answer
HEADS TAILS
yes no
HEADS TAILS
%50 %50
%25 %25
Relevant Concepts
Quasi-identifier : pieces of information that are not of themselves unique identifiers, but
are sufficiently well correlated with an entity to create a unique identifier
Typical bank loan eligibility data
Relevant Concepts
Exponential Mechanism : a technique for designing differentially private algorithms
(McSherry & Talwar, 2007)
Differentially Private FL
McMahan, Brendan, et al. "Learning Differentially Private Recurrent Language Models." (2018).
Tools & Libraries
● github.com/tensorflow/federated
● github.com/tensorflow/privacy
● github.com/uber/sql-differential-privacy
● github.com/IBM/differential-privacy-library
Other Privacy-preserving AI Concepts
ML on Encrypted Data
f3a9d
71g3e
f3a9d
71g3e
End-User
Third Party
Benign
Tumor
Trained Model
Encrypted
Prediction
Input Data Encrypted Input
Decrypted
Prediction
● The end-user encrypts her sensitive
data and sends it to a third-party
host.
● As end-user owns the private key,
third-party cannot decrypt the input
nor output prediction.
● Third-party produces an encrypted
prediction which is returned to the
end-user.
● Privacy is preserved in the entire
pipeline for both inputs and outputs.
Homomorphic Encryption
● Homomorphic Encryption (HE) is a form of encryption that allows computation (eg multiplication and
addition) on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the
operations as if they had been performed on the plaintext.
3 + 5 = 8
d8d4h… + 8ke3s1… = 1u3y7...
Plain Domain
Cipher Domain
Note : Computations in the cipher domain are very costly in terms of speed and memory.
Membership Inference Attacks
An interesting insight: the accuracy of the inference
attack increases with increasing number of classes
Given a black-box machine learning model
and a data record, determining whether this
record was used as part of the model’s
training dataset or not, was shown to be
possible with extremely high accuracy [7].
As a result, we now know that just a simple
query access to a black-box API that returns
the model’s output on a given input, can leak
significant amount of information about the
individual data records on which the model
were trained on.
Data Generation
Dar et al., Image synthesis in multi-contrast MRI with cGANs, 2019
Data Generation
Hyland et al., Real-valued time series generation with recurrent cGANs, 2017
Thank You!

Mais conteúdo relacionado

Mais procurados

Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
MLAI2
 

Mais procurados (20)

Federated Learning with TensorFlow
Federated Learning with TensorFlowFederated Learning with TensorFlow
Federated Learning with TensorFlow
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Distributed machine learning
Distributed machine learningDistributed machine learning
Distributed machine learning
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Machine learning
Machine learning Machine learning
Machine learning
 
Poisoning attacks on Federated Learning based IoT Intrusion Detection System
Poisoning attacks on Federated Learning based IoT Intrusion Detection SystemPoisoning attacks on Federated Learning based IoT Intrusion Detection System
Poisoning attacks on Federated Learning based IoT Intrusion Detection System
 
Privacy, security and ethics in data science
Privacy, security and ethics in data sciencePrivacy, security and ethics in data science
Privacy, security and ethics in data science
 
Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learning
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
 
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
How to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptxHow to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptx
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AI
 
Deep learning
Deep learning Deep learning
Deep learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

Semelhante a Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguzhan Gencoglu

Semelhante a Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguzhan Gencoglu (20)

2019-09-05Federated Learning.pdf
2019-09-05Federated Learning.pdf2019-09-05Federated Learning.pdf
2019-09-05Federated Learning.pdf
 
A quick peek into the word of AI
A quick peek into the word of AIA quick peek into the word of AI
A quick peek into the word of AI
 
Xain.io exhibiting at Berlin Tech Job Fair Spring 2020
Xain.io exhibiting at Berlin Tech Job Fair Spring 2020Xain.io exhibiting at Berlin Tech Job Fair Spring 2020
Xain.io exhibiting at Berlin Tech Job Fair Spring 2020
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
Introduction multiparty computation
Introduction multiparty computationIntroduction multiparty computation
Introduction multiparty computation
 
Jun 15 privacy in the cloud at financial institutions at the object managemen...
Jun 15 privacy in the cloud at financial institutions at the object managemen...Jun 15 privacy in the cloud at financial institutions at the object managemen...
Jun 15 privacy in the cloud at financial institutions at the object managemen...
 
Gans - Generative Adversarial Nets
Gans - Generative Adversarial NetsGans - Generative Adversarial Nets
Gans - Generative Adversarial Nets
 
Towards Statistical Queries over Distributed Private User Data
Towards Statistical Queries over Distributed Private User Data Towards Statistical Queries over Distributed Private User Data
Towards Statistical Queries over Distributed Private User Data
 
Webinar: Machine Learning para Microcontroladores
Webinar: Machine Learning para MicrocontroladoresWebinar: Machine Learning para Microcontroladores
Webinar: Machine Learning para Microcontroladores
 
When Scientific Software Meets (Model-Driven) Software Engineering
When Scientific Software Meets (Model-Driven) Software EngineeringWhen Scientific Software Meets (Model-Driven) Software Engineering
When Scientific Software Meets (Model-Driven) Software Engineering
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
New enterprise application and data security challenges and solutions apr 2...
New enterprise application and data security challenges and solutions   apr 2...New enterprise application and data security challenges and solutions   apr 2...
New enterprise application and data security challenges and solutions apr 2...
 
How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Deep Learning disruption
Deep Learning disruptionDeep Learning disruption
Deep Learning disruption
 
Webinar: How to Design Primary Storage for GDPR
Webinar: How to Design Primary Storage for GDPRWebinar: How to Design Primary Storage for GDPR
Webinar: How to Design Primary Storage for GDPR
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀
 
20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai
 
Philly ETE 2016: Securing Software by Construction
Philly ETE 2016: Securing Software by ConstructionPhilly ETE 2016: Securing Software by Construction
Philly ETE 2016: Securing Software by Construction
 
Building AI with Security and Privacy in mind
Building AI with Security and Privacy in mindBuilding AI with Security and Privacy in mind
Building AI with Security and Privacy in mind
 

Mais de Dataconomy Media

Mais de Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas TomperiBig Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguzhan Gencoglu

  • 1. Federated Learning & Privacy-preserving AI Oguzhan Gencoglu, Head of AI - Top Data Science Big Data Helsinki - 27 June 2019
  • 2. About Top Data Science ● Business : “AI as a Service” ● Located in Helsinki, Finland ● 15 people (12 data scientists with MScs and PhDs) ● Excellent customer track record - Finland, Germany, Denmark, Japan, Vietnam, Israel, USA ● 60+ machine learning solutions delivered Customers & Partners
  • 3. Outline ● The Problem ● The Solution : Federated Learning ● Application Example ● Differential Privacy ● Other Privacy-preserving AI Concepts
  • 5.
  • 6.
  • 9. Example - Gboard Hard, Andrew, et al. "Federated learning for mobile keyboard prediction." arXiv preprint arXiv:1811.03604 (2018). ● Higher next-word prediction accuracy = + 24% ● More useful prediction strip = + 10% more clicks ● Better emoji recommendation = + 7% ● 11% more users share emojis
  • 10. Differential Privacy a constraint on the algorithms used to publish aggregate information about a database which limits the disclosure of private information Learning common patterns in a dataset without memorizing individual examples
  • 11. “ Do you pee in the shower? ” yes no
  • 12. Mikko’s real answer HEADS TAILS yes no HEADS TAILS %50 %50 %25 %25
  • 13. Relevant Concepts Quasi-identifier : pieces of information that are not of themselves unique identifiers, but are sufficiently well correlated with an entity to create a unique identifier Typical bank loan eligibility data
  • 14. Relevant Concepts Exponential Mechanism : a technique for designing differentially private algorithms (McSherry & Talwar, 2007)
  • 15. Differentially Private FL McMahan, Brendan, et al. "Learning Differentially Private Recurrent Language Models." (2018).
  • 16. Tools & Libraries ● github.com/tensorflow/federated ● github.com/tensorflow/privacy ● github.com/uber/sql-differential-privacy ● github.com/IBM/differential-privacy-library
  • 18. ML on Encrypted Data f3a9d 71g3e f3a9d 71g3e End-User Third Party Benign Tumor Trained Model Encrypted Prediction Input Data Encrypted Input Decrypted Prediction ● The end-user encrypts her sensitive data and sends it to a third-party host. ● As end-user owns the private key, third-party cannot decrypt the input nor output prediction. ● Third-party produces an encrypted prediction which is returned to the end-user. ● Privacy is preserved in the entire pipeline for both inputs and outputs.
  • 19. Homomorphic Encryption ● Homomorphic Encryption (HE) is a form of encryption that allows computation (eg multiplication and addition) on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext. 3 + 5 = 8 d8d4h… + 8ke3s1… = 1u3y7... Plain Domain Cipher Domain Note : Computations in the cipher domain are very costly in terms of speed and memory.
  • 20. Membership Inference Attacks An interesting insight: the accuracy of the inference attack increases with increasing number of classes Given a black-box machine learning model and a data record, determining whether this record was used as part of the model’s training dataset or not, was shown to be possible with extremely high accuracy [7]. As a result, we now know that just a simple query access to a black-box API that returns the model’s output on a given input, can leak significant amount of information about the individual data records on which the model were trained on.
  • 21. Data Generation Dar et al., Image synthesis in multi-contrast MRI with cGANs, 2019
  • 22. Data Generation Hyland et al., Real-valued time series generation with recurrent cGANs, 2017

Notas do Editor

  1. Models need to run on device : offline and quicker
  2. Models need to run on device : offline and quicker