Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguzhan Gencoglu

•Transferir como PPTX, PDF•

0 gostou•917 visualizações

"Machine learning algorithms require significant amounts of training data which has been centralized on one machine or in a datacenter so far. For numerous applications, such need of collecting data can be extremely privacy-invasive. Recent advancements in AI research approach this issue by a new paradigm of training AI models, i.e., Federated Learning. In federated learning, edge devices (phones, computers, cars etc.) collaboratively learn a shared AI model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. From personal data perspective, this paradigm enables a way of training a model on the device without directly inspecting users’ data on a server. This talk will pinpoint several examples of AI applications benefiting from federated learning and the likely future of privacy-aware systems."

Tecnologia

Federated Learning & Privacy-preserving AI
Oguzhan Gencoglu, Head of AI - Top Data Science
Big Data Helsinki - 27 June 2019

About Top Data Science
● Business : “AI as a Service”
● Located in Helsinki, Finland
● 15 people (12 data scientists with MScs and PhDs)
● Excellent customer track record - Finland, Germany,
Denmark, Japan, Vietnam, Israel, USA
● 60+ machine learning solutions delivered
Customers & Partners

Outline
● The Problem
● The Solution : Federated Learning
● Application Example
● Differential Privacy
● Other Privacy-preserving AI Concepts

Example - Gboard
Hard, Andrew, et al. "Federated learning for mobile keyboard prediction." arXiv preprint arXiv:1811.03604 (2018).
● Higher next-word prediction accuracy = + 24%
● More useful prediction strip = + 10% more clicks
● Better emoji recommendation = + 7%
● 11% more users share emojis

Differential Privacy
a constraint on the algorithms used to publish aggregate information
about a database which limits the disclosure of private information
Learning common patterns in a dataset without memorizing
individual examples

Mikko’s real answer
HEADS TAILS
yes no
HEADS TAILS
%50 %50
%25 %25

Relevant Concepts
Quasi-identifier : pieces of information that are not of themselves unique identifiers, but
are sufficiently well correlated with an entity to create a unique identifier
Typical bank loan eligibility data

Relevant Concepts
Exponential Mechanism : a technique for designing differentially private algorithms
(McSherry & Talwar, 2007)

Differentially Private FL
McMahan, Brendan, et al. "Learning Differentially Private Recurrent Language Models." (2018).

Tools & Libraries
● github.com/tensorflow/federated
● github.com/tensorflow/privacy
● github.com/uber/sql-differential-privacy
● github.com/IBM/differential-privacy-library

ML on Encrypted Data
f3a9d
71g3e
f3a9d
71g3e
End-User
Third Party
Benign
Tumor
Trained Model
Encrypted
Prediction
Input Data Encrypted Input
Decrypted
Prediction
● The end-user encrypts her sensitive
data and sends it to a third-party
host.
● As end-user owns the private key,
third-party cannot decrypt the input
nor output prediction.
● Third-party produces an encrypted
prediction which is returned to the
end-user.
● Privacy is preserved in the entire
pipeline for both inputs and outputs.

Homomorphic Encryption
● Homomorphic Encryption (HE) is a form of encryption that allows computation (eg multiplication and
addition) on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the
operations as if they had been performed on the plaintext.
3 + 5 = 8
d8d4h… + 8ke3s1… = 1u3y7...
Plain Domain
Cipher Domain
Note : Computations in the cipher domain are very costly in terms of speed and memory.

Membership Inference Attacks
An interesting insight: the accuracy of the inference
attack increases with increasing number of classes
Given a black-box machine learning model
and a data record, determining whether this
record was used as part of the model’s
training dataset or not, was shown to be
possible with extremely high accuracy [7].
As a result, we now know that just a simple
query access to a black-box API that returns
the model’s output on a given input, can leak
significant amount of information about the
individual data records on which the model
were trained on.

Data Generation
Dar et al., Image synthesis in multi-contrast MRI with cGANs, 2019

Data Generation
Hyland et al., Real-valued time series generation with recurrent cGANs, 2017

Mais conteúdo relacionado

Mais procurados

Federated Learning with TensorFlow

Daniyal Shahrokhian

Federated Learning

Harvard Medical School, LMU Munich

Distributed machine learning

Stanley Wang

Overview about federated learning prepared by Miloudi Amara. El-oued university, Algeria. LIAP Laboratory Federated learning(FL) is a framework used to train a shared global model on data across distributed devices while the data does not leave the device at any point. contents: Federated learning. Underlying Architecture. The learning process. Exchanging Models, Parameters, or Gradients. The categorization of federated learning. Synchronous Vs Asynchronous Federated learning. What Is Aggregation in FL? Different Approaches of Aggregation. Top Federated learning frameworks.

Federated Learning

miloudiamara

Machine learning

Saurabh Agrawal

Poisoning attacks on Federated Learning based IoT Intrusion Detection System

Sai Kiran Kadam

Privacy, security and ethics in data science

Nikolaos Vasiloglou

Machine Learning is a field of computer science which deals with the study of computer algorithms that improve automatically through experience. In this PPT we discuss the following concepts - Prerequisite, Definition, Introduction to Machine Learning (ML), Fields associated with ML, Need for ML, Difference between Artificial Intelligence, Machine Learning, Deep Learning, Types of learning in ML, Applications of ML, Limitations of Machine Learning.

Lecture1 introduction to machine learning

UmmeSalmaM1

In the last couple of years, deep learning techniques have transformed the world of artificial intelligence. One by one, the abilities and techniques that humans once imagined were uniquely our own have begun to fall to the onslaught of ever more powerful machines. Deep neural networks are now better than humans at tasks such as face recognition and object recognition. They’ve mastered the ancient game of Go and thrashed the best human players. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist). What sparked this new hype? How is Deep Learning different from previous approaches? Let’s look behind the curtain and unravel the reality. This talk will introduce the core concept of deep learning, explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why “deep learning is probably one of the most exciting things that is happening in the computer industry“ (Jen-Hsun Huang – CEO NVIDIA).

Deep Learning - The Past, Present and Future of Artificial Intelligence

Lukas Masuch

Learning from imbalanced data

Aboul Ella Hassanien

While existing federated learning approaches mostly require that clients have fully-labeled data to train on, in realistic settings, data obtained at the client-side often comes without any accompanying labels. Such deficiency of labels may result from either high labeling cost, or difficulty of annotation due to the requirement of expert knowledge. Thus the private data at each client may be either partly labeled, or completely unlabeled with labeled data being available only at the server, which leads us to a new practical federated learning problem, namely Federated Semi-Supervised Learning (FSSL). In this work, we study two essential scenarios of FSSL based on the location of the labeled data. The first scenario considers a conventional case where clients have both labeled and unlabeled data (labels-at-client), and the second scenario considers a more challenging case, where the labeled data is only available at the server (labels-at-server). We then propose a novel method to tackle the problems, which we refer to as Federated Matching (FedMatch). FedMatch improves upon naive combinations of federated learning and semi-supervised learning approaches with a new inter-client consistency loss and decomposition of the parameters for disjoint learning on labeled and unlabeled data. Through extensive experimental validation of our method in the two different scenarios, we show that our method outperforms both local semi-supervised learning and baselines which naively combine federated learning with semi-supervised learning.

Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...

MLAI2

Automated Machine Learning (Auto ML)

Hayim Makabee

Explainable AI

Wagston Staehler

How to fine-tune and develop your own large language model.pptx

Knoldus Inc.

Intro to Machine Learning & AI

Mostafa Elsheikh

Deep learning

Rajgupta258

Machine Learning

Shrey Malik

▸ Machine Learning / Deep Learning models require to set the value of many hyperparameters ▸ Common examples: regularization coefficients, dropout rate, or number of neurons per layer in a Neural Network ▸ Instead of relying on some "expert advice", this presentation shows how to automatically find optimal hyperparameters ▸ Exhaustive Search, Monte Carlo Search, Bayesian Optimization, and Evolutionary Algorithms are explained with concrete examples

Hyperparameter Optimization for Machine Learning

Francesco Casalegno

Big data lecture notes

Mohit Saini

Machine Learning

Kumar P

Mais procurados (20)

Federated Learning with TensorFlow

Federated Learning

Distributed machine learning

Federated Learning

Machine learning

Poisoning attacks on Federated Learning based IoT Intrusion Detection System

Privacy, security and ethics in data science

Lecture1 introduction to machine learning

Deep Learning - The Past, Present and Future of Artificial Intelligence

Learning from imbalanced data

Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...

Automated Machine Learning (Auto ML)

Explainable AI

How to fine-tune and develop your own large language model.pptx

Intro to Machine Learning & AI

Deep learning

Machine Learning

Hyperparameter Optimization for Machine Learning

Big data lecture notes

Machine Learning

Semelhante a Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguzhan Gencoglu

2019-09-05Federated Learning.pdf

jimjones227147

A quick peek into the word of AI

Subhendu Dey

Xain.io exhibiting at Berlin Tech Job Fair Spring 2020

TechMeetups

Machine Learning AND Deep Learning for OpenPOWER

Ganesan Narayanasamy

Introduction multiparty computation

The Cryptography Centre For Excellence

Jun 15 privacy in the cloud at financial institutions at the object managemen...

Ulf Mattsson

Gans - Generative Adversarial Nets

SajalRastogi8

Towards Statistical Queries over Distributed Private User Data

Serafeim Chatzopoulos

Webinar: Machine Learning para Microcontroladores

Embarcados

When Scientific Software Meets (Model-Driven) Software Engineering

Benoit Combemale

AI hype or reality

Awantik Das

With sensitive data residing everywhere and the breach epidemic growing, the need for application and data protection solutions has become even more critical. Join Ulf Mattsson, Head of Innovation at TokenEx as he discusses: - New Security Challenges to Applications and Data in Cloud - New requirements from Regulations - Application and Data Security solutions for the Enterprise - Trends in integration of Security into Application development - Automating Security tasks in the Open Application development process - The new API Economy - Application Security in the new API Economy - Latest developments and standards in Identity Management for The API Economy - Emerging Data Protection options for Public, Hybrid and Private Cloud.

New enterprise application and data security challenges and solutions apr 2...

Ulf Mattsson

How Will AI Change the Role of the Data Scientist?

Hugo Gävert

TensorFlow 16: Building a Data Science Platform

Seldon

A new wave of Artificial intelligence has emerged which has revolutionized the industry/academia.. Much like the web took advantage of existing technologies, this new wave builds on trends such as the decline in the cost of computing hardware, the emergence of the cloud, the fundamental consumerization of the enterprise and, of course, the mobile revolution. Deep Learning has achieved remarkable breakthroughs, which have, in turn, driven performance improvements across AI components.

Deep Learning disruption

Usman Qayyum

Webinar: How to Design Primary Storage for GDPR

Storage Switzerland

🕵️‍♂️ Embark on an exhilarating journey into the realm of Machine learning and Generative AI with MachinaFiesta! 🚀. Join us for MachinaFiesta, a two-hour event exploring the fascinating world of machine learning and generative AI where you can Vision, Innovate and learn new technologies. Slide contets: 🎤 Brief introduction to the agenda and speakers of the event 🌐 Get to know the importance and future prospects of machine learning 🧠 Interactive session on core machine learning concepts 🚀 Exploration of cutting-edge generative AI advancements 🤖 Introduction to Gemini, the open-source factual language model 🤔Discussion on Gemini's capabilities and potential applications in research and development

MachinaFiesta: A Vision into Machine Learning 🚀

GDSCNiT

20180115 Mobile AIoT Networking-ftsai

Frank Chee-Da TSAI (蔡其達)

The high-profile attacks and data-breaches of the last few years have shown us the importance of securing our software. While it is good that we are seeing more tools that can analyze systems for vulnerabilities, this does not help the programmer write secure code in the first place. To prevent security from becoming a bottleneck–and expensive security mistakes from becoming increasingly probable–we need to look to techniques that allow us to secure software by construction. This talk has two parts. First, I will present technical ideas from research, including my own, that help secure software by construction. Even though these are reasonable ideas, however, the gap between academia and industry often prevents these ideas from becoming realized in practice. Second, I will discuss what prevents longer-term security solutions from being commercialized, how we started the Cybersecurity Factory accelerator bridge the research/industry gap, and how we can work together to address the issues that remain. http://2016.phillyemergingtech.com/session/securing-software-by-construction/

Philly ETE 2016: Securing Software by Construction

jxyz

Building AI with Security and Privacy in mind

geetachauhan

Semelhante a Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguzhan Gencoglu (20)

2019-09-05Federated Learning.pdf

A quick peek into the word of AI

Xain.io exhibiting at Berlin Tech Job Fair Spring 2020

Machine Learning AND Deep Learning for OpenPOWER

Introduction multiparty computation

Jun 15 privacy in the cloud at financial institutions at the object managemen...

Gans - Generative Adversarial Nets

Towards Statistical Queries over Distributed Private User Data

Webinar: Machine Learning para Microcontroladores

When Scientific Software Meets (Model-Driven) Software Engineering

AI hype or reality

New enterprise application and data security challenges and solutions apr 2...

How Will AI Change the Role of the Data Scientist?

TensorFlow 16: Building a Data Science Platform

Deep Learning disruption

Webinar: How to Design Primary Storage for GDPR

MachinaFiesta: A Vision into Machine Learning 🚀

20180115 Mobile AIoT Networking-ftsai

Philly ETE 2016: Securing Software by Construction

Building AI with Security and Privacy in mind

Mais de Dataconomy Media

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...

Dataconomy Media

The challenges of increasing complexity of organizations, companies and projects are obvious and omnipresent. Everywhere there are connections and dependencies that are often not adequately managed or not considered at all because of a lack of technology or expertise to uncover and leverage the relationships in data and information. In his presentation, Axel Morgner talks about graph technology and knowledge graphs as indispensable building blocks for successful companies.

Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...

Dataconomy Media

Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...

Dataconomy Media

Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...

Dataconomy Media

Compliance departments within banks and other financial institutions are turning to machine learning for improving their Anti Money Laundering compliance activities. Today, the systems that aim to detect potentially suspicious activity are commonly rule-based, and suffer from ultra-high false positive rates. DataRobot will discuss how their Automated Machine Learning platform was successfully used for a real use case to reduce their false positives and to enhance their Anti-Money Laundering activities.

Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...

Dataconomy Media

Trump, Brexit, Cambridge Analytica... In the last few years, we have had to confront the consequences of the use and misuse of data science algorithms in manipulating public opinion through social media. The use of private data to microtarget individuals is a daily practice (and a trillion-dollar industry), which has serious side-effects when the selling product is your political ideology. How can we cope with this new scenario?

Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...

Dataconomy Media

Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...

Dataconomy Media

Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...

Dataconomy Media

What does it take to build a good data product or service? Data practitioners always think about the technology, user experience and commercial viability. But rarely do they think about the implications of the systems they build. This talk will shed light on the impact of AI systems and the unintended consequences of the use of data in different products. It will also discuss our role, as data practitioners, in planting the seeds of fairness in the systems we build.

Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...

Dataconomy Media

Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...

Dataconomy Media

Cloud Infrastructure is a hostile environment: a power supply failure or a network outage leads to downtime and big losses. There is nothing we can trust: a single server, a server rack, even a whole datacenter can fail, and if an application is fragile by design, disruption is inevitable. We must distribute our application and diversify cloud data strategy to survive disturbances of any scale. Apache Cassandra is a cloud-native platform-agnostic database that stores data with a distributed redundancy so it easily survives any issue. What to know how Apple and Netflix handle petabytes of data, keeping it highly available? Join us and listen to a story of 10 little servers and no downtime!

Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...

Dataconomy Media

Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...

Dataconomy Media

Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...

Dataconomy Media

Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...

Dataconomy Media

Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...

Dataconomy Media

Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...

Dataconomy Media

Creativity is the mental ability to create new ideas and designs. Innovation, on the other hand, Means developing useful solutions from new ideas. Creativity can be goal-oriented, Whereas innovation is always goal-oriented. This bedeutet, dass innovation aims to achieve defined goals. The use of cloud services and technologies promises enterprise users many benefits in terms of more flexible use of IT resources and faster access to innovative solutions. That’s why we want to examine the question in this talk, of what role cloud computing plays for innovation in companies.

Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...

Dataconomy Media

Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...

Dataconomy Media

"With most machine learning (ML) and deep learning (DL) frameworks, it can take hours to move data for ETL, and hours to train models. It's also hard to scale, with data sets increasingly being larger than the capacity of any single server. The amount of the data also makes it hard to incrementally test and retrain models in near real-time. Learn how Apache Ignite and GridGain help to address limitations like ETL costs, scaling issues and Time-To-Market for the new models and help achieve near-real-time, continuous learning. Yuriy Babak, the head of ML/DL framework development at GridGain and Apache Ignite committer, will explain how ML/DL work with Apache Ignite, and how to get started. Topics include: — Overview of distributed ML/DL including architecture, implementation, usage patterns, pros and cons — Overview of Apache Ignite ML/DL, including built-in ML/DL algorithms, and how to implement your own — Model inference with Apache Ignite, including how to train models with other libraries, like Apache Spark, and deploy them in Ignite — How Apache Ignite and TensorFlow can be used together to build distributed DL model training and inference"

Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...

Dataconomy Media

Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi

Dataconomy Media

Mais de Dataconomy Media (20)