SlideShare uma empresa Scribd logo
1 de 66
Baixar para ler offline
The pipeline for
State-of-the-Art NLP
Hugging Face
Agenda
Lysandre DEBUT
Machine Learning Engineer @ Hugging Face,
maintainer and core contributor of
huggingface/transformers
Anthony MOI
Technical Lead @ Hugging Face, maintainer and
core contributor of huggingface/tokenizers
Some slides were adapted from previous
HuggingFace talk by Thomas Wolf, Victor Sanh and
Morgan Funtowicz
Feedback
Your feedback is important to us.
Don’t forget to rate and
review the sessions.
Hugging Face
Hugging Face
Most popular open source NLP
library
▪ 1,000+ Research paper
mentions
▪ Used in production by 1000+
companies
Hugging Face
Today’s Menu
Subjects we’ll dive in today
● NLP: Transfer learning, transformer networks
● Tokenizers: from text to tokens
● Transformers: from tokens to predictions
Transfer Learning - Transformer networks
One big training to rule them all
NLP took a turn in 2018
Self-supervised Training &
Transfer Learning
Large Text Datasets
Compute Power
The arrival of the transformer architecture
Transfer learning
In a few diagrams
Sequential transfer learning
Learn on one task/dataset, transfer to another task/dataset
word2vec
GloVe
skip-thought
InferSent
ELMo
ULMFiT
GPT
BERT
DistilBERT
Text classification
Word labeling
Question-Answering
....
Pre-training Adaptation
Computationally
intensive
step General purpose
model
Transformer Networks
Very large models - State of the Art in several tasks
Transformer Networks
● Very large networks
● Can be trained on very big datasets
● Better than previous architectures at maintaining
long-term dependencies
● Require a lot of compute to be trained Source: BERT: Pre-training of Deep Bidirectional
Transformers for
Language Understanding. Jacob Devlin, Ming-Wei Chang,
Kenton Lee, Kristina Toutanova.In NACCL, 2019.
Transformer Networks
Pre-training
Base model
Pre-trained language
model
Very large corpus
$$$ in compute
Days of training
Transformer Networks
Fine-tuning
Pre-trained language
model
Fine-tuned language
model
Training can be done on single GPU
Small dataset
Easily reproducible
Model Sharing
Reduced compute, cost, energy footprint
From 🏎 Smaller, faster, cheaper, lighter: Introducing DistilBERT, a
distilled version of BERT, by Victor Sanh
A deeper look at the inner mechanisms
Pipeline, pre-training, fine-tuning
Adaptation
Head
Pre-trained
model
Tokenizer
Transfer Learning pipeline in NLP
From text to tokens, from tokens to prediction
Jim
Henson
was
a
puppet
##eer
11067
5567
245
120
7756
9908
1.2 2.7 0.6 -0.2
3.7 9.1 -2.1 3.1
1.5 -4.7 2.4 6.7
6.1 2.4 7.3 -0.6
-3.1 2.5 1.9 -0.1
0.7 2.1 4.2 -3.1
True 0.7886
False 0.223
Jim Henson was a puppeteer
Tokenization
Convert to
vocabulary indices
Pre-trained model
Task-specificmodel
Pre-training
Many currently successful pre-training approaches are based on language
modeling: learning to predict Pϴ
(text) or Pϴ
(text | other text)
Advantages:
- Doesn’t require human annotation - self-supervised
- Many languages have enough text to learn high capacity models
- Versatile - can be used to learn both sentence and word representations with
a variety of objective functions
The rise of language modeling pre-training
Language Modeling
Objectives - MLM
['The', 'pipeline', 'for', 'State', '-', 'of', '-', 'the', '-', 'Art', 'Natural', 'Language', 'Process', '##ing']
The pipeline for State-of-the-Art Natural Language Processing
['The', ‘pipeline’ 'for', 'State', '-', 'of', '-', 'the', '-', 'Art', [MASK], 'Language', 'Process', '##ing']
Tokenization
Masking
['The', ‘pipeline’, 'for', 'State', '-', 'of', '-', 'the', '-', 'Art', [MASK], 'Language', 'Process', '##ing']
‘Natural’
‘Artificial’
‘Machine’
‘Processing’
‘Speech’
Prediction
Language Modeling
Objectives - CLM
['The', 'pipeline', 'for', 'State', '-', 'of', '-', 'the', '-', 'Art', 'Natural', 'Language', 'Process', '##ing']
The pipeline for State-of-the-Art Natural Language Processing
Tokenization
Prediction
['Process', '##ing', '(', 'NL', '##P', ')', 'software', 'which', 'will', 'allow', 'a', 'user', 'to', 'develop']
Tokenization
It doesn’t have to be slow
Tokenization
- Convert input strings to a set of numbers
Its role in the pipeline
Jim Henson was a puppet ##eer
11067 5567 245 120 7756 9908
Jim Henson was a puppeteer
- Goal: Find the most meaningful and smallest possible representation
Some examples
Let’s dive in the nitty-gritty
Word-based
Word by word tokenization
Let’s do tokenization!
Let ‘s do tokenization !
Split on punctuation:
Split on spaces:
▪ Split on spaces, or following specific rules to obtain words
▪ What to do with punctuation?
▪ Requires large vocabularies: dog != dogs, run != running
▪ Out-of-vocabulary (aka <UNK>) tokens for unknown words
Character
Character by character tokenization
▪ Split on characters individually
▪ Do we include spaces or not?
▪ Smaller vocabularies
▪ But lack of meaning -> Characters don’t necessarily have a meaning separately
▪ End up with a huge amount of tokens to be processed by the model
L e t ‘ s d o t o k e n i z a t i o n !
Byte Pair Encoding
Welcome subword tokenization
▪ First introduced by Philip Gage in 1994, as a compression algorithm
▪ Applied to NLP by Rico Sennrich et al. in “Neural Machine Translation of Rare Words with
Subwords Units”. ACL 2016.
Byte Pair Encoding
Welcome subword tokenization
A B C ... a b c ... ? ! ...
Initial alphabet:
▪ Start with a base vocabulary using Unicode characters seen in the data
▪ Most frequent pairs get merged to a new token:
1. T + h => Th
2. Th + e => The
Byte Pair Encoding
Welcome subword tokenization
▪ Less out-of-vocabulary tokens
▪ Smaller vocabularies
Let’s</w> do</w> token ization</w> !</w>
And a lot more
So many algorithms...
▪ Byte-level BPE as used in GPT-2 (Alec Radford et al. OpenAI)
▪ WordPiece as used in BERT (Jacob Devlin et al. Google)
▪ SentencePiece (Unigram model) (Taku Kudo et al. Google)
Tokenizers
Why did we build it?
▪ Performance
▪ One API for all the different tokenizers
▪ Easy to share and reproduce your work
▪ Easy to use any tokenizer, and re-train it on a new language/dataset
The tokenization pipeline
Inner workings
Normalization Pre-tokenization Tokenization Post-processing
The tokenization pipeline
Inner workings
Normalization Pre-tokenization Tokenization Post-processing
▪ Strip
▪ Lowercase
▪ Removing diacritics
▪ Deduplication
▪ Unicode normalization (NFD, NFC, NFKC, NFKD)
The tokenization pipeline
Inner workings
Normalization Pre-tokenization Tokenization Post-processing
▪ Set of rules to split:
- Whitespace use
- Punctuation use
- Something else?
The tokenization pipeline
Inner workings
Normalization Pre-tokenization Tokenization Post-processing
▪ Actual tokenization algorithm:
- BPE
- Unigram
- Word level
The tokenization pipeline
Inner workings
Normalization Pre-tokenization Tokenization Post-processing
▪ Add special tokens: for example [CLS], [SEP] with BERT
▪ Truncate to match the maximum length of the model
▪ Pad all sequence in a batch to the same length
▪ ...
Tokenizers
Let’s see some code!
Tokenizers
Let’s see some code!
Tokenizers
Let’s see some code!
Tokenizers
Let’s see some code!
Tokenizers
Let’s see some code!
Tokenizers
Let’s see some code!
Tokenizers
How to install it?
Transformers
Using complex models shouldn’t be complicated
Transformers
An explosion of Transformer architectures
▪ Wordpiece tokenization
▪ MLM & NSP
BERT
ALBERT
GPT-2
▪ SentencePiece tokenization
▪ MLM & SOP
▪ Repeating layers
▪ Byte-level BPE tokenization
▪ CLM
Same API
Transformers
As flexible as possible
Runs and trains on:
▪ CPU
▪ GPU
▪ TPU
With optimizations:
▪ XLA
▪ TorchScript
▪ Half-precision
▪ Others
All models
BERT & RoBERTa
More to come!
Transformers
Tokenization to prediction
transformers.PreTrainedTokenizer
transformers.PreTrainedModel
The pipeline for State-of-the-Art Natural Language Processing
[[464, 11523, 329, 1812, 12, ..., 15417, 28403]]
Tensor(batch_size, sequence_length, hidden_size) Task-specific prediction
Base model With task-specific head
Transformers
Available pre-trained models
transformers.PreTrainedTokenizer
transformers.PreTrainedModel
▪ We publicly host pre-trained tokenizer vocabularies and
model weights
▪ 1611 model/tokenizer pairs at the time of writing
Transformers
Pipelines
transformers.Pipeline
▪ Pipelines handle both the tokenization and prediction
▪ Reasonable defaults
▪ SOTA models
▪ Customizable
A few use-cases
That’s where it gets interesting
Transformers
Sentiment analysis/Sequence classification (pipeline)
Transformers
Question Answering (pipeline)
Transformers
Causal language modeling/Text generation
Transformers
Sequence Classification - Under the hood
Transformers
Sequence Classification - Under the hood
Transformers
Sequence Classification - Under the hood
Transformers
Sequence Classification - Under the hood
Transformers
Sequence Classification - Under the hood
Transformers
Sequence Classification - Under the hood
Transformers
Training models
Example scripts (TensorFlow & PyTorch)
- Named Entity Recognition
- Sequence Classification
- Question Answering
- Language modeling (fine-tuning & from scratch)
- Multiple Choice
Trains on TPU, CPU, GPU
Example scripts for PyTorch Lightning
Transformers
Just grazed the surface
The transformers library covers a lot more ground:
- ELECTRA
- Reformer
- Longformer
- Encoder-decoder architectures
- Translation & Summarization
Transformers + Tokenizers
The full pipeline?
Data Tokenization Prediction
🤗 nlp Tokenizers Transformers
Metrics
🤗 nlp
Feedback
Your feedback is important to us.
Don’t forget to rate and
review the sessions.
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hugging Face Tools

Mais conteúdo relacionado

Mais procurados

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN FrameworkKeymate.AI
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdfChaoYang81
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Hady Elsahar
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionRrubaa Panchendrarajan
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AIOzgurOscarOzkan
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understandinggohyunwoong
 

Mais procurados (20)

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
gpt3_presentation.pdf
gpt3_presentation.pdfgpt3_presentation.pdf
gpt3_presentation.pdf
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity Recognition
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
BERT
BERTBERT
BERT
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
 
Transformer Zoo
Transformer ZooTransformer Zoo
Transformer Zoo
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 

Semelhante a Building a Pipeline for State-of-the-Art Natural Language Processing Using Hugging Face Tools

Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Jeongkyu Shin
 
Deep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsDeep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsJeongkyu Shin
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0Plain Concepts
 
Learning to Translate with Joey NMT
Learning to Translate with Joey NMTLearning to Translate with Joey NMT
Learning to Translate with Joey NMTJulia Kreutzer
 
YOUR VOICE IS MY PASSPORT
 YOUR VOICE IS MY PASSPORT YOUR VOICE IS MY PASSPORT
YOUR VOICE IS MY PASSPORTPriyanka Aash
 
05-transformers.pdf
05-transformers.pdf05-transformers.pdf
05-transformers.pdfChaoYang81
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchNatasha Latysheva
 
SCGPT : Few-shot Natural Language Generation for Task-Oriented Dialog
SCGPT : Few-shot Natural Language Generation for Task-Oriented DialogSCGPT : Few-shot Natural Language Generation for Task-Oriented Dialog
SCGPT : Few-shot Natural Language Generation for Task-Oriented Dialogtaeseon ryu
 
Building your own open-source voice assistant
Building your own open-source voice assistantBuilding your own open-source voice assistant
Building your own open-source voice assistantAll Things Open
 
DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019Wes Widner
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Holden Karau
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdfRamya Nellutla
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Holden Karau
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLPVijay Ganti
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...AI Frontiers
 
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...Codemotion
 

Semelhante a Building a Pipeline for State-of-the-Art Natural Language Processing Using Hugging Face Tools (20)

Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
 
Deep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsDeep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractions
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
 
Learning to Translate with Joey NMT
Learning to Translate with Joey NMTLearning to Translate with Joey NMT
Learning to Translate with Joey NMT
 
NLTK
NLTKNLTK
NLTK
 
Diving deep into NLP
Diving deep into NLPDiving deep into NLP
Diving deep into NLP
 
YOUR VOICE IS MY PASSPORT
 YOUR VOICE IS MY PASSPORT YOUR VOICE IS MY PASSPORT
YOUR VOICE IS MY PASSPORT
 
05-transformers.pdf
05-transformers.pdf05-transformers.pdf
05-transformers.pdf
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
SCGPT : Few-shot Natural Language Generation for Task-Oriented Dialog
SCGPT : Few-shot Natural Language Generation for Task-Oriented DialogSCGPT : Few-shot Natural Language Generation for Task-Oriented Dialog
SCGPT : Few-shot Natural Language Generation for Task-Oriented Dialog
 
Building your own open-source voice assistant
Building your own open-source voice assistantBuilding your own open-source voice assistant
Building your own open-source voice assistant
 
DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdf
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...
 
Chatbot ppt
Chatbot pptChatbot ppt
Chatbot ppt
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
 
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
 

Mais de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Mais de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 

Último (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 

Building a Pipeline for State-of-the-Art Natural Language Processing Using Hugging Face Tools