SlideShare uma empresa Scribd logo
1 de 69
Baixar para ler offline
Piero Molino / Uber AI / piero@uber.com
A code-free deep learning toolbox
Deep Learning experimentation
toolbox based on TensorFlow
No coding required
!3
UNDERSTANDABLE
We provide standard visualizations to
understand their performances and
compare their predictions.
EASY
No coding skills are required
to train a model and use it
for obtaining predictions.
OPEN
Released under
the open source
Apache License 2.0.
GENERAL
A new data type-based approach to deep
learning model design that makes the tool
suited for many different applications.
FLEXIBLE
Experienced users have deep control
over model building and training, while
newcomers will find it easy to use.
EXTENSIBLE
Easy to add new model
architecture and new
feature data-types.
Simple Example
Example
!5
NEWS CLASS
Toronto Feb 26 - Standard Trustco said it expects earnings in 1987
to increase at least 15 to 20 pct from the 9 140 000 dlrs or 252
dlrs...
earnings
New York Feb 26 - American Express Co remained silent on market
rumors it would spinoff all or part of its Shearson Lehman Brothers
Inc...
acquisition
BANGKOK March 25 - Vietnam will resettle 300 000 people on
state farms known as new economic zones in 1987 to create jobs
and grow more high-value export crops...
coffe
Example experiment command
!6
The Experiment command:
1. splits the data in training, validation and test sets
2. trains a model on the training set
3. validates on the validation set (early stopping)
4. predicts on the test set
ludwig experiment
--data_csv reuters-allcats.csv
--model_definition "{input_features: [{name: news, type: text}], output_features:
[{name: class, type: category}]}"
# you can specify a --model_definition_file file_path argument instead
# that reads a YAML file
Epoch 1

Training: 100%|██████████████████████████████| 23/23 [00:02<00:00, 9.86it/s]

Evaluation train: 100%|██████████████████████| 23/23 [00:00<00:00, 46.39it/s]

Evaluation vali : 100%|████████████████████████| 4/4 [00:00<00:00, 29.05it/s]

Evaluation test : 100%|████████████████████████| 7/7 [00:00<00:00, 43.06it/s]

Took 2.8826s

╒═════════╤════════╤════════════╤══════════╕

│ class │ loss │ accuracy │ hits@3 │

╞═════════╪════════╪════════════╪══════════╡

│ train │ 0.2604 │ 0.9306 │ 0.9899 │

├─────────┼────────┼────────────┼──────────┤

│ vali │ 0.3678 │ 0.8792 │ 0.9769 │

├─────────┼────────┼────────────┼──────────┤

│ test │ 0.3532 │ 0.8929 │ 0.9903 │

╘═════════╧════════╧════════════╧══════════╛

Validation accuracy on class improved, model saved
!7
First epoch of training
!8
Epoch 4

Training: 100%|██████████████████████████████| 23/23 [00:01<00:00, 14.87it/s]

Evaluation train: 100%|██████████████████████| 23/23 [00:00<00:00, 48.87it/s]

Evaluation vali : 100%|████████████████████████| 4/4 [00:00<00:00, 59.50it/s]

Evaluation test : 100%|████████████████████████| 7/7 [00:00<00:00, 51.70it/s]

Took 1.7578s

╒═════════╤════════╤════════════╤══════════╕

│ class │ loss │ accuracy │ hits@3 │

╞═════════╪════════╪════════════╪══════════╡

│ train │ 0.0048 │ 0.9997 │ 1.0000 │

├─────────┼────────┼────────────┼──────────┤

│ vali │ 0.2560 │ 0.9177 │ 0.9949 │

├─────────┼────────┼────────────┼──────────┤

│ test │ 0.1677 │ 0.9440 │ 0.9964 │

╘═════════╧════════╧════════════╧══════════╛

Validation accuracy on class improved, model saved
Fourth epoch of training,
validation accuracy improved
!9
Epoch 14
Training: 100%|██████████████████████████████| 23/23 [00:01<00:00, 14.69it/s]
Evaluation train: 100%|██████████████████████| 23/23 [00:00<00:00, 48.67it/s]
Evaluation vali : 100%|████████████████████████| 4/4 [00:00<00:00, 61.29it/s]
Evaluation test : 100%|████████████████████████| 7/7 [00:00<00:00, 50.45it/s]
Took 1.7749s
╒═════════╤════════╤════════════╤══════════╕
│ class │ loss │ accuracy │ hits@3 │
╞═════════╪════════╪════════════╪══════════╡
│ train │ 0.0001 │ 1.0000 │ 1.0000 │
├─────────┼────────┼────────────┼──────────┤
│ vali │ 0.4208 │ 0.9100 │ 0.9974 │
├─────────┼────────┼────────────┼──────────┤
│ test │ 0.2081 │ 0.9404 │ 0.9988 │
╘═════════╧════════╧════════════╧══════════╛
Last improvement of accuracy on class happened 10 epochs ago
Fourteenth epoch of training,
validation accuracy has not
improved in a while, let’s
EARLY STOP
!10
===== class =====
accuracy: 0.94403892944
hits_at_k: 0.996350364964
Confusion Matrix
( { 'avg_f1_score_macro': 0.6440659449587669,
'avg_f1_score_micro': 0.94403892944038914,
'avg_f1_score_weighted': 0.94233823910531345,
'avg_precision_macro': 0.71699990923354218,
'avg_precision_micro': 0.94403892944038925,
'avg_precision_weighted': 0.94403892944038925,
'avg_recall_macro': 0.60875299574797059,
'avg_recall_micro': 0.94403892944038925,
'avg_recall_weighted': 0.94403892944038925,
'kappa_score': 0.91202440199068868,
'overall_accuracy': 0.94403892944038925},)
Test statistics overall
!11
{ earn: {'accuracy': 0.95377128953771284,
'f1_score': 0.95214105793450876,
'fall_out': 0.046948356807511749,
'false_discovery_rate': 0.050251256281407031,
'false_negative_rate': 0.045454545454545414,
'false_negatives': 18,
'false_omission_rate': 0.042452830188679291,
'false_positive_rate': 0.046948356807511749,
'false_positives': 20,
'hit_rate': 0.95454545454545459,
'informedness': 0.90759709773794284,
'markedness': 0.90729591352991368,
'matthews_correlation_coefficient': 0.90744649313843584,
'miss_rate': 0.045454545454545414,
'negative_predictive_value': 0.95754716981132071,
'positive_predictive_value': 0.94974874371859297,
'precision': 0.94974874371859297,
'recall': 0.95454545454545459,
'sensitivity': 0.95454545454545459,
'specificity': 0.95305164319248825,
'true_negative_rate': 0.95305164319248825,
'true_negatives': 406,
'true_positive_rate': 0.95454545454545459,
'true_positives': 378},
Test statistics per class
How does it work?
Training
!13
Fields mappings, model hyperparameters
and weights are saved during training
Raw Data
Processed
Data
Model

Training
Pre
Processing
CSV HDF5 / Numpy
JSON
Weights
Hyper
parameters
Tensor JSON
Fields
Mappings
Prediction
!14
The same fields mappings obtained during training are used to preprocess to each
datapoint and to postprocess each prediction of the model in order map back to labels
Raw Data
Processed
Data
Model

Training
Weights
Pre
Processing
Hyper
parameters
Raw Data
Processed
Data
Predictions
and Scores
Pre
Processing
Labels
Post
Processing
Model
Prediction
JSON Tensor JSON
CSV HDF5 / Numpy Numpy CSV
Fields
Mappings
Running Experiments
!15
The Experiment trains a model on the training set and predicts on the test set. It outputs predictions
and training and test statistics that can be used by the Visualization component to obtain graphs
Raw Data Experiment Labels
CSV CSV
Weights
Hyper
parameters
Tensor JSON
Training
Statistics
JSON
Test
Statistics
JSON
Visualization Graphs
PNG / PDF
JSON
Fields
Mappings
Under the hood
Data type abstraction
Model definition YAML file
Smart use of **kwargs
The magic lies in:
Back to the example
!18
ludwig experiment
--data_csv reuters-allcats.csv
--model_definition "{input_features: [{name: news, type: text}], output_features:
[{name: class, type: category}]}"
Model Definition
!19
ludwig experiment
--data_csv reuters-allcats.csv
--model_definition "{input_features: [{name: news, type: text}], output_features:
[{name: class, type: category}]}"
Merging with defaults it gets resolved to
!20
training: {
batch_size: 128,
bucketing_field: None,
decay: False,
decay_rate: 0.96,
decay_steps: 10000,
dropout_rate: 0.0,
early_stop: 3,
epochs: 20,
gradient_clipping: None,
increase_batch_size_on_plateau: 0,
increase_batch_size_on_plateau_max: 512,
increase_batch_size_on_plateau_patience: 5,
increase_batch_size_on_plateau_rate: 2,
learning_rate: 0.001,
learning_rate_warmup_epochs: 5,
optimizer: { beta1: 0.9,
beta2: 0.999,
epsilon: 1e-08,
type: adam},
reduce_learning_rate_on_plateau: 0,
reduce_learning_rate_on_plateau_patience: 5,
reduce_learning_rate_on_plateau_rate: 0.5,
regularization_lambda: 0,
regularizer: l2,
staircase: False,
validation_field: combined,
validation_measure: loss},
preprocessing: {
text: {
char_format: characters,
char_most_common: 70,
char_sequence_length_limit: 1024,
fill_value: ,
lowercase: True,
missing_value_strategy: fill_with_const,
padding: right,
padding_symbol: <PAD>,
unknown_symbol: <UNK>,
word_format: space_punct,
word_most_common: 20000,
word_sequence_length_limit: 256},
category: {
fill_value: <UNK>,
lowercase: False,
missing_value_strategy: fill_with_const,
most_common: 10000},
force_split: False,
split_probabilities: (0.7, 0.1, 0.2),
stratify: None
}
{
input_features: [
{ encoder: parallel_cnn,
level: word,
name: news,
tied_weights: None,
type: text}],
combiner: {type: concat},
output_features: [
{ dependencies: [],
loss: { class_distance_temperature: 0,
class_weights: 1,
confidence_penalty: 0,
distortion: 1,
labels_smoothing: 0,
negative_samples: 0,
robust_lambda: 0,
sampler: None,
type: softmax_cross_entropy,
unique: False,
weight: 1},
name: is_name,
reduce_dependencies: sum,
reduce_input: sum,
top_k: 3,
type: category}],
Merging with defaults it gets resolved to
!21
training: {
batch_size: 128,
bucketing_field: None,
decay: False,
decay_rate: 0.96,
decay_steps: 10000,
dropout_rate: 0.0,
early_stop: 3,
epochs: 20,
gradient_clipping: None,
increase_batch_size_on_plateau: 0,
increase_batch_size_on_plateau_max: 512,
increase_batch_size_on_plateau_patience: 5,
increase_batch_size_on_plateau_rate: 2,
learning_rate: 0.001,
learning_rate_warmup_epochs: 5,
optimizer: { beta1: 0.9,
beta2: 0.999,
epsilon: 1e-08,
type: adam},
reduce_learning_rate_on_plateau: 0,
reduce_learning_rate_on_plateau_patience: 5,
reduce_learning_rate_on_plateau_rate: 0.5,
regularization_lambda: 0,
regularizer: l2,
staircase: False,
validation_field: combined,
validation_measure: loss},
preprocessing: {
text: {
char_format: characters,
char_most_common: 70,
char_sequence_length_limit: 1024,
fill_value: ,
lowercase: True,
missing_value_strategy: fill_with_const,
padding: right,
padding_symbol: <PAD>,
unknown_symbol: <UNK>,
word_format: space_punct,
word_most_common: 20000,
word_sequence_length_limit: 256},
category: {
fill_value: <UNK>,
lowercase: False,
missing_value_strategy: fill_with_const,
most_common: 10000},
force_split: False,
split_probabilities: (0.7, 0.1, 0.2),
stratify: None
}
{
input_features: [
{ encoder: parallel_cnn,
level: word,
name: news,
tied_weights: None,
type: text}],
combiner: {type: concat},
output_features: [
{ dependencies: [],
loss: { class_distance_temperature: 0,
class_weights: 1,
confidence_penalty: 0,
distortion: 1,
labels_smoothing: 0,
negative_samples: 0,
robust_lambda: 0,
sampler: None,
type: softmax_cross_entropy,
unique: False,
weight: 1},
name: is_name,
reduce_dependencies: sum,
reduce_input: sum,
top_k: 3,
type: category}],
preprocessing: {
text: {
char_format: characters,
char_most_common: 70,
char_sequence_length_limit: 1024,
fill_value: ,
lowercase: True,
missing_value_strategy: fill_with_const,
padding: right,
padding_symbol: <PAD>,
unknown_symbol: <UNK>,
word_format: space_punct,
word_most_common: 20000,
word_sequence_length_limit: 256},
category: {
fill_value: <UNK>,
lowercase: False,
missing_value_strategy: fill_with_const,
most_common: 10000},
force_split: False,
split_probabilities: (0.7, 0.1, 0.2),
stratify: None
}
training: {
batch_size: 128,
bucketing_field: None,
decay: False,
decay_rate: 0.96,
decay_steps: 10000,
dropout_rate: 0.0,
early_stop: 3,
epochs: 20,
gradient_clipping: None,
increase_batch_size_on_plateau: 0,
increase_batch_size_on_plateau_max: 512,
increase_batch_size_on_plateau_patience: 5,
increase_batch_size_on_plateau_rate: 2,
learning_rate: 0.001,
learning_rate_warmup_epochs: 5,
optimizer: { beta1: 0.9,
beta2: 0.999,
epsilon: 1e-08,
type: adam},
reduce_learning_rate_on_plateau: 0,
reduce_learning_rate_on_plateau_patience: 5,
reduce_learning_rate_on_plateau_rate: 0.5,
regularization_lambda: 0,
regularizer: l2,
staircase: False,
validation_field: combined,
validation_measure: loss},
{
input_features: [
{ encoder: parallel_cnn,
level: word,
name: text,
tied_weights: None,
type: text}],
combiner: {type: concat},
output_features: [
{ dependencies: [],
loss: { class_distance_temperature: 0,
class_weights: 1,
confidence_penalty: 0,
distortion: 1,
labels_smoothing: 0,
negative_samples: 0,
robust_lambda: 0,
sampler: None,
type: softmax_cross_entropy,
unique: False,
weight: 1},
name: is_name,
reduce_dependencies: sum,
reduce_input: sum,
top_k: 3,
type: category}],
Merging with defaults it gets resolved to
!22
INPUT FEATURES
COMBINER
OUTPUT FEATURES
TRAINING PREPROCESSING
!23
INPUT COMBINER OUTPUT
Combiner
Encoder
Encoder
Encoder
Encoder
Encoder
Encoder
Encoder
Encoder
Text
Category
Numerical
Binary
Sequence
Set
Bag
Image
Time series
Audio /
Speech
H3
Date
Encoder
Encoder
Encoder
Encoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Decoder
Text
Category
Numerical
Binary
Sequence
Set
Bag
Image
Time series
Audio /
Speech
H3
Date
!24
Text ClassifierCombiner
EncoderText
Decoder Category
INPUT COMBINER OUTPUT
!25
Image CaptioningCombiner
EncoderImage
Decoder Text
INPUT COMBINER OUTPUT
!26
Classic RegressionCombiner
Encoder
Encoder
Encoder
Category
Numerical
Binary
Decoder Numerical
INPUT COMBINER OUTPUT
!27
Visual Question Answering
Combiner
Encoder
Encoder
Text
Image
Decoder Binary
INPUT COMBINER OUTPUT
Each output type can have multiple
decoders to choose from
Each input type can have multiple
encoders to choose from
Text input features encoders
CNN RNN
Stacked
CNN
RNN
Parallel
CNN
Words /
Char
6 x
1D Conv
2 x FC
Vector
Encoding
Words /
Char
2 x FC
Vector
Encoding
1D Conv
width 5
1D Conv
width 4
1D Conv
width 3
1D Conv
width 2
Words /
Char
2 x
RNN
2 x FC
Vector
Encoding
Words /
Char
2 x RNN
2 x FC
Vector
Encoding
3 x Conv
Stacked
Parallel CNN
Words /
Char
3 x
Parallel CNN
2 x FC
Vector
Encoding
Transformer
/ BERT
Words /
Char
Vector
Encoding
6 x
Transofmer
2 x FC
ParallelCNN encoding
!30
name: text_csv_column_name
type: text
encoder: parallel_cnn
level: word / char
tied_weights: null
representation: dense / sparse
embedding_size: 256
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
conv_layers: null
num_conv_layers: null
filter_size: 3
num_filters: 256
pool_size: null
fc_layers: null
num_fc_layers: null
fc_size: 256
activation: relu
norm: null
dropout: false
regularize: true
reduce_output: sum
Grey parameters
are optional
ParallelCNN encoding
!31
name: text_csv_column_name,
type: text,
encoder: parallel_cnn,
level: word / char,
tied_weights: null
representation: dense
embedding_size: 256
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
conv_layers: null
num_conv_layers: null
filter_size: 3
num_filters: 256
pool_size: null
fc_layers: null
num_fc_layers: null
fc_size: 256
activation: relu
norm: null
dropout: false
regularize: true
reduce_output: sum
class ParallelCNN(object):
def __init__(
self,
should_embed=True,
vocab=None,
representation='dense',
embedding_size=256,
embeddings_trainable=True,
pretrained_embeddings=None,
embeddings_on_cpu=False,
conv_layers=None,
num_conv_layers=None,
filter_size=3,
num_filters=256,
pool_size=None,
fc_layers=None,
num_fc_layers=None,
fc_size=256,
norm=None,
activation='relu',
dropout=False,
initializer=None,
regularize=True,
reduce_output='max',
**kwargs):
Code that builds
a parallel_cnn
StackedCNN encoding
!32
Grey parameters
are optional
name: text_csv_column_name
type: text
encoder: stacked_cnn
level: word / char
tied_weights: null
representation: dense / sparse
embedding_size: 256
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
conv_layers: null
num_conv_layers: null
filter_size: 3
num_filters: 256
pool_size: null
fc_layers: null
num_fc_layers: null
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
reduce_output: max
StackedParallelCNN encoding
!33
Grey parameters
are optional
name: text_csv_column_name
type: text
encoder: stacked_parallel_cnn
level: word / char
tied_weights: null
representation: dense / sparse
embedding_size: 256
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
stacked_layers: null
num_stacked_layers: null
filter_size: 3
num_filters: 256
pool_size: null
fc_layers: null
num_fc_layers: null
fc_size: 256
norm: null
activation: relu
regularize: true
reduce_output: max
RNN encoding
!34
Grey parameters
are optional
name: text_csv_column_name
type: text
encoder: rnn
level: word / char
tied_weights: null
representation: dense / sparse
embedding_size: 256
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
num_layers: 1
cell_type: rnn
state_size: 256
bidirectional: false
dropout: false
initializer: null
regularize: true
reduce_output: sum
CNN-RNN encoding
!35
Grey parameters
are optional
name: text_csv_column_name
type: text
encoder: cnn_rnn
level: word / char
tied_weights: null
representation: dense / sparse
embedding_size: 256
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
conv_layers: null
num_conv_layers: null
filter_size: 3
num_filters: 256
pool_size: null
norm: null
activation: relu
num_rec_layers: 1
cell_type: rnn
state_size: 256
bidirectional: false
dropout: false
initializer: null
regularize: true
reduce_output: last
Sequence, Time Series and Audio / Speech encoders
CNN RNN
Stacked
CNN
RNN
Parallel
CNN
Sequence
6 x
1D Conv
2 x FC
Vector
Encoding
Sequence
2 x FC
Vector
Encoding
1D Conv
width 5
1D Conv
width 4
1D Conv
width 3
1D Conv
width 2
Sequence
2 x
RNN
2 x FC
Vector
Encoding
Sequence
2 x RNN
2 x FC
Vector
Encoding
3 x Conv
Stacked
Parallel CNN
Sequence
3 x
Parallel CNN
2 x FC
Vector
Encoding
Transformer
Sequence
Vector
Encoding
6 x
Transofmer
2 x FC
Image feature encoders
!37
Stacked
CNN
Sequence
3 x
2D Conv
2 x FC
Vector
Encoding
Sequence
8 x
Res Block
2 x FC
Vector
Encoding
ResNet
Category features embedding encoding
!38
Grey parameters
are optionalname: category_csv_column_name
type: category
representation: dense, ← sparse = one hot encoding
embedding_size: 256
Numerical and binary features
encoders
Numerical features are encoded with one
FC layer + Norm for scaling purposes or
are used as they are
Binary features are used as they are
Set and bag features encoders
They are both encoded by embedding
the items, aggregating them, and
passing them through FC layers
In the case of bags, the aggregation is
a weighted sum
Concat combiner
!41
combiner:
type: concat
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
Grey parameters
are optional
Feature
Encoding
...
Feature
Encoding
Concat FC layers
Category features decoding
!42
name: category_csv_column_name
type: category
reduce_inputs: sum
loss:
type: softmax_cross_entropy
confidence_penalty: 0
robust_lambda: 0
class_weights: 1
class_distances: null
class_distance_temperature: 0
labels_smoothing: 0
negative_samples: 0
sampler: null
distortion: 1
unique: false
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
top_k: 3
Grey parameters
are optional
Numerical features decoding
!43
Grey parameters
are optional
name: numerical_csv_column_name
type: numerical
reduce_inputs: sum
loss:
type: mean_squared_error
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
Binary features decoding
!44
Grey parameters
are optional
name: binary_csv_column_name
type: binary
reduce_inputs: sum
loss:
type: cross_entropy
confidence_penalty: 0
robust_lambda: 0
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
threshold: 0.5
Sequence features decoding
!45
Grey parameters
are optional
name: sequence_csv_column_name
type: sequence
Decoder: generator / tagger
reduce_inputs: sum
loss:
type: softmax_cross_entropy
confidence_penalty: 0
robust_lambda: 0
class_weights: 1
class_distances: null
class_distance_temperature: 0
labels_smoothing: 0
negative_samples: 0
sampler: null
distortion: 1
unique: false
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
Set features decoding
!46
Grey parameters
are optional
name: binary_csv_column_name,
type: binary,
reduce_inputs: sum
dependencies: []
reduce_dependencies: sum
loss:
type: cross_entropy
confidence_penalty: 0
robust_lambda: 0
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
threshold: 0.5
name: set_csv_column_name
type: set
reduce_inputs: sum
loss:
type: sigmoid_cross_entropy
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: false
initializer: null
regularize: true
threshold: 0.5
Output features dependencies DAG
!47
Combiner
Output
of_2
Decoder
of_3
Decoder
of_1
Decoder
of_2
Loss
of_1
Loss
of_3
Loss
Combined
Loss
[{name: of_1,
type: any_type,
dependencies: [of_2]},
{name: of_2,
type: any_type,
dependencies: []},
{name: of_3,
type: any_type,
dependencies: [of_2, of_1]}]
training:
batch_size: 128
epochs: 100
early_stop: 5
optimizer: {type: adam, beta1: 0.9, beta2: 0.999, epsilon: 1e-08}
learning_rate: 0.001
decay: false
decay_rate: 0.96
decay_steps: 10000
staircase: false
regularization_lambda: 0
dropout_rate: 0.0
reduce_learning_rate_on_plateau: 0
reduce_learning_rate_on_plateau_patience: 5
reduce_learning_rate_on_plateau_rate: 0.5
increase_batch_size_on_plateau: 0
increase_batch_size_on_plateau_patience: 5
increase_batch_size_on_plateau_rate: 2
increase_batch_size_on_plateau_max: 512
validation_field: combined
validation_measure: accuracy
bucketing_field: null
Training parameters
!48
Everything
is optional
Preprocessing parameters
!49
preprocessing:
force_split: False,
split_probabilities: (0.7, 0.1, 0.2),
stratify: None
text:
char_format: characters,
char_most_common: 70,
char_sequence_length_limit: 1024,
fill_value: ,
lowercase: True,
missing_value_strategy: fill_with_const,
padding: right,
padding_symbol: <PAD>,
unknown_symbol: <UNK>,
word_format: space_punct,
word_most_common: 20000,
word_sequence_length_limit: 256,
category:
fill_value: <UNK>,
lowercase: False,
missing_value_strategy: fill_with_const,
most_common: 10000
sequence:
...
set:
...
...
Everything
is optional
Supports text preprocessing in
English, Italian, Spanish, German,
French, Portuguese, Dutch, Greek
and Multi-language
Custom Encoder Example
!51
class MySequenceEncoder:
def __init__(
self,
my_param=whatever,
**kwargs
):
self.my_param = my_param
def __call__(
self,
input_sequence,
regularizer,
is_training=True
):
# input_sequence = [b x s] int32
# every element of s is the
# int32 id of a token
your code goes here
it can use self.my_param
# hidden to return is
# [b x s x h] or [b x h]
# hidden_size is h
return hidden, hidden_size
Example Models
ludwig experiment
--data_csv reuters-allcats.csv
--model_definition "{input_features:
[{name: news, type: text, encoder:
parallel_cnn, level: word}],
output_features: [{name: class, type:
category}]}"
Text Classification
!53
NEWS CLASS
Toronto Feb 26 - Standard Trustco
said it expects earnings in 1987 to
increase at least 15...
earnings
New York Feb 26 - American Express
Co remained silent on market
rumors...
acquisition
BANGKOK March 25 - Vietnam will
resettle 300 000 people on state
farms known as new economic...
coffe
Text FC
CNN
CNN
CNN
CNN
Softmax Class
ludwig experiment
--data_csv imagenet.csv
--model_definition "{input_features:
[{name: image_path, type: image, encoder:
stacked_cnn}], output_features: [{name:
class, type: category}]}"
Image Object Classification (MNIST, ImageNet, ..)
!54
IMAGE_PATH CLASS
imagenet/image_000001.jpg car
imagenet/image_000002.jpg dog
imagenet/image_000003.jpg boat
Image FCCNNs Softmax Class
ludwig experiment
--data_csv eats_etd.csv
--model_definition "{input_features: [{name:
restaurant, type: category, encoder: embed},
{name: order, type: bag}, {name: hour, type:
numerical}, {name: min, type: numerical}],
output_features: [{name: etd, type: numerical,
loss: {type: mean_absolute_error}}]}"
Expected Time of Delivery
!55
RESTAURANT ORDER HOUR MIN ETD
Halal Guys
Chicken kebab,
Lamb kebab
20 00 20
Il Casaro Margherita 18 53 15
CurryUp
Chicken Tikka
Masala, Veg
Curry
21 10 35
Restaurant
FC Regress ETD
EMBED
Order
EMBED
SET
...
ludwig experiment
--data_csv chitchat.csv
--model_definition "{input_features:
[{name: user1, type: text, encoder: rnn,
cell_type: lstm}], output_features: [{name:
user2, type: text, decoder: generator,
cell_type: lstm, attention: bahdanau}]}"
Sequence2Sequence Chit-Chat Dialogue Model
!56
USER1 USER2
Hello! How are you doing? Doing well, thanks!
I got promoted today Congratulations!
Not doing well today
I’m sorry, can I do
something to help you?
User1 RNN User2RNN
ludwig experiment
--data_csv sequence_tags.csv
--model_definition "{input_features:
[{name: utterance, type: text, encoder: rnn,
cell_type: lstm}], output_features: [{name:
tag, type: sequence, decoder: tagger}]}"
Sequence Tagging for sequences or time series
!57
UTTERANCE TAG
Blade Runner is a 1982 neo-
noir science fiction film
directed by Ridley Scott
N N O O D O O O O O O
P P
Harrison Ford and Rutger
Hauer starred in it
P P O P P O O O
Philip Dick 's novel Do
Androids Dream of Electric
Sheep ? was published in
1968
P P O O N N N N N N N
O O O D
Utteranc
e
Softmax Tag
RNN
!58
ProgrammaticAPI
Easy to install:
pip install ludwig
Programmatic API:
from ludwig.api import LudwigModel
model = LudwigModel(model_definition)
model.train(train_data)
# or model = LudwigModel.load(path)
predictions = model.predict(other_data)
!59
Additional
Goodies
Model serving:
Ludwig serve --model_path <model_path>
Integration with other tools:
▪︎ Train models distributedly using Horovod
▪︎ Deploy models seamlessly using PyML
▪︎ Hyper-parameters optimization using BOCS /
Quicksilver and AutoTune
Hyperparameter optimization
!60
{"task": "python",
"task_params": {
"file": "/home/work/wrapper.py",
"function": "ludwig_wrapper",
"kwargs": {
"base_model_definition": {
"input_features": [
{
"type": "text",
"encoder": "parallel_cnn"
},
{
"name": "flow_node",
"type": "categorical",
"representation": "dense"
}
],
"output_features": [
{
"name": "contact_type",
"type": "category"
}
],}}}
"strategy": "random", <- grid / random / bayesian
"strategy_params": {
"samples_limit": 50
},
"params": {
"training__dropout": {
"type": "float",
"min": 0.0,
"max": 0.5
},
"flow_node__embedding_size": {
"type": "int",
"min": 50,
"max": 500
}
}
}
Why it is great for experimentation
Easy to plug in new encoders, decoders
and combiners
Change just one string in the YAML file
to run a new experiment
Visualizations
!63
Outputs
A directory containing the trained model
CSV file for each output feature containing predictions
NPY file for each output feature containing probabilities
JSON file containing training statistics
JSON file containing test statistics
Monitoring during training is available through TensorBoard
!64
Visualizations are one command away
!65
Nextsteps
More features: vectors, nested lists, point clouds, videos
More pretrained inputs encoders
Adding missing decoders
Petastorm integration to work directly on Hive tables
without intermediate CSV
Documentation
http://ludwig.ai
Repository
http://uber.github.com/ludwig
Blogpost
http://eng.uber.com/introducing_ludwig
Key contributors:
Yaroslav Dudin
Sai Sumanth Miryala
Extra
Machine Learning democratization
Efficient exploration of model space
through quick iteration
Tweak every aspect of preprocessing,
modeling and training
Extend with your models
Non-expert users Expert users
From idea to model training in minutes
No need for coding and deep knowledge
Easy declarative model construction
hides complexity
Ludwig ≠ AutoML

Mais conteúdo relacionado

Mais procurados

Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Amazon Web Services
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Amazon Web Services
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services Torsten Steinbach
 
Quick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skillsQuick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skillsRavindra kumar
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Amazon Web Services
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companiesData Science Milan
 
AWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAmazon Web Services
 
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperThe Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
 
Delivering Data Science to the Business
Delivering Data Science to the BusinessDelivering Data Science to the Business
Delivering Data Science to the BusinessDataWorks Summit
 
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019javier ramirez
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft Private Cloud
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyAmazon Web Services
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfAmazon Web Services
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
Cascading: Enterprise Data Workflows based on Functional Programming
Cascading: Enterprise Data Workflows based on Functional ProgrammingCascading: Enterprise Data Workflows based on Functional Programming
Cascading: Enterprise Data Workflows based on Functional ProgrammingPaco Nathan
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarAmazon Web Services
 

Mais procurados (20)

Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
 
Quick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skillsQuick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skills
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
 
Data Warehouses and Data Lakes
Data Warehouses and Data LakesData Warehouses and Data Lakes
Data Warehouses and Data Lakes
 
AWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution Showcase
 
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperThe Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
 
Delivering Data Science to the Business
Delivering Data Science to the BusinessDelivering Data Science to the Business
Delivering Data Science to the Business
 
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse Presentation
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit Sydney
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
Cascading: Enterprise Data Workflows based on Functional Programming
Cascading: Enterprise Data Workflows based on Functional ProgrammingCascading: Enterprise Data Workflows based on Functional Programming
Cascading: Enterprise Data Workflows based on Functional Programming
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
 

Semelhante a Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI

Machine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsMachine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsOmkar Rane
 
Deep Learning - Continuous Operations
Deep Learning - Continuous Operations Deep Learning - Continuous Operations
Deep Learning - Continuous Operations Haggai Philip Zagury
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Logistic Regression using Mahout
Logistic Regression using MahoutLogistic Regression using Mahout
Logistic Regression using Mahouttanuvir
 
Strata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiStrata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiManu Mukerji
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple stepsRenjith M P
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)TarunPaparaju
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventuremylittleadventure
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning serviceRuth Yakubu
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...CA Technologies
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...CA Technologies
 
PythonQuants conference - QuantUniversity presentation - Stress Testing in th...
PythonQuants conference - QuantUniversity presentation - Stress Testing in th...PythonQuants conference - QuantUniversity presentation - Stress Testing in th...
PythonQuants conference - QuantUniversity presentation - Stress Testing in th...QuantUniversity
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learningMax Kleiner
 
(DAT311) Large-Scale Genomic Analysis with Amazon Redshift
(DAT311) Large-Scale Genomic Analysis with Amazon Redshift(DAT311) Large-Scale Genomic Analysis with Amazon Redshift
(DAT311) Large-Scale Genomic Analysis with Amazon RedshiftAmazon Web Services
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...gdgsurrey
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt
 

Semelhante a Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI (20)

Machine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsMachine Learning Model for M.S admissions
Machine Learning Model for M.S admissions
 
Deep Learning - Continuous Operations
Deep Learning - Continuous Operations Deep Learning - Continuous Operations
Deep Learning - Continuous Operations
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Logistic Regression using Mahout
Logistic Regression using MahoutLogistic Regression using Mahout
Logistic Regression using Mahout
 
C3 w1
C3 w1C3 w1
C3 w1
 
Strata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiStrata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu Mukerji
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
 
Maestro_Abstract
Maestro_AbstractMaestro_Abstract
Maestro_Abstract
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
 
PythonQuants conference - QuantUniversity presentation - Stress Testing in th...
PythonQuants conference - QuantUniversity presentation - Stress Testing in th...PythonQuants conference - QuantUniversity presentation - Stress Testing in th...
PythonQuants conference - QuantUniversity presentation - Stress Testing in th...
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
 
(DAT311) Large-Scale Genomic Analysis with Amazon Redshift
(DAT311) Large-Scale Genomic Analysis with Amazon Redshift(DAT311) Large-Scale Genomic Analysis with Amazon Redshift
(DAT311) Large-Scale Genomic Analysis with Amazon Redshift
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
 

Mais de Data Science Milan

ML & Graph algorithms to prevent financial crime in digital payments
ML & Graph  algorithms to prevent  financial crime in  digital paymentsML & Graph  algorithms to prevent  financial crime in  digital payments
ML & Graph algorithms to prevent financial crime in digital paymentsData Science Milan
 
How to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plansHow to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plansData Science Milan
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsData Science Milan
 
Question generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AIQuestion generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AIData Science Milan
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaData Science Milan
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureData Science Milan
 
Reinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del PraReinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del PraData Science Milan
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraData Science Milan
 
Audience projection of target consumers over multiple domains a ner and baye...
Audience projection of target consumers over multiple domains  a ner and baye...Audience projection of target consumers over multiple domains  a ner and baye...
Audience projection of target consumers over multiple domains a ner and baye...Data Science Milan
 
Weak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina KhvatovaWeak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina KhvatovaData Science Milan
 
GANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex HoncharGANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex HoncharData Science Milan
 
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo LomonacoContinual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo LomonacoData Science Milan
 
3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep LearningData Science Milan
 
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...Data Science Milan
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...Data Science Milan
 
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data ReplyPricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data ReplyData Science Milan
 
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig..."How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...Data Science Milan
 
A view of graph data usage by Cerved
A view of graph data usage by CervedA view of graph data usage by Cerved
A view of graph data usage by CervedData Science Milan
 
Data science for smart manufacturing at Pirelli
Data science for smart manufacturing at PirelliData science for smart manufacturing at Pirelli
Data science for smart manufacturing at PirelliData Science Milan
 

Mais de Data Science Milan (20)

ML & Graph algorithms to prevent financial crime in digital payments
ML & Graph  algorithms to prevent  financial crime in  digital paymentsML & Graph  algorithms to prevent  financial crime in  digital payments
ML & Graph algorithms to prevent financial crime in digital payments
 
How to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plansHow to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plans
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning Methods
 
Question generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AIQuestion generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AI
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
Reinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del PraReinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del Pra
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
 
Audience projection of target consumers over multiple domains a ner and baye...
Audience projection of target consumers over multiple domains  a ner and baye...Audience projection of target consumers over multiple domains  a ner and baye...
Audience projection of target consumers over multiple domains a ner and baye...
 
Weak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina KhvatovaWeak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina Khvatova
 
GANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex HoncharGANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex Honchar
 
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo LomonacoContinual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
 
3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning
 
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
 
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data ReplyPricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
 
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig..."How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
 
A view of graph data usage by Cerved
A view of graph data usage by CervedA view of graph data usage by Cerved
A view of graph data usage by Cerved
 
Data science for smart manufacturing at Pirelli
Data science for smart manufacturing at PirelliData science for smart manufacturing at Pirelli
Data science for smart manufacturing at Pirelli
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 

Último

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI

  • 1. Piero Molino / Uber AI / piero@uber.com A code-free deep learning toolbox
  • 2. Deep Learning experimentation toolbox based on TensorFlow No coding required
  • 3. !3 UNDERSTANDABLE We provide standard visualizations to understand their performances and compare their predictions. EASY No coding skills are required to train a model and use it for obtaining predictions. OPEN Released under the open source Apache License 2.0. GENERAL A new data type-based approach to deep learning model design that makes the tool suited for many different applications. FLEXIBLE Experienced users have deep control over model building and training, while newcomers will find it easy to use. EXTENSIBLE Easy to add new model architecture and new feature data-types.
  • 5. Example !5 NEWS CLASS Toronto Feb 26 - Standard Trustco said it expects earnings in 1987 to increase at least 15 to 20 pct from the 9 140 000 dlrs or 252 dlrs... earnings New York Feb 26 - American Express Co remained silent on market rumors it would spinoff all or part of its Shearson Lehman Brothers Inc... acquisition BANGKOK March 25 - Vietnam will resettle 300 000 people on state farms known as new economic zones in 1987 to create jobs and grow more high-value export crops... coffe
  • 6. Example experiment command !6 The Experiment command: 1. splits the data in training, validation and test sets 2. trains a model on the training set 3. validates on the validation set (early stopping) 4. predicts on the test set ludwig experiment --data_csv reuters-allcats.csv --model_definition "{input_features: [{name: news, type: text}], output_features: [{name: class, type: category}]}" # you can specify a --model_definition_file file_path argument instead # that reads a YAML file
  • 7. Epoch 1
 Training: 100%|██████████████████████████████| 23/23 [00:02<00:00, 9.86it/s]
 Evaluation train: 100%|██████████████████████| 23/23 [00:00<00:00, 46.39it/s]
 Evaluation vali : 100%|████████████████████████| 4/4 [00:00<00:00, 29.05it/s]
 Evaluation test : 100%|████████████████████████| 7/7 [00:00<00:00, 43.06it/s]
 Took 2.8826s
 ╒═════════╤════════╤════════════╤══════════╕
 │ class │ loss │ accuracy │ hits@3 │
 ╞═════════╪════════╪════════════╪══════════╡
 │ train │ 0.2604 │ 0.9306 │ 0.9899 │
 ├─────────┼────────┼────────────┼──────────┤
 │ vali │ 0.3678 │ 0.8792 │ 0.9769 │
 ├─────────┼────────┼────────────┼──────────┤
 │ test │ 0.3532 │ 0.8929 │ 0.9903 │
 ╘═════════╧════════╧════════════╧══════════╛
 Validation accuracy on class improved, model saved !7 First epoch of training
  • 8. !8 Epoch 4
 Training: 100%|██████████████████████████████| 23/23 [00:01<00:00, 14.87it/s]
 Evaluation train: 100%|██████████████████████| 23/23 [00:00<00:00, 48.87it/s]
 Evaluation vali : 100%|████████████████████████| 4/4 [00:00<00:00, 59.50it/s]
 Evaluation test : 100%|████████████████████████| 7/7 [00:00<00:00, 51.70it/s]
 Took 1.7578s
 ╒═════════╤════════╤════════════╤══════════╕
 │ class │ loss │ accuracy │ hits@3 │
 ╞═════════╪════════╪════════════╪══════════╡
 │ train │ 0.0048 │ 0.9997 │ 1.0000 │
 ├─────────┼────────┼────────────┼──────────┤
 │ vali │ 0.2560 │ 0.9177 │ 0.9949 │
 ├─────────┼────────┼────────────┼──────────┤
 │ test │ 0.1677 │ 0.9440 │ 0.9964 │
 ╘═════════╧════════╧════════════╧══════════╛
 Validation accuracy on class improved, model saved Fourth epoch of training, validation accuracy improved
  • 9. !9 Epoch 14 Training: 100%|██████████████████████████████| 23/23 [00:01<00:00, 14.69it/s] Evaluation train: 100%|██████████████████████| 23/23 [00:00<00:00, 48.67it/s] Evaluation vali : 100%|████████████████████████| 4/4 [00:00<00:00, 61.29it/s] Evaluation test : 100%|████████████████████████| 7/7 [00:00<00:00, 50.45it/s] Took 1.7749s ╒═════════╤════════╤════════════╤══════════╕ │ class │ loss │ accuracy │ hits@3 │ ╞═════════╪════════╪════════════╪══════════╡ │ train │ 0.0001 │ 1.0000 │ 1.0000 │ ├─────────┼────────┼────────────┼──────────┤ │ vali │ 0.4208 │ 0.9100 │ 0.9974 │ ├─────────┼────────┼────────────┼──────────┤ │ test │ 0.2081 │ 0.9404 │ 0.9988 │ ╘═════════╧════════╧════════════╧══════════╛ Last improvement of accuracy on class happened 10 epochs ago Fourteenth epoch of training, validation accuracy has not improved in a while, let’s EARLY STOP
  • 10. !10 ===== class ===== accuracy: 0.94403892944 hits_at_k: 0.996350364964 Confusion Matrix ( { 'avg_f1_score_macro': 0.6440659449587669, 'avg_f1_score_micro': 0.94403892944038914, 'avg_f1_score_weighted': 0.94233823910531345, 'avg_precision_macro': 0.71699990923354218, 'avg_precision_micro': 0.94403892944038925, 'avg_precision_weighted': 0.94403892944038925, 'avg_recall_macro': 0.60875299574797059, 'avg_recall_micro': 0.94403892944038925, 'avg_recall_weighted': 0.94403892944038925, 'kappa_score': 0.91202440199068868, 'overall_accuracy': 0.94403892944038925},) Test statistics overall
  • 11. !11 { earn: {'accuracy': 0.95377128953771284, 'f1_score': 0.95214105793450876, 'fall_out': 0.046948356807511749, 'false_discovery_rate': 0.050251256281407031, 'false_negative_rate': 0.045454545454545414, 'false_negatives': 18, 'false_omission_rate': 0.042452830188679291, 'false_positive_rate': 0.046948356807511749, 'false_positives': 20, 'hit_rate': 0.95454545454545459, 'informedness': 0.90759709773794284, 'markedness': 0.90729591352991368, 'matthews_correlation_coefficient': 0.90744649313843584, 'miss_rate': 0.045454545454545414, 'negative_predictive_value': 0.95754716981132071, 'positive_predictive_value': 0.94974874371859297, 'precision': 0.94974874371859297, 'recall': 0.95454545454545459, 'sensitivity': 0.95454545454545459, 'specificity': 0.95305164319248825, 'true_negative_rate': 0.95305164319248825, 'true_negatives': 406, 'true_positive_rate': 0.95454545454545459, 'true_positives': 378}, Test statistics per class
  • 12. How does it work?
  • 13. Training !13 Fields mappings, model hyperparameters and weights are saved during training Raw Data Processed Data Model
 Training Pre Processing CSV HDF5 / Numpy JSON Weights Hyper parameters Tensor JSON Fields Mappings
  • 14. Prediction !14 The same fields mappings obtained during training are used to preprocess to each datapoint and to postprocess each prediction of the model in order map back to labels Raw Data Processed Data Model
 Training Weights Pre Processing Hyper parameters Raw Data Processed Data Predictions and Scores Pre Processing Labels Post Processing Model Prediction JSON Tensor JSON CSV HDF5 / Numpy Numpy CSV Fields Mappings
  • 15. Running Experiments !15 The Experiment trains a model on the training set and predicts on the test set. It outputs predictions and training and test statistics that can be used by the Visualization component to obtain graphs Raw Data Experiment Labels CSV CSV Weights Hyper parameters Tensor JSON Training Statistics JSON Test Statistics JSON Visualization Graphs PNG / PDF JSON Fields Mappings
  • 17. Data type abstraction Model definition YAML file Smart use of **kwargs The magic lies in:
  • 18. Back to the example !18 ludwig experiment --data_csv reuters-allcats.csv --model_definition "{input_features: [{name: news, type: text}], output_features: [{name: class, type: category}]}"
  • 19. Model Definition !19 ludwig experiment --data_csv reuters-allcats.csv --model_definition "{input_features: [{name: news, type: text}], output_features: [{name: class, type: category}]}"
  • 20. Merging with defaults it gets resolved to !20 training: { batch_size: 128, bucketing_field: None, decay: False, decay_rate: 0.96, decay_steps: 10000, dropout_rate: 0.0, early_stop: 3, epochs: 20, gradient_clipping: None, increase_batch_size_on_plateau: 0, increase_batch_size_on_plateau_max: 512, increase_batch_size_on_plateau_patience: 5, increase_batch_size_on_plateau_rate: 2, learning_rate: 0.001, learning_rate_warmup_epochs: 5, optimizer: { beta1: 0.9, beta2: 0.999, epsilon: 1e-08, type: adam}, reduce_learning_rate_on_plateau: 0, reduce_learning_rate_on_plateau_patience: 5, reduce_learning_rate_on_plateau_rate: 0.5, regularization_lambda: 0, regularizer: l2, staircase: False, validation_field: combined, validation_measure: loss}, preprocessing: { text: { char_format: characters, char_most_common: 70, char_sequence_length_limit: 1024, fill_value: , lowercase: True, missing_value_strategy: fill_with_const, padding: right, padding_symbol: <PAD>, unknown_symbol: <UNK>, word_format: space_punct, word_most_common: 20000, word_sequence_length_limit: 256}, category: { fill_value: <UNK>, lowercase: False, missing_value_strategy: fill_with_const, most_common: 10000}, force_split: False, split_probabilities: (0.7, 0.1, 0.2), stratify: None } { input_features: [ { encoder: parallel_cnn, level: word, name: news, tied_weights: None, type: text}], combiner: {type: concat}, output_features: [ { dependencies: [], loss: { class_distance_temperature: 0, class_weights: 1, confidence_penalty: 0, distortion: 1, labels_smoothing: 0, negative_samples: 0, robust_lambda: 0, sampler: None, type: softmax_cross_entropy, unique: False, weight: 1}, name: is_name, reduce_dependencies: sum, reduce_input: sum, top_k: 3, type: category}],
  • 21. Merging with defaults it gets resolved to !21 training: { batch_size: 128, bucketing_field: None, decay: False, decay_rate: 0.96, decay_steps: 10000, dropout_rate: 0.0, early_stop: 3, epochs: 20, gradient_clipping: None, increase_batch_size_on_plateau: 0, increase_batch_size_on_plateau_max: 512, increase_batch_size_on_plateau_patience: 5, increase_batch_size_on_plateau_rate: 2, learning_rate: 0.001, learning_rate_warmup_epochs: 5, optimizer: { beta1: 0.9, beta2: 0.999, epsilon: 1e-08, type: adam}, reduce_learning_rate_on_plateau: 0, reduce_learning_rate_on_plateau_patience: 5, reduce_learning_rate_on_plateau_rate: 0.5, regularization_lambda: 0, regularizer: l2, staircase: False, validation_field: combined, validation_measure: loss}, preprocessing: { text: { char_format: characters, char_most_common: 70, char_sequence_length_limit: 1024, fill_value: , lowercase: True, missing_value_strategy: fill_with_const, padding: right, padding_symbol: <PAD>, unknown_symbol: <UNK>, word_format: space_punct, word_most_common: 20000, word_sequence_length_limit: 256}, category: { fill_value: <UNK>, lowercase: False, missing_value_strategy: fill_with_const, most_common: 10000}, force_split: False, split_probabilities: (0.7, 0.1, 0.2), stratify: None } { input_features: [ { encoder: parallel_cnn, level: word, name: news, tied_weights: None, type: text}], combiner: {type: concat}, output_features: [ { dependencies: [], loss: { class_distance_temperature: 0, class_weights: 1, confidence_penalty: 0, distortion: 1, labels_smoothing: 0, negative_samples: 0, robust_lambda: 0, sampler: None, type: softmax_cross_entropy, unique: False, weight: 1}, name: is_name, reduce_dependencies: sum, reduce_input: sum, top_k: 3, type: category}],
  • 22. preprocessing: { text: { char_format: characters, char_most_common: 70, char_sequence_length_limit: 1024, fill_value: , lowercase: True, missing_value_strategy: fill_with_const, padding: right, padding_symbol: <PAD>, unknown_symbol: <UNK>, word_format: space_punct, word_most_common: 20000, word_sequence_length_limit: 256}, category: { fill_value: <UNK>, lowercase: False, missing_value_strategy: fill_with_const, most_common: 10000}, force_split: False, split_probabilities: (0.7, 0.1, 0.2), stratify: None } training: { batch_size: 128, bucketing_field: None, decay: False, decay_rate: 0.96, decay_steps: 10000, dropout_rate: 0.0, early_stop: 3, epochs: 20, gradient_clipping: None, increase_batch_size_on_plateau: 0, increase_batch_size_on_plateau_max: 512, increase_batch_size_on_plateau_patience: 5, increase_batch_size_on_plateau_rate: 2, learning_rate: 0.001, learning_rate_warmup_epochs: 5, optimizer: { beta1: 0.9, beta2: 0.999, epsilon: 1e-08, type: adam}, reduce_learning_rate_on_plateau: 0, reduce_learning_rate_on_plateau_patience: 5, reduce_learning_rate_on_plateau_rate: 0.5, regularization_lambda: 0, regularizer: l2, staircase: False, validation_field: combined, validation_measure: loss}, { input_features: [ { encoder: parallel_cnn, level: word, name: text, tied_weights: None, type: text}], combiner: {type: concat}, output_features: [ { dependencies: [], loss: { class_distance_temperature: 0, class_weights: 1, confidence_penalty: 0, distortion: 1, labels_smoothing: 0, negative_samples: 0, robust_lambda: 0, sampler: None, type: softmax_cross_entropy, unique: False, weight: 1}, name: is_name, reduce_dependencies: sum, reduce_input: sum, top_k: 3, type: category}], Merging with defaults it gets resolved to !22 INPUT FEATURES COMBINER OUTPUT FEATURES TRAINING PREPROCESSING
  • 23. !23 INPUT COMBINER OUTPUT Combiner Encoder Encoder Encoder Encoder Encoder Encoder Encoder Encoder Text Category Numerical Binary Sequence Set Bag Image Time series Audio / Speech H3 Date Encoder Encoder Encoder Encoder Decoder Decoder Decoder Decoder Decoder Decoder Decoder Decoder Decoder Decoder Decoder Decoder Text Category Numerical Binary Sequence Set Bag Image Time series Audio / Speech H3 Date
  • 28. Each output type can have multiple decoders to choose from Each input type can have multiple encoders to choose from
  • 29. Text input features encoders CNN RNN Stacked CNN RNN Parallel CNN Words / Char 6 x 1D Conv 2 x FC Vector Encoding Words / Char 2 x FC Vector Encoding 1D Conv width 5 1D Conv width 4 1D Conv width 3 1D Conv width 2 Words / Char 2 x RNN 2 x FC Vector Encoding Words / Char 2 x RNN 2 x FC Vector Encoding 3 x Conv Stacked Parallel CNN Words / Char 3 x Parallel CNN 2 x FC Vector Encoding Transformer / BERT Words / Char Vector Encoding 6 x Transofmer 2 x FC
  • 30. ParallelCNN encoding !30 name: text_csv_column_name type: text encoder: parallel_cnn level: word / char tied_weights: null representation: dense / sparse embedding_size: 256 embeddings_on_cpu: false pretrained_embeddings: null embeddings_trainable: true conv_layers: null num_conv_layers: null filter_size: 3 num_filters: 256 pool_size: null fc_layers: null num_fc_layers: null fc_size: 256 activation: relu norm: null dropout: false regularize: true reduce_output: sum Grey parameters are optional
  • 31. ParallelCNN encoding !31 name: text_csv_column_name, type: text, encoder: parallel_cnn, level: word / char, tied_weights: null representation: dense embedding_size: 256 embeddings_on_cpu: false pretrained_embeddings: null embeddings_trainable: true conv_layers: null num_conv_layers: null filter_size: 3 num_filters: 256 pool_size: null fc_layers: null num_fc_layers: null fc_size: 256 activation: relu norm: null dropout: false regularize: true reduce_output: sum class ParallelCNN(object): def __init__( self, should_embed=True, vocab=None, representation='dense', embedding_size=256, embeddings_trainable=True, pretrained_embeddings=None, embeddings_on_cpu=False, conv_layers=None, num_conv_layers=None, filter_size=3, num_filters=256, pool_size=None, fc_layers=None, num_fc_layers=None, fc_size=256, norm=None, activation='relu', dropout=False, initializer=None, regularize=True, reduce_output='max', **kwargs): Code that builds a parallel_cnn
  • 32. StackedCNN encoding !32 Grey parameters are optional name: text_csv_column_name type: text encoder: stacked_cnn level: word / char tied_weights: null representation: dense / sparse embedding_size: 256 embeddings_on_cpu: false pretrained_embeddings: null embeddings_trainable: true conv_layers: null num_conv_layers: null filter_size: 3 num_filters: 256 pool_size: null fc_layers: null num_fc_layers: null fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true reduce_output: max
  • 33. StackedParallelCNN encoding !33 Grey parameters are optional name: text_csv_column_name type: text encoder: stacked_parallel_cnn level: word / char tied_weights: null representation: dense / sparse embedding_size: 256 embeddings_on_cpu: false pretrained_embeddings: null embeddings_trainable: true stacked_layers: null num_stacked_layers: null filter_size: 3 num_filters: 256 pool_size: null fc_layers: null num_fc_layers: null fc_size: 256 norm: null activation: relu regularize: true reduce_output: max
  • 34. RNN encoding !34 Grey parameters are optional name: text_csv_column_name type: text encoder: rnn level: word / char tied_weights: null representation: dense / sparse embedding_size: 256 embeddings_on_cpu: false pretrained_embeddings: null embeddings_trainable: true num_layers: 1 cell_type: rnn state_size: 256 bidirectional: false dropout: false initializer: null regularize: true reduce_output: sum
  • 35. CNN-RNN encoding !35 Grey parameters are optional name: text_csv_column_name type: text encoder: cnn_rnn level: word / char tied_weights: null representation: dense / sparse embedding_size: 256 embeddings_on_cpu: false pretrained_embeddings: null embeddings_trainable: true conv_layers: null num_conv_layers: null filter_size: 3 num_filters: 256 pool_size: null norm: null activation: relu num_rec_layers: 1 cell_type: rnn state_size: 256 bidirectional: false dropout: false initializer: null regularize: true reduce_output: last
  • 36. Sequence, Time Series and Audio / Speech encoders CNN RNN Stacked CNN RNN Parallel CNN Sequence 6 x 1D Conv 2 x FC Vector Encoding Sequence 2 x FC Vector Encoding 1D Conv width 5 1D Conv width 4 1D Conv width 3 1D Conv width 2 Sequence 2 x RNN 2 x FC Vector Encoding Sequence 2 x RNN 2 x FC Vector Encoding 3 x Conv Stacked Parallel CNN Sequence 3 x Parallel CNN 2 x FC Vector Encoding Transformer Sequence Vector Encoding 6 x Transofmer 2 x FC
  • 37. Image feature encoders !37 Stacked CNN Sequence 3 x 2D Conv 2 x FC Vector Encoding Sequence 8 x Res Block 2 x FC Vector Encoding ResNet
  • 38. Category features embedding encoding !38 Grey parameters are optionalname: category_csv_column_name type: category representation: dense, ← sparse = one hot encoding embedding_size: 256
  • 39. Numerical and binary features encoders Numerical features are encoded with one FC layer + Norm for scaling purposes or are used as they are Binary features are used as they are
  • 40. Set and bag features encoders They are both encoded by embedding the items, aggregating them, and passing them through FC layers In the case of bags, the aggregation is a weighted sum
  • 41. Concat combiner !41 combiner: type: concat fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true Grey parameters are optional Feature Encoding ... Feature Encoding Concat FC layers
  • 42. Category features decoding !42 name: category_csv_column_name type: category reduce_inputs: sum loss: type: softmax_cross_entropy confidence_penalty: 0 robust_lambda: 0 class_weights: 1 class_distances: null class_distance_temperature: 0 labels_smoothing: 0 negative_samples: 0 sampler: null distortion: 1 unique: false fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true top_k: 3 Grey parameters are optional
  • 43. Numerical features decoding !43 Grey parameters are optional name: numerical_csv_column_name type: numerical reduce_inputs: sum loss: type: mean_squared_error fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true
  • 44. Binary features decoding !44 Grey parameters are optional name: binary_csv_column_name type: binary reduce_inputs: sum loss: type: cross_entropy confidence_penalty: 0 robust_lambda: 0 fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true threshold: 0.5
  • 45. Sequence features decoding !45 Grey parameters are optional name: sequence_csv_column_name type: sequence Decoder: generator / tagger reduce_inputs: sum loss: type: softmax_cross_entropy confidence_penalty: 0 robust_lambda: 0 class_weights: 1 class_distances: null class_distance_temperature: 0 labels_smoothing: 0 negative_samples: 0 sampler: null distortion: 1 unique: false fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true
  • 46. Set features decoding !46 Grey parameters are optional name: binary_csv_column_name, type: binary, reduce_inputs: sum dependencies: [] reduce_dependencies: sum loss: type: cross_entropy confidence_penalty: 0 robust_lambda: 0 fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true threshold: 0.5 name: set_csv_column_name type: set reduce_inputs: sum loss: type: sigmoid_cross_entropy fc_layers: null num_fc_layers: 0 fc_size: 256 activation: relu norm: null dropout: false initializer: null regularize: true threshold: 0.5
  • 47. Output features dependencies DAG !47 Combiner Output of_2 Decoder of_3 Decoder of_1 Decoder of_2 Loss of_1 Loss of_3 Loss Combined Loss [{name: of_1, type: any_type, dependencies: [of_2]}, {name: of_2, type: any_type, dependencies: []}, {name: of_3, type: any_type, dependencies: [of_2, of_1]}]
  • 48. training: batch_size: 128 epochs: 100 early_stop: 5 optimizer: {type: adam, beta1: 0.9, beta2: 0.999, epsilon: 1e-08} learning_rate: 0.001 decay: false decay_rate: 0.96 decay_steps: 10000 staircase: false regularization_lambda: 0 dropout_rate: 0.0 reduce_learning_rate_on_plateau: 0 reduce_learning_rate_on_plateau_patience: 5 reduce_learning_rate_on_plateau_rate: 0.5 increase_batch_size_on_plateau: 0 increase_batch_size_on_plateau_patience: 5 increase_batch_size_on_plateau_rate: 2 increase_batch_size_on_plateau_max: 512 validation_field: combined validation_measure: accuracy bucketing_field: null Training parameters !48 Everything is optional
  • 49. Preprocessing parameters !49 preprocessing: force_split: False, split_probabilities: (0.7, 0.1, 0.2), stratify: None text: char_format: characters, char_most_common: 70, char_sequence_length_limit: 1024, fill_value: , lowercase: True, missing_value_strategy: fill_with_const, padding: right, padding_symbol: <PAD>, unknown_symbol: <UNK>, word_format: space_punct, word_most_common: 20000, word_sequence_length_limit: 256, category: fill_value: <UNK>, lowercase: False, missing_value_strategy: fill_with_const, most_common: 10000 sequence: ... set: ... ... Everything is optional
  • 50. Supports text preprocessing in English, Italian, Spanish, German, French, Portuguese, Dutch, Greek and Multi-language
  • 51. Custom Encoder Example !51 class MySequenceEncoder: def __init__( self, my_param=whatever, **kwargs ): self.my_param = my_param def __call__( self, input_sequence, regularizer, is_training=True ): # input_sequence = [b x s] int32 # every element of s is the # int32 id of a token your code goes here it can use self.my_param # hidden to return is # [b x s x h] or [b x h] # hidden_size is h return hidden, hidden_size
  • 53. ludwig experiment --data_csv reuters-allcats.csv --model_definition "{input_features: [{name: news, type: text, encoder: parallel_cnn, level: word}], output_features: [{name: class, type: category}]}" Text Classification !53 NEWS CLASS Toronto Feb 26 - Standard Trustco said it expects earnings in 1987 to increase at least 15... earnings New York Feb 26 - American Express Co remained silent on market rumors... acquisition BANGKOK March 25 - Vietnam will resettle 300 000 people on state farms known as new economic... coffe Text FC CNN CNN CNN CNN Softmax Class
  • 54. ludwig experiment --data_csv imagenet.csv --model_definition "{input_features: [{name: image_path, type: image, encoder: stacked_cnn}], output_features: [{name: class, type: category}]}" Image Object Classification (MNIST, ImageNet, ..) !54 IMAGE_PATH CLASS imagenet/image_000001.jpg car imagenet/image_000002.jpg dog imagenet/image_000003.jpg boat Image FCCNNs Softmax Class
  • 55. ludwig experiment --data_csv eats_etd.csv --model_definition "{input_features: [{name: restaurant, type: category, encoder: embed}, {name: order, type: bag}, {name: hour, type: numerical}, {name: min, type: numerical}], output_features: [{name: etd, type: numerical, loss: {type: mean_absolute_error}}]}" Expected Time of Delivery !55 RESTAURANT ORDER HOUR MIN ETD Halal Guys Chicken kebab, Lamb kebab 20 00 20 Il Casaro Margherita 18 53 15 CurryUp Chicken Tikka Masala, Veg Curry 21 10 35 Restaurant FC Regress ETD EMBED Order EMBED SET ...
  • 56. ludwig experiment --data_csv chitchat.csv --model_definition "{input_features: [{name: user1, type: text, encoder: rnn, cell_type: lstm}], output_features: [{name: user2, type: text, decoder: generator, cell_type: lstm, attention: bahdanau}]}" Sequence2Sequence Chit-Chat Dialogue Model !56 USER1 USER2 Hello! How are you doing? Doing well, thanks! I got promoted today Congratulations! Not doing well today I’m sorry, can I do something to help you? User1 RNN User2RNN
  • 57. ludwig experiment --data_csv sequence_tags.csv --model_definition "{input_features: [{name: utterance, type: text, encoder: rnn, cell_type: lstm}], output_features: [{name: tag, type: sequence, decoder: tagger}]}" Sequence Tagging for sequences or time series !57 UTTERANCE TAG Blade Runner is a 1982 neo- noir science fiction film directed by Ridley Scott N N O O D O O O O O O P P Harrison Ford and Rutger Hauer starred in it P P O P P O O O Philip Dick 's novel Do Androids Dream of Electric Sheep ? was published in 1968 P P O O N N N N N N N O O O D Utteranc e Softmax Tag RNN
  • 58. !58 ProgrammaticAPI Easy to install: pip install ludwig Programmatic API: from ludwig.api import LudwigModel model = LudwigModel(model_definition) model.train(train_data) # or model = LudwigModel.load(path) predictions = model.predict(other_data)
  • 59. !59 Additional Goodies Model serving: Ludwig serve --model_path <model_path> Integration with other tools: ▪︎ Train models distributedly using Horovod ▪︎ Deploy models seamlessly using PyML ▪︎ Hyper-parameters optimization using BOCS / Quicksilver and AutoTune
  • 60. Hyperparameter optimization !60 {"task": "python", "task_params": { "file": "/home/work/wrapper.py", "function": "ludwig_wrapper", "kwargs": { "base_model_definition": { "input_features": [ { "type": "text", "encoder": "parallel_cnn" }, { "name": "flow_node", "type": "categorical", "representation": "dense" } ], "output_features": [ { "name": "contact_type", "type": "category" } ],}}} "strategy": "random", <- grid / random / bayesian "strategy_params": { "samples_limit": 50 }, "params": { "training__dropout": { "type": "float", "min": 0.0, "max": 0.5 }, "flow_node__embedding_size": { "type": "int", "min": 50, "max": 500 } } }
  • 61. Why it is great for experimentation Easy to plug in new encoders, decoders and combiners Change just one string in the YAML file to run a new experiment
  • 63. !63 Outputs A directory containing the trained model CSV file for each output feature containing predictions NPY file for each output feature containing probabilities JSON file containing training statistics JSON file containing test statistics Monitoring during training is available through TensorBoard
  • 65. !65 Nextsteps More features: vectors, nested lists, point clouds, videos More pretrained inputs encoders Adding missing decoders Petastorm integration to work directly on Hive tables without intermediate CSV
  • 67. Extra
  • 68. Machine Learning democratization Efficient exploration of model space through quick iteration Tweak every aspect of preprocessing, modeling and training Extend with your models Non-expert users Expert users From idea to model training in minutes No need for coding and deep knowledge Easy declarative model construction hides complexity