Lowering the bar: deep learning for side-channel analysis

1
Lowering the Bar:
Deep Learning for
Side Channel Analysis
Guilherme Perin, Baris Ege,
Jasper van Woudenberg @jzvw
Blackhat, August 9, 2018

2
Before
BLACKHAT 2018
Leakage
modeling
Signal
processing

4
Power / EM side
channel analysis
BLACKHAT 2018

6
Power analysis
BLACKHAT 2018
Some crypto algorithm

7
Example (huge) leakage
BLACKHAT 2018
Data leakage Noise

8
Signal processing (demo)
Raw trace
Processed trace
BLACKHAT 2018

9
Misalignment (demo)
BLACKHAT 2018

10
AES-128 first round attack
𝑖0
𝑖4
𝑖8
𝑖12
𝑖1
𝑖5
𝑖9
𝑖13
𝑖2
𝑖6
𝑖10
𝑖14
𝑖3
𝑖7
𝑖11
𝑖15
𝑘0
𝑘4
𝑘8
𝑘12
𝑘1
𝑘5
𝑘9
𝑘13
𝑘2
𝑘6
𝑘10
𝑘14
𝑘3
𝑘7
𝑘11
𝑘15
𝑥0
𝑥4
𝑥8
𝑥12
𝑥1
𝑥5
𝑥9
𝑥13
𝑥2
𝑥6
𝑥10
𝑥14
𝑥3
𝑥7
𝑥11
𝑥15
𝑠0
𝑠4
𝑠8
𝑠12
𝑠1
𝑠5
𝑠9
𝑠13
𝑠2
𝑠6
𝑠10
𝑠14
𝑠3
𝑠7
𝑠11
𝑠15
S-BOX
Key Addition
Unknown
Known
Leakage model,
Power prediction

11
Points of interest selection
BLACKHAT 2018
Data leakage Noise
Correlation, T-test, Difference of Means
Samples showing statistical dependency between
intermediate (key-related) data and power consumption.

12
Concept of Template Analysis
BLACKHAT 2018
Open Sample
Ciphertext
Keys Templates
Leakage
ModelMeasure
Closed Sample
Fixed Key Analysis
InputMeasure
Input
Learn (Profiling) Phase
Attack (Exploitation) Phase

13
Key recovery
BLACKHAT 2018
Number of traces
KeyByteRank
AES key bytes 0-15

14
The actual process
Setup
Processing
Acquisition
Analysis
BLACKHAT 2018

15
Deep learning
background
BLACKHAT 2018

16
Deep Learning
BLACKHAT 2018
dog
dog
dog
cat
cat
cat
Data
with labels

17
Deep Learning
BLACKHAT 2018
machine
Cat (%)
Dog (%)
Error function
BACK-PROPAGATION
ALGORITHM
Train a machine to
classify these data
Data
with labels

18
Deep Learning
BLACKHAT 2018
Test the machine
on new data
Train a machine to
classify these data
Data
with labels Cat (%)
Dog (%)
Trained
machine

19
Deep Learning
BLACKHAT 2018
Is classification
accuracy good
enough?
No
Yes
We are done!
Change
parameters
Test the machine
on new data
Train a machine to
classify these data
Data
with labels Trained
machine
Cat
Machine = Deep Neural Network

20
Input Layer
(the size is
equivalent to the
number of samples)
Dense Layers
(classifiers)
Output Layer
(the size is
equivalent to the
number of classes)
Conv. Layers
(feature extractor + encoding)
The convolutional layers are able to detect the features
independently of their positions
Convolutional Neural Networks (CNNs)

21
Creating training/test/validation data sets
BLACKHAT 2018
samples HW = 5
samples HW = 7
samples HW = 3
samples HW = 4
…
features label
…
Leakage model

22
Classification
BLACKHAT 2018
HW = 4
HW = 5
HW = 6
HW = 7
Key enumeration
using output
probabilities
(Bayes)0.65
0.15
0.05
0.08
0.01
0.02
0.01
0.02
0.02
Softmax (σ 𝑝𝑖 = 1)
Trained Model
Trace (samples)

23
Deep learning on
side channels
in practice
BLACKHAT 2018

24
Step 1: Define initial hyper-parameters (demo)
BLACKHAT 2018
HW = 0
HW = 1

25
Step 2: Make sure it’s capable of learning
BLACKHAT 2018
• Increase the number of training traces and observe the training and validation accuracy
• Overfitting too fast?
• Training accuracy: 100% | Validation accuracy: low
• Neural network is too big for the number of traces and samples

26
Step 3: Make it generalize
Make sure the training accuracy/recall is increasing
0.111 Validation recall stays above the minimum
threshold value = model is generalizing
0.111 = 1/9 (9 is the number of classes – HW of a byte)
NN is learning
from its
training set
BLACKHAT 2018

27
Step 3: Make it generalize
Regularization techniques:
 L1, L2 (penalty applied to the weights)
 Dropout
 Data Augmentation (+traces)
 Early Stopping
BLACKHAT 2018
𝑥2
𝑥1
𝑥2
𝑥1
𝑥2
𝑥1
Linear Separation Good Regularization Overfitting
Low Training Accuracy
Low Validation Accuracy
Good Training Accuracy
Good Validation Accuracy
High Training Accuracy
Low Validation Accuracy

28
Step 4: Key Recovery
In this analysis, we only need slightly-above coin flip accuracy!
BLACKHAT 2018
Number of traces
KeyByteRank

29
Getting keys from
the thingz!
BLACKHAT 2018

30
Piñata AES-128 with misalignment (demo)
BLACKHAT 2018

31
Bypassing Misalignment with CNNs
Neural Network: Input Layer > ConvLayer > 36 > 36 > 36 > Output Layer
Training/validation/test sets: 90000/5000/5000 traces of 500 samples
Leakage Model: HW of S-Box Out (Round 1) → 9 classes
Results for key byte 0:
Use Data Augmentation
as regularization
technique to improve
generalization
BLACKHAT 2018
Number of traces
KeyByteRank

32
Breaking protected ECC on Piñata
Supervised deep learning attack:
- Curve25519, Montgomery ladder, scalar blinding
- Messy signal
- Brute-force methods for ECC are needed if test accuracy < 100%
- Need to get (almost) all bits from one trace!
BLACKHAT 2018

33
Breaking protected ECC
Unsupervised/Supervised Horizontal Attack: 60% success rate
Deep learning: 90% success rate
Deep learning( + data augmentation): 99.4% success rate
Data augmentation: 25k → 200k traces.
Misaligned traces
BLACKHAT 2018
4 Dense Layers
(100 Neurons)
3 Conv Layers
(10 filters)
Input
(4000)
Output
(2 Classes)
RELU TANH SOFTMAX

34
Breaking AES with First-Order Masking (demo)
 Target published in 2013 (http://www.dpacontest.org/v4/)
 40k traces available
 AES-256 (Atmel ATMega-163 smart card)
 Countermeasure: Rotating S-box Masking (RSM)
BLACKHAT 2018

35
Breaking AES with First-Order Masking
𝑥 ⊕ 𝑚
𝑦 ⊕ 𝑚
Combine data: (𝑥 ⊕ 𝑚) ⊕ (𝑦 ⊕ 𝑚) = 𝑥 ⊕ 𝑦
Combine samples: 𝑡 𝑖 × 𝑡 𝑗
Brute-force 𝑖, 𝑗 → Quadratic complexity
Remove the relationship between
power consumption (EM) and predictable data
BLACKHAT 2018
𝑥 ⊕ 𝑚 𝑦 ⊕ 𝑚
𝑖 𝑗

36
BLACKHAT 2018
Challenge: Training key == validation key
Correct key byte candidate
(good generalization)
Wrong key byte candidate
(poor generalization)
1/9

37
BLACKHAT 2018
Overfitting can be verified by checking where the NN is learning
Correct key byte candidate
(CNN learns from specific
and leaky samples)
Wrong key byte candidate
(CNN overfits because it
can’t distinguish leaky
samples from noise)

38
Neural Network: Input Layer > ConvLayer > 50 > 50 > 50 > Output Layer
Training/validation/test sets: 36000/2000/2000 traces
Leakage Model: HW of S-Box Out (Round 1) → 9 classes
1/9
BLACKHAT 2018
Results for key byte 0:
The processing of 8
traces is sufficient to
recover the key

39
1st cool thing
BLACKHAT 2018
This shouldn’t work

40
2nd cool thing
BLACKHAT 2018
DL is up there with dozens of SCA research teams

42
I want to learn more!
BLACKHAT 2018
Deeplearningbook.org introtodeeplearning.com bookstores nostarch
?
By Colin & Jasper

43
Key takeaways
BLACKHAT 2018
• DL does SCA art + science and scales
• DL requires network fiddling, the bar is low, not yet at 0
• Automation needed to put a dent in embedded insecurity

44
References
BLACKHAT 2018
• http://www.deeplearningbook.org/
• S. Haykin, “Neural Networks and Learning Machines”.
• E. Cagli et al, “Breaking Cryptographic Implementations Using Deep Learning”
https://eprint.iacr.org/2016/921.pdf
• Benadjila et al, “Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to
ADCAD Database” https://eprint.iacr.org/2018/053.pdf
• H. Maghrebi et al, “Convolutional Neural Networks with Data Augmentation Against Jitter-Based
Countermeasures” https://eprint.iacr.org/2017/740
• Zhang et al, “Understanding deep learning requires re-thinking generalization”
https://arxiv.org/abs/1611.03530
• Keskar et al, “On Large-Batch Training for Deep Learning: Generalization Gap and SharpMinima”,
https://arxiv.org/pdf/1609.04836.pdf
• Shwartz and Tishby, “Opening the black-box of Deep Learning via Information”,
https://arxiv.org/abs/1703.00810

45
Challenge your security
Riscure B.V.
Frontier Building, Delftechpark 49
2628 XJ Delft
The Netherlands
Phone: +31 15 251 40 90
www.riscure.com
Riscure North America
550 Kearny St., Suite 330
San Francisco, CA 94108 USA
Phone: +1 650 646 99 79
inforequest@riscure.com
Riscure China
Room 2030-31, No. 989, Changle Road, Shanghai 200031
China
Phone: +86 21 5117 5435
inforcn@riscure.com
jasper@riscure.com
@jzvw

Lowering the bar: deep learning for side-channel analysis

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Lowering the bar: deep learning for side-channel analysis

Semelhante a Lowering the bar: deep learning for side-channel analysis (20)

Mais de Riscure

Mais de Riscure (19)

Último

Último (20)

Lowering the bar: deep learning for side-channel analysis