Nous entendons aujourd’hui parler de Deep Learning un peu partout : reconnaissance d’images, de sons, génération de textes, etc. Suite aux récentes annonces sur Android Neural Network API et TensorFlowLite et à la release du framework CoreML d’Apple, tout nous pousse vers le “on-device intelligence”.
Bien que les techniques et frameworks soient en train de se démocratiser, il reste difficile d’en voir les applications concrètes en entreprise, et encore moins sur des applications mobiles. Nous avons donc décidé de construire un Proof Of Concept pour relever les défis du domaine.
A travers une application mobile à but éducatif, utilisant du Deep Learning pour de la reconnaissance d’objets, nous aborderons les impacts de ce type de modèles sur les smartphones, l’architecture pour l’entraînement et le déploiement de modèles sur un service Cloud, ainsi que la construction de l’application mobile avec les dernières nouveautés annoncées.
Semelhante a XebiCon'17 : Faites chauffer les neurones de votre Smartphone avec du Deep Learning on-device - Qian Jin, Yoann Benoit et Sylvain Lequeux (20)
XebiCon'17 : Faites chauffer les neurones de votre Smartphone avec du Deep Learning on-device - Qian Jin, Yoann Benoit et Sylvain Lequeux
1. Heat the Neurons of Your
Smartphone with Deep Learning
Qian Jin | @bonbonking | qjin@xebia.fr
Yoann Benoit | @YoannBENOIT | ybenoit@xebia.fr
Sylvain Lequeux | @slequeux | slequeux@xebia.fr
35. Deep Convolutional Neural Network
& Inception Architecture
Credit: http://nicolovaligi.com/history-inception-deep-learning-architecture.html
36. Deep Convolutional Neural Network
36
Image Credit: https://github.com/tensorflow/models/tree/master/research/inception
Visualisation of Inception v3
Model Architecture
Edges Shapes
High Level
Features
Classifiers
40. Transfer Learning
• Use a pre-trained Deep Neural Network
• Keep all operations but the last one
• Re-train only the last operation to specialize your network to your classes
Keep all weights identical
except these ones
40
42. 42
Retrain a Model
Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/
python -m tensorflow/examples/image_retraining/retrain.py
--bottleneck_dir=tf_files/bottlenecks
--how_many_training_steps=500
--model_dir=tf_files/models/
--summaries_dir=tf_files/training_summaries/
--output_graph=tf_files/retrained_graph.pb
--output_labels=tf_files/retrained_labels.txt
--image_dir=tf_files/fruit_photos
43. Obtain the Retrained Model
•2 outputs:
• Model as protobuf file: contains
a version of the selected network
with a final layer retrained on
your categories
• Labels as text file
43
model.pb label.txt
44. 44
public class ClassifierActivity extends CameraActivity implements
OnImageAvailableListener {
private static final int INPUT_SIZE = 224;
private static final int IMAGE_MEAN = 117;
private static final float IMAGE_STD = 1;
private static final String INPUT_NAME = "input";
private static final String OUTPUT_NAME = "output";
private static final String MODEL_FILE = "file:///android_asset/
tensorflow_inception_graph.pb";
private static final String LABEL_FILE = "file:///android_asset/
imagenet_comp_graph_label_strings.txt";
}
Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java
51. Pre-Google I/O 2017
• Use nightly build
• Library .so
• Java API jar
android {
//…
sourceSets {
main {
jniLibs.srcDirs = ['libs']
}
}
}
51
52. Post-Google I/O 2017
Source: Android Meets TensorFlow: How to Accelerate Your App with AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys
52
Currently: 1.4.0
68. Converts YUV420 (NV21) to ARGB8888
68
public static native void
convertYUV420ToARGB8888(
byte[] y,
byte[] u,
byte[] v,
int[] output,
int width,
int height,
int yRowStride,
int uvRowStride,
int uvPixelStride,
boolean halfSize
);
69. 69
/**
* Initializes a native TensorFlow session for classifying images.
*
* @param assetManager The asset manager to be used to load assets.
* @param modelFilename The filepath of the model GraphDef protocol buffer.
* @param labels The list of labels
* @param inputSize The input size. A square image of inputSize x inputSize
is assumed.
* @param imageMean The assumed mean of the image values.
* @param imageStd The assumed std of the image values.
* @param inputName The label of the image input node.
* @param outputName The label of the output node.
* @throws IOException
*/
public static Classifier create(
AssetManager assetManager,
String modelFilename,
List<String> labels,
int inputSize,
int imageMean,
float imageStd,
String inputName,
String outputName) {
}
70. 70
@Override
public List<Recognition> recognizeImage(final Bitmap bitmap) {
// Preprocess bitmap
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(),
bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
}
// Copy the input data into TensorFlow.
inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3);
// Run the inference call.
inferenceInterface.run(outputNames, logStats);
// Copy the output Tensor back into the output array.
inferenceInterface.fetch(outputName, outputs);
(continue..)
Preprocess Bitmap / Create Tensor
71. 71
@Override
public List<Recognition> recognizeImage(final Bitmap bitmap) {
// Preprocess bitmap
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(),
bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
}
// Copy the input data into TensorFlow.
inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3);
// Run the inference call.
inferenceInterface.run(outputNames, logStats);
// Copy the output Tensor back into the output array.
inferenceInterface.fetch(outputName, outputs);
(continue..)
Feed Input Data to TensorFlow
72. 72
@Override
public List<Recognition> recognizeImage(final Bitmap bitmap) {
// Preprocess bitmap
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(),
bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
}
// Copy the input data into TensorFlow.
inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3);
// Run the inference call.
inferenceInterface.run(outputNames, logStats);
// Copy the output Tensor back into the output array.
inferenceInterface.fetch(outputName, outputs);
(continue..)
Run the Inference Call
73. 73
@Override
public List<Recognition> recognizeImage(final Bitmap bitmap) {
// Preprocess bitmap
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(),
bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
}
// Copy the input data into TensorFlow.
inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3);
// Run the inference call.
inferenceInterface.run(outputNames, logStats);
// Copy the output Tensor back into the output array.
inferenceInterface.fetch(outputName, outputs);
(continue..)
Fetch the Output Tensor
74. 74
(continue..)
// Find the best classifications.
PriorityQueue<Recognition> pq =
new PriorityQueue<>(
3,
(lhs, rhs) -> {
// Intentionally reversed to put high confidence at the head of the queue.
return Float.compare(rhs.getConfidence(), lhs.getConfidence());
});
for (int i = 0; i < outputs.length; ++i) {
if (outputs[i] > THRESHOLD) {
pq.add(
new Recognition(
"" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null));
}
}
//...
return recognitions;
}
Find the Best Classification
78. Model Fusion
• Start from previous model to keep all specific operations in the graph
• Specify all operations to keep when optimizing for inference
78
graph_util.convert_variables_to_constants(sess, graph.as_graph_def(),
[“final_result_fruits”, “final_result_vegetables”]
85. Currently with Project Magritte…
Training
• Model debug done by overheating a laptop
• Model built on personal GPU
• Files uploaded manually
Model dispensing
• API available
• Deployment on AWS, currently migrating on Google Cloud
85
89. public TensorFlowInferenceInterface(AssetManager assetManager, String model) {
prepareNativeRuntime();
this.modelName = model;
this.g = new Graph();
this.sess = new Session(g);
this.runner = sess.runner();
final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX);
InputStream is = null;
try {
String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model;
is = assetManager.open(aname);
} catch (IOException e) {
if (hasAssetPrefix) {
throw new RuntimeException("Failed to load model from '" + model + "'", e);
}
// Perhaps the model file is not an asset but is on disk.
try {
is = new FileInputStream(model);
} catch (IOException e2) {
throw new RuntimeException("Failed to load model from '" + model + "'", e);
}
}
}
Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
90. public TensorFlowInferenceInterface(AssetManager assetManager, String model) {
prepareNativeRuntime();
this.modelName = model;
this.g = new Graph();
this.sess = new Session(g);
this.runner = sess.runner();
final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX);
InputStream is = null;
try {
String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model;
is = assetManager.open(aname);
} catch (IOException e) {
if (hasAssetPrefix) {
throw new RuntimeException("Failed to load model from '" + model + "'", e);
}
// Perhaps the model file is not an asset but is on disk.
try {
is = new FileInputStream(model);
} catch (IOException e2) {
throw new RuntimeException("Failed to load model from '" + model + "'", e);
}
}
}
Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
91. public TensorFlowInferenceInterface(AssetManager assetManager, String model) {
prepareNativeRuntime();
this.modelName = model;
this.g = new Graph();
this.sess = new Session(g);
this.runner = sess.runner();
final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX);
InputStream is = null;
try {
String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model;
is = assetManager.open(aname);
} catch (IOException e) {
if (hasAssetPrefix) {
throw new RuntimeException("Failed to load model from '" + model + "'", e);
}
// Perhaps the model file is not an asset but is on disk.
try {
is = new FileInputStream(model);
} catch (IOException e2) {
throw new RuntimeException("Failed to load model from '" + model + "'", e);
}
}
}
Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
103. On-device Inferencing
• Latency: You don’t need to send a request over a network connection and wait for a
response. This can be critical for video applications that process successive frames
coming from a camera.
• Availability: The application runs even when outside of network coverage.
• Speed: New hardware specific to neural networks processing provide significantly
faster computation than with general-use CPU alone.
• Privacy: The data does not leave the device.
• Cost: No server farm is needed when all the computations are performed on the
device.
103
Source: https://developer.android.com/ndk/guides/neuralnetworks/index.html
104. Source: https://www.tensorflow.org/mobile/tflite/
TensorFlow Lite
• New model file format: based on
FlatBuffers, no parsing/unpacking
step & much smaller footprint
• Mobile-optimized interpreter: uses a
static graph ordering and a custom
(less-dynamic) memory allocator
• Hardware acceleration
105. Federate Learning
Collaborative Machine Learning without Centralized Training Data
105
Source: https://research.googleblog.com/2017/04/federated-learning-collaborative.html