3. How Can You Get Started with Machine Learning?
Three ways, with varying complexity:
(1) Use a Cloud-based or Mobile API (Vision, Natural Language,
etc.)
(2) Use an existing model architecture, and retrain it or fine tune
on your dataset
(3) Develop your own machine learning models for new
problems
More
flexible,
but more
effort
required
7. Faces
Faces, facial landmarks, emotions
OCR
Read and extract text, with
support for > 10 languages
Label
Detect entities from furniture to
transportation
Logos
Identify product logos
Landmarks & Image Properties
Detect landmarks & dominant
color of image
Safe Search
Detect explicit content - adult,
violent, medical and spoof
Cloud Vision API
8. API Usage: Detect Objects in an Image
Image Detected
Items
Vision API
Create JSON
request with the
image or pointer
to an image
Process
the JSON
response
Call the
REST API1 2 3
9. Confidential & ProprietaryGoogle Cloud Platform 9
Cloud Natural Language API
Extract sentence, identify parts of
speech and create dependency parse
trees for each sentence.
Identify entities and label by types such
as person, organization, location, events,
products and media.
Understand the overall sentiment of a
block of text.
Syntax Analysis Entity Recognition
Sentiment Analysis
11. Confidential & ProprietaryGoogle Cloud Platform 11
Cloud Speech API
Automatic Speech Recognition (ASR)
powered by deep learning neural
networking to power your
applications like voice search or
speech transcription.
Recognizes over 80
languages and variants
with an extensive
vocabulary.
Returns partial
recognition results
immediately, as they
become available.
Filter inappropriate
content in text results.
Audio input can be captured by an application’s
microphone or sent from a pre-recorded audio
file. Multiple audio file formats are supported,
including FLAC, AMR, PCMU and linear-16.
Handles noisy audio from many
environments without requiring
additional noise cancellation.
Audio files can be uploaded in the
request and, in future releases,
integrated with Google Cloud
Storage.
Automatic Speech Recognition Global Vocabulary Inappropriate Content
Filtering
Streaming Recognition
Real-time or Buffered Audio Support Noisy Audio Handling Integrated API
15. Face API
faces, facial landmarks, eyes
open, smiling
Barcode API
1D and 2D barcodes
Text API
Latin-based text / structure
Common Mobile Vision API
Support for fast image and video on-device detection and tracking.
NEW!
16. Googly Eyes Android App
Video credit Google
1. Create a face detector for facial landmarks (e.g., eyes)
3. For each face, draw the eyes
FaceDetector detector = new FaceDetector.Builder()
.setLandmarkType(FaceDetector.ALL_LANDMARKS)
.build();
SparseArray<Face> faces = detector.detect(image);
for (int i = 0; i < faces.size(); ++i) {
Face face = faces.valueAt(i);
for (Landmark landmark : face.getLandmarks()) {
// Draw eyes
2. Detect faces in the image
37. Data Flow Graphs
Computation is defined as a directed acyclic graph
(DAG) to optimize an objective function
● Graph is defined in high-level language (Python)
● Graph is compiled and optimized
● Graph is executed (in parts or fully) on available low
level devices (CPU, GPU)
● Data (tensors) flow through the graph
● TensorFlow can compute gradients automatically
38. Architecture
● Core in C++
● Different front ends
○ Python and C++ today, community may add more
Core TensorFlow Execution System
CPU GPU Android iOS ...
C++ front end Python front end ...
54. tensorflow.org
github.com/tensorflow
Want to learn more?
Udacity class on Deep Learning, goo.gl/iHssII
Guides, codelabs, videos
MNIST for Beginners, goo.gl/tx8R2b
TF Learn Quickstart, goo.gl/uiefRn
TensorFlow for Poets, goo.gl/bVjFIL
ML Recipes, goo.gl/KewA03
TensorFlow and Deep Learning without a PhD, goo.gl/pHeXe7
What's Next