The document provides an overview of various Google Cloud machine learning and artificial intelligence services including BigQuery, Cloud Vision API, Cloud Natural Language API, Cloud Speech API, Cloud Video Intelligence API, AutoML, and Cloud ML Engine. It also includes code examples demonstrating how to use these services to analyze images, text, audio and video by extracting metadata and insights. The speaker is introduced as a Developer Advocate at Google Cloud whose mission is to help developers be successful using Google Cloud tools and platforms.
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Easy Path to Machine Learning (2019)
1. 2018 | Confidential and Proprietary
Easy path to Machine Learning
Wesley Chun (@wescpy)
Developer Advocate, Google Cloud
G Suite Dev Show
goo.gl/JpBQ40
About the speaker
● Developer Advocate, Google Cloud
● Mission: enable current and future developers to be successful using
Google Cloud and other Google developer tools, APIs, and platforms
● Videos: host of the G Suite Dev Show on YouTube
● Blogs: developers.googleblog.com & gsuite-developers.googleblog.com
● Twitters: @wescpy, @GoogleDevs, @GSuiteDevs
● Background
● Software engineer & architect for 20+ years
● One of the original Yahoo!Mail engineers
● Author of bestselling "Core Python" books (corepython.com)
● Teacher and technical instructor since 1983
● Fellow of the Python Software Foundation
2.
3. Storing and Analyzing Data: BigQuery
Google BigQuery: a fast, highly
scalable, cost-effective, and
fully-managed data warehouse in the
cloud for analytics with built-in
machine learning; issue SQL queries
across multi-terabytes of data
cloud.google.com/bigquery
Machine Learning: Cloud Vision
Google Cloud Vision API
enables developers to extract
metadata & understand the
content of an image
cloud.google.com/vision
4. Machine Learning: Cloud Natural Language
Google Cloud Natural Language API
reveals the structure & meaning
of text; also performs content
classification and sentiment
analysis; multi-lingual
cloud.google.com/language
Machine Learning: Cloud Speech
Google Cloud Speech APIs enable
developers to convert
speech-to-text and vice versa
cloud.google.com/speech
cloud.google.com/text-to-speech
5. Machine Learning: Cloud Video Intelligence
Google Cloud Video Intelligence
API makes videos searchable, and
discoverable, by extracting
metadata. Other features: object
tracking, shot change detection,
and text detection
cloud.google.com/video-intelligence
BigQuery: querying Shakespeare words
TITLE = "The top 10 most common words in all of Shakespeare's works"
QUERY = '''
SELECT LOWER(word) AS word, sum(word_count) AS count
FROM [bigquery-public-data:samples.shakespeare]
GROUP BY word ORDER BY count DESC LIMIT 10
'''
rsp = BQ.query(body={'query': QUERY}, projectId=PROJ_ID).execute()
print('n*** Results for %r:n' % TITLE)
for col in rsp['schema']['fields']: # HEADERS
print(col['name'].upper(), end='t')
print()
for row in rsp['rows']: # DATA
for col in row['f']:
print(col['v'], end='t')
print()
6. Top 10 most common Shakespeare words
$ python bq_shake.py
*** Results for "The most common words in all of Shakespeare's works":
WORD COUNT
the 29801
and 27529
i 21029
to 20957
of 18514
a 15370
you 14010
my 12936
in 11722
that 11519
labeling = VISION.images().annotate(body=body).execute().get('responses')
for labels in labeling:
if 'labelAnnotations' in labels:
print('** Labels detected (and confidence score):')
for label in labels['labelAnnotations']:
print(('%.2f%%' % (
label['score']*100.)).ljust(10), label['description'])
if 'faceAnnotations' in labels:
print('n** Facial features detected (and likelihood):')
for label, value in labels['faceAnnotations'][0].items():
if label.endswith('Likelihood'):
print(label.split('Likelihood')[0].ljust(16),
value.lower().replace('_', ' '))
Vision: image analysis & metadata extraction
7. $ python viz_demo.py
** Labels detected (and confidence score):
89.94% Sitting
86.09% Interior design
82.08% Furniture
81.52% Table
80.85% Room
79.04% White-collar worker
76.19% Office
68.18% Conversation
60.96% Window
60.07% Desk
** Facial features detected (and likelihood):
anger very unlikely
joy very likely
underExposed very unlikely
sorrow very unlikely
surprise very unlikely
headwear very unlikely
blurred very unlikely
Vision: image analysis & metadata extraction
Simple sentiment & classification analysis
data = {'type': 'PLAIN_TEXT', 'content': '''
Google, headquartered in Mountain View, unveiled the new Android
phone at the Consumer Electronics Show. Sundar Pichai said in
his keynote that users love their new Android phones.'''
# sentiment analysis followed by content classification
sentiment = NL.documents().analyzeSentiment(
body={'document': data}).execute().get('documentSentiment')
print('TEXT:', text)
print('nSENTIMENT: score (%s), magnitude (%s)' % (
sentiment['score'], sentiment['magnitude']))
print('nCATEGORIES:')
categories = NL.documents().classifyText(
body={'document': data}).execute().get('categories')
for cat in categories:
print ('* %s (%s)' % (cat['name'][1:], cat['confidence']))
8. Simple sentiment & classification analysis
$ python nl_sent_class.py
TEXT: Google, headquartered in Mountain View, unveiled the new Android
phone at the Consumer Electronics Show. Sundar Pichai said in
his keynote that users love their new Android phones.
SENTIMENT: score (0.3), magnitude (0.6)
CATEGORIES:
* Internet & Telecom (0.76)
* Computers & Electronics (0.64)
* News (0.56)
Text-to-Speech: synthsizing audio text
# request body (with text body using 16-bit linear PCM audio encoding)
body = {
'input': {'text': text},
'voice': {
'languageCode': 'en-US',
'ssmlGender': 'FEMALE',
},
'audioConfig': {'audioEncoding': 'LINEAR16'},
}
# call Text-to-Speech API to synthesize text (write to text.wav file)
T2S = discovery.build('texttospeech', 'v1', developerKey=API_KEY)
audio = T2S.text().synthesize(body=body).execute().get('audioContent')
with open('text.wav', 'wb') as f:
f.write(base64.b64decode(audio))
9. Speech-to-Text: transcribing audio text
# request body (16-bit linear PCM audio content, i.e., from text.wav)
body = {
'audio': {'content': audio},
'config': {
'languageCode': 'en-US',
'encoding': 'LINEAR16',
},
}
# call Speech-to-Text API to recognize text
S2T = discovery.build('speech', 'v1', developerKey=API_KEY)
rsp = S2T.speech().recognize(
body=body).execute().get('results')[0]['alternatives'][0]
print('** %.2f%% confident of this transcript:n%r' % (
rsp['confidence']*100., rsp['transcript']))
Speech-to-Text: transcribing audio text
$ python s2t_demo.py
** 92.03% confident of this transcript:
'Google headquarters in Mountain View unveiled the new
Android phone at the Consumer Electronics Show Sundar
pichai said in his keynote that users love their new
Android phones'
10. Video intelligence: make videos searchable
# request body (single payload, base64 binary video)
body = {
"inputContent": video,
"features": ['LABEL_DETECTION', 'SPEECH_TRANSCRIPTION'],
"videoContext": {"speechTranscriptionConfig": {"languageCode": 'en-US'}},
}
# perform video shot analysis followed by speech analysis
VINTEL = discovery.build('videointelligence', 'v1', developerKey=API_KEY)
resource = VINTEL.videos().annotate(body=body).execute().get('name')
while True:
results = VINTEL.operations().get(name=resource).execute()
if results.get('done'):
break
time.sleep(random.randrange(8)) # expo-backoff probably better
Video intelligence: make videos searchable
# loop through all annotation results
for labels in results['response']['annotationResults']:
# display shot labels (and confidence score)
if 'shotLabelAnnotations' in labels:
for shot in labels['shotLabelAnnotations']:
seg = shot['segments'][0]
print(' - %s (%.2f%%)' % (
shot['entity']['description'],
seg['confidence']*100.,
))
# display speech labels (and confidence)
if 'speechTranscriptions' in labels:
speech = labels['speechTranscriptions'][0]['alternatives'][0]
print(' - %.2f%% confidence transscript is: %r' % (
speech['confidence']*100., speech['transcript']))
11. Video intelligence: make videos searchable
$ python3 vid_demo.py you-need-a-hug.mp4
** Video shot analysis labeling
- vacation (30.62%)
- fun (61.53%)
- interaction (38.93%)
- summer (57.10%)
** Speech analysis labeling
- 'you need a hug come here' (79.27%)
Machine Learning: AutoML
AutoML: a suite of cloud APIs for
developers with limited machine
learning expertise; chooses the best
models & allows for further training
of those models for your data
(Translation, Vision, Natural Language,
Video Intelligence, Tables)
cloud.google.com/automl
cloud.google.com/automl-tables
12. Machine Learning: Cloud ML Engine
Google Cloud Machine Learning Engine
is a managed service that lets you
build, train, and deploy machine
learning models (scikit-learn,
XGBoost, Keras, TensorFlow), then make
predictions with trained models
cloud.google.com/ml-engine
Thank you!
Wesley Chun
@wescpy