Some machine learning algorithms work better for certain types of problems than others. However, basically all of them start out with a blank slate, or perhaps random data as intermediate values, and then the dataset is analyzed in order to find which attributes of an example outcome provide the most information gain in order to make the correct classification on the training data. Learning happens iteratively over time in a manner that tries to come up with an ever more accurate answer.
Chat bots need machine learning in order to parse the conversation into its grammatical components, and to understand the context of the conversation so it can come up with an appropriate response.
Pepper the robot is used for a variety of emotion-based tasks such as teaching, helping kids with autism stay focused, and training people to make standard responses to events if they are not socially adjusted.
I hate to say this, because my mom works in Accounts Payable, but eventually her job will be automated by a system that can automatically scan an invoice, look for the amount due, and facilitate the payment over a payment network electronically. Document recognition is important for banks too, as statements which prove income for purposes of granting a loan come in many different forms.
HPE Haven – text analysis, speech recognition, image recognition, face detection, recommendation engine, knowledge graph analysis
Amazon – industry-standard ML algorithms on top of your own data sets with listeners to re-evaluate upon new data (kind of like the now-deprecated Google Cloud Prediction API)
Amazon also just rolled out SageMaker, which is a cloud-based service to help users visualize data and select the best ML algorithm, then deploy it at scale for the inference stage
Amazon Rekognition image & video analysis that ties in with the DeepLens camera
Azure APIs – language analytics, face & emotion & explicit content detection, speech recognition, and recommendation APIs
IBM Watson
Facebook – Ported Caffe for several platforms to run light & efficiently
Things above the line are pre-defined models that Google has already tuned for you to give what they feel are the best results. Things below the line are where you have to provide the data and possibly the tuning yourself in order to get the best outcome.
Tensorflow Serving falls in Local because you have to set up the parameters of the infrastructure yourself, whereas Cloud ML Engine will size the model inference environment for you automatically. However, TF Serving can also be run on any hardware you have, not just inside GCP.
The native implementations are brought to you by the Mobile Vision API. The RESTful API which is called Cloud Vision does not feature barcode reading capabilities since that’s something a device should be able to handle on its own natively.
Safe search includes Adult, Medical, Spoof, and Violence.
No Barcode feature because why should you send images over the network just to find out if it’s a barcode when chances are it’s not.
Before we get the response, there’s one more detail we need to form our request: the authentication.
Google provides several ways to allow use of the Machine Learning APIs, and the exact ways you implement them depend on what level of user data you seek to access in your application.
The Cloud Vision API does not require you to pass along OAuth prompts to your users. You can simply make a request by finding your browser API key (usually starting with the letters AIza) and then appending that to your query string. However, this precludes you from using Google’s handy dandy API libraries for your programming language of choice. (Then again, if you’re using ALGOL 68, SmallTalk, or 6502 assembly, maybe it’s your only option.)
To use Google’s libraries, you need to at least set up a Service Account. This type of authentication scheme is usually used for servers communicating to other servers and requires minimal end user intervention. Also, typical controls you might find on accounts in a Google Apps domain do not apply to service accounts, therefore you could be inadvertently granting your users the chance to do something dangerous like share documents and data outside of your domain if you are not careful with the permissions you grant them.
Nevertheless, to save ourselves some time and trouble, we are going to use the basic Service Account approach for the sake of this demo. The instructions on how to create an appropriate service worker are on this workshop’s GitHub repo.
importanceFraction is the “fraction of importance of this salient region with respect to the original image.” Not really sure how they come to that conclusion.
This demo consists of running the Googly Eyes Node.js application provided in my GitHub repo. With some files in my Google Cloud Storage bucket, I could refer to that storage bucket name and the file name to load up files with human faces in them, and then this program would use the parameters returned by the Face Detection API and do some mathematics to superimpose googly eyes on top of the image using HTML5 Canvas.
One would presume the native libraries would be taking advantage of lower-level calls to get to the bare metal of the phone and make the analysis faster.
In the past, Sentiment Analysis was only available for English text. The Changelog https://cloud.google.com/natural-language/release-notes from 11/15/16 indicates that Japanese & Spanish support is available on sentiment analysis, but their How-To guide for “Analyzing Sentiment” indicates only English is supported. What’s interesting is when an English string is scored, then sent through Translate, and then you see totally different sentiment scores.
There are client libraries for C#, Go, Java, and other languages, plus RPC calls, available for interaction with the NL API. For RESTful calls, this is the syntax to use if you wish to just run one particular NL API query at a time.
analyzeEntities gives you phrases in the text that are known entities, such as persons, locations, or organizations, and references the saliency of the entity to the article as well as how many times it was mentioned.
analyzeSentiment gives you sentiment scores for the entire text body, and analyzeEntitySentiment will give you sentiment scores for text associated with each entity and its mentions. More on this in the next slide.
analyzeSyntax gives you all the parts of speech and grammatical considerations for each word.
classifyText will give you an array of potential categories for the text as well as the confidence of that category classification.
In this construct with the “annotateText” endpoint invoked, specify in the “features” object which of the three types of NLP queries you want to run.
The score (formerly known as polarity) is a [-1, 1] scale of the emotional sentiment (tone) of the article, from negative/scathing/blistering to positive/gushing/cheerful.
The magnitude of the article is a scale of how emotional the document comes across. Since each expression in the text contributes to the magnitude, longer articles will have a higher magnitude, regardless of whether the tone of each individual emotional word in the text is positive or negative in sentiment.
The small snippet of JSON here is now contained within an object called “documentSentiment”. A sibling of this object in the returned JSON structure is called “sentences”, which is an array consisting of objects containing the text and offset of each sentence, plus a “sentiment” object containing the individual magnitude and score for that piece.
In this demo, I simply navigated to the Web sandbox at https://cloud.google.com/natural-language/ to run various strings of text through the tool.
Types of models:
- Regression: estimate a numeric answer based on the examples along a continuous curve, akin to a formula that you would solve for
- Categorization: group possible answers into buckets that don’t have a clear distance to other buckets
Here again, we need to discuss authentication, because it’s a little bit stricter for the Prediction API than for Cloud Vision.
Users must be authenticated into the Prediction API in order to use it. We can still use Service Accounts to authenticate ourselves into the app, but you have to specify a few more settings when creating them, as specified in this workshop’s GitHub repository.
However, to push the authentication screen to your end users, go into the API Manager (at console.*developers*.google.com), open up the Credentials page, and create an OAuth Client ID credential. It will guide you through steps based on the type of devices your application is targeting, and then use the client secrets file it creates when you are defining the client object in your application. Depending on the language you are using, the objects you create to authenticate a user through a Web-based service or a standalone application might look different than that used for Service Accounts.
In this example, I took the data set from Problem Set #2 from my old Machine Learning homework from Fall 2010 (at http://users.eecs.northwestern.edu/~ahu340/eecs349-ps2/ ), and loaded in the train.csv file into my Google Cloud Storage Bucket. I had to massage the CSV file into the exact format it was looking for. This includes things like removing all double quotes and changing the outcome of the example to a text value rather than a numeric value so it would classify the outcome as a categorization rather than a regression, since we probably don’t want to force the values of “0” for no and “1” for yes to be the regression we solve for.
After loading in the training data, I took some of the classification examples and pasted them into the Prediction API sample app running locally, downloadable from this workshop’s GitHub repo.
If you want to try this in API Explorer, break out the attributes of each example into JSON in the following format (note if copying the following to watch for unintended fancy double quotes):
{
"input": {
"csvInstance": [
"104042",
"0",
"77",
"m",
"s",
"0",
"9",
"7",
"523",
"428824",
"24",
"115315"
]
}
}
In this example, I took the data set from Problem Set #2 from my old Machine Learning homework from Fall 2010 (at http://users.eecs.northwestern.edu/~ahu340/eecs349-ps2/ ), and loaded in the train.csv file into my Google Cloud Storage Bucket. I had to massage the CSV file into the exact format it was looking for. This includes things like removing all double quotes and changing the outcome of the example to a text value rather than a numeric value so it would classify the outcome as a categorization rather than a regression, since we probably don’t want to force the values of “0” for no and “1” for yes to be the regression we solve for.
After loading in the training data, I took some of the classification examples and pasted them into the Prediction API sample app running locally, downloadable from this workshop’s GitHub repo.
If you want to try this in API Explorer, break out the attributes of each example into JSON in the following format (note if copying the following to watch for unintended fancy double quotes):
{
"input": {
"csvInstance": [
"104042",
"0",
"77",
"m",
"s",
"0",
"9",
"7",
"523",
"428824",
"24",
"115315"
]
}
}
The ReLU (rectified linear unit) function is a popular function to use inside perceptrons.
We will describe the Softmax function in great detail later.
The old way, initialize_all_variables(), should have been removed back in March 2017.
The softmax function is one that attempts to normalize the vector of guesses so that all the outputs sum to exactly 1, thus converting the output of guesses into a probability distribution. It also serves to multiplicatively increase the weight given to any particular outcome per additional unit of evidence, but also reduce that weight with each unit of evidence withheld. Thus, it is often used to represent categorical distributions. After all, we are categorizing pictures of digits as 0 through 9; that doesn’t mean they will exist linearly in space away from each other, or even in the same order, if you were to look at the values of each attribute that makes the system think it’s one particular digit compared to another.
Cross-entropy is used as a function to compute the cost of encoding a particular event drawn from a set. The less information required to make the distinction, the better. In this case, we can also represent how far apart the guess is from the correct answer on the training set. Cross-entropy helps perform learning faster by avoiding slowdowns that could be encountered by other functions like the quadratic cost function, where learning is impaired when it gets close to the correct answer.
Incidentally, the formula for cross-entropy has changed slightly and has been updated in this presentation. It was noted in the comments of the example code that the previous way could become numerically unstable, and so it has been changed to what is listed here. This also simplifies the model because the new function will handle taking the softmax of y.
The first code snippet takes the vectors and compares the first dimension of each element between y and y_. It then generates a new vector equal to the length of y (or y_ for that matter) and each result is a Boolean as to whether the things being compared were equal or not.
The second code snippet converts each Boolean into a 0 or 1, and then takes the average of all the elements in the vector to give you accuracy as a fraction.
The third code snippet actually causes the accuracy to be calculated. Note this used to be print(sess.run(accuracy, feed_dict=…)) and it’s still like that in the code examples, but the way listed here is in the latest documentation and it works fine.
In this demo, I launchTensorFlow through a Docker instance and ran my Python code that runs a previously-generated model to classify one of the images represented in one of the datasets.
If you happened to terminate a previously-running Docker instance, you can reactivate it and get into it again with the following commands on the command line:
ed
docker start `docker ps -q -l`
docker attach `docker ps -q -l`
Note there are several versions of Margaret’s talk available to be found on the Internet, including one from Big Android BBQ 2016 that’s a couple months newer and might have slightly different content (like how this slide deck is different from November’s deck).