5. Proprietary and ConfidentialProprietary and Confidential
Building a Neural Network
State of the art computer vision techniques were researched, prototyped, then
iterated on until the service began to take form.
As the team iterated and iterated, the similarity results improved.
Let’s take a journey...
Source: 2012, Krachevsky et al
Hello - my name is Catherine Ulrich and I am the Chief Product Officer at Shutterstock.
Shutterstock is a technology company that enables marketers, designers, businesses and creative professionals to find and license royalty free images, video and music for use on websites, ad campaigns, marketing materials and movies.
More than 100,000 photographers, artists, videographers and musicians upload and license their work via Shutterstock.
Uploading more than 800,000 images/video/music clips a week.
1.5million customers.
I’m going to share some insights from leveraging artificial intelligence for visual search - I’ll focus on
How we taught the technology
How we leveraged that for mobile
The biggest challenge, faced by people searching for images.
I found this image on the internet and I’d really like to use it in a marketing campaign I’m working on.
Since I don’t own the rights to this image, I need to purchase a licensed version of it so I’ll try to describe this image to find something similar on Shutterstock.
I might use the words stained, glass, window, church, angle.
As you can see it is challenging.
More keywords do not necessarily help improve the accuracy, and in fact about 65% of searches are just 1 or 2 keywords on Shutterstock - it’s not a lot for us to go off of.
The results coming up are all relevant to the keywords, but they don’t look like the image I want.
I’m not sure the words exist to describe what it is I like about this image to put into the search box.
1. Language accuracy challenge
2. Articulating how a image looks
3. Relevancy/popular based on downloads
To help customers with this issue we created a Computer Vision team to leverage artificial intelligence for improved search functionality.
In 2012, there was a huge advancement in computer vision with the publication of the Krachevsky paper on imagenet classification. This is the first model our team used to build an initial service that was poor and slow, but showed promise.
A Neural Network is a system modeled after the human brain, and the way it works is that you feed in a large number of images with labels, such as cat pictures, and the system literally “learns” what a cat looks like.
The more training pictures you give it, the more it can learn. The more the system understands what is in an image, the better it is able to compute similarity. And that’s great for us, because at Shutterstock we have a LOT of high-quality data.
We sell about 5 images a second, or about 400,000 images a day.
Each of these are preceded by searches, clicks, detail page views, etc, which sums up to a LOT of training data.
Over the past year they iterated on making it smarter and faster, until it reached the quality level we were happy to share publicly.
I’ll show you the process of refining this and how we continue to improve this model.
The computer vision team within Shutterstock harnessed this data and were given the explicit mission of advancing our understanding of images to power next generation search experiences.
Here’s how it began to evolve over time.
Constantly refining, almost like two dials - trying to balance the pixel data that we’re feeding the model.
Picking up the colors. But only one face. No geometric pattern - but understanding the abstract.
Hyper focused on the facial elements and colors, artistic style
Too far on the geometric side of the scale - meant the facial features are lost.
This is the model we launched with and continue to refine with more data every day as customers search and download.
Both images are really some of the more complicated images we fed the model - in terms of composition. This is to show how far its come.
Now with reverse image for desktop launched in March - these are the results.
If you remember from the beginning, this is the image I wanted to describe and these were the results I got.
Launched on web in March.
Yesterday we launched reverse image search for mobile.
This is revolutionary for our customers to be able to search for images on the go.
We’re the only stock photo provider to have launched the search/download computer vision technology on mobile.
Step by step demo -
As we hoped, reverse image search is benefitting customers in non-English speaking countries most, as we can see by the share of reverse image downloads vs. traditional text search downloads by country:
For our on-demand customers, this is vastly outperforming text search.
Continue to improve and evolve as the collection grows.
Different asset types and Android later in the summer.
Transcending language.