TED: consider using the word “interesting” instead of “anomalous”… people may think you are talking about anomaly detection…
Old joke: all the world can be divided into 2 categories: Scotch tape and non-Scotch tape… This is a way to think about the co-occurrence
Only important co-occurrence is puppy follows apple
*Take that row of matrix and combine with all the meta data we might have…
*Important thing to get from the co-occurrence matrix is this indicator..
Cool thing: analogous to what a lot of recommendation engines do
*This row forms the indicator field in a Solr document containing meta-data (you do NOT have to build a separate index for the indicators)
Find the useful co-occurrence and get rid of the rest.
Sparsify and get the anomalous co-occurrence
Note to trainer: take a little time to explore this here and on the next couple of slides. Details enlarged on next slide
*This indicator field is where the output of the Mahout recommendation engine are stored (the row from the indicator matrix that identified significant or interesting co-occurrence.
*Keep in mind that this recommendation indicator data is added to the same original document in the Solr index that contains meta data for the item in question
This is a diagnostics window in the LucidWorks Solr index (not the web interface a user would see). It’s a way for the developer to do a rough evaluation (laugh test) of the choices offered by the recommendation engine.
In other words, do these indicator artists represented by their indicator Id make reasonable recommendations
Note to trainer: artist 303 happens to be The Beatles. Is that a good match for Chuck Berry?