SlideShare a Scribd company logo
1 of 30
twitter.com/tudorcristea
Pies, pipes, or any
combination thereof are not
involved.
Refers to the story of the
Pied Piper of Hamelin: follow
the music.
Root problem: music
information retrieval.

ID3 information
(CDDB/Gracenote) is usually
flawed.
Everybody likes music.
What kind of music?
Difficult to say? That is
because it is a difficult (as in
subjective) question.
Usually neither we or the
artists we listen to limit our
range to just one music
genre.
We like stuff that sounds like
the stuff we like.
Tough problem: teach that to
a machine.
Tough, but not impossible:
teach it what sounds like
means.
Luckily, music…
…has hues.
First of all, we’re going to
need some (legal) teaching
material:
ccmixter.org/query-api
Second, a way to extract
some meaningful features
out of those MP3 files:
ifs.tuwien.ac.at/mir/audiofeat
ureextraction.html
Rhythm Histograms   Rhythm Patterns
3.243435282592375 1.994601142999418 1.602942870377543 2.580839981133314 1.543740353999544 2.068105015908046 3.632687882087414
3.263291634847903 3.044904881070487 4.5829566086821 1.956934469923299 2.580545989450239 2.429407925390764 2.788994073797176
1.690166884934422 1.555860154231844 2.538332189091759 3.308823909719808 2.901616552714908 2.822014163802716 3.057580859936266
4.119539882459613 5.453411739995226 5.700965142221574 3.644340096592682 2.396144855618434 2.668955574617245 3.084844387219049
2.390710363180224 3.76382419342901 5.030486764491426 4.300433534348324 4.821028562923828 4.280883335826844 2.190830721855284
2.961580313318945 4.34391019602777 2.951321599170762 2.449444142519419 4.812316305000328 3.951498800499126 3.917810868262246
3.715130504668615 3.998629684199756 5.181464895638284 6.084726075190199 7.913084886632093 3.706451224577581 4.698003385499159
3.824291021844686 5.384668543825263 3.953003636302575 3.428872039025952 4.824143429813525 4.541042340953487 4.264937057101506
7.33094855091972 4.580084845659833 6.295917883436843 4.14291707342239 4.954729182869467 5.254899441852804 7.756213636796876
7.677458410851154 8.173543349122873 7.614100030473213 7.518159783076418 7.843224352722845 7.125038216943838 5.195612434552322
5.151940097341069 3.0640282849477 5.918468920797019 6.15884358569437 4.81452719506864 5.800933882702959 5.744174364919134
7.569921298684167 6.712979004287242 7.717992870245721 5.684438150875594 7.127374728497071 4.792119949151986 5.20195100901256
4.384040088867235 7.480543165254396 6.068555547993299 7.616701708848335 7.766169335357228 11.382122364726827 12.858904597857029
19.849080334434777 7.158426099912344 4.517127293813476 6.274332358430609 8.472280658535901 7.801381678697753 8.758855739959486
8.997139007750159 9.00666057300392 7.949432386638614 8.794768734764487 8.989550890287987 8.302698807962054 14.182123963447646
8.391117712214388 11.11758041425602 9.466868578773333 8.624599196425805 9.839857864388469 10.01151957620144 14.268381887311374
13.453466400101796 13.307992403194998 14.386721104633946 8.597042271214999 5.38266610228167 8.56467105950794 9.081842335644312
8.751284030192279 7.571968441155724 8.989899908097717 10.798020955384793 10.240883714858885 9.95734131838057 13.880290227708793
9.695548728433801 8.659073120491788 8.454564226658066 15.185547619830412 10.20496195273381 9.444081140439875 9.5477974200691
9.295915277584736 13.426555546246881 14.028824815440144 12.40471194483655 15.310328654641358 3.859719187222005 5.016493848604693
4.94214997156282 8.578907909909004 6.94284649966117 5.369714403322543 9.600482742973435 9.755788157713518 10.433168122908866
10.848681227375263 7.679992449798913 8.094613338358835 10.093465358580273 10.102689258506887 10.20504928557266 10.207752963407055
10.160430268333766 10.649981306839111 9.65257694380794 8.186103579325769 10.115554717636922 9.81887110074179 14.470708806486687
6.260998589366658 7.145961791822931 4.400459800176191 6.432841170064679 7.207109587185632 10.054109307587924 8.148560378534384
9.481813014405802 9.496118214594299 10.386204280440149 12.76674363352633 12.460836669545115 15.961674114984211 12.717438599448013
13.695306246773496 16.741151349256675 11.890804211283207 13.184462990253897 12.343617842695329 15.58978678059538
11.602280125984032 12.516441905610472 13.799204673707482 7.101041266172024 4.993316589832442 5.644348388281271 4.676078264590855
4.546299225476101 6.93578128842586 8.276079288197144 8.23367669095205 10.748422886150713 10.771386105217127 11.032099291318998
7.812005415842901 9.178449512371232 10.23516826121661 10.90301593546188 10.189220210347054 8.810998620905725 11.787183547206608
9.296236912340708 10.738220788891837 9.522449017668217 12.557179627621437 14.192937597513765 5.996579734461314 4.551063833905184
4.831507524016216 6.339155247104576 6.183923078107966 8.454775808034872 10.800599276392719 12.00146662496045 14.061079731055917
11.240781910608634 10.14613611148149 8.220503778510432 7.665807831411472 7.629925985769293 5.676148468868258 6.507623392477133
7.136358910584341 10.375461251213538 8.949849488418403 11.611373119952349 11.95095051883064 15.7706208019365 16.856335961144552
12.191199445332309 7.788081303313621 11.66089295005103 14.184085123792396 15.043473556928557 11.762577736576716 25.905857397783336
16.716924329462138 25.499979343490466 20.730583684244632 25.659206437972436 22.267453929132937 31.362526063314156
30.022068712065376 35.37839821990554 28.467015190336838 23.240537780206445 27.036056141332217 22.285161424867912 41.07112692673061
59.47331283268421 78.34108682145228 98.71614105494565 8.977171134544522 6.904445013840824 5.079352087117484 6.767011796959547
4.529028036594085 6.974298791708502 8.923714823963904 8.243909316285263 10.317986757880721 8.695841266592888 10.717309150680364
12.49861281405036 10.461037143501649 8.995606768277778 9.136664427796598 12.38754496496926 10.635964588452383 8.69771967108946
7.503901567503874 10.248588034536084 11.686036205118295 14.447685802137656 15.301764309325383 7.355452352186035 4.931358344681889
4.96446501860343 6.604293070535694 5.817617856710792 5.595589306370536 10.169040335952701 7.656965890729936…




Sounds good?
Lastly, a way to
distributively store, process,
and interpret the results:
hbase.apache.org
hadoop.apache.org
mahout.apache.org
Step 1


         offset = 0            offset = 10              offset = 20



           Mapper                         Mapper                      Mapper


                      for index in offset .. limit do
                        download(index)
                      end for




                                                   pp/downloads/$tag
Step 2


         index = 0            index = 10            index = 20



           Mapper                       Mapper                   Mapper


                     for mp3 in index .. limit do
                       extract(mp3)
                     end for




                                                         features table
Rows and columns in HBase
(Table, RowKey, Family, Column, Timestamp) → Value
A time-oriented view
 into parts of a row
$tag + $url = { <- row key
  data = { <- column family
    ssd = …, <- column qualifier
    rp = …,
    rh = …,
    artist = …,
    title = …
  }
}
Step 3
???




                       for example in features[$tag] do
                         train(example)
                       end for




tag = { data = { sgd = … } }                              models table
Gradient descent visualization
Gradient descent algorithm
Step 4           Upload a file

Profit!




                Extract features




          for model in models do
            classify(example, model)
          end for




                 Display results
PiedPiper – Distributed Machine Learning Applied In Music Genre Recognition
PiedPiper – Distributed Machine Learning Applied In Music Genre Recognition

More Related Content

Similar to PiedPiper – Distributed Machine Learning Applied In Music Genre Recognition

Linked Data in Learning Analytics Tools
Linked Data in Learning Analytics ToolsLinked Data in Learning Analytics Tools
Linked Data in Learning Analytics ToolsMathieu d'Aquin
 
SevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittrSevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittrRomain Francois
 
Solving the Riddle of Search: Using Sphinx with Rails
Solving the Riddle of Search: Using Sphinx with RailsSolving the Riddle of Search: Using Sphinx with Rails
Solving the Riddle of Search: Using Sphinx with Railsfreelancing_god
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionChetan Khatri
 
ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON Padma shree. T
 
r,rstats,r language,r packages
r,rstats,r language,r packagesr,rstats,r language,r packages
r,rstats,r language,r packagesAjay Ohri
 
Machine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification ChallengesMachine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification ChallengesMarc Borowczak
 
Rooted 2010 ppp
Rooted 2010 pppRooted 2010 ppp
Rooted 2010 pppnoc_313
 
Hartwarming lightning talk in winter Sapporo
Hartwarming lightning talk in winter SapporoHartwarming lightning talk in winter Sapporo
Hartwarming lightning talk in winter SapporoJun OHWADA
 
Crosstalk
CrosstalkCrosstalk
Crosstalkcdhowe
 
Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Jordi Valverde
 
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataDAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataMuhammad Saleem
 
Rails 3 overview
Rails 3 overviewRails 3 overview
Rails 3 overviewYehuda Katz
 
Let’s Talk About Ruby
Let’s Talk About RubyLet’s Talk About Ruby
Let’s Talk About RubyIan Bishop
 
Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Workhorse Computing
 
(Slightly) Smarter Smart Pointers
(Slightly) Smarter Smart Pointers(Slightly) Smarter Smart Pointers
(Slightly) Smarter Smart PointersCarlo Pescio
 
Is There Room For Another Elephant In Tucson
Is There Room For Another Elephant In TucsonIs There Room For Another Elephant In Tucson
Is There Room For Another Elephant In TucsonAndy Lenards
 

Similar to PiedPiper – Distributed Machine Learning Applied In Music Genre Recognition (20)

Linked Data in Learning Analytics Tools
Linked Data in Learning Analytics ToolsLinked Data in Learning Analytics Tools
Linked Data in Learning Analytics Tools
 
SevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittrSevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittr
 
Solving the Riddle of Search: Using Sphinx with Rails
Solving the Riddle of Search: Using Sphinx with RailsSolving the Riddle of Search: Using Sphinx with Rails
Solving the Riddle of Search: Using Sphinx with Rails
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in production
 
ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON
 
r,rstats,r language,r packages
r,rstats,r language,r packagesr,rstats,r language,r packages
r,rstats,r language,r packages
 
Migrating Legacy Data
Migrating Legacy DataMigrating Legacy Data
Migrating Legacy Data
 
Machine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification ChallengesMachine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification Challenges
 
Rooted 2010 ppp
Rooted 2010 pppRooted 2010 ppp
Rooted 2010 ppp
 
Hartwarming lightning talk in winter Sapporo
Hartwarming lightning talk in winter SapporoHartwarming lightning talk in winter Sapporo
Hartwarming lightning talk in winter Sapporo
 
Perl Presentation
Perl PresentationPerl Presentation
Perl Presentation
 
Crosstalk
CrosstalkCrosstalk
Crosstalk
 
Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011
 
Greenberg, Starr, Kunze, and Hammond, "Show Me the Data: Managing Data Sets f...
Greenberg, Starr, Kunze, and Hammond, "Show Me the Data: Managing Data Sets f...Greenberg, Starr, Kunze, and Hammond, "Show Me the Data: Managing Data Sets f...
Greenberg, Starr, Kunze, and Hammond, "Show Me the Data: Managing Data Sets f...
 
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataDAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
 
Rails 3 overview
Rails 3 overviewRails 3 overview
Rails 3 overview
 
Let’s Talk About Ruby
Let’s Talk About RubyLet’s Talk About Ruby
Let’s Talk About Ruby
 
Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.
 
(Slightly) Smarter Smart Pointers
(Slightly) Smarter Smart Pointers(Slightly) Smarter Smart Pointers
(Slightly) Smarter Smart Pointers
 
Is There Room For Another Elephant In Tucson
Is There Room For Another Elephant In TucsonIs There Room For Another Elephant In Tucson
Is There Room For Another Elephant In Tucson
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

PiedPiper – Distributed Machine Learning Applied In Music Genre Recognition

  • 2.
  • 3. Pies, pipes, or any combination thereof are not involved.
  • 4. Refers to the story of the Pied Piper of Hamelin: follow the music.
  • 5. Root problem: music information retrieval. ID3 information (CDDB/Gracenote) is usually flawed.
  • 7. Difficult to say? That is because it is a difficult (as in subjective) question.
  • 8. Usually neither we or the artists we listen to limit our range to just one music genre.
  • 9. We like stuff that sounds like the stuff we like.
  • 10. Tough problem: teach that to a machine.
  • 11. Tough, but not impossible: teach it what sounds like means.
  • 14.
  • 15. First of all, we’re going to need some (legal) teaching material: ccmixter.org/query-api
  • 16. Second, a way to extract some meaningful features out of those MP3 files: ifs.tuwien.ac.at/mir/audiofeat ureextraction.html
  • 17. Rhythm Histograms Rhythm Patterns
  • 18. 3.243435282592375 1.994601142999418 1.602942870377543 2.580839981133314 1.543740353999544 2.068105015908046 3.632687882087414 3.263291634847903 3.044904881070487 4.5829566086821 1.956934469923299 2.580545989450239 2.429407925390764 2.788994073797176 1.690166884934422 1.555860154231844 2.538332189091759 3.308823909719808 2.901616552714908 2.822014163802716 3.057580859936266 4.119539882459613 5.453411739995226 5.700965142221574 3.644340096592682 2.396144855618434 2.668955574617245 3.084844387219049 2.390710363180224 3.76382419342901 5.030486764491426 4.300433534348324 4.821028562923828 4.280883335826844 2.190830721855284 2.961580313318945 4.34391019602777 2.951321599170762 2.449444142519419 4.812316305000328 3.951498800499126 3.917810868262246 3.715130504668615 3.998629684199756 5.181464895638284 6.084726075190199 7.913084886632093 3.706451224577581 4.698003385499159 3.824291021844686 5.384668543825263 3.953003636302575 3.428872039025952 4.824143429813525 4.541042340953487 4.264937057101506 7.33094855091972 4.580084845659833 6.295917883436843 4.14291707342239 4.954729182869467 5.254899441852804 7.756213636796876 7.677458410851154 8.173543349122873 7.614100030473213 7.518159783076418 7.843224352722845 7.125038216943838 5.195612434552322 5.151940097341069 3.0640282849477 5.918468920797019 6.15884358569437 4.81452719506864 5.800933882702959 5.744174364919134 7.569921298684167 6.712979004287242 7.717992870245721 5.684438150875594 7.127374728497071 4.792119949151986 5.20195100901256 4.384040088867235 7.480543165254396 6.068555547993299 7.616701708848335 7.766169335357228 11.382122364726827 12.858904597857029 19.849080334434777 7.158426099912344 4.517127293813476 6.274332358430609 8.472280658535901 7.801381678697753 8.758855739959486 8.997139007750159 9.00666057300392 7.949432386638614 8.794768734764487 8.989550890287987 8.302698807962054 14.182123963447646 8.391117712214388 11.11758041425602 9.466868578773333 8.624599196425805 9.839857864388469 10.01151957620144 14.268381887311374 13.453466400101796 13.307992403194998 14.386721104633946 8.597042271214999 5.38266610228167 8.56467105950794 9.081842335644312 8.751284030192279 7.571968441155724 8.989899908097717 10.798020955384793 10.240883714858885 9.95734131838057 13.880290227708793 9.695548728433801 8.659073120491788 8.454564226658066 15.185547619830412 10.20496195273381 9.444081140439875 9.5477974200691 9.295915277584736 13.426555546246881 14.028824815440144 12.40471194483655 15.310328654641358 3.859719187222005 5.016493848604693 4.94214997156282 8.578907909909004 6.94284649966117 5.369714403322543 9.600482742973435 9.755788157713518 10.433168122908866 10.848681227375263 7.679992449798913 8.094613338358835 10.093465358580273 10.102689258506887 10.20504928557266 10.207752963407055 10.160430268333766 10.649981306839111 9.65257694380794 8.186103579325769 10.115554717636922 9.81887110074179 14.470708806486687 6.260998589366658 7.145961791822931 4.400459800176191 6.432841170064679 7.207109587185632 10.054109307587924 8.148560378534384 9.481813014405802 9.496118214594299 10.386204280440149 12.76674363352633 12.460836669545115 15.961674114984211 12.717438599448013 13.695306246773496 16.741151349256675 11.890804211283207 13.184462990253897 12.343617842695329 15.58978678059538 11.602280125984032 12.516441905610472 13.799204673707482 7.101041266172024 4.993316589832442 5.644348388281271 4.676078264590855 4.546299225476101 6.93578128842586 8.276079288197144 8.23367669095205 10.748422886150713 10.771386105217127 11.032099291318998 7.812005415842901 9.178449512371232 10.23516826121661 10.90301593546188 10.189220210347054 8.810998620905725 11.787183547206608 9.296236912340708 10.738220788891837 9.522449017668217 12.557179627621437 14.192937597513765 5.996579734461314 4.551063833905184 4.831507524016216 6.339155247104576 6.183923078107966 8.454775808034872 10.800599276392719 12.00146662496045 14.061079731055917 11.240781910608634 10.14613611148149 8.220503778510432 7.665807831411472 7.629925985769293 5.676148468868258 6.507623392477133 7.136358910584341 10.375461251213538 8.949849488418403 11.611373119952349 11.95095051883064 15.7706208019365 16.856335961144552 12.191199445332309 7.788081303313621 11.66089295005103 14.184085123792396 15.043473556928557 11.762577736576716 25.905857397783336 16.716924329462138 25.499979343490466 20.730583684244632 25.659206437972436 22.267453929132937 31.362526063314156 30.022068712065376 35.37839821990554 28.467015190336838 23.240537780206445 27.036056141332217 22.285161424867912 41.07112692673061 59.47331283268421 78.34108682145228 98.71614105494565 8.977171134544522 6.904445013840824 5.079352087117484 6.767011796959547 4.529028036594085 6.974298791708502 8.923714823963904 8.243909316285263 10.317986757880721 8.695841266592888 10.717309150680364 12.49861281405036 10.461037143501649 8.995606768277778 9.136664427796598 12.38754496496926 10.635964588452383 8.69771967108946 7.503901567503874 10.248588034536084 11.686036205118295 14.447685802137656 15.301764309325383 7.355452352186035 4.931358344681889 4.96446501860343 6.604293070535694 5.817617856710792 5.595589306370536 10.169040335952701 7.656965890729936… Sounds good?
  • 19. Lastly, a way to distributively store, process, and interpret the results: hbase.apache.org hadoop.apache.org mahout.apache.org
  • 20. Step 1 offset = 0 offset = 10 offset = 20 Mapper Mapper Mapper for index in offset .. limit do download(index) end for pp/downloads/$tag
  • 21. Step 2 index = 0 index = 10 index = 20 Mapper Mapper Mapper for mp3 in index .. limit do extract(mp3) end for features table
  • 22. Rows and columns in HBase (Table, RowKey, Family, Column, Timestamp) → Value
  • 23. A time-oriented view into parts of a row
  • 24. $tag + $url = { <- row key data = { <- column family ssd = …, <- column qualifier rp = …, rh = …, artist = …, title = … } }
  • 25. Step 3 ??? for example in features[$tag] do train(example) end for tag = { data = { sgd = … } } models table
  • 28. Step 4 Upload a file Profit! Extract features for model in models do classify(example, model) end for Display results