12.
<ul><li>Specifying the audio format: </li></ul>private AudioFormat getFormat() { float sampleRate = 44100; int sampleSizeInBits = 8; int channels = 1; boolean signed = true ; boolean bigEndian = true ; return new AudioFormat( sampleRate, sampleSizeInBits, channels, signed, bigEndian); } <ul>The microphone </ul><ul></ul><ul></ul>
13.
<ul><li>Accessing the microphone </li></ul>final AudioFormat format = getFormat(); DataLine.Info info = new DataLine.Info(TargetDataLine. class , format); final TargetDataLine line = (TargetDataLine) AudioSystem. getLine (info); line.open(format); line.start(); <ul>The microphone </ul><ul></ul><ul></ul>
14.
<ul><li>Reading the sound: </li></ul>out = new ByteArrayOutputStream(); running = true ; try { while ( running ) { int count = line.read( buffer , 0, buffer . length ); if (count > 0) { out .write( buffer , 0, count); } } out .close(); } catch (IOException e) { throw new RuntimeException(e); } <ul>The microphone </ul><ul></ul><ul></ul>
27.
<ul><li>Excellent explanation by Stuart Riffle </li><ul><li>http://altdevblogaday.com </li></ul></ul><ul>Fourier Transformation </ul><ul></ul><ul></ul>
28.
<ul><li>We've lost track of time! </li></ul><ul>Frequency domain </ul><ul></ul><ul></ul>
29.
<ul><li>Solution: Apply transformation on pieces </li></ul>byte audio[] = out .toByteArray(); final int amountSlices = audio. length / SLICE_SIZE ; Complex[][] results = new Complex[amountChucks][]; for ( int slice = 0;slice < amountSlices; slice++) { Complex[] complex = new Complex[ SLICE_SIZE ]; for ( int i = 0;i< SLICE_SIZE ;i++) { complex[i] = new Complex(audio[(slice* SLICE_SIZE )+i], 0); } results[slice] = FFT. fft (complex); } <ul>Windowing </ul><ul></ul><ul></ul>
30.
<ul><li>From wikipedia: </li></ul>Spectum Analyzer A spectrum analyzer or spectral analyzer is a device used to examine the spectral composition of some electrical, acoustic , or optical waveform. <ul>Spectrum Analyzer </ul><ul></ul><ul></ul>
43.
38 42 113 131 245 </li></ul>(etc...) <ul>Matching the song </ul><ul></ul><ul></ul>
44.
<ul><li>Playing/decoding MP3 files: </li><ul><li>JLayer (real time MP3 decoder) </li><ul><li>jl1.0.1.jar </li></ul><li>MP3SPI (Java plugin, based on JLayer) </li><ul><li>mp3spi1.9.4.jar </li></ul><li>Tritonus (implementation of Java Sound API) </li><ul><li>tritonus_share.jar </li></ul></ul></ul><ul>Something to match against </ul><ul></ul><ul></ul>
45.
<ul><li>Harvesting my music collection: </li></ul>public void harvest(File rootDirectory) { String[] itemsInDirectory = rootDirectory.list(); for (String itemInDirectory:itemsInDirectory) { if (itemInDirectory.endsWith( ".mp3" )) { //Assume mp3 file File mp3File = new File(mp3Directory, itemInDirectory); captureAudio(mp3File); } else if ( new File(mp3Directory, itemInDirectory).isDirectory()) { //Directory? Recurse! harvest( new File(mp3Directory, itemInDirectory)); } } } <ul>Something to match against </ul><ul></ul><ul></ul>
46.
<ul><li>We have: </li></ul><ul><ul><li>Set of +/- 3000 files of reference data (songs)
47.
Way of capturing key moments with microphone </li></ul></ul><ul><li>Lets do some matching! </li></ul><ul>What we have now </ul><ul></ul><ul></ul>
48.
<ul><li>Create a single hash per slice </li></ul>private static final in t FUZ_FACTOR = 2; private long hash(String line) { String[] p = line.split( "" ); long p1 = Long. parseLong (p[0]); long p2 = Long. parseLong (p[1]); long p3 = Long. parseLong (p[2]); long p4 = Long. parseLong (p[3]); // long p5 = Long.parseLong(p[5]); // Not using the fifth point currently return (p4-(p4% FUZ_FACTOR )) * 100000000 + (p3-(p3% FUZ_FACTOR )) * 100000 + (p2-(p2% FUZ_FACTOR )) * 100 + (p1-(p1% FUZ_FACTOR )); } <ul>Hash function </ul><ul></ul><ul></ul>
49.
<ul><ul><ul><ul><li>Load all the reference hashes
64.
<ul><li>Now we group the results: </li></ul>2x: Song 6 with offset 8 1x: Song 4 with offset 2 1x: Song 4 with offset 3 1x: Song 4 with offset 4 1x: Song 8 with offset 2 1x: Song 5 with offset -1 Matching algorithm #2 <ul></ul><ul></ul>
69.
<ul><li>What are other uses for this algorithm: </li><ul><li>Speech recognition? </li><ul><li>Probably not.. </li></ul><li>Detecting duplicate songs in your music collection? </li><ul><li>Yes! Took 5 minutes for crude implementation </li></ul><li>Subtitle synchronisation in India ! </li></ul></ul>Other uses for this algorithm <ul></ul><ul></ul>
70.
... Landmark Digital Services owns the patents that cover the algorithm used as the basis for your recently posted “Creating Shazam In Java”. While it is not Landmark’s intention to alienate those in the Open Source and Music Information Retrieval community, Landmark must request that you do not ship, deploy or post the code presented in your post. Landmark also requests that in the future you do not ship, deploy or post any portions or versions of this code in its current state or in any modified state. We hope you understand our position and that we would be legally remiss not to make this request. We appreciate your immediate attention and response. ... Landmark Digital Services <ul></ul><ul></ul>
71.
<ul><li>After this email I contacted: </li></ul><ul><ul><li>Arnoud Engelfriet (Dutch IT lawyer, patent attorney)
73.
And others. </li></ul></ul>Getting information <ul></ul><ul></ul>
74.
<ul><li>From another email: </li></ul><ul><ul><li>As I'm sure you are aware, your blogpost may be viewed internationally. As a result, you may contribute to someone infringing our patents in any part of the world. While we trust your good intentions, yes, we would like you to refrain from releasing the code at all and to remove the blogpost explaining the algorithm . </li></ul></ul>Now the blogpost? <ul></ul><ul></ul>