Jesper Sindahl Nielsen, State and University Library, Denmark, presented algorithms for automated quality assurance on audio files in context of preservation actions and
access. Cross correlation is used to compare the soundwaves.
In: iPRES 2012 – Proceedings of the 9th International Conference on Preservation of Digital Objects. Toronto 2012, 144-149.
ISBN 978-0-9917997-0-1
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Audio QA Using Cross Correlation
1. SCAPE
Audio Quality Assurance
An application of cross correlation
Jesper Sindahl Nielsen
The State and University Library & MADALGO
iPRES
Toronto, 2012
This work was partially supported by the SCAPE Project.
The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137).
2. SCAPE
The State and University Library
• Large national collections
• Radio & television:
• More than 1.000.000 hours – app. 1Pbytes of data
• Web archive
• More than 8 billion pages – app. 300Tbytes of data
• Up-coming newspaper digitization project
• 32 million pages – app. 800Tbytes of data
• Many other collections of almost any kind and size
• Digital preservation challanges in large scale
• Fits perfectly with the overall objectives of SCAPE
2/22
3. SCAPE
The Two Problems
• Overlap
• Input: 2 Audio (wav) files
• Guarantee: They overlap within the last 6 minutes
• Output: The exact timestamp where the overlap starts
• Quality Assurance of Migration
• Input: 2 Audio (wav) files
• Guarantee: Playback of the same file with different players
• Output: Whether the two files are ‘enough’ alike
3/22
4. SCAPE
Cross Correlation
• Main component of our solutions
• Well known technique (folklore by now)
• An audio (wav) file is just a function (signal)
• At sample 1 we have an amplitude, f(1)
• At sample 2 we have an amplitude, f(2)
• ... At sample n, we have an amplitude, f(n)
• What it does: Given two functions it computes how
much to shift one function along the x-axis such that
they have highest correlation.
4/22
10. SCAPE
Cross Correlation: Example
This had the highest
correlation, thus
the output is ’2’.
Samples
10/22
11. SCAPE
Cross Correlation
• Naively running this algorithm is slow
• Running time O(n2)
• Using Fourier transforms it is much faster
• Running time O(n log n)
• Any text book on signal processing will describe this
procedure.
• a short summary can be found in the paper as well
11/22
12. SCAPE
The Overlap problem
• We have 15+ years of radio broadcast from DR
(Danish Radio) on tape in 2-hour chunks.
• Recorded using 2 tape recorders.
• Overlap occurs
• Then digitized
• Situation: Someone wants to listen to a program that
spans two tapes (files)
• Don’t want to listen to the same clip twice
12/22
13. SCAPE
The Overlap problem
• Solutions?
• Find the longest suffix of the first file, that is a prefix of the
second file (excluding meta data etc) – bitwise comparison.
• Does not work. Audio files sounding the same do not necessarily
have the same bit pattern.
• Fingerprinting techniques
• Seems excessive.
• Some of them even calculate correlation as a subroutine.
• Cross Correlation
• It finds exactly what we want.
• Cut out the last 6 minutes of the first file and the first 6 minutes of
the second file, use cross correlation on the two clips.
13/22
14. SCAPE
The Overlap problem
• We implemented the procedure
• It becomes quite simple when relying on FFT libraries
• So it relies on FFTW (“Fastest Fourier Transform in the
West”)
• Results
• Has been run on approximately 3 months of broadcasts
• Around 1000 files, took around 85 hours
• Found errors in the collection (missing files, wrong channel .. ~3%)
• The rest has been nicely cut (91% of the 3 months)
• 5 minutes pr overlap, including cutting the files and using a
Quality Assurance check
14/22
15. SCAPE
The Migration Problem
• Over time file types become endangered
• When did you last listen to a ‘real audio’ clip?
• How many ‘gif’ images do you encounter today versus 5 or
10 years ago?
• We still want to be able to hear/view them in fifty
years
• Solution: Migrate to a different, more preservation
friendly format.
• Real Audio WAV files
15/22
16. SCAPE
The Quality Assurance Problem
• How can we be sure the content is the same?
• We need methods for doing QA for audio
• Simple methods: Is the length the same before and after?
• Better: extract more sophisticated features from the content.
• Do they have the same average sound level?
• Our suggestion: Use two different migration programs and cross
correlate the output
16/22
18. SCAPE
The Quality Assurance Algorithm
• Split the two output files into blocks of ~5 seconds.
• If all blocks have the same offset, within a fixed
parameter, we conclude the two files are equal, and
the migration went as it should
• Otherwise report an error
18/22
19. SCAPE
QA algorithm Results
• We needed a data set
• We do not actually have any migration errors (yet!)
• Solution: make an artificial data set
• Some files turned in to complete garbage (except header)
• Some files had some parts replaced by garbage
• Rest were OK
• The data set was 70 file pairs where
• 3 files where complete garbage
• 5 files where partly garbage
19/22
20. SCAPE
QA algorithm Results
• How did we do?
• The tool reported all the introduced errors
• It reported false positives
• These can be removed by tweaking parameters
• How long did it take?
• 70 files – 4 hours and 45 minutes (each file is 2 hours).
• ~4-5 minutes pr case
• How much memory does it use?
• Almost nothing (less than 10mb, no matter the input size)
20/22
21. SCAPE
Summary
• Overlap
• On average about 5 minutes pr file.
• Seems to be very accurate
• Can we do it faster? – integrate with SCAPE platform?
• Migration
• On average about 5 minutes pr case
• Caught the errors we thought of
• Which errors do actually occur?
• Can we apply these ideas in other contexts?
• Video? Images? Is it too slow for this?
21/22
22. SCAPE
Questions?
Thank you
Code can be found at:
https://github.com/openplanets/scape-xcorrsound
A blog post about xcorrSound can be found on
http://www.openplanetsfoundation.org/blogs/
22/22