Core Audio in iOS 6 provides frameworks for low-level audio processing including capture, playback, and effects. It includes engines like Audio Units for real-time processing and Audio Queue for simpler playback and recording. Helpers handle file and stream I/O, format conversion, and audio session management. New features in iOS 6 include the AUNewTimePitch unit for independent pitch and time shifting, AUSplitter and AUMatrixMixer for routing audio, and AudioQueueTap for accessing decoded PCM audio. The speaker discussed the components and engines of Core Audio and provided examples of effects, rate shifting, and parsing web radio streams.
7. Legitimate copies!
• Amazon (paper or Kindle)
• Barnes & Noble (paper or Nook)
• Apple (iBooks)
• Direct from InformIT (paper, eBook [.epub
+ .mobi + .pdf], or Bundle)
• 35% off with code COREAUDIO3174
8. What You’ll Learn
• What Core Audio does and doesn’t do
• When to use and not use it
• What’s new in Core Audio for iOS 6
11. AV Foundation,
Media Player
Simple things should be simple,
complex things should be possible.
–Alan Kay
12. AV Foundation,
Media Player
Simple things should be simple,
complex things should be possible.
–Alan Kay
Core Audio
13. Core Audio
• Low-level C framework for processing
audio
• Capture, play-out, real-time or off-line
processing
• The “complex things should be possible”
part of audio on OS X and iOS
14. Chris’ CA Taxonomy
• Engines: process streams of audio
• Capture, play-out, mixing, effects
processing
• Helpers: deal with formats, encodings, etc.
• File I/O, stream I/O, format conversion,
iOS “session” management
15. Helpers: Audio File
• Read from / write to multiple audio file
types (.aiff, .wav, .caf, .m4a, .mp3) in a
content-agnostic way
• Get metadata (data format, duration,
iTunes/ID3 info)
16. Helpers: Audio File
Stream
• Read audio from non-random-access
source like a network stream
• Discover encoding and encapsulation on
the fly, then deliver audio packets to client
application
17. Helpers: Converters
• Convert buffers of audio to and from
different encodings
• One side must be in an uncompressed
format (i.e., Linear PCM)
18. Helpers: ExtAudioFile
• Combine file I/O and format conversion
• Read a compressed file into PCM buffers
• Write PCM buffers into a compressed file
19. Helpers: Audio Session
• iOS-only API to negotiate use of audio
resources with the rest of the system
• Deetermine whether your app mixes with
other apps’ audio, honors ring/silent
switch, can play in background, etc.
• Gets notified of audio interruptions
• See also AVAudioSession
20. Engines: Audio Units
• Low-latency (~10ms) processing of
capture/play-out audio data
• Effects, mixing, etc.
• Connect units manually or via an AUGraph
• Much more on this topic momentarily…
21. Engines: Audio Queue
• Convenience API for recording or play-out,
built atop audio units
• Rather than processing on-demand and on
Core Audio’s thread, your callback provides
or receives buffers of audio (at whatever size
is convenient to you)
• Higher latency, naturally
• Supports compressed formats (MP3, AAC)
22. Engines: Open AL
• API for 3D spatialized audio, implemented
atop audio units
• Set a source’s properties (x/y/z
coordinates, orientation, audio buffer, etc.),
OpenAL renders what it sounds like to the
listener from that location
23. Engines and Helpers
• Audio Units • Audio File
• Audio Queue • Audio File Stream
• Open AL • Audio Converter
• ExtAudioFile
• Audio Session
33. AURemoteIO
• Output unit used for play-out, capture
• A Core Audio thread repeatedly and
automatically calls AudioUnitRender()
• Must set EnableIO property to explicitly
enable capture and/or play-out
• Capture requires setting appropriate
AudioSession category
47. The problem with effect
units
• Audio Units available since iPhone OS 2.0
prefer int formats
• Effect units arrived with iOS 5 (arm7 era)
and only work with float format
• Have to set the AUEffect unit’s format on
AURemoteIO
52. AUNewTimePitch
parameters
• Rate: kNewTimePitchParam_Rate takes a
Float32 rate from 1/32 speed to 32x
speed.
• Use powers of 2: 1/32, 1/16, …, 2, 4, 8…
• Pitch: kNewTimePitchParam_Pitch takes
a Float32 representing cents, meaning
1/100 of a musical semitone
53. Pitch shifting
• Pitch can vary, time does not
• Suitable for real-time sources, such as audio
capture
55. Rate shifting
• Rate can vary, pitch does not
• Think of 1.5x and 2x speed modes in
Podcasts app
• Not suitable for real-time sources, as data
will be consumed faster. Files work well.
• Sources must be able to map time
systems with
kAudioUnitProperty_InputSamplesInOutput
60. AudioQueue
• Easier than AURemoteIO - provide data
when you want to, less time pressure, can
accept or provide compressed formats
(MP3, AAC)
• Recording queue - receive buffers of
captured audio in a callback
• Play-out queue - enqueue buffers of audio
to play, optionally refill in a callback
62. Common AQ scenarios
• File player - Read from file and “prime”
queue buffers, start queue, when called
back with used buffer, refill from next part
of file
• Synthesis - Maintain state in your own
code, write raw samples into buffers during
callbacks
63. Web Radio
• Thursday class’ third project
• Use Audio File Stream Services to pick out
audio data from a network stream
• Enqueue these packets as new AQ buffers
• Dispose used buffers in callback
65. Parsing web radio
NSURLConnection delivers
NSData buffers, containing audio
and framing info. We pass it to NSData NSData
Audio File Services. Packets Packets Packets Packets Packets
66. Parsing web radio
NSURLConnection delivers
NSData buffers, containing audio
and framing info. We pass it to NSData NSData
Audio File Services. Packets Packets Packets Packets Packets
Packets Packets
Audio File Services calls us back
with parsed packets of audio data. Packets Packets Packets
67. Parsing web radio
NSURLConnection delivers
NSData buffers, containing audio
and framing info. We pass it to NSData NSData
Audio File Services. Packets Packets Packets Packets Packets
Packets Packets
Audio File Services calls us back
with parsed packets of audio data. Packets Packets Packets
We create an AudioQueueBuffer
Packets Packets
with those packets and enqueue it Packets
2 Packets
1 0
for play-out.
Packets Packets
68. A complex thing!
• What if we want to see that data after it’s
been decoded to PCM and is about to be
played?
• e.g., spectrum analysis, effects, visualizers
• AudioQueue design is “fire-and-forget”
70. AudioQueueProcessingTap
• Set as a property on the Audio Queue
• Calls back to your function with decoded
(PCM) audio data
• Three types: pre- or post- effects (that the
AQ performs), or siphon. First two can
modify the data.
• Only documentation is in AudioQueue.h
71. Creating an AQ Tap
! ! // create the tap
! ! UInt32 maxFrames = 0;
! ! AudioStreamBasicDescription tapFormat = {0};
! ! AudioQueueProcessingTapRef tapRef;
! ! CheckError(AudioQueueProcessingTapNew(audioQueue,
! ! ! ! ! ! ! ! ! ! ! tapProc,
! ! ! ! ! ! ! ! ! ! ! (__bridge void *)(player),
! ! ! ! ! ! ! ! ! ! ! kAudioQueueProcessingTap_PreEffects,
! ! ! ! ! ! ! ! ! ! ! &maxFrames,
! ! ! ! ! ! ! ! ! ! ! &tapFormat,
! ! ! ! ! ! ! ! ! ! ! &tapRef),
! ! ! ! "couldn't create AQ tap");
Notice that you receive maxFrames and tapFormat. These do not appear to be settable.
77. AudioUnitRender()
• Last argument is an AudioBufferList, whose
AudioBuffer members have mData pointers
• If mData != NULL, audio unit does its
thing with those samples
• If mData == NULL, audio data pulls from
whatever it’s connected to
• So we just call with AudioBufferList ioData
we got from tap callback, right?
78. Psych!
• AQ tap provides data as signed ints
• Effect units only work with floating point
• We need to do an on-the-spot format
conversion
79. invalidname’s convert-
and-effect recipe
OSStatus converterInputRenderCallback (void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * ioData) {
CCFWebRadioPlayer *player = (__bridge CCFWebRadioPlayer*) inRefCon;
// read from buffer
ioData->mBuffers[0].mData = player.preRenderData;
return noErr;
}
AUConverter AUEffect AUConverter AUGenericOutput
Note: red arrows are float format, yellow arrows are int
80. How it works
• AUGraph: AUConverter → AUEffect →
AUConverter → AUGenericOutput
• Top AUConverter is connected to a render
callback function
81. The trick!
• Copy mData pointer to a state variable and
NULL it in ioData
• Call AudioQueueRender() on output unit.
The NULL makes it pull from the graph.
• Top of the graph pulls on render callback,
which gives it back the mData we copied
off.
82. Yes, really
This is the rest of tapProc()
! // copy off the ioData so the graph can read from it
// in render callback
! player.preRenderData = ioData->mBuffers[0].mData;
! ioData->mBuffers[0].mData = NULL;
!
! OSStatus renderErr = noErr;
! AudioUnitRenderActionFlags actionFlags = 0;
! renderErr = AudioUnitRender(player.genericOutputUnit,
! ! ! ! ! ! ! ! &actionFlags,
! ! ! ! ! ! ! ! player.renderTimeStamp,
! ! ! ! ! ! ! ! 0,
! ! ! ! ! ! ! ! inNumberFrames,
! ! ! ! ! ! ! ! ioData);
! NSLog (@"AudioUnitRender, renderErr = %ld",renderErr);
}
87. Multi-Route
• Ordinarily, one input or output is active:
earpiece, speaker, headphones, dock-
connected device
• “Last in wins”
• With AV Session “multi-route” category,
you can use several at once
• WWDC 2012 session 505
88. Utility classes moved
again
• C++ utilities, including the CARingBuffer
• < Xcode 4.3, installed into /Developer
• Xcode 4.3-4.4, optional download from
developer.apple.com
• ≧ Xcode 4.5, sample code project “Core
Audio Utility Classes”
89. Takeaways
• Core Audio fundamentals never change
• New stuff is added as properties, typedefs,
enums, etc.
• Watch the SDK API diffs document to find
the new stuff
• Hope you like header files and
experimentation
90. Q&A
• Slides will be posted to slideshare.net/
invalidname
• Code will be linked from there and my blog
• Watch CocoaConf PDX glassboard,
@invalidname on Twitter/ADN, or [Time
code]; blog for announcement
• Thanks!