The document summarizes some of the key techniques used in the Dubsmash iOS app for video creation and editing. It discusses using AVFoundation to handle video capture, stitching together multiple video parts while synchronizing them to audio, and rendering additional layers like text on top of the video. It notes some common pitfalls like crashes from doing AV work on background threads and unhelpful errors from AVFoundation.
2. ● First off - Gasper
● Funded my own mobile dev company back in Slovenia
● Past half a year driving the crazy train that is Dubsmash’s iOS app
● Dubsmash is aiming to revolutionize the way we do video communication
Unavoidable intro
Who the fu*k am I?
3. ● Video capture
● Stitching together several video parts
● Merging sound with video
● Rendering additional layers on top of the video
Video creation process in Dubsmash app
Overview
5. Video capture
It all starts here...
Key components
● Audio player
● Video camera object
Obstacles to overcome
● Different screen sizes and aspect ratios
● Keeping video capture in sync with sound
● Make the video personal by maintaining eye contact
6. Video capture
VideoCamera class
Just a few publicly exposed functions
● Able to initialize a new AVCaptureMovieFileOutput
● Rotate camera
● Start and stop capture
7. Video capture
VideoCamera class
func initializeNewMovieFileOutput() -> AVCaptureMovieFileOutput {
resetCurrentMovieFileOutput()
captureSession .beginConfiguration()
let newMovieFileOutput = AVCaptureMovieFileOutput()
if captureSession .canAddOutput(newMovieFileOutput) {
captureSession .addOutput(newMovieFileOutput)
}
captureSession .commitConfiguration()
return newMovieFileOutput
}
9. Initial video rendering
RenderEngine class
RenderEngine class with several video rendering capabilities
● Merging video files
● Adding sound to a video object
● Drawing addition layers on top of video
● Compressing video
10. Initial video rendering
Merging video parts
● Length of the output video is constrained to audio asset’s duration
● Stitching the video parts together using AVMutableComposition
○ Leads to possible discrepancies between the video and audio length, so we
need to make sure to scale video parts to appropriate lengths
● Using AVMutableVideoCompositionInstruction which contains an
AVMutableVideoCompositionLayerInstruction for every video part in
its layerInstructions property
● In the end export through AVAssetExportSession
11. Initial video rendering
Merging video parts
do {
let mutableComposition = AVMutableComposition()
for videoAsset in videoAssets {
let videoTrack = mutableComposition.addMutableTrackWithMediaType
(AVMediaTypeVideo, preferredTrackID: kCMPersistentTrackID_Invalid)
guard let videoAssetTrack = videoAsset.tracksWithMediaType(AVMediaTypeVideo).first
else { throw NSError(...) }
...
1. Wrapping everything in a do - catch statement
2. Creating a AVMutableComposition
3. Extracting an AVAssetTrack for every video part
12. Initial video rendering
Merging video parts
let adjustedDuration = CMTime(seconds: videoAsset.duration.seconds +
durationDifferencePerClip, preferredTimescale: videoAsset.duration.timescale)
try videoTrack.insertTimeRange(
CMTimeRange(start: kCMTimeZero, duration: videoAsset.duration),
ofTrack: videoAssetTrack,
atTime: timeOffset)
videoTrack.scaleTimeRange(CMTimeRange(start: timeOffset, duration: videoAsset.duration),
toDuration: adjustedDuration)
4. Scaling AVAssetTrack to appropriate length
13. Initial video rendering
Merging video parts
let layerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoTrack)
mainInstruction.layerInstructions.append(layerInstruction)
layerInstruction.setOpacity(
mainInstruction.layerInstructions.count < videoAssets.count ? 0 : 1,
atTime: timeOffset + adjustedDuration)
var layerTransform = videoAssetTrack.preferredTransform
// ... stuff with transforms
layerInstruction.setTransform(layerTransform, atTime: kCMTimeZero)
5. Setting appropriate layer instruction
14. Initial video rendering
Merging video parts
exportSession = AVAssetExportSession(asset: asset, presetName: exportPreset)
exportSession?.videoComposition = videoComposition
exportSession?.outputFileType = AVFileTypeMPEG4
exportSession?.shouldOptimizeForNetworkUse = true
taskCompletion = BFTaskCompletionSource()
let appEnteredBackgroundSignal = // Signal for UIApplicationDidEnterBackgroundNotification
appEnteredBackgroundSignal.subscribeNext { _ in
cancelExport()
}
exportSession?.exportAsynchronouslyWithCompletionHandler { … }
6. Exporting through AVAssetExportSession
15. Initial video rendering
Adding sound to video
● Sound and video playing together from different sources results in apparent lack
of synchronization between the two
● To mitigate this problem we add sound to the video through creating an
AVMutableComposition object with two AVAssetTracks
let audioAssetTrack = audioAsset.tracksWithMediaType(AVMediaTypeAudio).first
let videoAssetTrack = videoAsset.tracksWithMediaType(AVMediaTypeVideo).first
17. Rendering additional layers
Rendering text on top of video
● Users have the ability to enrich the video through text, filters and stickers
● Code snippet for rendering text:
let textImage = textField.screenshot()
videoContainer .convertRect(textField.frame, fromView: textField.superview))
let videoLayer = addLayerOverlayToVideoComposition( videoComposition )
let textLayer = CALayer()
textLayer.bounds = CGRect(x: 0, y: 0,
width: textImage.size.width, height: textImage.size.height)
textLayer.contents = textImage.CGImage
videoLayer.addSublayer(textLayer)
18. Rendering additional layers
Rendering text on top of video
let parentLayer = CALayer()
parentLayer.bounds = CGRect(origin: CGPointZero, size: exportSize)
parentLayer.anchorPoint = CGPointZero
parentLayer.position = CGPointZero
let videoLayer = CALayer()
videoLayer.bounds = parentLayer.bounds
parentLayer.addSublayer(videoLayer)
videoLayer.position = CGPoint(x: parentLayer.bounds.width / 2,
y: parentLayer.bounds.height / 2)
let layer = CALayer()
layer.frame = parentLayer.bounds
parentLayer.addSublayer(layer)
videoComposition.animationTool = AVVideoCompositionCoreAnimationTool(
postProcessingAsVideoLayer: videoLayer, inLayer: parentLayer)
return layer
20. Common pitfalls
Love / hate relationship with AVFoundation
● Doing practically anything AV related on a background thread → Crash
○ Your best friend is now:
UIApplication.sharedApplication().applicationState == .Active
● Setting max recorded duration on a AVCaptureMovieFileOutput? Who
cares! AVFoundation certainly doesn’t…
● Something is bound to go wrong at a certain time, thankfully we have these
descriptive errors popping up:
Error Domain=NSOSStatusErrorDomain Code=-12780
"The operation couldn’t be completed. (OSStatus error -12780.)"