AV Foundation Programming Guide
AV Foundation Programming Guide
Programming Guide
Contents
About AV Foundation 5
At a Glance 6
Representing and Using Media with AV Foundation 6
Concurrent Programming with AV Foundation 8
Prerequisites 9
Using Assets 10
Creating an Asset Object 10
Options for Initializing an Asset 10
Accessing the Users Assets 11
Preparing an Asset for Use 12
Getting Still Images From a Video 13
Generating a Single Image 14
Generating a Sequence of Images 15
Trimming and Transcoding a Movie 16
Playback 19
Playing Assets 19
Handling Different Types of Asset 21
Playing an Item 23
Changing the Playback Rate 23
SeekingRepositioning the Playhead 23
Playing Multiple Items 24
Monitoring Playback 25
Responding to a Change in Status 26
Tracking Readiness for Visual Display 27
Tracking Time 27
Reaching the End of an Item 28
Putting it all Together: Playing a Video File Using AVPlayerLayer 28
The Player View 29
A Simple View Controller 29
Creating the Asset 30
Responding to the Player Items Status Change 32
Playing the Item 33
Contents
Editing 34
Creating a Composition 37
Options for Initializing a Composition Track 38
Adding Audiovisual Data to a Composition 38
Retrieving Compatible Composition Tracks 39
Generating a Volume Ramp 39
Performing Custom Video Processing 40
Changing the Compositions Background Color 40
Applying Opacity Ramps 40
Incorporating Core Animation Effects 41
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll 42
Creating the Composition 43
Adding the Assets 43
Checking the Video Orientations 44
Applying the Video Composition Layer Instructions 45
Setting the Render Size and Frame Duration 46
Exporting the Composition and Saving it to the Camera Roll 47
Media Capture 49
Use a Capture Session to Coordinate Data Flow 50
Configuring a Session 51
Monitoring Capture Session State 52
An AVCaptureDevice Object Represents an Input Device 52
Device Characteristics 53
Device Capture Settings 54
Configuring a Device 57
Switching Between Devices 58
Use Capture Inputs to Add a Capture Device to a Session 58
Use Capture Outputs to Get Output from a Session 59
Saving to a Movie File 60
Processing Frames of Video 63
Capturing Still Images 64
Showing the User Whats Being Recorded 66
Video Preview 66
Showing Audio Levels 67
Putting it all Together: Capturing Video Frames as UIImage Objects 68
Create and Configure a Capture Session 68
Create and Configure the Device and Device Input 69
Create and Configure the Data Output 69
Contents
Export 71
Reading an Asset 71
Creating the Asset Reader 71
Setting Up the Asset Reader Outputs 72
Reading the Assets Media Data 74
Writing an Asset 75
Creating the Asset Writer 76
Setting Up the Asset Writer Inputs 76
Writing Media Data 78
Reencoding Assets 79
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset 81
Handling the Initial Setup 81
Initializing the Asset Reader and Writer 83
Reencoding the Asset 88
Handling Completion 93
Handling Cancellation 94
About AV Foundation
AV Foundation is one of several frameworks that you can use to play and create time-based audiovisual media.
It provides an Objective-C interface you use to work on a detailed level with time-based audiovisual data. For
example, you can use it to examine, create, edit, or reencode media files. You can also get input streams from
devices and manipulate video during realtime capture and playback.
You should typically use the highest-level abstraction available that allows you to perform the tasks you want.
For example, in iOS:
If you simply want to play movies, you can use the Media Player Framework (MPMoviePlayerController
or MPMoviePlayerViewController), or for web-based media you could use a UIWebView object.
To record video when you need only minimal control over format, use the UIKit framework
(UIImagePickerController).
Note, however, that some of the primitive data structures that you use in AV Foundationincluding time-related
data structures and opaque objects to carry and describe media dataare declared in the Core Media framework.
AV Foundation is available in iOS 4 and later, and OS X 10.7 and later. This document describes AV Foundation
as introduced in iOS 4.0. To learn about changes and additions to the framework in subsequent versions, you
should also read the appropriate release notes:
About AV Foundation
At a Glance
AV Foundation Release Notes (iOS 4.3) describe changes made for iOS 4.3 and included in OS X 10.7.
At a Glance
There are two facets to the AV Foundation frameworkAPIs related just to audio, which was available prior
to iOS 4; and APIs introduced in iOS 4 and later. The older audio-related classes provide easy ways to deal with
audio. They are described in the Multimedia Programming Guide , not in this document.
You can also configure the audio behavior of your application using AVAudioSession; this is described in
Audio Session Programming Guide .
About AV Foundation
At a Glance
Playback
AVFoundation allows you to manage the playback of asset in sophisticated ways. To support this, it separates
the presentation state of an asset from the asset itself. This allows you to, for example, play two different
segments of the same asset at the same time rendered at different resolutions. The presentation state for an
asset is managed by a player item object; the presentation state for each track within an asset is managed by
a player item track object. Using the player item and player item tracks you can, for example, set the size at
which the visual portion of the item is presented by the player, set the audio mix parameters and video
composition settings to be applied during playback, or disable components of the asset during playback.
You play player items using a player object, and direct the output of a player to Core Animation layer. In iOS 4.1
and later, you can use a player queue to schedule playback of a collection of player items in sequence.
Relevant Chapters: Playback (page 19)
Thumbnails
To create thumbnail images of video presentations, you initialize an instance of AVAssetImageGenerator
using the asset from which you want to generate thumbnails. AVAssetImageGenerator uses the default
enabled video tracks to generate images.
About AV Foundation
At a Glance
Editing
AV Foundation uses compositions to create new assets from existing pieces of media (typically, one or more
video and audio tracks). You use a mutable composition to add and remove tracks, and adjust their temporal
orderings. You can also set the relative volumes and ramping of audio tracks; and set the opacity, and opacity
ramps, of video tracks. A composition is an assemblage of pieces of media held in memory. When you export
a composition using an export session, it's collapsed to a file.
In iOS 4.1 and later, you can also create an asset from media such as sample buffers or still images using an
asset writer.
Relevant Chapters: Editing (page 34)
About AV Foundation
Prerequisites
appropriate queue, either the main queue for UI tasks or a queue you have up for concurrent operations. For
more about concurrent operations, see Concurrency Programming Guide ; for more about blocks, see Blocks
Programming Topics .
Prerequisites
AV Foundation is an advanced Cocoa framework. To use it effectively, you must have:
For playback, a basic understanding of Core Animation (see Core Animation Programming Guide )
Using Assets
Assets can come from a file or from media in the users iPod Library or Photo library. Simply creating an asset
object, though, does not necessarily mean that all the information that you might want to retrieve for that
item is immediately available. Once you have a movie asset, you can extract still images from it, transcode it
to another format, or trim the contents.
If you only intend to play the asset, either pass nil instead of a dictionary, or pass a dictionary that contains
the AVURLAssetPreferPreciseDurationAndTimingKey key and a corresponding value of NO
(contained in an NSValue object).
If you want to add the asset to a composition (AVMutableComposition), you typically need precise
random access. Pass a dictionary that contains the AVURLAssetPreferPreciseDurationAndTimingKey
key and a corresponding value of YES (contained in an NSValue objectrecall that NSNumber inherits
from NSValue):
10
Using Assets
Creating an Asset Object
NSURL *url = <#A URL that identifies an audiovisual asset such as a movie
file#>;
NSDictionary *options = @{ AVURLAssetPreferPreciseDurationAndTimingKey :
@YES };
AVURLAsset *anAssetToUseInAComposition = [[AVURLAsset alloc]
initWithURL:url options:options];
To access the iPod Library, you create an MPMediaQuery instance to find the item you want, then get its
URL using MPMediaItemPropertyAssetURL.
For more about the Media Library, see Multimedia Programming Guide .
To access the assets managed by the Photos application, you use ALAssetsLibrary.
The following example shows how you can get an asset to represent the first video in the Saved Photos Album.
ALAssetsLibrary *library = [[ALAssetsLibrary alloc] init];
11
Using Assets
Preparing an Asset for Use
12
Using Assets
Getting Still Images From a Video
If you want to prepare an asset for playback, you should load its tracks property. For more about playing
assets, see Playback (page 19).
13
Using Assets
Getting Still Images From a Video
You can configure several aspects of the image generator, for example, you can specify the maximum dimensions
for the images it generates and the aperture mode using maximumSize and apertureMode respectively.You
can then generate a single image at a given time, or a series of images. You must ensure that you keep a strong
reference to the image generator until it has generated all the images.
if (halfWayImage != NULL) {
14
Using Assets
Getting Still Images From a Video
array of NSValue objects, each containing a CMTime, specifying the asset times for which you want images
to be generated. The second argument is a block that serves as a callback invoked for each image that is
generated. The block arguments provide a result constant that tells you whether the image was created
successfully or if the operation was canceled, and, as appropriate:
The image.
The time for which you requested the image and the actual time for which the image was generated.
In your implementation of the block, you should check the result constant to determine whether the image
was created. In addition, you must ensure that you keep a strong reference to the image generator until it has
finished creating the images.
AVAsset *myAsset = <#An asset#>];
// Assume: @property (strong) AVAssetImageGenerator *imageGenerator;
self.imageGenerator = [AVAssetImageGenerator assetImageGeneratorWithAsset:myAsset];
[imageGenerator generateCGImagesAsynchronouslyForTimes:times
completionHandler:^(CMTime requestedTime, CGImageRef image, CMTime
actualTime,
AVAssetImageGeneratorResult result, NSError
*error) {
15
Using Assets
Trimming and Transcoding a Movie
CFBridgingRelease(CMTimeCopyDescription(NULL, actualTime));
NSLog(@"Requested: %@; actual %@", requestedTimeString,
actualTimeString);
if (result == AVAssetImageGeneratorSucceeded) {
// Do something interesting with the image.
}
if (result == AVAssetImageGeneratorFailed) {
NSLog(@"Failed with error: %@", [error localizedDescription]);
}
if (result == AVAssetImageGeneratorCancelled) {
NSLog(@"Canceled");
}
}];
You can cancel the generation of the image sequence by sending the image generator a
cancelAllCGImageGeneration message.
You can check whether you can export a given asset using a given preset using
exportPresetsCompatibleWithAsset: as illustrated in this example:
AVAsset *anAsset = <#Get an asset#>;
16
Using Assets
Trimming and Transcoding a Movie
You complete configuration of the session by providing the output URL (The URL must be a file URL.)
AVAssetExportSession can infer the output file type from the URLs path extension; typically, however,
you set it directly using outputFileType. You can also specify additional properties such as the time range,
a limit for the output file length, whether the exported file should be optimized for network use, and a video
composition. The following example illustrates how to use the timeRange property to trim the movie:
exportSession.outputURL = <#A file URL#>;
exportSession.outputFileType = AVFileTypeQuickTimeMovie;
17
Using Assets
Trimming and Transcoding a Movie
default:
break;
}
}];
You can cancel the export by sending the session a cancelExport message.
The export will fail if you try to overwrite an existing file, or write a file outside of the applications sandbox. It
may also fail if:
In these situations, you should typically inform the user that the export failed, then allow the user to restart
the export.
18
Playback
To control the playback of assets, you use an AVPlayer object. During playback, you can use an AVPlayerItem
object to manage the presentation state of an asset as a whole, and an AVPlayerItemTrack to manage the
presentation state of an individual track. To display video, you use an AVPlayerLayer object.
Playing Assets
A player is a controller object that you use to manage playback of an asset, for example starting and stopping
playback, and seeking to a particular time. You use an instance of AVPlayer to play a single asset. In iOS 4.1
and later, you can use an AVQueuePlayer object to play a number of items in sequence (AVQueuePlayer is
a subclass of AVPlayer).
A player provides you with information about the state of the playback so, if you need to, you can synchronize
your user interface with the players state. You typically direct the output of a player to specialized Core
Animation Layer (an instance of AVPlayerLayer or AVSynchronizedLayer). To learn more about layers,
see Core Animation Programming Guide .
19
Playback
Playing Assets
Multiple player layers: You can create arbitrarily many AVPlayerLayer objects from a single
AVPlayer instance, but only the most-recently-created such layer will display any video content
on-screen.
Although ultimately you want to play an asset, you dont provide assets directly to an AVPlayer object. Instead,
you provide an instance of AVPlayerItem. A player item manages the presentation state of an asset with
which it is associated. A player item contains player item tracksinstances of AVPlayerItemTrackthat
correspond to the tracks in the asset.
20
Playback
Handling Different Types of Asset
This abstraction means that you can play a given asset using different players simultaneously, but rendered
in different ways by each player. Using the item tracks, you can, for example, disable a particular track during
playback (you might not want to play the sound component).
You can initialize a player item with an existing asset, or you can initialize a player item directly from a URL so
that you can play a resource at a particular location (AVPlayerItem will then create and configure an asset
for the resource). As with AVAsset, though, simply initializing a player item doesnt necessarily mean its ready
for immediate playback. You can observe (using key-value observing) an items status property to determine
if and when its ready to play.
When the asset has loaded its tracks, create an instance of AVPlayerItem using the asset.
21
Playback
Handling Different Types of Asset
Wait until the items status indicates that its ready to play (typically you use key-value observing to
receive a notification when the status changes).
This approach is illustrated in Putting it all Together: Playing a Video File Using AVPlayerLayer (page 28).
To create and prepare an HTTP live stream for playback. Initialize an instance of AVPlayerItem using the
URL. (You cannot directly create an AVAsset instance to represent the media in an HTTP Live Stream.)
NSURL *url = [NSURL URLWithString:@"<#Live stream URL#>];
// You may find a test stream at
<http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8>.
self.playerItem = [AVPlayerItem playerItemWithURL:url];
[playerItem addObserver:self forKeyPath:@"status" options:0
context:&ItemStatusContext];
self.player = [AVPlayer playerWithPlayerItem:playerItem];
When you associate the player item with a player, it starts to become ready to play. When it is ready to play,
the player item creates the AVAsset and AVAssetTrack instances, which you can use to inspect the contents
of the live stream.
To get the duration of a streaming item, you can observe the duration property on the player item. When
the item becomes ready to play, this property will change to the correct value for the stream.
Note: Using the duration property on the player item requires iOS 4.3 or later. An approach that
is compatible with all versions of iOS involves observing the status property of the player item.
When the status becomes AVPlayerItemStatusReadyToPlay, the duration can be fetched with
the following line of code:
[[[[[playerItem tracks] objectAtIndex:0] assetTrack] asset] duration];
If you simply want to play a live stream, you can take a shortcut and create a player directly using the URL:
self.player = [AVPlayer playerWithURL:<#Live stream URL#>];
[player addObserver:self forKeyPath:@"status" options:0
context:&PlayerStatusContext];
22
Playback
Playing an Item
As with assets and items, initializing the player does not mean its ready for playback. You should observe the
players status property, which changes to AVPlayerStatusReadyToPlay when it is ready to play. You
can also observe the currentItem property to access the player item created for the stream.
If you dont know what kind of URL you have. Follow these steps:
1.
Try to initialize an AVURLAsset using the URL, then load its tracks key.
If the tracks load successfully, then you create a player item for the asset.
2.
If either route succeeds, you end up with a player item that you can then associate with a player.
Playing an Item
To start playback, you send a play message to the player.
- (IBAction)play:sender {
[player play];
}
In addition to simply playing, you can manage various aspects of the playback, such as the rate and the location
of the playhead. You can also monitor the play state of the player; this is useful if you want to, for example,
synchronize the user interface to the presentation state of the assetsee Monitoring Playback (page 25).
A value of 1.0 means play at the natural rate of the current item. Setting the rate to 0.0 is the same as pausing
playbackyou can also use pause.
23
Playback
Playing Multiple Items
The seekToTime: method, however, is tuned for performance rather than precision. If you need to move the
playhead precisely, instead you use seekToTime:toleranceBefore:toleranceAfter:.
CMTime fiveSecondsIn = CMTimeMake(5, 1);
[player seekToTime:fiveSecondsIn toleranceBefore:kCMTimeZero
toleranceAfter:kCMTimeZero];
Using a tolerance of zero may require the framework to decode a large amount of data. You should only use
zero if you are, for example, writing a sophisticated media editing application that requires precise control.
After playback, the players head is set to the end of the item, and further invocations of play have no effect.
To position the play head back at the beginning of the item, you can register to receive an
AVPlayerItemDidPlayToEndTimeNotification from the item. In the notifications callback method, you
invoke seekToTime: with the argument kCMTimeZero.
// Register with the notification center after creating the player item.
[[NSNotificationCenter defaultCenter]
addObserver:self
selector:@selector(playerItemDidReachEnd:)
name:AVPlayerItemDidPlayToEndTimeNotification
object:<#The player item#>];
- (void)playerItemDidReachEnd:(NSNotification *)notification {
[player seekToTime:kCMTimeZero];
}
24
Playback
Monitoring Playback
You can then play the queue using play, just as you would an AVPlayer object. The queue player plays each
item in turn. If you want to skip to the next item, you send the queue player an advanceToNextItem message.
You can modify the queue using insertItem:afterItem:, removeItem:, and removeAllItems. When
adding a new item, you should typically check whether it can be inserted into the queue, using
canInsertItem:afterItem:. You pass nil as the second argument to test whether the new item can be
appended to the queue:
AVPlayerItem *anItem = <#Get a player item#>;
if ([queuePlayer canInsertItem:anItem afterItem:nil]) {
[queuePlayer insertItem:anItem afterItem:nil];
}
Monitoring Playback
You can monitor a number of aspects of the presentation state of a player and the player item being played.
This is particularly useful for state changes that are not under your direct control, for example:
If the user uses multitasking to switch to a different application, a players rate property will drop to 0.0.
If you are playing remote media, a player items loadedTimeRanges and seekableTimeRanges properties
will change as more data becomes available.
These properties tell you what portions of the player items timeline are available.
A players currentItem property changes as a player item is created for an HTTP live stream.
A player items tracks property may change while playing an HTTP live stream.
This may happen if the stream offers different encodings for the content; the tracks change if the player
switches to a different encoding.
A player or player items status may change if playback fails for some reason.
You can use key-value observing to monitor changes to values of these properties.
25
Playback
Monitoring Playback
Important: You should register for KVO change notifications and unregister from KVO change notifications
on the main thread. This avoids the possibility of receiving a partial notification if a change is being made
on another thread. AV Foundation invokes observeValueForKeyPath:ofObject:change:context:
on the main thread, even if the change operation is made on another thread.
26
Playback
Monitoring Playback
Tracking Time
To track changes in the position of the playhead in an AVPlayer object, you can use
addPeriodicTimeObserverForInterval:queue:usingBlock: or
addBoundaryTimeObserverForTimes:queue:usingBlock:. You might do this to, for example, update
your user interface with information about time elapsed or time remaining, or perform some other user interface
synchronization.
Both of the methods return an opaque object that serves as an observer. You must keep a strong reference to
the returned object as long as you want the time observation block to be invoked by the player. You must also
balance each invocation of these methods with a corresponding call to removeTimeObserver:.
With both of these methods, AV Foundation does not guarantee to invoke your block for every interval or
boundary passed. AV Foundation does not invoke a block if execution of a previously-invoked block has not
completed. You must make sure, therefore, that the work you perform in the block does not overly tax the
system.
// Assume a property: @property (strong) id playerObserver;
27
Playback
Putting it all Together: Playing a Video File Using AVPlayerLayer
Create an AVPlayerItem object for a file-based asset, and use key-value observing to observe its status
Play the item, then restore the players head to the beginning.
Note: To focus on the most relevant code, this example omits several aspects of a complete
application, such as memory management, and unregistering as an observer (for key-value observing
or for the notification center). To use AV Foundation, you are expected to have enough experience
with Cocoa to be able to infer the missing pieces.
28
Playback
Putting it all Together: Playing a Video File Using AVPlayerLayer
@implementation PlayerView
+ (Class)layerClass {
return [AVPlayerLayer class];
}
- (AVPlayer*)player {
return [(AVPlayerLayer *)[self layer] player];
}
- (void)setPlayer:(AVPlayer *)player {
[(AVPlayerLayer *)[self layer] setPlayer:player];
}
@end
29
Playback
Putting it all Together: Playing a Video File Using AVPlayerLayer
- (IBAction)play:sender;
- (void)syncUI;
@end
The syncUI method synchronizes the buttons state with the players state:
- (void)syncUI {
if ((self.player.currentItem != nil) &&
([self.player.currentItem status] == AVPlayerItemStatusReadyToPlay)) {
self.playButton.enabled = YES;
}
else {
self.playButton.enabled = NO;
}
}
You can invoke syncUI in the view controllers viewDidLoad method to ensure a consistent user interface
when the view is first displayed.
- (void)viewDidLoad {
[super viewDidLoad];
[self syncUI];
}
The other properties and methods are described in the remaining sections.
30
Playback
Putting it all Together: Playing a Video File Using AVPlayerLayer
In the completion block, you create an instance of AVPlayerItem for the asset, and set it as the player for
the player view. As with creating the asset, simply creating the player item does not mean its ready to use. To
determine when its ready to play, you can observe the items status. You trigger its preparation to play when
you associate it with the player.
// Define this constant for the key-value observation context.
static const NSString *ItemStatusContext;
if (status == AVKeyValueStatusLoaded) {
self.playerItem = [AVPlayerItem playerItemWithAsset:asset];
[self.playerItem addObserver:self forKeyPath:@"status"
options:0 context:&ItemStatusContext];
[[NSNotificationCenter defaultCenter] addObserver:self
selector:@selector(playerItemDidReachEnd:)
name:AVPlayerItemDidPlayToEndTimeNotification
31
Playback
Putting it all Together: Playing a Video File Using AVPlayerLayer
object:self.playerItem];
self.player = [AVPlayer playerWithPlayerItem:self.playerItem];
[self.playerView setPlayer:self.player];
}
else {
// You should deal with the error appropriately.
NSLog(@"The asset's tracks were not loaded:\n%@", [error
localizedDescription]);
}
});
if (context == &ItemStatusContext) {
dispatch_async(dispatch_get_main_queue(),
^{
[self syncUI];
});
return;
}
[super observeValueForKeyPath:keyPath ofObject:object
change:change context:context];
return;
}
32
Playback
Putting it all Together: Playing a Video File Using AVPlayerLayer
This only plays the item once, though. After playback, the players head is set to the end of the item, and further
invocations of play will have no effect. To position the play head back at the beginning of the item, you can
register to receive an AVPlayerItemDidPlayToEndTimeNotification from the item. In the notifications
callback method, invoke seekToTime: with the argument kCMTimeZero.
// Register with the notification center after creating the player item.
[[NSNotificationCenter defaultCenter]
addObserver:self
selector:@selector(playerItemDidReachEnd:)
name:AVPlayerItemDidPlayToEndTimeNotification
object:[self.player currentItem]];
- (void)playerItemDidReachEnd:(NSNotification *)notification {
[self.player seekToTime:kCMTimeZero];
}
33
Editing
The AV Foundation framework provides a feature-rich set of classes to facilitate the editing of audiovisual
assets. At the heart of AV Foundations editing API are compositions. A composition is simply a collection of
tracks from one or more different media assets. The AVMutableComposition class provides an interface for
inserting and removing tracks, as well as managing their temporal orderings. You use a mutable composition
to piece together a new asset from a combination of existing assets. If all you want to do is merge multiple
assets together sequentially into a single file then that is as much detail as you need. If you want to perform
any custom audio or video processing on the tracks in your composition, you need to incorporate an audio
mix or a video composition, respectively.
34
Editing
Using the AVMutableAudioMix class, you can perform custom audio processing on the audio tracks in your
composition. Currently, you can specify a maximum volume or set a volume ramp for an audio track.
You can use the AVMutableVideoComposition class to work directly with the video tracks in your composition
for the purposes of editing. With a single video composition, you can specify the desired render size and scale,
as well as the frame duration, for the output video. Through a video compositions instructions (represented
by the AVMutableVideoCompositionInstruction class), you can modify the background color of your
video and apply layer instructions. These layer instructions (represented by the
AVMutableVideoCompositionLayerInstruction class) can be used to apply transforms, transform ramps,
35
Editing
opacity and opacity ramps to the video tracks within your composition. The video composition class also gives
you the ability to introduce effects from the Core Animation framework into your video using the
animationTool property.
36
Editing
Creating a Composition
To combine your composition with an audio mix and a video composition, you use an AVAssetExportSession
object. You initialize the export session with your composition and then simply assign your audio mix and
video composition to the audioMix and videoComposition properties respectively.
Creating a Composition
To create your own composition, you use the AVMutableComposition class. To add media data to your
composition, you must add one or more composition tracks; represented by the AVMutableCompositionTrack
class. The simplest case is creating a mutable composition with one video track and one audio track:
AVMutableComposition *mutableComposition = [AVMutableComposition composition];
// Create the video composition track.
AVMutableCompositionTrack *mutableCompositionVideoTrack = [mutableComposition
addMutableTrackWithMediaType:AVMediaTypeVideo
preferredTrackID:kCMPersistentTrackID_Invalid];
// Create the audio composition track.
AVMutableCompositionTrack *mutableCompositionAudioTrack = [mutableComposition
addMutableTrackWithMediaType:AVMediaTypeAudio
preferredTrackID:kCMPersistentTrackID_Invalid];
37
Editing
Adding Audiovisual Data to a Composition
38
Editing
Generating a Volume Ramp
Note: Placing multiple video segments on the same composition track can potentially lead to
dropping frames at the transitions between video segments, especially on embedded devices.
Choosing the number of composition tracks for your video segments depends entirely on the design
of your app and its intended platform.
39
Editing
Performing Custom Video Processing
40
Editing
Performing Custom Video Processing
// Set its time range to span the duration of the first video track.
firstVideoCompositionInstruction.timeRange = CMTimeRangeMake(kCMTimeZero,
firstVideoAssetTrack.timeRange.duration);
// Create the layer instruction and associate it with the composition video track.
AVMutableVideoCompositionLayerInstruction *firstVideoLayerInstruction =
[AVMutableVideoCompositionLayerInstruction
videoCompositionLayerInstructionWithAssetTrack:mutableCompositionVideoTrack];
// Create the opacity ramp to fade out the first video track over its entire
duration.
[firstVideoLayerInstruction setOpacityRampFromStartOpacity:1.f toEndOpacity:0.f
timeRange:CMTimeRangeMake(kCMTimeZero, firstVideoAssetTrack.timeRange.duration)];
// Create the second video composition instruction so that the second video track
isn't transparent.
AVMutableVideoCompositionInstruction *secondVideoCompositionInstruction =
[AVMutableVideoCompositionInstruction videoCompositionInstruction];
// Set its time range to span the duration of the second video track.
secondVideoCompositionInstruction.timeRange =
CMTimeRangeMake(firstVideoAssetTrack.timeRange.duration,
CMTimeAdd(firstVideoAssetTrack.timeRange.duration,
secondVideoAssetTrack.timeRange.duration));
// Create the second layer instruction and associate it with the composition video
track.
AVMutableVideoCompositionLayerInstruction *secondVideoLayerInstruction =
[AVMutableVideoCompositionLayerInstruction
videoCompositionLayerInstructionWithAssetTrack:mutableCompositionVideoTrack];
// Attach the first layer instruction to the first video composition instruction.
firstVideoCompositionInstruction.layerInstructions = @[firstVideoLayerInstruction];
// Attach the second layer instruction to the second video composition instruction.
secondVideoCompositionInstruction.layerInstructions = @[secondVideoLayerInstruction];
// Attach both of the video composition instructions to the video composition.
mutableVideoComposition.instructions = @[firstVideoCompositionInstruction,
secondVideoCompositionInstruction];
41
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
Check the preferredTransform property of a video asset track to determine the videos orientation.
Set appropriate values for the renderSize and frameDuration properties of a video composition.
Use a composition in conjunction with a video composition when exporting to a video file.
42
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
Note: To focus on the most relevant code, this example omits several aspects of a complete app,
such as memory management and error handling. To use AV Foundation, you are expected to have
enough experience with Cocoa to infer the missing pieces.
43
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
Note: This part assumes that you have two assets which contain at least one video track each and
a third asset that contains at least one audio track. The videos can be retrieved from the camera roll
and the audio track can be retrieved from the music library or the videos themselves for example.
44
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
All AVAssetTrack objects have a preferredTransform property that contains the orientation information
for that asset track. This transform is applied whenever the asset track is displayed onscreen. In the previous
code, the layer instructions transform is set to the asset tracks transform so that the video in the new
composition displays properly once you adjust its render size.
45
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
46
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
47
Editing
Putting it all Together: Combining Multiple Assets and Saving the Result to the Camera Roll
48
Media Capture
To manage the capture from a device such as a camera or microphone, you assemble objects to represent
inputs and outputs, and use an instance of AVCaptureSession to coordinate the data flow between them.
Minimally you need:
An instance of a concrete subclass of AVCaptureInput to configure the ports from the input device
An instance of a concrete subclass of AVCaptureOutput to manage the output to a movie file or still
image
An instance of AVCaptureSession to coordinate the data flow from the input to the output
To show the user what a camera is recording, you can use an instance of AVCaptureVideoPreviewLayer
(a subclass of CALayer).
You can configure multiple inputs and outputs, coordinated by a single session:
For many applications, this is as much detail as you need. For some operations, however, (if you want to monitor
the power levels in an audio channel, for example) you need to consider how the various ports of an input
device are represented, how those ports are connected to the output.
49
Media Capture
Use a Capture Session to Coordinate Data Flow
A connection between a capture input and a capture output in a capture session is represented by an
AVCaptureConnection object. Capture inputs (instances of AVCaptureInput) have one or more input ports
(instances of AVCaptureInputPort). Capture outputs (instances of AVCaptureOutput) can accept data
from one or more sources (for example, an AVCaptureMovieFileOutput object accepts both video and
audio data).
When you add an input or an output to a session, the session greedily forms connections between all the
compatible capture inputs ports and capture outputs. A connection between a capture input and a capture
output is represented by an AVCaptureConnection object.
You can use a capture connection to enable or disable the flow of data from a given input or to a given output.
You can also use a connection to monitor the average and peak power levels in an audio channel.
instance to coordinate the flow of data from AV input devices to outputs. You add the capture devices and
outputs you want to the session, then start data flow by sending the session a startRunning message, and
stop recording by sending a stopRunning message.
50
Media Capture
Use a Capture Session to Coordinate Data Flow
Configuring a Session
You use a preset on the session to specify the image quality and resolution you want. A preset is a constant
that identifies one of a number of possible configurations; in some cases the actual configuration is
device-specific:
Symbol
Resolution
Comments
AVCaptureSessionPresetHigh
High
AVCaptureSessionPresetMedium
Medium
AVCaptureSessionPresetLow
Low
AVCaptureSessionPreset640x480
640x480
VGA.
AVCaptureSessionPreset1280x720
1280x720
720p HD.
AVCaptureSessionPresetPhoto
Photo
For examples of the actual values these presets represent for various devices, see Saving to a Movie File (page
60) and Capturing Still Images (page 64).
If you want to set a size-specific configuration, you should check whether it is supported before setting it:
if ([session canSetSessionPreset:AVCaptureSessionPreset1280x720]) {
session.sessionPreset = AVCaptureSessionPreset1280x720;
}
else {
// Handle the failure.
}
51
Media Capture
An AVCaptureDevice Object Represents an Input Device
In many situations, you create a session and the various inputs and outputs all at once. Sometimes, however,
you may want to reconfigure a running session, perhaps as different input devices become available, or in
response to user request. This can present a challenge, since, if you change them one at a time, a new setting
may be incompatible with an existing setting. To deal with this, you use beginConfiguration and
commitConfiguration to batch multiple configuration operations into an atomic update. After calling
beginConfiguration, you can for example add or remove outputs, alter the sessionPreset, or configure
individual capture input or output properties. No changes are actually made until you invoke
commitConfiguration, at which time they are applied together.
[session beginConfiguration];
// Remove an existing capture device.
// Add a new capture device.
// Reset the preset.
[session commitConfiguration];
52
Media Capture
An AVCaptureDevice Object Represents an Input Device
AVCaptureDeviceWasConnectedNotification and
AVCaptureDeviceWasDisconnectedNotificationnotifications to be alerted when the list of available
devices changes.
You add a device to a capture session using a capture input (see Use Capture Inputs to Add a Capture Device
to a Session (page 58)).
Device Characteristics
You can ask a device about several different characteristics. You can test whether it provides a particular media
type or supports a given capture session preset using hasMediaType: and
supportsAVCaptureSessionPreset: respectively. To provide information to the user, you can find out
the position of the capture device (whether it is on the front or the back of the unit theyre using), and its
localized name. This may be useful if you want to present a list of capture devices to allow the user to choose
one.
The following code example iterates over all the available devices and logs their name, and for video devices
their position on the unit.
NSArray *devices = [AVCaptureDevice devices];
if ([device hasMediaType:AVMediaTypeVideo]) {
In addition, you can find out the devices model ID and its unique ID.
53
Media Capture
An AVCaptureDevice Object Represents an Input Device
iPhone 3G
iPhone 3GS
iPhone 4 (Back)
iPhone 4 (Front)
Focus mode
NO
YES
YES
NO
NO
YES
YES
NO
Exposure mode
YES
YES
YES
YES
NO
YES
YES
YES
YES
YES
YES
YES
Flash mode
NO
NO
YES
NO
Torch mode
NO
NO
YES
NO
The following code fragment shows how you can find video input devices that have a torch mode and support
a given capture session preset:
NSArray *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
NSMutableArray *torchDevices = [[NSMutableArray alloc] init];
If you find multiple devices that meet your criteria, you might let the user choose which one they want to use.
To display a description of a device to the user, you can use its localizedName property.
You use the various different features in similar ways. There are constants to specify a particular mode, and
you can ask a device whether it supports a particular mode. In several cases you can observe a property to be
notified when a feature is changing. In all cases, you should lock the device before changing the mode of a
particular feature, as described in Configuring a Device (page 57).
54
Media Capture
An AVCaptureDevice Object Represents an Input Device
Note: Focus point of interest and exposure point of interest are mutually exclusive, as are focus
mode and exposure mode.
Focus modes
There are three focus modes:
AVCaptureFocusModeLocked: the focal length is fixed.
This is useful when you want to allow the user to compose a scene then lock the focus.
AVCaptureFocusModeAutoFocus: the camera does a single scan focus then reverts to locked.
This is suitable for a situation where you want to select a particular item on which to focus and then
maintain focus on that item even if it is not the center of the scene.
AVCaptureFocusModeContinuousAutoFocus: the camera continuously auto-focuses as needed.
You use the isFocusModeSupported: method to determine whether a device supports a given focus mode,
then set the mode using the focusMode property.
In addition, a device may support a focus point of interest. You test for support using
focusPointOfInterestSupported. If its supported, you set the focal point using focusPointOfInterest.
You pass a CGPoint where {0,0} represents the top left of the picture area, and {1,1} represents the bottom
right in landscape mode with the home button on the right this applies even if the device is in portrait mode.
You can use the adjustingFocus property to determine whether a device is currently focusing. You can
observe the property using key-value observing to be notified when a device starts and stops focusing.
If you change the focus mode settings, you can return them to the default configuration as follows:
if ([currentDevice isFocusModeSupported:AVCaptureFocusModeContinuousAutoFocus]) {
CGPoint autofocusPoint = CGPointMake(0.5f, 0.5f);
[currentDevice setFocusPointOfInterest:autofocusPoint];
[currentDevice setFocusMode:AVCaptureFocusModeContinuousAutoFocus];
}
Exposure modes
There are three exposure modes:
AVCaptureExposureModeAutoExpose: the device automatically adjusts the exposure once and then
55
Media Capture
An AVCaptureDevice Object Represents an Input Device
You use the isExposureModeSupported: method to determine whether a device supports a given exposure
mode, then set the mode using the exposureMode property.
In addition, a device may support an exposure point of interest. You test for support using
exposurePointOfInterestSupported. If its supported, you set the exposure point using
exposurePointOfInterest. You pass a CGPoint where {0,0} represents the top left of the picture area,
and {1,1} represents the bottom right in landscape mode with the home button on the right this applies
even if the device is in portrait mode.
You can use the adjustingExposure property to determine whether a device is currently changing its
exposure setting. You can observe the property using key-value observing to be notified when a device starts
and stops changing its exposure setting.
If you change the exposure settings, you can return them to the default configuration as follows:
if ([currentDevice
isExposureModeSupported:AVCaptureExposureModeContinuousAutoExposure]) {
CGPoint exposurePoint = CGPointMake(0.5f, 0.5f);
[currentDevice setExposurePointOfInterest:exposurePoint];
[currentDevice setExposureMode:AVCaptureExposureModeContinuousAutoExposure];
}
Flash modes
There are three flash modes:
You use hasFlash to determine whether a device has a flash. You use the isFlashModeSupported: method
to determine whether a device supports a given flash mode, then set the mode using the flashMode property.
56
Media Capture
An AVCaptureDevice Object Represents an Input Device
Torch mode
Torch mode is where a camera uses the flash continuously at a low power to illuminate a video capture. There
are three torch modes:
You use hasTorch to determine whether a device has a flash. You use the isTorchModeSupported: method
to determine whether a device supports a given flash mode, then set the mode using the torchMode property.
For devices with a torch, the torch only turns on if the device is associated with a running capture session.
White balance
There are two white balance modes:
Configuring a Device
To set capture properties on a device, you must first acquire a lock on the device using
lockForConfiguration:. This avoids making changes that may be incompatible with settings in other
applications. The following code fragment illustrates how to approach changing the focus mode on a device
by first determining whether the mode is supported, then attempting to lock the device for reconfiguration.
The focus mode is changed only if the lock is obtained, and the lock is released immediately afterward.
if ([device isFocusModeSupported:AVCaptureFocusModeLocked]) {
NSError *error = nil;
if ([device lockForConfiguration:&error]) {
device.focusMode = AVCaptureFocusModeLocked;
57
Media Capture
Use Capture Inputs to Add a Capture Device to a Session
[device unlockForConfiguration];
}
else {
// Respond to the failure as appropriate.
You should only hold the device lock if you need settable device properties to remain unchanged. Holding
the device lock unnecessarily may degrade capture quality in other applications sharing the device.
[session removeInput:frontFacingCameraDeviceInput];
[session addInput:backFacingCameraDeviceInput];
[session commitConfiguration];
When the outermost commitConfiguration is invoked, all the changes are made together. This ensures a
smooth transition.
58
Media Capture
Use Capture Outputs to Get Output from a Session
You add inputs to a session using addInput:. If appropriate, you can check whether a capture input is
compatible with an existing session using canAddInput:.
AVCaptureSession *captureSession = <#Get a capture session#>;
AVCaptureDeviceInput *captureDeviceInput = <#Get a capture device input#>;
if ([captureSession canAddInput:captureDeviceInput]) {
[captureSession addInput:captureDeviceInput];
}
else {
// Handle the failure.
}
See Configuring a Session (page 51) for more details on how you might reconfigure a running session.
An AVCaptureInput vends one or more streams of media data. For example, input devices can provide both
audio and video data. Each media stream provided by an input is represented by an AVCaptureInputPort
object. A capture session uses an AVCaptureConnection object to define the mapping between a set of
AVCaptureInputPort objects and a single AVCaptureOutput.
AVCaptureVideoDataOutput if you want to process frames from the video being captured
You add outputs to a capture session using addOutput:. You check whether a capture output is compatible
with an existing session using canAddOutput:. You can add and remove outputs as you want while the
session is running.
59
Media Capture
Use Capture Outputs to Get Output from a Session
The resolution and bit rate for the output depend on the capture sessions sessionPreset. The video encoding
is typically H.264 and audio encoding AAC. The actual values vary by device, as illustrated in the following
table.
Preset
iPhone 3G
iPhone 3GS
iPhone 4 (Back)
iPhone 4 (Front)
High
No video
640x480
1280x720
640x480
Apple Lossless
3.5 mbps
10.5 mbps
3.5 mbps
No video
480x360
480x360
480x360
Apple Lossless
700 kbps
700 kbps
700 kbps
No video
192x144
192x144
192x144
Apple Lossless
128 kbps
128 kbps
128 kbps
Medium
Low
60
Media Capture
Use Capture Outputs to Get Output from a Session
Preset
iPhone 3G
iPhone 3GS
iPhone 4 (Back)
iPhone 4 (Front)
640x480
No video
640x480
640x480
640x480
Apple Lossless
3.5 mbps
3.5 mbps
3.5 mbps
No video
No video
No video
No video
Apple Lossless
64 kbps AAC
64 kbps AAC
64 kbps AAC
Not supported
for video output
1280x720
Photo
Starting a Recording
You start recording a QuickTime movie using startRecordingToOutputFileURL:recordingDelegate:.
You need to supply a file-based URL and a delegate. The URL must not identify an existing file, as the movie
file output does not overwrite existing resources. You must also have permission to write to the specified
location. The delegate must conform to the AVCaptureFileOutputRecordingDelegate protocol, and
must implement the
captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error: method.
AVCaptureMovieFileOutput *aMovieFileOutput = <#Get a movie file output#>;
NSURL *fileURL = <#A file URL that identifies the output location#>;
[aMovieFileOutput startRecordingToOutputFileURL:fileURL recordingDelegate:<#The
delegate#>];
In the implementation of
captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error:, the delegate
might write the resulting movie to the camera roll. It should also check for any errors that might have occurred.
not only the error, but also the value of the AVErrorRecordingSuccessfullyFinishedKey in the errors
user info dictionary:
- (void)captureOutput:(AVCaptureFileOutput *)captureOutput
didFinishRecordingToOutputFileAtURL:(NSURL *)outputFileURL
fromConnections:(NSArray *)connections
error:(NSError *)error {
61
Media Capture
Use Capture Outputs to Get Output from a Session
You should check the value of the AVErrorRecordingSuccessfullyFinishedKey in the errors user info
dictionary because the file might have been saved successfully, even though you got an error. The error might
indicate that one of your recording constraints was reached, for example AVErrorMaximumDurationReached
or AVErrorMaximumFileSizeReached. Other reasons the recording might stop are:
The recording device was disconnected (for example, the microphone was removed from an iPod
touch)AVErrorDeviceWasDisconnected.
The session was interrupted (for example, a phone call was received)AVErrorSessionWasInterrupted.
62
Media Capture
Use Capture Outputs to Get Output from a Session
[newMetadataArray addObject:item];
aMovieFileOutput.metadata = newMetadataArray;
opaque type (see Representations of Media (page 100)). By default, the buffers are emitted in the cameras
most efficient format. You can use the videoSettings property to specify a custom output format. The video
settings property is a dictionary; currently, the only supported key is kCVPixelBufferPixelFormatTypeKey.
The recommended pixel format choices for iPhone 4 are
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange or kCVPixelFormatType_32BGRA; for iPhone
3G the recommended pixel format choices are kCVPixelFormatType_422YpCbCr8 or
kCVPixelFormatType_32BGRA. You should use the CVOpenGLESTextureCacheRef type defined in the
Core Video framework if you want to work with the 420v format directly. Both Core Graphics and OpenGL work
well with the BGRA format:
AVCaptureSession *captureSession = <#Get a capture session#>;
NSDictionary *newSettings =
63
Media Capture
Use Capture Outputs to Get Output from a Session
@{ (NSString *)kCVPixelBufferPixelFormatTypeKey :
@(kCVPixelFormatType_32BGRA) };
captureSession.videoSettings = newSettings;
the amount of time allotted to a frame. If it takes too long, and you hold onto the video frames, AV Foundation
will stop delivering frames, not only to your delegate but also other outputs such as a preview layer.
You can use the capture video data outputs minFrameDuration property to ensure you have enough time
to process a frameat the cost of having a lower frame rate than would otherwise be the case. You might
also ensure that the alwaysDiscardsLateVideoFrames property is set to YES (the default). This ensures
that any late video frames are dropped rather than handed to you for processing. Alternatively, if you are
recording and it doesnt matter if the output fames are a little late, you would prefer to get all of them, you
can set the property value to NO. This does not mean that frames will not be dropped (that is, frames may still
be dropped), but they may not be dropped as early, or as efficiently.
iPhone 3G
iPhone 3GS
iPhone 4 (Back)
iPhone 4 (Front)
High
400x304
640x480
1280x720
640x480
Medium
400x304
480x360
480x360
480x360
Low
400x304
192x144
192x144
192x144
640x480
N/A
640x480
640x480
640x480
1280x720
N/A
N/A
1280x720
N/A
Photo
1600x1200
2048x1536
2592x1936
640x480
64
Media Capture
Use Capture Outputs to Get Output from a Session
iPhone 3GS
iPhone 4
You can find out what pixel and codec types are supported using availableImageDataCVPixelFormatTypes
and availableImageDataCodecTypes respectively. You set the outputSettings dictionary to specify
the image format you want, for example:
AVCaptureStillImageOutput *stillImageOutput = [[AVCaptureStillImageOutput alloc]
init];
NSDictionary *outputSettings = @{ AVVideoCodecKey : AVVideoCodecJPEG};
[stillImageOutput setOutputSettings:outputSettings];
If you want to capture a JPEG image, you should typically not specify your own compression format. Instead,
you should let the still image output do the compression for you, since its compression is hardware-accelerated.
If you need a data representation of the image, you can use jpegStillImageNSDataRepresentation: to
get an NSData object without re-compressing the data, even if you modify the images metadata.
Capturing an Image
When you want to capture an image, you send the output a
captureStillImageAsynchronouslyFromConnection:completionHandler: message. The first
argument is the connection you want to use for the capture. You need to look for the connection whose input
port is collecting video:
AVCaptureConnection *videoConnection = nil;
for (AVCaptureConnection *connection in stillImageOutput.connections) {
for (AVCaptureInputPort *port in [connection inputPorts]) {
if ([[port mediaType] isEqual:AVMediaTypeVideo] ) {
videoConnection = connection;
break;
}
}
if (videoConnection) { break; }
}
65
Media Capture
Showing the User Whats Being Recorded
Video Preview
You can provide the user with a preview of whats being recorded using an AVCaptureVideoPreviewLayer
object. AVCaptureVideoPreviewLayer is a subclass ofCALayer (see Core Animation Programming Guide .
You dont need any outputs to show the preview.
Unlike a capture output, a video preview layer maintains a strong reference to the session with which it is
associated. This is to ensure that the session is not deallocated while the layer is attempting to display video.
This is reflected in the way you initialize a preview layer:
AVCaptureSession *captureSession = <#Get a capture session#>;
CALayer *viewLayer = <#Get a layer from the view in which you want to present the
preview#>;
66
Media Capture
Showing the User Whats Being Recorded
In general, the preview layer behaves like any other CALayer object in the render tree (see Core Animation
Programming Guide ). You can scale the image and perform transformations, rotations and so on just as you
would any layer. One difference is that you may need to set the layers orientation property to specify how
it should rotate images coming from the camera. In addition, on iPhone 4 the preview layer supports mirroring
(this is the default when previewing the front-facing camera).
AVLayerVideoGravityResizeAspect: This preserves the aspect ratio, leaving black bars where the
AVLayerVideoGravityResizeAspectFill: This preserves the aspect ratio, but fills the available screen
AVLayerVideoGravityResize: This simply stretches the video to fill the available screen area, even if
67
Media Capture
Putting it all Together: Capturing Video Frames as UIImage Objects
Create an AVCaptureSession object to coordinate the flow of data from an AV input device to an output
Find the AVCaptureDevice object for the input type you want
Implement a function to convert the CMSampleBuffer received by the delegate into a UIImage object
Note: To focus on the most relevant code, this example omits several aspects of a complete
application, including memory management. To use AV Foundation, you are expected to have
enough experience with Cocoa to be able to infer the missing pieces.
68
Media Capture
Putting it all Together: Capturing Video Frames as UIImage Objects
The data output object uses delegation to vend the video frames. The delegate must adopt the
AVCaptureVideoDataOutputSampleBufferDelegate protocol. When you set the data outputs delegate,
you must also provide a queue on which callbacks should be invoked.
69
Media Capture
Putting it all Together: Capturing Video Frames as UIImage Objects
You use the queue to modify the priority given to delivering and processing the video frames.
Remember that the delegate method is invoked on the queue you specified in
setSampleBufferDelegate:queue:; if you want to update the user interface, you must invoke any relevant
code on the main thread.
70
Export
To read and write audiovisual assets, you must use the export APIs provided by the AV Foundation framework.
The AVAssetExportSession class provides an interface for simple exporting needs, such as modifying the
file format or trimming the length of an asset (see Trimming and Transcoding a Movie (page 16)). For more
in-depth exporting needs, use the AVAssetReader and AVAssetWriter classes.
Use an AVAssetReader when you want to perform an operation on the contents of an asset. For example,
you might read the audio track of an asset to produce a visual representation of the waveform. To produce an
asset from media such as sample buffers or still images, use an AVAssetWriter object.
Note: The asset reader and writer classes are not intended to be used for real-time processing. In
fact, an asset reader cannot even be used for reading from a real-time source like an HTTP live stream.
However, if you are using an asset writer with a real-time data source, such as an AVCaptureOutput
object, set the expectsMediaDataInRealTime property of your asset writers inputs to YES.
Setting this property to YES for a non-real-time data source will result in your files not being
interleaved properly.
Reading an Asset
Each AVAssetReader object can be associated only with a single asset at a time, but this asset may contain
multiple tracks. For this reason, you must assign concrete subclasses of the AVAssetReaderOutput class to
your asset reader before you begin reading in order to configure how the media data is read. There are three
concrete subclasses of the AVAssetReaderOutput base class that you can use for your asset reading needs:
AVAssetReaderTrackOutput, AVAssetReaderAudioMixOutput, and
AVAssetReaderVideoCompositionOutput.
71
Export
Reading an Asset
Note: Always check that the asset reader returned to you is non-nil to ensure that the asset reader
was initialized successfully. Otherwise, the error parameter (outError in the previous example) will
contain the relevant error information.
72
Export
Reading an Asset
Note: To read the media data from a specific asset track in the format in which it was stored, pass
nil to the outputSettings parameter.
73
Export
Reading an Asset
Note: Passing nil for the audioSettings parameter tells the asset reader to return samples in a
convenient uncompressed format. The same is true for the
AVAssetReaderVideoCompositionOutput class.
The video composition output behaves in much the same way: You can read multiple video tracks from your
asset that have been composited together using an AVVideoComposition object. To read the media data
from multiple composited video tracks and decompress it to ARGB, set up your output as follows:
AVVideoComposition *videoComposition = <#An AVVideoComposition that specifies how
the video tracks from the AVAsset are composited#>;
// Assumes assetReader was initialized with an AVComposition.
AVComposition *composition = (AVComposition *)assetReader.asset;
// Get the video tracks to read.
NSArray *videoTracks = [composition tracksWithMediaType:AVMediaTypeVideo];
// Decompression settings for ARGB.
NSDictionary *decompressionVideoSettings = @{ (id)kCVPixelBufferPixelFormatTypeKey
: [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32ARGB],
(id)kCVPixelBufferIOSurfacePropertiesKey : [NSDictionary dictionary] };
// Create the video composition output with the video tracks and decompression
setttings.
AVAssetReaderOutput *videoCompositionOutput = [AVAssetReaderVideoCompositionOutput
assetReaderVideoCompositionOutputWithVideoTracks:videoTracks
videoSettings:decompressionVideoSettings];
// Associate the video composition used to composite the video tracks being read
with the output.
videoCompositionOutput.videoComposition = videoComposition;
// Add the output to the reader if possible.
if ([assetReader canAddOutput:videoCompositionOutput])
[assetReader addOutput:videoCompositionOutput];
74
Export
Writing an Asset
Writing an Asset
UAVAssetWriter class to write media data from multiple sources to a single file of a specified file format.
You dont need to associate your asset writer object with a specific asset, but you must use a separate asset
writer for each output file that you want to create. Because an asset writer can write media data from multiple
sources, you must create an AVAssetWriterInput object for each individual track that you want to write to
75
Export
Writing an Asset
the output file. Each AVAssetWriterInput object expects to receive data in the form of CMSampleBufferRef
objects, but if you want to append CVPixelBufferRef objects to your asset writer input, use the
AVAssetWriterInputPixelBufferAdaptor class.
: [NSNumber numberWithUnsignedInt:kAudioFormatMPEG4AAC],
AVEncoderBitRateKey
: [NSNumber numberWithInteger:128000],
AVSampleRateKey
: [NSNumber numberWithInteger:44100],
AVChannelLayoutKey
: channelLayoutAsData,
76
Export
Writing an Asset
Note: If you want the media data to be written in the format in which it was stored, pass nil in the
outputSettings parameter. Pass nil only if the asset writer was initialized with a fileType of
AVFileTypeQuickTimeMovie.
Your asset writer input can optionally include some metadata or specify a different transform for a particular
track using the metadata and transform properties respectively. For an asset writer input whose data source
is a video track, you can maintain the videos original transform in the output file by doing the following:
AVAsset *videoAsset = <#AVAsset with at least one video track#>;
AVAssetTrack *videoAssetTrack = [[videoAsset tracksWithMediaType:AVMediaTypeVideo]
objectAtIndex:0];
assetWriterInput.transform = videoAssetTrack.preferredTransform;
Note: Set the metadata and transform properties before you begin writing with your asset writer
for them to take effect.
When writing media data to the output file, sometimes you may want to allocate pixel buffers. To do so, use
the AVAssetWriterInputPixelBufferAdaptor class. For greatest efficiency, instead of adding pixel buffers
that were allocated using a separate pool, use the pixel buffer pool provided by the pixel buffer adaptor. The
following code creates a pixel buffer object working in the RGB domain that will use CGImage objects to create
its pixel buffers.
NSDictionary *pixelBufferAttributes = @{
kCVPixelBufferCGImageCompatibilityKey : [NSNumber numberWithBool:YES],
kCVPixelBufferCGBitmapContextCompatibilityKey : [NSNumber numberWithBool:YES],
77
Export
Writing an Asset
kCVPixelBufferPixelFormatTypeKey : [NSNumber
numberWithInt:kCVPixelFormatType_32ARGB]
};
AVAssetWriterInputPixelBufferAdaptor *inputPixelBufferAdaptor =
[AVAssetWriterInputPixelBufferAdaptor
assetWriterInputPixelBufferAdaptorWithAssetWriterInput:self.assetWriterInput
sourcePixelBufferAttributes:pixelBufferAttributes];
Normally, to end a writing session you must call the endSessionAtSourceTime: method. However, if your
writing session goes right up to the end of your file, you can end the writing session simply by calling the
finishWriting method. To start up an asset writer with a single input and write all of its media data, do the
following:
// Prepare the asset writer for writing.
[self.assetWriter startWriting];
// Start a sample-writing session.
[self.assetWriter startSessionAtSourceTime:kCMTimeZero];
// Specify the block to execute when the asset writer is ready for media data and
the queue to call it on.
[self.assetWriterInput requestMediaDataWhenReadyOnQueue:myInputSerialQueue
usingBlock:^{
78
Export
Reencoding Assets
The copyNextSampleBufferToWrite method in the code above is simply a stub. The location of this stub
is where you would need to insert some logic to return CMSampleBufferRef objects representing the media
data that you want to write. One possible source of sample buffers is an asset reader output.
Reencoding Assets
You can use an asset reader and asset writer object in tandem to convert an asset from one representation to
another. Using these objects, you have more control over the conversion than you do with an
AVAssetExportSession object. For example, you can choose which of the tracks you want to be represented
in the output file, specify your own output format, or modify the asset during the conversion process. The first
step in this process is just to set up your asset reader outputs and asset writer inputs as desired. After your
asset reader and writer are fully configured, you start up both of them with calls to the startReading and
startWriting methods, respectively. The following code snippet displays how to use a single asset writer
input to write media data supplied by a single asset reader output:
79
Export
Reencoding Assets
80
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
}
else
{
// The asset reader output must have vended all of its samples.
Mark the input as finished.
[self.assetWriterInput markAsFinished];
break;
}
}
}
}];
Use serialization queues to handle the asynchronous nature of reading and writing audiovisual data
Initialize an asset reader and configure two asset reader outputs, one for audio and one for video
Initialize an asset writer and configure two asset writer inputs, one for audio and one for video
Use an asset reader to asynchronously supply media data to an asset writer through two different
output/input combinations
Note: To focus on the most relevant code, this example omits several aspects of a complete
application. To use AV Foundation, you are expected to have enough experience with Cocoa to be
able to infer the missing pieces.
81
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
The main serialization queue is used to coordinate the starting and stopping of the asset reader and writer
(perhaps due to cancellation) and the other two serialization queues are used to serialize the reading and
writing by each output/input combination with a potential cancellation.
Now that you have some serialization queues, load the tracks of your asset and begin the reencoding process.
self.asset = <#AVAsset that you want to reencode#>;
self.cancelled = NO;
self.outputURL = <#NSURL representing desired output URL for file generated by
asset writer#>;
// Asynchronously load the tracks of the asset you want to read.
[self.asset loadValuesAsynchronouslyForKeys:@[@"tracks"] completionHandler:^{
// Once the tracks have finished loading, dispatch the work to the main
serialization queue.
dispatch_async(self.mainSerializationQueue, ^{
// Due to asynchronous nature, check to see if user has already cancelled.
if (self.cancelled)
return;
BOOL success = YES;
NSError *localError = nil;
// Check for success of loading the assets tracks.
82
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
When the track loading process finishes, whether successfully or not, the rest of the work is dispatched to the
main serialization queue to ensure that all of this work is serialized with a potential cancellation. Now all thats
left is to implement the cancellation process and the three custom methods at the end of the previous code
listing.
83
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
if (success)
{
// If the reader and writer were successfully initialized, grab the audio
and video asset tracks that will be used.
AVAssetTrack *assetAudioTrack = nil, *assetVideoTrack = nil;
NSArray *audioTracks = [self.asset tracksWithMediaType:AVMediaTypeAudio];
if ([audioTracks count] > 0)
assetAudioTrack = [audioTracks objectAtIndex:0];
NSArray *videoTracks = [self.asset tracksWithMediaType:AVMediaTypeVideo];
if ([videoTracks count] > 0)
assetVideoTrack = [videoTracks objectAtIndex:0];
if (assetAudioTrack)
{
// If there is an audio track to read, set the decompression settings
to Linear PCM and create the asset reader output.
NSDictionary *decompressionAudioSettings = @{ AVFormatIDKey :
[NSNumber numberWithUnsignedInt:kAudioFormatLinearPCM] };
self.assetReaderAudioOutput = [AVAssetReaderTrackOutput
assetReaderTrackOutputWithTrack:assetAudioTrack
outputSettings:decompressionAudioSettings];
[self.assetReader addOutput:self.assetReaderAudioOutput];
// Then, set the compression settings to 128kbps AAC and create the
asset writer input.
AudioChannelLayout stereoChannelLayout = {
84
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo,
.mChannelBitmap = 0,
.mNumberChannelDescriptions = 0
};
NSData *channelLayoutAsData = [NSData
dataWithBytes:&stereoChannelLayout length:offsetof(AudioChannelLayout,
mChannelDescriptions)];
NSDictionary *compressionAudioSettings = @{
AVFormatIDKey
: [NSNumber
numberWithUnsignedInt:kAudioFormatMPEG4AAC],
AVEncoderBitRateKey
: [NSNumber numberWithInteger:128000],
AVSampleRateKey
: [NSNumber numberWithInteger:44100],
AVChannelLayoutKey
: channelLayoutAsData,
if (assetVideoTrack)
{
// If there is a video track to read, set the decompression settings
for YUV and create the asset reader output.
NSDictionary *decompressionVideoSettings = @{
(id)kCVPixelBufferPixelFormatTypeKey
numberWithUnsignedInt:kCVPixelFormatType_422YpCbCr8],
: [NSNumber
(id)kCVPixelBufferIOSurfacePropertiesKey : [NSDictionary
dictionary]
};
self.assetReaderVideoOutput = [AVAssetReaderTrackOutput
assetReaderTrackOutputWithTrack:assetVideoTrack
outputSettings:decompressionVideoSettings];
[self.assetReader addOutput:self.assetReaderVideoOutput];
CMFormatDescriptionRef formatDescription = NULL;
// Grab the video format descriptions from the video track and grab
the first one if it exists.
85
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
AVVideoCleanApertureHeightKey
(id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription,
kCMFormatDescriptionKey_CleanApertureHeight),
AVVideoCleanApertureHorizontalOffsetKey :
(id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription,
kCMFormatDescriptionKey_CleanApertureHorizontalOffset),
86
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
AVVideoCleanApertureVerticalOffsetKey
(id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription,
kCMFormatDescriptionKey_CleanApertureVerticalOffset)
};
}
CFDictionaryRef pixelAspectRatioFromCMFormatDescription =
CMFormatDescriptionGetExtension(formatDescription,
kCMFormatDescriptionExtension_PixelAspectRatio);
if (pixelAspectRatioFromCMFormatDescription)
{
pixelAspectRatio = @{
AVVideoPixelAspectRatioHorizontalSpacingKey :
(id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription,
kCMFormatDescriptionKey_PixelAspectRatioHorizontalSpacing),
AVVideoPixelAspectRatioVerticalSpacingKey
(id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription,
kCMFormatDescriptionKey_PixelAspectRatioVerticalSpacing)
};
}
// Add whichever settings we could grab from the format
description to the compression settings dictionary.
if (cleanAperture || pixelAspectRatio)
{
NSMutableDictionary *mutableCompressionSettings =
[NSMutableDictionary dictionary];
if (cleanAperture)
[mutableCompressionSettings setObject:cleanAperture
forKey:AVVideoCleanApertureKey];
if (pixelAspectRatio)
[mutableCompressionSettings setObject:pixelAspectRatio
forKey:AVVideoPixelAspectRatioKey];
compressionSettings = mutableCompressionSettings;
}
}
// Create the video settings dictionary for H.264.
NSMutableDictionary *videoSettings = (NSMutableDictionary *) @{
AVVideoCodecKey
: AVVideoCodecH264,
AVVideoWidthKey : [NSNumber
numberWithDouble:trackDimensions.width],
87
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
AVVideoHeightKey : [NSNumber
numberWithDouble:trackDimensions.height]
};
// Put the compression settings into the video settings dictionary
if we were able to grab them.
if (compressionSettings)
[videoSettings setObject:compressionSettings
forKey:AVVideoCompressionPropertiesKey];
// Create the asset writer input and add it to the asset writer.
self.assetWriterVideoInput = [AVAssetWriterInput
assetWriterInputWithMediaType:[videoTrack mediaType] outputSettings:videoSettings];
[self.assetWriter addInput:self.assetWriterVideoInput];
}
}
return success;
}
88
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
if (success)
{
// If the asset reader and writer both started successfully, create the
dispatch group where the reencoding will take place and start a sample-writing
session.
self.dispatchGroup = dispatch_group_create();
[self.assetWriter startSessionAtSourceTime:kCMTimeZero];
self.audioFinished = NO;
self.videoFinished = NO;
if (self.assetWriterAudioInput)
{
// If there is audio to reencode, enter the dispatch group before
beginning the work.
dispatch_group_enter(self.dispatchGroup);
// Specify the block to execute when the asset writer is ready for
audio media data, and specify the queue to call it on.
[self.assetWriterAudioInput
requestMediaDataWhenReadyOnQueue:self.rwAudioSerializationQueue usingBlock:^{
// Because the block is called asynchronously, check to see
whether its task is complete.
if (self.audioFinished)
return;
BOOL completedOrFailed = NO;
// If the task isn't complete yet, make sure that the input
is actually ready for more media data.
while ([self.assetWriterAudioInput isReadyForMoreMediaData]
&& !completedOrFailed)
{
// Get the next audio sample buffer, and append it to the
output file.
CMSampleBufferRef sampleBuffer =
[self.assetReaderAudioOutput copyNextSampleBuffer];
if (sampleBuffer != NULL)
{
BOOL success = [self.assetWriterAudioInput
appendSampleBuffer:sampleBuffer];
CFRelease(sampleBuffer);
89
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
sampleBuffer = NULL;
completedOrFailed = !success;
}
else
{
completedOrFailed = YES;
}
}
if (completedOrFailed)
{
// Mark the input as finished, but only if we haven't
already done so, and then leave the dispatch group (since the audio work has
finished).
BOOL oldFinished = self.audioFinished;
self.audioFinished = YES;
if (oldFinished == NO)
{
[self.assetWriterAudioInput markAsFinished];
}
dispatch_group_leave(self.dispatchGroup);
}
}];
}
if (self.assetWriterVideoInput)
{
// If we had video to reencode, enter the dispatch group before
beginning the work.
dispatch_group_enter(self.dispatchGroup);
// Specify the block to execute when the asset writer is ready for
video media data, and specify the queue to call it on.
[self.assetWriterVideoInput
requestMediaDataWhenReadyOnQueue:self.rwVideoSerializationQueue usingBlock:^{
// Because the block is called asynchronously, check to see
whether its task is complete.
if (self.videoFinished)
90
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
return;
BOOL completedOrFailed = NO;
// If the task isn't complete yet, make sure that the input
is actually ready for more media data.
while ([self.assetWriterVideoInput isReadyForMoreMediaData]
&& !completedOrFailed)
{
// Get the next video sample buffer, and append it to the
output file.
CMSampleBufferRef sampleBuffer =
[self.assetReaderVideoOutput copyNextSampleBuffer];
if (sampleBuffer != NULL)
{
BOOL success = [self.assetWriterVideoInput
appendSampleBuffer:sampleBuffer];
CFRelease(sampleBuffer);
sampleBuffer = NULL;
completedOrFailed = !success;
}
else
{
completedOrFailed = YES;
}
}
if (completedOrFailed)
{
// Mark the input as finished, but only if we haven't
already done so, and then leave the dispatch group (since the video work has
finished).
BOOL oldFinished = self.videoFinished;
self.videoFinished = YES;
if (oldFinished == NO)
{
[self.assetWriterVideoInput markAsFinished];
}
dispatch_group_leave(self.dispatchGroup);
}
91
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
}];
}
// Set up the notification that the dispatch group will send when the
audio and video work have both finished.
dispatch_group_notify(self.dispatchGroup, self.mainSerializationQueue,
^{
BOOL finalSuccess = YES;
NSError *finalError = nil;
// Check to see if the work has finished due to cancellation.
if (self.cancelled)
{
// If so, cancel the reader and writer.
[self.assetReader cancelReading];
[self.assetWriter cancelWriting];
}
else
{
// If cancellation didn't occur, first make sure that the asset
reader didn't fail.
if ([self.assetReader status] == AVAssetReaderStatusFailed)
{
finalSuccess = NO;
finalError = [self.assetReader error];
}
// If the asset reader didn't fail, attempt to stop the asset
writer and check for any errors.
if (finalSuccess)
{
finalSuccess = [self.assetWriter finishWriting];
if (!finalSuccess)
finalError = [self.assetWriter error];
}
}
// Call the method to handle completion, and pass in the appropriate
parameters to indicate whether reencoding was successful.
92
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
[self readingAndWritingDidFinishSuccessfully:finalSuccess
withError:finalError];
});
}
// Return success here to indicate whether the asset reader and writer were
started successfully.
return success;
}
The reencoding of the audio and video track are both asynchronously handled on their own serialization
queues to increase the overall performance of the process, but both of them lie within the same dispatch
group. By placing the work for each track within the same dispatch group, the group can send a notification
when all of the work is done and the success of the reencoding process can be determined.
Handling Completion
To handle the completion of the reading and writing process, the
readingAndWritingDidFinishSuccessfully: method is calledwith parameters indicating whether
or not the reencoding completed successfully. If the process didnt finish successfully, the asset reader and
writer are both canceled and any UI related tasks are dispatched to the main queue.
- (void)readingAndWritingDidFinishSuccessfully:(BOOL)success withError:(NSError
*)error
{
if (!success)
{
// If the reencoding process failed, we need to cancel the asset reader
and writer.
[self.assetReader cancelReading];
[self.assetWriter cancelWriting];
dispatch_async(dispatch_get_main_queue(), ^{
// Handle any UI tasks here related to failure.
});
}
else
{
// Reencoding was successful, reset booleans.
self.cancelled = NO;
93
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
self.videoFinished = NO;
self.audioFinished = NO;
dispatch_async(dispatch_get_main_queue(), ^{
// Handle any UI tasks here related to success.
});
}
}
Handling Cancellation
Using multiple serialization queues, you can allow the user of your app to cancel the reencoding process with
ease. On the main serialization queue, messages are asynchronously sent to each of the asset reencoding
serialization queues to cancel their reading and writing. When these two serialization queues complete their
cancellation, the dispatch group sends a notification to the main serialization queue where the cancelled
property is set to YES. You might associate the cancel method from the following code listing with a button
on your UI.
- (void)cancel
{
// Handle cancellation asynchronously, but serialize it with the main queue.
dispatch_async(self.mainSerializationQueue, ^{
// If we had audio data to reencode, we need to cancel the audio work.
if (self.assetWriterAudioInput)
{
// Handle cancellation asynchronously again, but this time serialize
it with the audio queue.
dispatch_async(self.rwAudioSerializationQueue, ^{
// Update the Boolean property indicating the task is complete
and mark the input as finished if it hasn't already been marked as such.
BOOL oldFinished = self.audioFinished;
self.audioFinished = YES;
if (oldFinished == NO)
{
[self.assetWriterAudioInput markAsFinished];
}
// Leave the dispatch group since the audio work is finished
now.
94
Export
Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset
dispatch_group_leave(self.dispatchGroup);
});
}
if (self.assetWriterVideoInput)
{
// Handle cancellation asynchronously again, but this time serialize
it with the video queue.
dispatch_async(self.rwVideoSerializationQueue, ^{
// Update the Boolean property indicating the task is complete
and mark the input as finished if it hasn't already been marked as such.
BOOL oldFinished = self.videoFinished;
self.videoFinished = YES;
if (oldFinished == NO)
{
[self.assetWriterVideoInput markAsFinished];
}
// Leave the dispatch group, since the video work is finished
now.
dispatch_group_leave(self.dispatchGroup);
});
}
// Set the cancelled Boolean property to YES to cancel any work on the
main queue as well.
self.cancelled = YES;
});
}
95
Time-based audiovisual data such as a movie file or a video stream is represented in the AV Foundation
framework by AVAsset. Its structure dictates much of the framework works. Several low-level data structures
that AV Foundation uses to represent time and media such as sample buffers come from the Core Media
framework.
Representation of Assets
AVAsset is the core class in the AV Foundation framework. It provides a format-independent abstraction of
time-based audiovisual data, such as a movie file or a video stream. In many cases, you work with one of its
subclasses: you use the composition subclasses when you create new assets (see Editing (page 8)), and you
use AVURLAsset to create a new asset instance from media at a given URL (including assets from the MPMedia
framework or the Asset Library frameworksee Using Assets (page 10)).
An asset contains a collection of tracks that are intended to be presented or processed together, each of a
uniform media type, including (but not limited to) audio, video, text, closed captions, and subtitles. The asset
object provides information about whole resource, such as its duration or title, as well as hints for presentation,
such as its natural size. Assets may also have metadata, represented by instances of AVMetadataItem.
96
A track is represented by an instance of AVAssetTrack. In a typical simple case, one track represents the
audio component and another represents the video component; in a complex composition, there may be
multiple overlapping tracks of audio and video.
A track has a number of properties, such as its type (video or audio), visual and/or audible characteristics (as
appropriate), metadata, and timeline (expressed in terms of its parent asset). A track also has an array of format
descriptions. The array contains CMFormatDescriptions (see CMFormatDescriptionRef), each of which
describes the format of media samples referenced by the track. A track that contains uniform media (for
example, all encoded using to the same settings) will provide an array with a count of 1.
A track may itself be divided into segments, represented by instances of AVAssetTrackSegment. A segment
is a time mapping from the source to the asset track timeline.
Representations of Time
Time in AV Foundation is represented by primitive structures from the Core Media framework.
a denominator (an int32_t timescale).Conceptually, the timescale specifies the fraction of a second each unit
in the numerator occupies. Thus if the timescale is 4, each unit represents a quarter of a second; if the timescale
is 10, each unit represents a tenth of a second, and so on. You frequently use a timescale of 600, since this is
a common multiple of several commonly-used frame-rates: 24 frames per second (fps) for film, 30 fps for NTSC
(used for TV in North America and Japan), and 25 fps for PAL (used for TV in Europe). Using a timescale of 600,
you can exactly represent any number of frames in these systems.
In addition to a simple time value, a CMTime can represent non-numeric values: +infinity, -infinity, and indefinite.
It can also indicate whether the time been rounded at some point, and it maintains an epoch number.
97
Using CMTime
You create a time using CMTimeMake, or one of the related functions such as CMTimeMakeWithSeconds
(which allows you to create a time using a float value and specify a preferred time scale). There are several
functions for time-based arithmetic and to compare times, as illustrated in the following example.
CMTime time1 = CMTimeMake(200, 2); // 200 half-seconds
CMTime time2 = CMTimeMake(400, 4); // 400 quarter-seconds
// time1 and time2 both represent 100 seconds, but using different timescales.
if (CMTimeCompare(time1, time2) == 0) {
NSLog(@"time1 and time2 are the same");
}
98
You should not compare the value of an arbitrary CMTime with kCMTimeInvalid.
Epochs
The epoch number of a CMTime is usually set to 0, but you can use it to distinguish unrelated timelines. For
example, the epoch could be incremented each cycle through a presentation loop, to differentiate between
time N in loop 0 from time N in loop 1.
does not include the time that is the start time plus the duration.
You create a time range using CMTimeRangeMake or CMTimeRangeFromTimeToTime. There are constraints
on the value of the CMTimes epochs:
The epoch in a CMTime that represents a timestamp may be non-zero, but you can only perform range
operations (such as CMTimeRangeGetUnion) on ranges whose start fields have the same epoch.
The epoch in a CMTime that represents a duration should always be 0, and the value must be non-negative.
99
CMTimeRangeContainsTime(range, CMTimeRangeGetEnd(range))
You should not compare the value of an arbitrary CMTimeRange with kCMTimeRangeInvalid.
Representations of Media
Video data and its associated metadata is represented in AV Foundation by opaque objects from the Core
Media framework. Core Media represents video data using CMSampleBuffer (see CMSampleBufferRef).
CMSampleBuffer is a Core Foundation-style opaque type; an instance contains the sample buffer for a frame
of video data as a Core Video pixel buffer (see CVPixelBufferRef). You access the pixel buffer from a sample
buffer using CMSampleBufferGetImageBuffer:
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(<#A CMSampleBuffer#>);
100
From the pixel buffer, you can access the actual video data. For an example, see Converting a CMSampleBuffer
to a UIImage (page 101).
In addition to the video data, you can retrieve a number of other aspects of the video frame:
Timing information You get accurate timestamps for both the original presentation time and the decode
time using CMSampleBufferGetPresentationTimeStamp and CMSampleBufferGetDecodeTimeStamp
respectively.
Metadata Metadata are stored in a dictionary as an attachment. You use CMGetAttachment to retrieve
the dictionary:
CMSampleBufferRef sampleBuffer = <#Get a sample buffer#>;
CFDictionaryRef metadataDictionary =
CMGetAttachment(sampleBuffer, CFSTR("MetadataDictionary", NULL);
if (metadataDictionary) {
// Do something with the metadata.
}
101
// Get the number of bytes per row for the pixel buffer.
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
// Get the pixel buffer width and height.
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
102
CGImageRelease(cgImage);
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
return image;
}
103
Date
Notes
2013-10-22
2013-08-08
Added a new chapter centered around the editing API provided by the
AV Foundation framework.
2011-10-12
2011-04-28
2010-09-08
TBD
2010-08-16
104
Apple Inc.
Copyright 2013 Apple Inc.
All rights reserved.
No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any
form or by any means, mechanical, electronic,
photocopying, recording, or otherwise, without
prior written permission of Apple Inc., with the
following exceptions: Any person is hereby
authorized to store documentation on a single
computer for personal use only and to print
copies of documentation for personal use
provided that the documentation contains
Apples copyright notice.
No licenses, express or implied, are granted with
respect to any of the technology described in this
document. Apple retains all intellectual property
rights associated with the technology described
in this document. This document is intended to
assist application developers to develop
applications only for Apple-labeled computers.
Apple Inc.
1 Infinite Loop
Cupertino, CA 95014
408-996-1010
Apple, the Apple logo, Aperture, Cocoa, iPhone,
iPod, iPod touch, Mac, Mac OS, Objective-C, OS
X, Quartz, and QuickTime are trademarks of Apple
Inc., registered in the U.S. and other countries.
OpenGL is a registered trademark of Silicon
Graphics, Inc.
Times is a registered trademark of Heidelberger
Druckmaschinen AG, available from Linotype
Library GmbH.
iOS is a trademark or registered trademark of
Cisco in the U.S. and other countries and is used
under license.
Even though Apple has reviewed this document,
APPLE MAKES NO WARRANTY OR REPRESENTATION,
EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS
DOCUMENT, ITS QUALITY, ACCURACY,
MERCHANTABILITY, OR FITNESS FOR A PARTICULAR
PURPOSE. AS A RESULT, THIS DOCUMENT IS PROVIDED
AS IS, AND YOU, THE READER, ARE ASSUMING THE
ENTIRE RISK AS TO ITS QUALITY AND ACCURACY.
IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT,
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES RESULTING FROM ANY DEFECT OR
INACCURACY IN THIS DOCUMENT, even if advised of
the possibility of such damages.
THE WARRANTY AND REMEDIES SET FORTH ABOVE
ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL
OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer,
agent, or employee is authorized to make any
modification, extension, or addition to this warranty.
Some states do not allow the exclusion or limitation
of implied warranties or liability for incidental or
consequential damages, so the above limitation or
exclusion may not apply to you. This warranty gives
you specific legal rights, and you may also have other
rights which vary from state to state.