Matroska/WebM in MPEG DASH
This document defines how to use Matroska/WebM with the MPEG DASH adaptive streaming system defined in ISO-IEC_23009-1 [DASH].
Matroska and WebM share the same file structure, Matroska having some extra features. WebM is also restricted to the VP8 video codec and Vorbis audio codec. The Matroska specification can be found at
http://www.matroska.org/technical/specs/index.html
[MATROSKA] and the WebM specification can be found at
http://www.webmproject.org/code/specs/container/
[WEBM]. There is also an overview of all the structures found in Matroska/WebM for beginners at
http://www.matroska.org/technical/diagram/index.html
. The MPEG DASH terminology appears in
bold
and the Matroska terminology appears in
italic
. For example
Segment
in DASH and
Segment
in Matroska.
1 Segment formats for Matroska/WebM
1.1 Preliminaries: Refinements of generic concepts
1.1.1 Subsegments
Media subsegments
are defined as one or more consecutive
Clusters
.
1.1.2 Media stream access points
Media stream access points
correspond to a
CuePoint
contained in a
Cues
. Typically
CuePoints
reference
Blocks
that are marked as key frames within a single stream.
For WebM all SAPs shall be of type 1. In this case a SAP shall not reference any
Blocks
preceding the SAP and the SAP shall be the first frame rendered.
1.1.3 Segment Index
The
Segment Index
corresponds to the
Cues
.
1.2 Initialization Segment Format
The
Initialization Segment
shall contain the
EBML header
,
Segment header
,
Segment Information
and
Tracks
. The
Initialization Segment
may contain other level1 elements and padding.
If the
Media Segment
contains a
Cues
that is placed after the
Clusters
than the
Initialization Segment
shall contain a
SeekHead
with a reference to the
Cues
.
The
Initialization Segment
shall not contain
Clusters
or
Cues
.
Media Segments
can be of three different types:
Basic Media Segments, Self-Initializing Media Segments, and Self-Initializing Indexed Media Segments
.
1.3.1 General format type
- All
Media Segments
must contain one or more
Clusters
.
1.3.2 Basic Media Segment
- The
Media Segment
shall contain only one or more
Clusters
.
- The
Initialization Segment
that references these
Media Segments
should have a
Segment
with an unknown size (-1).
1.3.3 Self-Initializing Media Segment
- Each
Media Segment
shall contain an
Initialization Segment
.
- The
Initialization Segment
shall be placed before the
Clusters
.
1.3.4 Self-Initializing Indexed Media Segment
- Each
Media Segment
shall contain an
Initialization Segment
.
- The
Initialization Segment
shall be placed before the
Clusters
.
- Each
Media Segment
shall contain a
Cues
that references the
Clusters
in the
Media Segment
.
- The
Cues
shall not be part of the
Initialization Segment
.
- The
Cues
shall be placed after the
Initialization Segment
.
- If the
Cues
is placed after the
Clusters
then a
SeekHead
shall be contained in the
Initialization Segment
with a reference to the
Cues
.
1.4 Media Presentation based on Matroska/WebM formats
1.4.1 MIME types
The MIME types for
Segments
are as follows:
video/webm for WebM video files,
audio/webm for WebM audio-only files
video/x-matroska for Matroska video files
audio/x-matroska for Matroska audio-only files
video/x-matroska-3d for Matroska files with a stereoscopic video trackRFC 4281
The MIME type for codecs are as defined in
RFC 4281
[MIMETYPE]. For WebM only video/vp8 and audio/vorbis are possible.
1.4.2 General
- In the case a
Representation
contains only one
Media Segment
then the
Media Segment
must be a
Self-Initializing Indexed Media Segment
.
- An
Index Segment
shall not be present. However an
@indexRange
attribute may be present to signal the byte range for a
Segment Index
within a
Media Segment
.
- If an
Initialization Segment
is contained within a
Self-Initializing Indexed Media Segment
then the
@range
attribute of the
Initialization Segment
shall be present to signal the byte range for the
Initialization Segment
within the
Media Segment
.
1.4.3 Authoring Rules for specific MPD flags
1.4.3.1 Segments starting with media stream access points
- If the
@startWithRAP
attribute is not set to '0', the conditions in 5.5.3.2 [DASH] shall apply.
1.4.3.2 Subsegments starting with media stream access points
- If the
@subsegmentStartsWithRAP
attribute is not set to '0', the conditions in 5.5.3.2 [DASH] shall apply.
- Each
subsegment
shall start on a
Cluster
boundary.
- The first
Block
in the
subsegment
shall be a key
Block
.
1.4.3.3 Subsegment Alignment
- If the
@segmentAlignmentFlag
is set to 'true', the conditions in 5.5.3.2 [DASH] shall apply.
- The
Cues
of each
Representation
within an
AdaptationSet
shall have the same timecodes.
- Each
CuePoint
within a
Cues
should reference the first
Block
of the
TrackNumber
within a
Cluster
.
1.4.3.4 Bitstream switching
- If the
@bitstreamSwitchingFlag
is set to 'true', for a set of
Representations
within an
AdaptationSet
, the conditions in 5.5.3.2 [DASH] shall apply and the
Bitstream Switching Segment
shall not be present.
- For a set of
Representations
within an
AdaptationSet
, the
TrackNumber
,
CodecID
and
CodecPrivate
shall contain the same value.
- The codec shall support transparent resolution switching. (e.g. VP8)
2 Matroska/WebM On-Demand profile
- The conditions in 8.3.1 [DASH] shall apply.
- For WebM the On-Demand profile is identified by the URN “urn:mpeg:dash:profile:webm-on-demand:2012”.
2.2 Media Presentation Description Constraints
Most of the constraints are the same constraints as those outlined in 8.3.2 [DASH].
- The rules for the
MPD
and the
segments
as defined in
Media Presentation
based on Matroska/WebM formats shall apply.
-
Representations
not inferred to have @profiles equal to the profile identifier as defined in On-Demand General may be ignored.
-
MPD@type
shall be “static”.
- The
Subset
element may be ignored.
- neither the
Period.SegmentList
element nor the
Period.SegmentTemplate
element shall be present
- if either the
AdaptationSet.SegmentList
or the
AdaptationSet.SegmentTemplate
element is present in an
AdaptationSet
element then this
AdaptationSet
element may be ignored.
- if either the
Representation.SegmentList
or the
Representation.SegmentTemplate
element is present in a
Representation
element then this
Representation
element may be ignored.
- if the
Representation
element does not contain a
BaseURL
element then this
Representation
element may be ignored.
-
AdaptationSet
elements with
AdaptationSet@subsegmentAlignment
not present, or set to 'false' may be ignored.
-
Representation
elements with
@subsegmentStartWithSAP
value (either supplied directly or inherited from the containing
AdaptationSet
) that does not equal 1 may be ignored if the containing
AdaptationSet
contains more than one
Representation
.
- Elements using the
@xlink
attribute may be ignored from the
MPD
. The
Representations
conforming to this profile are those not accessed through an
AdaptationSet
that uses an @xlink.
2.3 Segment Format Constraints
- Each
Representation
shall contain one
Segment
that complies with the
Self-Initializing Index Media Segment
.
3 Matroska/WebM Live profile
TODO(fgalligan)
[DASH] Dash specification
www.itscj.ipsj.or.jp/sc29/open/29view/29n12313t.doc
[MATROSKA] Matroska specification
http://www.matroska.org/technical/specs/index.htm
[MIMETYPE] RFC mimetypes for codecs
http://www.ietf.org/rfc/rfc4281.txt
[WEBM] WebM specification
http://www.webmproject.org/code/specs/container/