| 1 |
|
|---|
| 2 |
|
|---|
| 3 |
|
|---|
| 4 |
Network Working Group S. Pfeiffer |
|---|
| 5 |
Internet-Draft C. Parker |
|---|
| 6 |
Intended status: Informational Annodex |
|---|
| 7 |
Expires: May 4, 2008 November 2007 |
|---|
| 8 |
|
|---|
| 9 |
|
|---|
| 10 |
The "skeleton" meta information track for Ogg |
|---|
| 11 |
draft-pfeiffer-oggskeleton-00 |
|---|
| 12 |
|
|---|
| 13 |
Status of this Memo |
|---|
| 14 |
|
|---|
| 15 |
This document is an Internet-Draft and is subject to all provisions |
|---|
| 16 |
of Section 3 of RFC 3667. By submitting this Internet-Draft, each |
|---|
| 17 |
author represents that any applicable patent or other IPR claims of |
|---|
| 18 |
which he or she is aware have been or will be disclosed, and any of |
|---|
| 19 |
which he or she become aware will be disclosed, in accordance with |
|---|
| 20 |
RFC 3668. |
|---|
| 21 |
|
|---|
| 22 |
Internet-Drafts are working documents of the Internet Engineering |
|---|
| 23 |
Task Force (IETF), its areas, and its working groups. Note that |
|---|
| 24 |
other groups may also distribute working documents as Internet- |
|---|
| 25 |
Drafts. |
|---|
| 26 |
|
|---|
| 27 |
Internet-Drafts are draft documents valid for a maximum of six months |
|---|
| 28 |
and may be updated, replaced, or obsoleted by other documents at any |
|---|
| 29 |
time. It is inappropriate to use Internet-Drafts as reference |
|---|
| 30 |
material or to cite them other than as "work in progress." |
|---|
| 31 |
|
|---|
| 32 |
The list of current Internet-Drafts can be accessed at |
|---|
| 33 |
http://www.ietf.org/ietf/1id-abstracts.txt. |
|---|
| 34 |
|
|---|
| 35 |
The list of Internet-Draft Shadow Directories can be accessed at |
|---|
| 36 |
http://www.ietf.org/shadow.html. |
|---|
| 37 |
|
|---|
| 38 |
This Internet-Draft will expire on May 4, 2008. |
|---|
| 39 |
|
|---|
| 40 |
|
|---|
| 41 |
|
|---|
| 42 |
|
|---|
| 43 |
|
|---|
| 44 |
|
|---|
| 45 |
|
|---|
| 46 |
|
|---|
| 47 |
|
|---|
| 48 |
|
|---|
| 49 |
|
|---|
| 50 |
|
|---|
| 51 |
|
|---|
| 52 |
|
|---|
| 53 |
|
|---|
| 54 |
|
|---|
| 55 |
Pfeiffer & Parker Expires May 4, 2008 [Page 1] |
|---|
| 56 |
|
|---|
| 57 |
Internet-Draft SKELETON November 2007 |
|---|
| 58 |
|
|---|
| 59 |
|
|---|
| 60 |
Abstract |
|---|
| 61 |
|
|---|
| 62 |
This specification defines "Skeleton", a logical bitstream for the |
|---|
| 63 |
Ogg encapsulation format version 0 [Ogg]. Skeleton is a header-style |
|---|
| 64 |
bitstream that describes the content of the other logical bitstreams |
|---|
| 65 |
encapsulated inside an Ogg container. Its purpose is to remove |
|---|
| 66 |
codec-specific information requirements from the multiplexing/ |
|---|
| 67 |
demultiplexing process. It provides default structure and semantic |
|---|
| 68 |
information to describe multitrack physical Ogg bitstreams. There is |
|---|
| 69 |
also a mechanism through which more information than the default can |
|---|
| 70 |
be provided. |
|---|
| 71 |
|
|---|
| 72 |
Please note that this document assumes that the reader understands |
|---|
| 73 |
the Ogg encapsulation format version 0 [Ogg]. The specification of |
|---|
| 74 |
Skeleton is not encumbered by patents. |
|---|
| 75 |
|
|---|
| 76 |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", |
|---|
| 77 |
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this |
|---|
| 78 |
document are to be interpreted as described in RFC 2119 [rfc2119]. |
|---|
| 79 |
|
|---|
| 80 |
|
|---|
| 81 |
|
|---|
| 82 |
|
|---|
| 83 |
|
|---|
| 84 |
|
|---|
| 85 |
|
|---|
| 86 |
|
|---|
| 87 |
|
|---|
| 88 |
|
|---|
| 89 |
|
|---|
| 90 |
|
|---|
| 91 |
|
|---|
| 92 |
|
|---|
| 93 |
|
|---|
| 94 |
|
|---|
| 95 |
|
|---|
| 96 |
|
|---|
| 97 |
|
|---|
| 98 |
|
|---|
| 99 |
|
|---|
| 100 |
|
|---|
| 101 |
|
|---|
| 102 |
|
|---|
| 103 |
|
|---|
| 104 |
|
|---|
| 105 |
|
|---|
| 106 |
|
|---|
| 107 |
|
|---|
| 108 |
|
|---|
| 109 |
|
|---|
| 110 |
|
|---|
| 111 |
Pfeiffer & Parker Expires May 4, 2008 [Page 2] |
|---|
| 112 |
|
|---|
| 113 |
Internet-Draft SKELETON November 2007 |
|---|
| 114 |
|
|---|
| 115 |
|
|---|
| 116 |
Table of Contents |
|---|
| 117 |
|
|---|
| 118 |
1. Features of Ogg and Skeleton . . . . . . . . . . . . . . . . . 4 |
|---|
| 119 |
2. The Ogg skeleton logical bitstream . . . . . . . . . . . . . . 5 |
|---|
| 120 |
2.1. The format of the skeleton ident header . . . . . . . . . 6 |
|---|
| 121 |
2.2. The format of the skeleton secondary headers . . . . . . . 8 |
|---|
| 122 |
2.3. Media mapping of skeleton into Ogg . . . . . . . . . . . . 11 |
|---|
| 123 |
3. Handling time in an Ogg format bitstream . . . . . . . . . . . 13 |
|---|
| 124 |
3.1. Conceptual overview . . . . . . . . . . . . . . . . . . . 13 |
|---|
| 125 |
3.2. Mapping a granule position to a time position . . . . . . 15 |
|---|
| 126 |
3.3. Seeking into the bitstream . . . . . . . . . . . . . . . . 17 |
|---|
| 127 |
3.4. Remultiplexing an Ogg bitstream using Skeleton . . . . . . 19 |
|---|
| 128 |
4. Security considerations . . . . . . . . . . . . . . . . . . . 20 |
|---|
| 129 |
5. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 |
|---|
| 130 |
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 22 |
|---|
| 131 |
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 |
|---|
| 132 |
Intellectual Property and Copyright Statements . . . . . . . . . . 24 |
|---|
| 133 |
|
|---|
| 134 |
|
|---|
| 135 |
|
|---|
| 136 |
|
|---|
| 137 |
|
|---|
| 138 |
|
|---|
| 139 |
|
|---|
| 140 |
|
|---|
| 141 |
|
|---|
| 142 |
|
|---|
| 143 |
|
|---|
| 144 |
|
|---|
| 145 |
|
|---|
| 146 |
|
|---|
| 147 |
|
|---|
| 148 |
|
|---|
| 149 |
|
|---|
| 150 |
|
|---|
| 151 |
|
|---|
| 152 |
|
|---|
| 153 |
|
|---|
| 154 |
|
|---|
| 155 |
|
|---|
| 156 |
|
|---|
| 157 |
|
|---|
| 158 |
|
|---|
| 159 |
|
|---|
| 160 |
|
|---|
| 161 |
|
|---|
| 162 |
|
|---|
| 163 |
|
|---|
| 164 |
|
|---|
| 165 |
|
|---|
| 166 |
|
|---|
| 167 |
Pfeiffer & Parker Expires May 4, 2008 [Page 3] |
|---|
| 168 |
|
|---|
| 169 |
Internet-Draft SKELETON November 2007 |
|---|
| 170 |
|
|---|
| 171 |
|
|---|
| 172 |
1. Features of Ogg and Skeleton |
|---|
| 173 |
|
|---|
| 174 |
Ogg is a container format for encapsulation of several tracks of |
|---|
| 175 |
temporally interleaved bitstreams of time-continuous data. It |
|---|
| 176 |
enables encapsulation of any type of time-continuous data stream as |
|---|
| 177 |
long as it is streamable. Each track represents codec data for only |
|---|
| 178 |
one type of time-continuous data stream. Ogg is designed to be used |
|---|
| 179 |
both as a persistent file format and as a streaming format to |
|---|
| 180 |
exchange temporally addressable bitstreams. |
|---|
| 181 |
|
|---|
| 182 |
Skeleton adds to Ogg a means to describe the codec tracks contained |
|---|
| 183 |
inside Ogg. It assumes reasonably that for each logical bitstream |
|---|
| 184 |
there is a regular data sampling rate (called granulerate). For |
|---|
| 185 |
variable sampling rate bitstreams, it assumes there is a common |
|---|
| 186 |
multiple of the used sampling rates that is used as granulerate. |
|---|
| 187 |
|
|---|
| 188 |
Codec tracks generally contain the following information: |
|---|
| 189 |
|
|---|
| 190 |
o setup information for a codec |
|---|
| 191 |
|
|---|
| 192 |
o content data |
|---|
| 193 |
|
|---|
| 194 |
The setup information is inserted at the start of a data bitstream |
|---|
| 195 |
before any content data. Skeleton pulls out the key information |
|---|
| 196 |
about the codecs from their headers and puts them into a defined |
|---|
| 197 |
location in a defined manner, such that no decoding of logical |
|---|
| 198 |
bitstreams is required to find out about the tracks of content |
|---|
| 199 |
encapsulated inside Ogg. |
|---|
| 200 |
|
|---|
| 201 |
An Ogg physical bitstream with a Skeleton track has the following |
|---|
| 202 |
mandatory order of Ogg pages: |
|---|
| 203 |
|
|---|
| 204 |
1. skeleton bos page. |
|---|
| 205 |
|
|---|
| 206 |
2. bos pages of the other logical bitstreams. |
|---|
| 207 |
|
|---|
| 208 |
3. secondary header pages of all logical bitstreams, including |
|---|
| 209 |
fisbone. |
|---|
| 210 |
|
|---|
| 211 |
4. skeleton eos page. |
|---|
| 212 |
|
|---|
| 213 |
5. data and eos pages of logical bitstreams, excluding skeleton, |
|---|
| 214 |
multiplexed in a time-synchronous fashion. |
|---|
| 215 |
|
|---|
| 216 |
|
|---|
| 217 |
|
|---|
| 218 |
|
|---|
| 219 |
|
|---|
| 220 |
|
|---|
| 221 |
|
|---|
| 222 |
|
|---|
| 223 |
Pfeiffer & Parker Expires May 4, 2008 [Page 4] |
|---|
| 224 |
|
|---|
| 225 |
Internet-Draft SKELETON November 2007 |
|---|
| 226 |
|
|---|
| 227 |
|
|---|
| 228 |
2. The Ogg skeleton logical bitstream |
|---|
| 229 |
|
|---|
| 230 |
The purpose of Ogg skeleton is to provide codec-specific knowledge |
|---|
| 231 |
that allows parsing, demultiplexing and remultiplexing of Ogg |
|---|
| 232 |
bitstreams without having to decode. |
|---|
| 233 |
|
|---|
| 234 |
While the Ogg encapsulation format by itself is capable of |
|---|
| 235 |
interleaving an unlimited number of time-continuous bitstreams, it is |
|---|
| 236 |
not possible to identify the type of bitstreams (e.g. audio or video) |
|---|
| 237 |
and their encoding format (e.g. Vorbis or Speex or Theora) without |
|---|
| 238 |
decoding at least the bos page of the logical bitstreams. Also, |
|---|
| 239 |
further general media type information such as the image dimensions |
|---|
| 240 |
of a frame in a video bitstream or the language of a speech bitstream |
|---|
| 241 |
may be provided in skeleton. Another limitation of Ogg is that each |
|---|
| 242 |
logical bitstream defines its own mapping of granule_position to |
|---|
| 243 |
time, which is therefore also given in the skeleton. |
|---|
| 244 |
|
|---|
| 245 |
This section specifies the content of the "skeleton" logical |
|---|
| 246 |
bitstream and how it is mapped into Ogg. Knowledge of the Ogg |
|---|
| 247 |
bitstream format as specified in the Ogg RFC [Ogg] is presumed. |
|---|
| 248 |
Please also refer to that document for descriptions of the terms used |
|---|
| 249 |
in this document. |
|---|
| 250 |
|
|---|
| 251 |
The skeleton bitstream has the ability to generically describe Ogg |
|---|
| 252 |
bitstreams that consist of one or more time-continuous data bitstream |
|---|
| 253 |
and one or more time-instantaneous data bitstream concurrently |
|---|
| 254 |
interleaved (in Ogg terms: multiplexed). It does not describe |
|---|
| 255 |
sequentially multiplexed Ogg bitstreams, but rather expects that a |
|---|
| 256 |
sequentially multiplexed bitstream has its own skeleton logical |
|---|
| 257 |
bitstream. |
|---|
| 258 |
|
|---|
| 259 |
The skeleton logical bitstream provides the following functionality |
|---|
| 260 |
on top of Ogg: |
|---|
| 261 |
|
|---|
| 262 |
o allows for the identification of the codec format and the content |
|---|
| 263 |
type of encapsulated logical bitstreams without the need to decode |
|---|
| 264 |
that bitstream's headers or data. |
|---|
| 265 |
|
|---|
| 266 |
o allows for extraction of a temporal interval of the Ogg physical |
|---|
| 267 |
bitstream while retaining the original start time offset of that |
|---|
| 268 |
interval. |
|---|
| 269 |
|
|---|
| 270 |
o allows for attachment of a real-world wall-clock time and a date |
|---|
| 271 |
to the Ogg physical bitstream, thus e.g. retaining creation date/ |
|---|
| 272 |
time or first broadcast date/time. |
|---|
| 273 |
|
|---|
| 274 |
o allows for temporal offset operations into an Ogg physical |
|---|
| 275 |
bitstream without a need to decode any data. |
|---|
| 276 |
|
|---|
| 277 |
|
|---|
| 278 |
|
|---|
| 279 |
Pfeiffer & Parker Expires May 4, 2008 [Page 5] |
|---|
| 280 |
|
|---|
| 281 |
Internet-Draft SKELETON November 2007 |
|---|
| 282 |
|
|---|
| 283 |
|
|---|
| 284 |
o allows generally for handling of content without a need to decode |
|---|
| 285 |
it, such as is necessary in a caching Web proxy. |
|---|
| 286 |
|
|---|
| 287 |
o allows for attachment of message header fields given as name-value |
|---|
| 288 |
pairs that contain some sort of protocol messages about the |
|---|
| 289 |
logical bitstream, e.g. the screen size for a video bitstream or |
|---|
| 290 |
the number of channels for an audio bitstream. |
|---|
| 291 |
|
|---|
| 292 |
2.1. The format of the skeleton ident header |
|---|
| 293 |
|
|---|
| 294 |
The skeleton logical bitstream starts with an ident header containing |
|---|
| 295 |
information for the complete Ogg physical bitstream. The ident |
|---|
| 296 |
header has the following format: |
|---|
| 297 |
|
|---|
| 298 |
0 1 2 3 |
|---|
| 299 |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte |
|---|
| 300 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 301 |
| Identifier 'fishead\0' | 0-3 |
|---|
| 302 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 303 |
| | 4-7 |
|---|
| 304 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 305 |
| Version major | Version minor | 8-11 |
|---|
| 306 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 307 |
| Presentationtime numerator | 12-15 |
|---|
| 308 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 309 |
| | 16-19 |
|---|
| 310 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 311 |
| Presentationtime denominator | 20-23 |
|---|
| 312 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 313 |
| | 24-27 |
|---|
| 314 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 315 |
| Basetime numerator | 28-31 |
|---|
| 316 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 317 |
| | 32-35 |
|---|
| 318 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 319 |
| Basetime denominator | 36-39 |
|---|
| 320 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 321 |
| | 40-43 |
|---|
| 322 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 323 |
| UTC | 44-47 |
|---|
| 324 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 325 |
| | 48-51 |
|---|
| 326 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 327 |
| | 52-55 |
|---|
| 328 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 329 |
| | 56-59 |
|---|
| 330 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 331 |
| | 60-63 |
|---|
| 332 |
|
|---|
| 333 |
|
|---|
| 334 |
|
|---|
| 335 |
Pfeiffer & Parker Expires May 4, 2008 [Page 6] |
|---|
| 336 |
|
|---|
| 337 |
Internet-Draft SKELETON November 2007 |
|---|
| 338 |
|
|---|
| 339 |
|
|---|
| 340 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 341 |
|
|---|
| 342 |
Fields with more than one Byte length are encoded LSB (least |
|---|
| 343 |
significant Byte) first. |
|---|
| 344 |
|
|---|
| 345 |
The fields in the skeleton ident header have the following meaning: |
|---|
| 346 |
|
|---|
| 347 |
1. Identifier: a 8 Byte field that identifies this bitstream as a |
|---|
| 348 |
skeleton. It contains the magic numbers: |
|---|
| 349 |
|
|---|
| 350 |
0x66 'f' |
|---|
| 351 |
|
|---|
| 352 |
0x69 'i' |
|---|
| 353 |
|
|---|
| 354 |
0x73 's' |
|---|
| 355 |
|
|---|
| 356 |
0x68 'h' |
|---|
| 357 |
|
|---|
| 358 |
0x65 'e' |
|---|
| 359 |
|
|---|
| 360 |
0x61 'a' |
|---|
| 361 |
|
|---|
| 362 |
0x64 'd' |
|---|
| 363 |
|
|---|
| 364 |
0x00 '\0' |
|---|
| 365 |
|
|---|
| 366 |
2. Version major: 2 Byte unsigned integer signifying the major |
|---|
| 367 |
version number of the skeleton bitstream. This document |
|---|
| 368 |
specifies the major version 3. |
|---|
| 369 |
|
|---|
| 370 |
3. Version minor: 2 Byte unsigned integer signifying the minor |
|---|
| 371 |
version number of the skeleton bitstream. This document |
|---|
| 372 |
specifies the minor version 0. |
|---|
| 373 |
|
|---|
| 374 |
4. Presentationtime numerator & denominator: 8 Byte signed integer |
|---|
| 375 |
each. They represent together the time at which to start |
|---|
| 376 |
presenting the Ogg physical bitstream given as a rational number. |
|---|
| 377 |
The denominator represents the temporal resolution at which the |
|---|
| 378 |
presentationtime is given. E.g. 5 on 1000 results in a |
|---|
| 379 |
presentationtime of 0.005 sec. This enables a very high temporal |
|---|
| 380 |
resolution without having to store floating point numbers. In a |
|---|
| 381 |
newly created physical bitstream presentationtime and basetime |
|---|
| 382 |
are the same. When remultiplexing a subpart of the stream, this |
|---|
| 383 |
number MUST be adapted to the requested start time offset of the |
|---|
| 384 |
newly created stream. Presentationtime must always be larger or |
|---|
| 385 |
equal to zero. |
|---|
| 386 |
|
|---|
| 387 |
|
|---|
| 388 |
|
|---|
| 389 |
|
|---|
| 390 |
|
|---|
| 391 |
Pfeiffer & Parker Expires May 4, 2008 [Page 7] |
|---|
| 392 |
|
|---|
| 393 |
Internet-Draft SKELETON November 2007 |
|---|
| 394 |
|
|---|
| 395 |
|
|---|
| 396 |
5. Basetime numerator & denominator: 8 Byte signed integer each. |
|---|
| 397 |
They represent together the basetime of the Ogg physical |
|---|
| 398 |
bitstream given as a rational number like the presentationtime. |
|---|
| 399 |
This number is fixed once the physical bitstream is created and |
|---|
| 400 |
provides a mapping to time for the beginning of the physical |
|---|
| 401 |
bitstream when it starts with a granule position of 0. |
|---|
| 402 |
|
|---|
| 403 |
6. UTC [ISO8601]: a 20 Byte string containing a UTC time in the form |
|---|
| 404 |
of YYYYMMDDTHHMMSS.sssZ. It associates a calendar date and a |
|---|
| 405 |
wall-clock time with the basetime. It is a sequence of 20 NUL |
|---|
| 406 |
Bytes if not in use, making this ident packet and thus the bos |
|---|
| 407 |
page of the skeleton bitstream constant length. |
|---|
| 408 |
|
|---|
| 409 |
Please note: The possible temporal resolution of the presentation- |
|---|
| 410 |
and basetime is on the order of 2^-64. For example, the time formats |
|---|
| 411 |
in use for media that are described in this document range from 1/24 |
|---|
| 412 |
to 1/60 for the different smpte formats [SMPTE]. This resolution is |
|---|
| 413 |
enough for any one of these. It is also expected to accommodate any |
|---|
| 414 |
future needs of time resolution for any other time format and time- |
|---|
| 415 |
continuously sampled data. |
|---|
| 416 |
|
|---|
| 417 |
Please note further: A denominator of 0 in either presentationtime or |
|---|
| 418 |
basetime is regarded as a special value and sets the respective time |
|---|
| 419 |
to 0, no matter what the value of the numerator. |
|---|
| 420 |
|
|---|
| 421 |
2.2. The format of the skeleton secondary headers |
|---|
| 422 |
|
|---|
| 423 |
The skeleton secondary headers are a sequence of packets that each |
|---|
| 424 |
contain information about one of the time-continuous or time- |
|---|
| 425 |
instantaneous other logical bitstreams contained within the Ogg |
|---|
| 426 |
physical bitstream. A skeleton secondary header packet has the |
|---|
| 427 |
following format: |
|---|
| 428 |
|
|---|
| 429 |
|
|---|
| 430 |
|
|---|
| 431 |
|
|---|
| 432 |
|
|---|
| 433 |
|
|---|
| 434 |
|
|---|
| 435 |
|
|---|
| 436 |
|
|---|
| 437 |
|
|---|
| 438 |
|
|---|
| 439 |
|
|---|
| 440 |
|
|---|
| 441 |
|
|---|
| 442 |
|
|---|
| 443 |
|
|---|
| 444 |
|
|---|
| 445 |
|
|---|
| 446 |
|
|---|
| 447 |
Pfeiffer & Parker Expires May 4, 2008 [Page 8] |
|---|
| 448 |
|
|---|
| 449 |
Internet-Draft SKELETON November 2007 |
|---|
| 450 |
|
|---|
| 451 |
|
|---|
| 452 |
0 1 2 3 |
|---|
| 453 |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte |
|---|
| 454 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 455 |
| Identifier 'fisbone\0' | 0-3 |
|---|
| 456 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 457 |
| | 4-7 |
|---|
| 458 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 459 |
| Offset to message header fields | 8-11 |
|---|
| 460 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 461 |
| Serial number | 12-15 |
|---|
| 462 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 463 |
| Number of header packets | 16-19 |
|---|
| 464 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 465 |
| Granulerate numerator | 20-23 |
|---|
| 466 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 467 |
| | 24-27 |
|---|
| 468 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 469 |
| Granulerate denominator | 28-31 |
|---|
| 470 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 471 |
| | 32-35 |
|---|
| 472 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 473 |
| Startgranule | 36-39 |
|---|
| 474 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 475 |
| | 40-43 |
|---|
| 476 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 477 |
| Preroll | 44-47 |
|---|
| 478 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 479 |
| Granuleshift | Padding/future use | 48-51 |
|---|
| 480 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 481 |
| Message header fields ... | 52- |
|---|
| 482 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|---|
| 483 |
|
|---|
| 484 |
|
|---|
| 485 |
Fields with more than one Byte length are encoded LSB (least |
|---|
| 486 |
significant Byte) first. |
|---|
| 487 |
|
|---|
| 488 |
The fields in a skeleton secondary header packet have the following |
|---|
| 489 |
meaning: |
|---|
| 490 |
|
|---|
| 491 |
1. Identifier: a 8 Byte field that identifies this packet as a |
|---|
| 492 |
skeleton secondary header for identifying other logical |
|---|
| 493 |
bitstreams. It contains the magic numbers: |
|---|
| 494 |
|
|---|
| 495 |
0x66 'f' |
|---|
| 496 |
|
|---|
| 497 |
|
|---|
| 498 |
|
|---|
| 499 |
|
|---|
| 500 |
|
|---|
| 501 |
|
|---|
| 502 |
|
|---|
| 503 |
Pfeiffer & Parker Expires May 4, 2008 [Page 9] |
|---|
| 504 |
|
|---|
| 505 |
Internet-Draft SKELETON November 2007 |
|---|
| 506 |
|
|---|
| 507 |
|
|---|
| 508 |
0x69 'i' |
|---|
| 509 |
|
|---|
| 510 |
0x73 's' |
|---|
| 511 |
|
|---|
| 512 |
0x62 'b' |
|---|
| 513 |
|
|---|
| 514 |
0x6f 'o' |
|---|
| 515 |
|
|---|
| 516 |
0x6e 'n' |
|---|
| 517 |
|
|---|
| 518 |
0x65 'e' |
|---|
| 519 |
|
|---|
| 520 |
0x00 '\0' |
|---|
| 521 |
|
|---|
| 522 |
2. Offset to message header fields: 4 Byte unsigned integer that |
|---|
| 523 |
contains the number of Bytes used in this packet before the |
|---|
| 524 |
message header fields. For the version of the skeleton |
|---|
| 525 |
bitstream described in this document this number is fixed to 44. |
|---|
| 526 |
This field accommodates future changes to the skeleton bitstream |
|---|
| 527 |
allowing to parse message header fields even if more fields get |
|---|
| 528 |
inserted before them. |
|---|
| 529 |
|
|---|
| 530 |
3. Serial number: 4 Byte unsigned integer containing the |
|---|
| 531 |
bitstream_serial_number of the Ogg logical bitstream described |
|---|
| 532 |
by this skeleton secondary header packet and thus connecting it |
|---|
| 533 |
to the logical bitstream. |
|---|
| 534 |
|
|---|
| 535 |
4. Number of header packets: a 4 Byte unsigned integer that |
|---|
| 536 |
contains the number of header packets of that particular logical |
|---|
| 537 |
bitstream consisting of the bos page and the secondary header |
|---|
| 538 |
pages. |
|---|
| 539 |
|
|---|
| 540 |
5. Granulerate numerator & denominator: 8 Byte signed integer each. |
|---|
| 541 |
They represent the temporal resolution of the logical bitstream |
|---|
| 542 |
in Hz given as a rational number in the same way as the basetime |
|---|
| 543 |
attribute above. |
|---|
| 544 |
|
|---|
| 545 |
6. Startgranule: 8 Byte signed integer that represents the granule |
|---|
| 546 |
number with which this logical bitstream starts, which is |
|---|
| 547 |
originally 0, but will be a positive offset when only a subpart |
|---|
| 548 |
of the stream is requested. |
|---|
| 549 |
|
|---|
| 550 |
7. Preroll: 4 Byte unsigned integer that contains the number of |
|---|
| 551 |
packets to pre-roll in order to decode a current packet |
|---|
| 552 |
correctly. This is for example the case with Ogg Vorbis, which |
|---|
| 553 |
requires a pre-roll of 2 packets. |
|---|
| 554 |
|
|---|
| 555 |
|
|---|
| 556 |
|
|---|
| 557 |
|
|---|
| 558 |
|
|---|
| 559 |
Pfeiffer & Parker Expires May 4, 2008 [Page 10] |
|---|
| 560 |
|
|---|
| 561 |
Internet-Draft SKELETON November 2007 |
|---|
| 562 |
|
|---|
| 563 |
|
|---|
| 564 |
8. Granuleshift: a 1 Byte unsigned integer describing whether to |
|---|
| 565 |
partition the granule_position into two for that logical |
|---|
| 566 |
bitstream, and how many of the lower bits to use for the |
|---|
| 567 |
partitioning. The upper bits signify a time-continuous granule |
|---|
| 568 |
position for an independently decodable and presentable data |
|---|
| 569 |
granule. The lower bits are generally used to specify the |
|---|
| 570 |
relative offset of dependent packets, such as predicted frames |
|---|
| 571 |
of a video. Hence these can be addressed, though not decoded |
|---|
| 572 |
without tracing back to the last fully decodable data granule. |
|---|
| 573 |
This is the case with Ogg Theora; the general procedure is given |
|---|
| 574 |
in section 3.2. |
|---|
| 575 |
|
|---|
| 576 |
9. Padding/future use: 3 Bytes padding data that may be used for |
|---|
| 577 |
future requirements and are mandated to zero in this revision. |
|---|
| 578 |
|
|---|
| 579 |
10. Message header fields: header fields, following the generic |
|---|
| 580 |
Internet Message Format defined in RFC 2822 [Headers]. Each |
|---|
| 581 |
header field consists of a name followed by a colon (":") and |
|---|
| 582 |
the field value. Field names are case-insensitive. The field |
|---|
| 583 |
value MAY be preceded by any amount of LWS, though a single SP |
|---|
| 584 |
is preferred. Header fields can be extended over multiple lines |
|---|
| 585 |
by preceding each extra line with at least one SP or HT. |
|---|
| 586 |
|
|---|
| 587 |
There is one mandatory Message header field for all of the logical |
|---|
| 588 |
bitstreams: the "Content-type" header field. For an application that |
|---|
| 589 |
is parsing the Ogg bitstream, this field contains the MIME type and |
|---|
| 590 |
the character encoding of the data in the logical bitstream. E.g. |
|---|
| 591 |
for a bitstream containing Ogg Vorbis data the value is "Content- |
|---|
| 592 |
type: audio/x-vorbis". The Content-type message header field MUST |
|---|
| 593 |
come first for all of the Message header fields such that it can be |
|---|
| 594 |
found at a fixed location in the skeleton fisbone packet. |
|---|
| 595 |
|
|---|
| 596 |
As per RFC 2277 [I18N], message header fields are considered protocol |
|---|
| 597 |
data, i.e. it is not expected to have human readable text in there, |
|---|
| 598 |
and they MUST be entirely encoded in UTF-8. In addition, the |
|---|
| 599 |
mandatory header fields MUST be encoded in US-ASCII and it is |
|---|
| 600 |
recommended to also use US-ASCII code points as much as possible for |
|---|
| 601 |
the optional header fields. |
|---|
| 602 |
|
|---|
| 603 |
User defined optional message header fields MUST follow the naming |
|---|
| 604 |
standard given in RFC2822. |
|---|
| 605 |
|
|---|
| 606 |
2.3. Media mapping of skeleton into Ogg |
|---|
| 607 |
|
|---|
| 608 |
The media mapping for skeleton into Ogg is as follows: |
|---|
| 609 |
|
|---|
| 610 |
o The skeleton ident (fishead) header is mapped into the skeleton |
|---|
| 611 |
bos page. |
|---|
| 612 |
|
|---|
| 613 |
|
|---|
| 614 |
|
|---|
| 615 |
Pfeiffer & Parker Expires May 4, 2008 [Page 11] |
|---|
| 616 |
|
|---|
| 617 |
Internet-Draft SKELETON November 2007 |
|---|
| 618 |
|
|---|
| 619 |
|
|---|
| 620 |
o The secondary header pages of a skeleton logical bitstream consist |
|---|
| 621 |
of the fisbone header packets that each describe one particular |
|---|
| 622 |
logical data bitstream within the Ogg physical bitstream. |
|---|
| 623 |
|
|---|
| 624 |
o There are no content pages or data packets. As the skeleton eos |
|---|
| 625 |
page is included before the first data page of any logical |
|---|
| 626 |
bitstream, there actually cannot be any content data packets. |
|---|
| 627 |
|
|---|
| 628 |
o The skeleton eos page MUST contain one packet of length zero. |
|---|
| 629 |
|
|---|
| 630 |
When using a skeleton logical bitstream in Ogg, a further restriction |
|---|
| 631 |
on the order in which Ogg pages appear is introduced to allow for |
|---|
| 632 |
easier identification: |
|---|
| 633 |
|
|---|
| 634 |
1. The skeleton bos page is the very first bos page. This allows |
|---|
| 635 |
its differentiation from other Ogg bitstreams that don't contain |
|---|
| 636 |
a skeleton logical bitstream. |
|---|
| 637 |
|
|---|
| 638 |
2. The bos pages of the other logical bitstreams come next as is a |
|---|
| 639 |
requirement of the Ogg bitstream format. |
|---|
| 640 |
|
|---|
| 641 |
3. The secondary header pages of all the logical bitstreams in the |
|---|
| 642 |
Ogg physical bitstream come next, as is also a requirement of |
|---|
| 643 |
Ogg. The skeleton secondary header pages are also included here. |
|---|
| 644 |
|
|---|
| 645 |
4. Before any data pages of any of the logical bitstreams appear in |
|---|
| 646 |
the Ogg physical bitstream, the skeleton eos page MUST end the |
|---|
| 647 |
skeleton logical bitstream. This is necessary to end the control |
|---|
| 648 |
section of the bitstream. If an Ogg stream parser reaches the |
|---|
| 649 |
skeleton eos page, it knows that it has received all the bos and |
|---|
| 650 |
secondary header pages and can start setting up its decoding or |
|---|
| 651 |
parsing environment. |
|---|
| 652 |
|
|---|
| 653 |
|
|---|
| 654 |
|
|---|
| 655 |
|
|---|
| 656 |
|
|---|
| 657 |
|
|---|
| 658 |
|
|---|
| 659 |
|
|---|
| 660 |
|
|---|
| 661 |
|
|---|
| 662 |
|
|---|
| 663 |
|
|---|
| 664 |
|
|---|
| 665 |
|
|---|
| 666 |
|
|---|
| 667 |
|
|---|
| 668 |
|
|---|
| 669 |
|
|---|
| 670 |
|
|---|
| 671 |
Pfeiffer & Parker Expires May 4, 2008 [Page 12] |
|---|
| 672 |
|
|---|
| 673 |
Internet-Draft SKELETON November 2007 |
|---|
| 674 |
|
|---|
| 675 |
|
|---|
| 676 |
3. Handling time in an Ogg format bitstream |
|---|
| 677 |
|
|---|
| 678 |
With time-continuous data inside Ogg, one needs to handle data at |
|---|
| 679 |
four different levels: |
|---|
| 680 |
|
|---|
| 681 |
o at the Bytes level, upon seeking. |
|---|
| 682 |
|
|---|
| 683 |
o at the packets level, upon encapsulating. |
|---|
| 684 |
|
|---|
| 685 |
o at the granules level, upon recomposing. |
|---|
| 686 |
|
|---|
| 687 |
o at the time level, upon displaying and addressing. |
|---|
| 688 |
|
|---|
| 689 |
This section explains how they all fit together. |
|---|
| 690 |
|
|---|
| 691 |
3.1. Conceptual overview |
|---|
| 692 |
|
|---|
| 693 |
Ogg bitstreams inherently represent one timeline only, where the |
|---|
| 694 |
different logical bitstreams can be thought of as content tracks on |
|---|
| 695 |
that timeline. All of these tracks relate to the same timeline which |
|---|
| 696 |
starts at a certain time point and ends when the last bitstream ends. |
|---|
| 697 |
|
|---|
| 698 |
An example bitstream can be seen in the following figure. It |
|---|
| 699 |
consists of an Ogg bitstream that contains 4 media bitstreams. The |
|---|
| 700 |
picture is a conceptual representation of the time intervals covered |
|---|
| 701 |
by the different logical bitstreams and the Ogg pages used to |
|---|
| 702 |
encapsulate the data. In the flat representation these are |
|---|
| 703 |
multiplexed such that the data packets of each of these bitstreams |
|---|
| 704 |
occur at the correct time. |
|---|
| 705 |
|
|---|
| 706 |
|
|---|
| 707 |
|
|---|
| 708 |
|
|---|
| 709 |
|
|---|
| 710 |
|
|---|
| 711 |
|
|---|
| 712 |
|
|---|
| 713 |
|
|---|
| 714 |
|
|---|
| 715 |
|
|---|
| 716 |
|
|---|
| 717 |
|
|---|
| 718 |
|
|---|
| 719 |
|
|---|
| 720 |
|
|---|
| 721 |
|
|---|
| 722 |
|
|---|
| 723 |
|
|---|
| 724 |
|
|---|
| 725 |
|
|---|
| 726 |
|
|---|
| 727 |
Pfeiffer & Parker Expires May 4, 2008 [Page 13] |
|---|
| 728 |
|
|---|
| 729 |
Internet-Draft SKELETON November 2007 |
|---|
| 730 |
|
|---|
| 731 |
|
|---|
| 732 |
t_url |
|---|
| 733 |
| |
|---|
| 734 |
t_0 v t_n |
|---|
| 735 |
|------------------------------------------------------------------->| |
|---|
| 736 |
---------------------------------------------- |
|---|
| 737 |
| | | | | | | | | | |//| | | | | |
|---|
| 738 |
---------------------------------------------- |
|---|
| 739 |
audio bitstream 1 |
|---|
| 740 |
------------------------------------------------------------- |
|---|
| 741 |
| | | |/////| | | | | | | |
|---|
| 742 |
------------------------------------------------------------- |
|---|
| 743 |
video bitstream 1 |
|---|
| 744 |
---------------------------------------------------- |
|---|
| 745 |
| | | | |//| | | | | | | | | | | | | |
|---|
| 746 |
---------------------------------------------------- |
|---|
| 747 |
audio bitstream 2 |
|---|
| 748 |
------------------------------- |
|---|
| 749 |
| |/////| | | | |
|---|
| 750 |
------------------------------- |
|---|
| 751 |
video bitstream 2 |
|---|
| 752 |
|
|---|
| 753 |
The time point at which an Ogg bitstream starts (t_0 in the above |
|---|
| 754 |
diagram) is called the "basetime" and represents the time in seconds |
|---|
| 755 |
associated with the granule position of 0 on all logical bitstreams. |
|---|
| 756 |
Typically, a newly created Ogg file starts all its logical bitstreams |
|---|
| 757 |
at granule position 0, and a typical extract of an Ogg bitstream, |
|---|
| 758 |
such as the one starting at t_url in the image above, starts each of |
|---|
| 759 |
its logical bitstreams at a different granule positions. These |
|---|
| 760 |
granule positions are stored in the "startgranule" field of the |
|---|
| 761 |
skeleton secondary header packets. |
|---|
| 762 |
|
|---|
| 763 |
The "basetime" of an Ogg bitstream may be 0, but it can also be any |
|---|
| 764 |
positive time. For example, in professional video production, the |
|---|
| 765 |
first frame of video of a program normally refers to a SMPTE basetime |
|---|
| 766 |
[SMPTE] of 01:00:00:00, not 00:00:00:00 (see also the temporal URI |
|---|
| 767 |
addressing [timedURI] specification). Associating such a practice to |
|---|
| 768 |
a digital video resource requires a way to store that basetime with |
|---|
| 769 |
the resource and interpreting it correctly when addressing offsets |
|---|
| 770 |
such as t_uri. Skeleton provides such a mapping through the basetime |
|---|
| 771 |
field in the skeleton ident header. |
|---|
| 772 |
|
|---|
| 773 |
Also associated with the basetime is a calendar date [ISO8601] and |
|---|
| 774 |
wall-clock time (a "UTC base") which represent a real-world time |
|---|
| 775 |
giving some meaningful calendar date association to the content such |
|---|
| 776 |
as the creation time or the first presentation time. The UTC base is |
|---|
| 777 |
specified in the UTC field of the skeleton ident header. |
|---|
| 778 |
|
|---|
| 779 |
|
|---|
| 780 |
|
|---|
| 781 |
|
|---|
| 782 |
|
|---|
| 783 |
Pfeiffer & Parker Expires May 4, 2008 [Page 14] |
|---|
| 784 |
|
|---|
| 785 |
Internet-Draft SKELETON November 2007 |
|---|
| 786 |
|
|---|
| 787 |
|
|---|
| 788 |
3.2. Mapping a granule position to a time position |
|---|
| 789 |
|
|---|
| 790 |
Each one of the encapsulated data bitstreams have their own temporal |
|---|
| 791 |
resolution at which they provide data to cover the given timeline. |
|---|
| 792 |
This temporal resolution is usually given through the sampling rate |
|---|
| 793 |
of the particular bitstream. For example, a raw audio bitstream at |
|---|
| 794 |
CD quality is sampled with a sampling rate of 44100 Hz. A video |
|---|
| 795 |
bitstream may be sampled with a frame rate of 25 frames per second. |
|---|
| 796 |
|
|---|
| 797 |
This temporal resolution is called the "granulerate". A granule is a |
|---|
| 798 |
data element that is based on a regular data rate specific to the |
|---|
| 799 |
content type, such as the frame rate for video or the sampling rate |
|---|
| 800 |
for audio. It even exists for bitstreams that are not sampled at a |
|---|
| 801 |
regular rate - then it is the highest resolution of any of the used |
|---|
| 802 |
sampling rates. The granulerate is specified in the skeleton |
|---|
| 803 |
secondary header packets for each logical bitstream. |
|---|
| 804 |
|
|---|
| 805 |
Each one of the bitstreams insert data into the Ogg bitstream through |
|---|
| 806 |
packets which have an associated temporal duration based on the |
|---|
| 807 |
encoder packaging. Packets are packaged into Ogg pages, which have a |
|---|
| 808 |
granule position associated with them. Not taking the special case |
|---|
| 809 |
of a granuleshift into account, the granule position specifies the |
|---|
| 810 |
number of granules that has been encapsulated since the implicit |
|---|
| 811 |
start of the original bitstream until and including the given Ogg |
|---|
| 812 |
page. |
|---|
| 813 |
|
|---|
| 814 |
The granule position together with the granulerate and granuleshift |
|---|
| 815 |
information of the skeleton secondary header packets for the |
|---|
| 816 |
particular logical bitstream are used for the calculation of the time |
|---|
| 817 |
position for which a data packet of the logical bitstream completes |
|---|
| 818 |
data. A granule position of -1 indicates a special case and MUST NOT |
|---|
| 819 |
be used for calculation of a mapping to time. |
|---|
| 820 |
|
|---|
| 821 |
In principle, the granule position of an Ogg page divided by the |
|---|
| 822 |
granulerate of this page's logical bitstream provides the time |
|---|
| 823 |
position that is reached in that bitstream after decoding all data |
|---|
| 824 |
packets finished on this page. However, the granule_position field |
|---|
| 825 |
in an Ogg page allows for a more finely-grained description of the |
|---|
| 826 |
temporal position. The following image explains the composition of |
|---|
| 827 |
the granule_position field in an Ogg page: |
|---|
| 828 |
|
|---|
| 829 |
granule_position |
|---|
| 830 |
------------------------------------------------ |
|---|
| 831 |
| keyindex | keyoffset | |
|---|
| 832 |
------------------------------------------------ |
|---|
| 833 |
|
|---|
| 834 |
The granuleshift field of the skeleton secondary header packets |
|---|
| 835 |
describes how many of the granule_position's 64 bits are being used |
|---|
| 836 |
|
|---|
| 837 |
|
|---|
| 838 |
|
|---|
| 839 |
Pfeiffer & Parker Expires May 4, 2008 [Page 15] |
|---|
| 840 |
|
|---|
| 841 |
Internet-Draft SKELETON November 2007 |
|---|
| 842 |
|
|---|
| 843 |
|
|---|
| 844 |
for the keyoffset. The keyoffset part of the granule_position is |
|---|
| 845 |
commonly used when the logical bitstream consists of packets that can |
|---|
| 846 |
only be fully decoded when referring back to a previous packet. For |
|---|
| 847 |
example, video streams often consist of inter and intra coded frames, |
|---|
| 848 |
where the intra frames are fully decodable and the inter frames are |
|---|
| 849 |
intermediate frames that require backtracking to the last inter frame |
|---|
| 850 |
for accurate decoding. Another example is a logical bitstream that |
|---|
| 851 |
is mapped as instantaneous information (i.e. their granuleposition |
|---|
| 852 |
represents the start time and the end time of the packet data), but |
|---|
| 853 |
actually has a duration associated to it, which is provided through a |
|---|
| 854 |
subsequent packet. CMML is such an example. The keyindex part of |
|---|
| 855 |
the granule_position is then used to provide the temporal position of |
|---|
| 856 |
the reference packet and the keyoffset part provides a counter for |
|---|
| 857 |
the data in between. |
|---|
| 858 |
|
|---|
| 859 |
The calculation of the temporal position of an Ogg page using |
|---|
| 860 |
Skeleton is thus specified through the following formula: |
|---|
| 861 |
|
|---|
| 862 |
t_page = basetime + ((keyindex + keyoffset) / granulerate) |
|---|
| 863 |
|
|---|
| 864 |
The basetime provides the time offset used at the beginning of the |
|---|
| 865 |
logical bitstream for the first data packet and thus MUST be added |
|---|
| 866 |
for a correct calculation of the temporal position. |
|---|
| 867 |
|
|---|
| 868 |
As an example regard an audio bitstream that has a granulerate of |
|---|
| 869 |
44100 (i.e. 44100 samples per 1 sec), a granuleshift of 0, and starts |
|---|
| 870 |
at 4 sec. When reaching a granule_position of 88200, this maps to a |
|---|
| 871 |
time position of 6 seconds: |
|---|
| 872 |
|
|---|
| 873 |
t_page = 4 + ((88200 + 0) / 44100) = 6 |
|---|
| 874 |
|
|---|
| 875 |
This signifies that the bitstream has reached the second sec of the |
|---|
| 876 |
audio bitstream after the end of decoding this page's packets, but |
|---|
| 877 |
maps to 6 seconds because of the basetime. |
|---|
| 878 |
|
|---|
| 879 |
As another example consider a video bitstream that has a granulerate |
|---|
| 880 |
of 25 (i.e. 25 frames per 1 second), a granuleshift of 3 (because it |
|---|
| 881 |
encodes - say - 7 partial frames between each fully encoded frame), |
|---|
| 882 |
and starts at 0 sec. When reaching a granule_position of 997, i.e. a |
|---|
| 883 |
keyindex of 62 and a keyshift of 5, this maps to a fully decodable |
|---|
| 884 |
time position of 2.68 seconds: |
|---|
| 885 |
|
|---|
| 886 |
t_page = 0 + ((62 + 5) / 25) = 2.68 sec |
|---|
| 887 |
|
|---|
| 888 |
The granulerate of a time-instantaneous bitstream such as a CMML |
|---|
| 889 |
bitstream can be chosen arbitrarily by the bitstream multiplexer. |
|---|
| 890 |
Per default, a granulerate of 1000 is used, which is the resolution |
|---|
| 891 |
of npt. The resolution of all the time schemes is given as: |
|---|
| 892 |
|
|---|
| 893 |
|
|---|
| 894 |
|
|---|
| 895 |
Pfeiffer & Parker Expires May 4, 2008 [Page 16] |
|---|
| 896 |
|
|---|
| 897 |
Internet-Draft SKELETON November 2007 |
|---|
| 898 |
|
|---|
| 899 |
|
|---|
| 900 |
o npt: 1000 (milliseconds) |
|---|
| 901 |
|
|---|
| 902 |
o smpte-24: 24 (24 fps) |
|---|
| 903 |
|
|---|
| 904 |
o smpte-24-drop: 24/1.001 = 23.976 (approx. as per SMPTE) |
|---|
| 905 |
|
|---|
| 906 |
o smpte-25: 25 |
|---|
| 907 |
|
|---|
| 908 |
o smpte-30: 30 |
|---|
| 909 |
|
|---|
| 910 |
o smpte-30-drop: 30/1.001 = 29.970 (approx. as per SMPTE) |
|---|
| 911 |
|
|---|
| 912 |
o smpte-50: 50 |
|---|
| 913 |
|
|---|
| 914 |
o smpte-60: 60 |
|---|
| 915 |
|
|---|
| 916 |
o smpte-60-drop: 60/1.001 = 59.940 (approx. as per SMPTE) |
|---|
| 917 |
|
|---|
| 918 |
The granule position of the page finishing data of a time- |
|---|
| 919 |
instantaneous bitstream packet MUST signify the start time of that |
|---|
| 920 |
packet. For example, a CMML bitstream with a granulerate of 1000, a |
|---|
| 921 |
basetime of 0, and a clip that lasts from npt=12.020 till npt=15.0 |
|---|
| 922 |
will get a granule_position of 12020. In contrast, the |
|---|
| 923 |
granule_position of the page finishing data of e.g. an audio |
|---|
| 924 |
bitstream with granulerate 44100, basetime 0 and containing data from |
|---|
| 925 |
npt=12.020 to npt=15.0 will be 661500. |
|---|
| 926 |
|
|---|
| 927 |
A note about field overflows: an overflow of the granule position |
|---|
| 928 |
field can destroy the temporal integrity of the Ogg physical |
|---|
| 929 |
bitstream. In this case, a multiplexer MUST end the Ogg physical |
|---|
| 930 |
bitstream and restart a new one resetting the counter to 0 and |
|---|
| 931 |
adjusting the basetime appropriately. This is also called sequential |
|---|
| 932 |
multiplexing in Ogg. The same measure MUST be taken in case of an |
|---|
| 933 |
overflow of the page_sequence_number on one of the logical |
|---|
| 934 |
bitstreams. |
|---|
| 935 |
|
|---|
| 936 |
3.3. Seeking into the bitstream |
|---|
| 937 |
|
|---|
| 938 |
Seeking to a time offset inside an Ogg logical bitstream is a |
|---|
| 939 |
fundamental activity frequently performed on media data. Time inside |
|---|
| 940 |
an Ogg with a Skeleton track is specified as a temporal offset from |
|---|
| 941 |
the "beginning" of the stream, making use of the basetime field. |
|---|
| 942 |
Time offsets can also be specified as calendar dates and times. The |
|---|
| 943 |
UTC base is then used as a basis for offsetting. |
|---|
| 944 |
|
|---|
| 945 |
The basetime allows to correctly map a temporal offset point such as |
|---|
| 946 |
a temporal URI to a Byte position in the stream. In the above figure |
|---|
| 947 |
take t_uri=npt:14.0 as the temporal offset addressed on a stream with |
|---|
| 948 |
|
|---|
| 949 |
|
|---|
| 950 |
|
|---|
| 951 |
Pfeiffer & Parker Expires May 4, 2008 [Page 17] |
|---|
| 952 |
|
|---|
| 953 |
Internet-Draft SKELETON November 2007 |
|---|
| 954 |
|
|---|
| 955 |
|
|---|
| 956 |
t_0=npt:5.0 as the basetime - this requires a stream offsetting of |
|---|
| 957 |
only 9 sec to the appropriate granule position in each of the |
|---|
| 958 |
bitstreams, in the figure marked through patterned pages. |
|---|
| 959 |
|
|---|
| 960 |
The seeking action is performed on the interleaved bitstream, in |
|---|
| 961 |
which the data packets occur in a temporally consecutive order based |
|---|
| 962 |
on the time at which their data ends. These times are represented in |
|---|
| 963 |
the granule positions of the Ogg pages, which are only allowed to |
|---|
| 964 |
monotonically increase within one logical bitstream. This implies |
|---|
| 965 |
that when having found an Ogg page with a granule position that maps |
|---|
| 966 |
to a given seek time (i.e. covers the time or ends at it), the seek |
|---|
| 967 |
has found the right location. This applies over all logical |
|---|
| 968 |
bitstreams. In the above example, this means that the Byte position |
|---|
| 969 |
of the first occurring page of the patterned pages has been found. |
|---|
| 970 |
|
|---|
| 971 |
There is a complication to the seeking: some logical bitstreams have |
|---|
| 972 |
backwards dependencies in their data packets and these have to be |
|---|
| 973 |
taken into account for seeking. For example, a logical bitstream may |
|---|
| 974 |
require several of its previous packets to allow a correct and |
|---|
| 975 |
complete decoding of the actual packet that occurs at the seektime. |
|---|
| 976 |
This is the case for Theora which requires to go back to the previous |
|---|
| 977 |
keyframe when decoding from a time offset. It is also the case for |
|---|
| 978 |
Vorbis which requires the previous 2 packets for accurate setup of |
|---|
| 979 |
the frequency transform - Speex needs approximately 2 packets for |
|---|
| 980 |
similar reasons. Even instantaneous bitstreams such as CMML may |
|---|
| 981 |
require to go back to a previous packet to recover the last state |
|---|
| 982 |
information - the currently active clip in the case of CMML. |
|---|
| 983 |
|
|---|
| 984 |
Therefore, once seeking has located the correct Byte position that |
|---|
| 985 |
refers to the given temporal offset, it MUST seek back. For logical |
|---|
| 986 |
bitstreams that have a non-zero "granuleshift" in the skeleton, it |
|---|
| 987 |
MUST seek back to the Ogg page that has a "keyindex" granule |
|---|
| 988 |
position. For logical bitstreams that have a non-zero "preroll" in |
|---|
| 989 |
the skeleton, it MUST seek back that many packets. The earliest Byte |
|---|
| 990 |
position that satisfies all these requirements is the correct seek |
|---|
| 991 |
position. |
|---|
| 992 |
|
|---|
| 993 |
A player that presents from an offset MUST take into account that the |
|---|
| 994 |
bitstream may contain some packets that are only there to allow |
|---|
| 995 |
accurate decoding of the seek time. When the backwards dependencies |
|---|
| 996 |
were resolved for a specific logical bitstream, several non-relevant |
|---|
| 997 |
Ogg pages of may also have ended up in the intermediate. These have |
|---|
| 998 |
to be skipped by a player. The time that a player MUST start |
|---|
| 999 |
presenting from is given in the "presentationtime" in the skeleton |
|---|
| 1000 |
ident header. |
|---|
| 1001 |
|
|---|
| 1002 |
|
|---|
| 1003 |
|
|---|
| 1004 |
|
|---|
| 1005 |
|
|---|
| 1006 |
|
|---|
| 1007 |
Pfeiffer & Parker Expires May 4, 2008 [Page 18] |
|---|
| 1008 |
|
|---|
| 1009 |
Internet-Draft SKELETON November 2007 |
|---|
| 1010 |
|
|---|
| 1011 |
|
|---|
| 1012 |
3.4. Remultiplexing an Ogg bitstream using Skeleton |
|---|
| 1013 |
|
|---|
| 1014 |
Ogg with a Skeleton track allows for the creation of mashups of a |
|---|
| 1015 |
file without actual decoding and re-encoding. A mashup in the sense |
|---|
| 1016 |
used here is when a subpart of a Ogg physical bitstream is required, |
|---|
| 1017 |
such as a temporal sub-interval from the whole file. Skeleton allows |
|---|
| 1018 |
the creation of the mashup bitstream through recomposition and |
|---|
| 1019 |
remultiplexing. There are several aims for performing the |
|---|
| 1020 |
remultiplexing with as little effort and therefore as little delay as |
|---|
| 1021 |
possible: |
|---|
| 1022 |
|
|---|
| 1023 |
o no decoding of the logical bitstreams is performed. |
|---|
| 1024 |
|
|---|
| 1025 |
o no changes to the pages, in particular to the granule positions |
|---|
| 1026 |
are made. |
|---|
| 1027 |
|
|---|
| 1028 |
o changes occur only to the control section. |
|---|
| 1029 |
|
|---|
| 1030 |
The fields of the skeleton track allow achievement of all these aims. |
|---|
| 1031 |
Remultiplexing is essentially achieved by seeking to the position as |
|---|
| 1032 |
described above and then including from each logical bitstream only |
|---|
| 1033 |
the relevant Ogg pages into the new stream. Changes to fields in the |
|---|
| 1034 |
bitstream are restricted to the control section: |
|---|
| 1035 |
|
|---|
| 1036 |
o the "presentationtime" MUST be adjusted to the requested start |
|---|
| 1037 |
time |
|---|
| 1038 |
|
|---|
| 1039 |
o the "startgranule" for each logical bitstream MUST be adjusted to |
|---|
| 1040 |
the granule position at which each logical bitstream starts. This |
|---|
| 1041 |
is not the first granule position of the Ogg pages included into |
|---|
| 1042 |
the bitstream, but rather the last one that did not get included, |
|---|
| 1043 |
as it represents the start time of the bitstream. |
|---|
| 1044 |
|
|---|
| 1045 |
Everything else, and in particular the Ogg pages, stay the same. |
|---|
| 1046 |
This is important also to allow caching of such files as is required |
|---|
| 1047 |
for Web proxies and described in temporal URI addressing [timedURI]. |
|---|
| 1048 |
|
|---|
| 1049 |
|
|---|
| 1050 |
|
|---|
| 1051 |
|
|---|
| 1052 |
|
|---|
| 1053 |
|
|---|
| 1054 |
|
|---|
| 1055 |
|
|---|
| 1056 |
|
|---|
| 1057 |
|
|---|
| 1058 |
|
|---|
| 1059 |
|
|---|
| 1060 |
|
|---|
| 1061 |
|
|---|
| 1062 |
|
|---|
| 1063 |
Pfeiffer & Parker Expires May 4, 2008 [Page 19] |
|---|
| 1064 |
|
|---|
| 1065 |
Internet-Draft SKELETON November 2007 |
|---|
| 1066 |
|
|---|
| 1067 |
|
|---|
| 1068 |
4. Security considerations |
|---|
| 1069 |
|
|---|
| 1070 |
Ogg format bitstreams contain several multiplexed binary and non- |
|---|
| 1071 |
binary data bitstream. There is no generic encryption or signing |
|---|
| 1072 |
mechanism provided for the complete bitstream or anyone of its parts. |
|---|
| 1073 |
As the format of the encapsulated media bitstreams is not prescribed |
|---|
| 1074 |
and is identified through the "Content-type" Message header field in |
|---|
| 1075 |
that bitstream's skeleton secondary header packet, it is possible to |
|---|
| 1076 |
encrypt or sign that media bitstream and then mark it accordingly |
|---|
| 1077 |
with a MIME type that signifies the encryption. It is up to the |
|---|
| 1078 |
applications that use this bitstream to provide an appropriate codec |
|---|
| 1079 |
to handle such bitstreams. |
|---|
| 1080 |
|
|---|
| 1081 |
As Ogg format bitstreams generally contain binary media bitstreams, |
|---|
| 1082 |
it is possible to include executable content in them. This can be an |
|---|
| 1083 |
issue with applications that decode these bitstreams, especially when |
|---|
| 1084 |
they are used in a network scenario. Such applications MUST ensure |
|---|
| 1085 |
correct handling of manipulated bitstreams, of buffer overflow and |
|---|
| 1086 |
the like. |
|---|
| 1087 |
|
|---|
| 1088 |
|
|---|
| 1089 |
|
|---|
| 1090 |
|
|---|
| 1091 |
|
|---|
| 1092 |
|
|---|
| 1093 |
|
|---|
| 1094 |
|
|---|
| 1095 |
|
|---|
|
|---|