HTML5 Audio and Video

Supported Browsers: Opera 10.5+, Firefox 3.5+, Safari 3.0+, IE 9.0+, Chrome 3.0+

By Alexander Jones,

Introduction

Before any implementation of the HTML5 audio and video tags can take place it's important to look at the history of multimedia across the web. Mutlimedia in web pages originated with MIDI sounds files and animated GIFs. As bandwidth and connection speeds got faster, compression of files had improved and real video and MP3 music began to pick up.

Different companies quickly started to develop different playback systems including Real Player and Windows Media player but the emering success came in 2005: Adobe Flash, which uses a plugin and was the playback system of choice for the video streaming conqeruer of the web: Youtube.

In 2007, Anne van Kesteren wrote to the Working Group about a new 'native' implementation of mutlimedia:

Opera has some internal experimental builds with an implementation of a <video> element. The element exposes a simple API (for the moment) much like the Audio() object: play(), pause(), stop(). The idea is that it works like <object> except that it has special <video> semantics much like <img> has image semantics.

While the API has increased in complexity, van Kesteren's original announcement is now implemented accross all major browsers and Microsoft have recently (Dec, 2010) support in the forthcoming Internet Explorer 9.
The <video> element goes hand in hand with the <audio> element and they share many similar features, so it's important to disucss the implementation techniques of both:

1. Why not just use the object tag?

Previously, if a developer wanted to include a video in a web page, they had to make use of the <object> element, which is a generic HTML container for 'foreign objects'. Also, a developer would also need to use the <embed> element for cross-brower compatability. This resulted in HTML code that looked like the following:

<object width="425" height="344">
    <param name="movie" value="http://www.youtube.com/video" />
    <param name="allowFullScreen" value="true" />
    <param name="allowscriptaccess" value="always" />
    <embed src="http://www.youtube.com/video" 
    type="application/x-shockwave-flash" allowscriptaccess="always" 
    allowfullscreen="true" width="425" height="344">
    </embed>
</object>

The code above is complicated to write and understand, the browser has to verify that the user has a third party plugin or has the knowledge and ability to download it. Plugins can also be a significant cause of browser instability and can cause problems and worry for less technically 'savvy' users when prompted to download something to view the desirable content.

Whenever you include a plugin in your pages, an area is reserved that the browser delegates to the plugin. As far as the browser is concerned, the plugin’s area remains a black box and the browser does not process or interpret anything that is happening there.

Issues can arise when a layout overlaps the plugins drawing area. For example with CSS based dropdown menus that need to unfold over the video. By default, the plugins drawing area will always sit on top of a web page and will mess up the layout. HTML5 provides a standardized way to play video directly in the browser, with no plugins required.

So now with HTML5, video and audio have become officially part of the web, as stated by Bruce Lawson and Remy Sharp on Oct 25th 2010:

One of the major advantages of the HTML5 video element is that, finally, video is a full-fledged citizen on the Web. It’s no longer shunted off to the hinterland of <object> or the non-validating <embed> element.

2. The Markup

At the lowest level, adding a video onto a page in HTML5 requires the following simple code:

<video src=videofile.ogv></video>

The .ogv file extension is used here to point to an Ogg Theora video. The above code will be the most common used when all the browsers support one common video format, but at the moment they dont so it's important to concentrate on giving a fallback option.

Similar to <object>, you can put fallback markup between the tags, for older Web browsers that do not support native video. You should at least supply a link to the video so users can download it to their hard drives and watch it later on the operating system’s media player.

<h1>Video and legacy browser fallback</h1>
<video src=videofile.ogv>
    Download the <a href=videofile.ogv>HTML5 video</a>
</video>

This implementation is okay, but what about the browsers that dont support .ogv files but support other formats? The following would be much better:

<video id="video_with_controls" width="620" controls autobuffer>
    <source src="videofile.ogv" type="video/ogg" />
    <source src="videofile.webm" type="video/vp8" />
    <source src="videofile.mp4" type="video/mp4" />
<p>Your browser doesn’t support video.  
Please download the video in <a href="videofile.ogv">Ogg </a> 
or <a href="videofile.mp4">mp4</a> format.</p>  
</video>

The above code (shown here) will be compatible in all browsers apart from IE8 and below, but the fallback is there so that the user can download the video. Its important to explain the controls that i've added into the <video> element so that the video can play, buffer, specify the height and width and be able to add an image to the front of the video:

Autoplay

You can tell the browser to play the video or audio file automatically, which in most cases is not a great idea. For example, on mobile phones most users would not like to have their bandwidth allowance used without their permission, it is used for streaming sites like youtube who in most cases play their videos as soon as the page loads. Here is an example of how to do it:

<video src=videofile.ogv autoplay></video>

Controls

Adding Controls is in the sense of the word: a way of adding controls to your video (play, pause, volume etc) and is a much better way than autoplaying a video and you can tell the browser to provide them automatically:

<video src=videofile.ogv controls></video>

These controls do vary between different browsers and naturally they can look a bit different, like any form controls might do, but the fundamental play/pause, volume and seek bar will still be there. The <audio> elements has the controls attribute but if you dont use it nothing will be seen on the page.

Poster

The poster attribute (<audio not applicable) allows the browser to use a chosen image while the video is downloading or until the user presses play. Having this available means that a developer no longer needs to display an image then remove it via JavaScript when the video starts. It the poster attribute is not used the browser shows the first frame of the video.

Loop

The loop attribute does as it sounds to the video, it loops the playback. It is a boolean attribute (it's presence on the video element represents the true value, and the absence of the attribute represents the false value.) that, if specified, tells the browser to seek back to the beginning of the video at the end.

Sizing

To help with the sizing of the video in the browser you can add height and width attributes (Unfortunatly not applicable to <audio>). If you only specify the width of say, the height and not the width, the browser will adjust the size of the width to maintain the video's aspect ratio. If you specify neither the browser will use the natural width of the video used, or it will use the width of the poster frame, or if those are not available it will be 300 pixels.

If you set both width and height to an aspect ratio that doesn’t match that of the video, the video is not stretched to those dimensions but is rendered "letter-boxed" inside the video element of your specified size while retaining the aspect ratio.
(Bruce Lawson and Remy Sharp on Oct 25th 2010)


This is what the code should look like with the above attributes added to the video element:

<video src=videofile.ogv poster="posterpic.png" controls loop width="620"></video>

Preload

There's also a preload attribute that can be used to download the video in the background when the webpage loads even if the video has'nt started playing to save time in the expectation that the user will activate the controls.
There are three states associated with preload:

  • Putting preload on it's own where the browser can decide to preload or not depending on the user's browser or device (a mobile browser might never preload data to save the bandwidth).
  • Adding preload=auto is still something that the browser may ignore depending on for example, the signal strength, but the idea is that as soon as the browser loads it will start to download the video in full.
  • preload=none means that the browser should never preload the data until the user activates the controls.
  • Putting preload=meta (in my opinion the most appropriate way of using preload) tells the browser that it should prefetch metadata (sizing, first frame, poster, duration etc) but that it shouldnt download anything until the user activates the controls.

3. Formats

A video format is file that contains the encoded video and audio stream. A video contrainer format only defines how to store things within them and not what kind of data is being stored. The three important container formats used in the web are webm, mp4 and ogv.

  • .mp4 uses H.264(video) + AAC(audio) codecs.
  • .ogg/.ogv uses Theora(video) + Vorbis(audio) codecs.
  • .webm uses VP8 (google open source video) + Vorbis (audio) codecs.

Early drafts of the HTML5 specification mandated that all browsers should at least have built-in support for multimedia in two codecs: Ogg Vorbis for audio and Ogg Theora for movies. However, these codecs were dropped from the HTML5 spec after Apple and Nokia objected, so the spec makes no recommendations about codecs at all.
(Bruce Lawson and Remy Sharp on Oct 25th 2010)

At the time of writing all modern browsers now support the <video> tag including the upcoming IE9. But the support for codecs is somewhat convuluted and it can be confusing, it's important to understand that at the moment Safari does not support the .webm format, preferring to give support for the H.264 video codec and MP3 audio.

At the current time the following list defines the browser support for different codecs:

  • Theora + Vorbis + Ogg is supported by the following browsers:
    • Firefox 3.5+, Safari*, Chrome 5.0+ and Opera 10.5+
  • H.264 + AAC + MP4 is supported by the following browsers:
    • IE 9.0+, Safari 3.0+, IPhone 3.0+ and Android 2.0+
  • .WebM is supported by the following browsers:
    • IE 9.0+**, Firefox 4.0+, Safari*, Chrome 6.0+, Opera 10.6+, Android 2.3+***
*Safari will play what Quicktime can play, which only comes with H.264+AAC+MP4 pre-installed.
**Internet Explorer 9 will only support WebM if the codec is installed on the users computer (hopefully this will change and they can ship the codec themselves in IE9).
***Android 2.3 supports WebM, but because mobile hardware can not yet decode it can limit battery life.

Format Solution

So what's the solution to enable maximum compatibility at the current time? The specification states that:

There is no single combination of containers and codecs that works in all HTML5 browsers. To make your video watchable across all of these devices and platforms, you’re going to need to encode your video more than once.

So at the moment (February, 2011) the solution is to encode your video three times, one version that uses WebM (VP8 + Vorbis), and one version that uses H.264 baseline video and AAC audio in an MP4 container. Then another that users Theora video and Vorbis audio in an Ogg container. Then link to all three of these sources in a single <video> element and then fall back to a flash based player and/or download option like the following:

<video>
<source src="video.webm" type='video/webm; codecs="vp8, vorbis"'/>
<source src="video.mp4" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'/>
<source src="video.ogv" type='video/ogg; codecs="theora, vorbis"'/>
<embed src="http://youtube.com/video" 
type="application/x-shockwave-flash" 
allowscriptaccess="always" allowfullscreen="true" width="425" height="344">  
</video>

Note: Because of a bug in the current iPad you need to put .mp4 as the first option if you want the video to be loaded in that device, until it's fixed.

Encoding Videos

Some useful resources for encoding videos into the desired formats are as follows:

  • Micro Converter is an open source, GPL-licensed program for encoding video in multiple formats.
  • Firefogg, is a firefox extenstion that provides an easy web based conversion.
  • TinyOgg converts flash video to Ogg for download and can use a Youtube URL.
  • Batch Encoding Ogg Video with ffmpeg2theora is software is a wrapper for ffmpeg and ffmpeg2theora and can be used from the command line to specify more parameters.

There's also a great guide on this by Mark Pilgrim at Dive into HTML5 which gives a full list of useful programs and how-to's.

4. Working around current versions of IE

Currently versions of IE unfortunately don't support the video tag and at the moment there are two solutions available:

Chrome Frame

Chrome Frame is a plugin for Internet Explorer which effectively runs Google Chrome inside Internet Explorer, which has the disadvantage that we're no longer using the native IE and it's intented features and it means users have to download a plugin which some people don't like doing and can be seen as a 'hassle'.

Despite this it also has the obvious advantage that all the latest HTML, JavaScript and CSS will be supported that IE doesnt, and IF the user has the plugin it means a develops life is a lot easier because they dont need to provide lots of fallback options, but it's a big if and good practise is to always provide back-ups.

Flash Fallback

As already mentioned above in 'the solution' section Flash can be used as a fallback, although it may need to be encoded again depending on it's fallback to suit flash. But luckily adobe has committed support for .WebM format in their flash player in the future.

5. Conclusion

In my opinion, there's no doubt that having the video tag as a native HTML element gives a great platform to integrate video with the rest of a web site. It brings so many customising options to the table, with it's attributes and javascript capabilities.

Currently of-course though, the downside is the cross-browser codec support and it makes it difficult to implement the <video> element which will work across different browsers and it's comparably easier just to use a plugin. But over the next year this situation will get better and better as browsers increase their support for different formats.

The Next Step

The next step is to go out and develop! Try different things and experiment with the <audio> and <video>, there's a huge wealth of features to get stuck into including using javaScript to create your own players and using the Canvas API to increase the interactivity, for more information on HTML5 <video> and <audio>, be sure to get check out these great resources: