Hint tracks The MPEG-4 team has adopted the QuickTime concept of hint tracks for control of the stream delivery. A streaming file is called a movie. The movie file container contains tracks, which could be video, audio, or other clip data. The track consists of control information that references the media data (or objects) that constitute the track. This means that several different movie files could reference the same video media object. This can be very useful for rich media presentations. One video file can be repurposed into several different presentations maybe multiple languages or a number of levels of detail (introduction, overview, and in-depth). A movie is not the video and audio media files; it is the metadata or instructions for a specific presentation of the media data. The files are flattened into a single file when the stream is encoded. A streamable movie has a hint track in addition to the video and audio (MPEG- 4 files are not limited streaming media applications). The hint track gives the server software pointers to the RTP information in order to serve the relevant media chunks. This information allows the server to deliver the correct video material in the sequence stipulated in the track file, and at the correct rate for the player display. 214 The Technology of Video and Audio Streaming movie trak moov media data mdat media data mdat Hinted movie RTP metadata Video file (RTP hint) track (video) track trak chunk chunk chunk chunk header sample pointer RTP packet hint chunk header sample pointer RTP packet hint chunk header sample pointer RTP packet hint frame frame frame Figure 11.4 Typical streaming file format.
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services
Web server and streaming server If you already have a web site with web server capacity, and you stream only occasionally, it is possible to use that web server to deliver streaming files. The web server will use HTTP over TCP/IP. There will be no control of the stream delivery rate beyond buffer overflow in the TCP/IP stack. Stream serving 213 server RTSP control & info TCP IP RTP UDP network IP video & audio control & info video & audio control channel media data channel media player RTSP TCP IP RTP UDP IP Figure 11.2 The streaming protocol stack. Header Object ASF file Index Object(s) Data Object [fixed size packets] Header Object ASF file Index Object(s) Data Object [fixed size packets] Figure 11.3 Typical streaming file format.
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost PHP Web Hosting services
then automatically select the optimum rate for the propagation conditions. This switching between different rate files is another task for the server. One of the great attractions of streaming is the interactivity. The user can navigate the clip with VCR controls. The server has to locate and serve the correct portions of the clip using an index. From these examples, it can be seen that the streaming server has several additional functions over a standard web server: Real-time flow control Intelligent stream switching Interactive clip navigation HTTP does not support any of this functionality, so new protocols were developed for streaming media. Under the auspices of the IETF several new protocols were developed for multimedia real-time file exchange: RTSP, RTP, and RTCP. There are also a number of proprietary protocols using similar principles. Windows Media originally used the Microsoft Media Server (MMS) for the delivery framework (but now supports RTSP); the stream is in Advanced System Format (ASF). Real-Time Streaming Protocol (RTSP) is the framework that can be used for the interactive VCR-like control of the playback (Play, Pause, etc.). It is also used to retrieve the relevant media file from the disk storage array. RTSP also can be used to announce the availability of additional media streams in, for example, a live webcast. Real-Time Protocol (RTP) is used for the media data packets. The Real-Time Control Protocol (RTCP) provides feedback from the player to indicate the quality of the stream. It can report packet loss and outof- order packets. The server can then react to congested network conditions by lowering the video frame rate or gear-shifting to a file encoded at a lower bit rate. The real-time media stream can be delivered by UDP or TCP over IP; the choice depends upon propagation conditions. The control protocols use TCP/IP for the bidirectional client server connection. Streaming file formats To stream media files in real-time, they must be wrapped by one of the streaming formats. These formats have timing control information that can be used by the server to manage the flow rate. If the client is using interactive control, the file index aids the navigation. The main formats are MPEG-4 (mp4), the Microsoft advanced system format (.wmv and .wma extensions if created by Windows Media codecs, .asf if not), RealNetworks (.rm and .ra), and QuickTime hinted movies (.mov extension). 212 The Technology of Video and Audio Streaming
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Adult Web Hosting services
Streaming What is a streaming server? The most-used server for the delivery of multimedia content is the web server, typified by Apache. Web servers use HTTP over TCP/IP to deliver HTML pages and their associated image files. TCP/IP is used as the transport layer over the Internet. The files are downloaded to the web browser cache as fast as the system allows. TCP incorporates flow control to manage the download rate. There is no predetermined rate for delivery. TCP will increase the data rate until network packet loss indicates that the network is congested. At this point, the rate backs off. Another constraint is the receive buffer. TCP uses a sliding window of data in transit. The receiver processes packets as they arrive. If data arrives too fast, the receive buffer will overflow. The receiver sends messages to the transmitter to slow down, to stop the buffer from filling. Suppose that you want to stream a stream encoded at 40 kbit/s. The TCP transmissions could start at 10 kbit/s. The transmitter then ramps up to 100 kbit/s, where network congestion sets the upper limit. Suppose other users come on to the network, and the transmission throttles back to 30 kbit/s. At no time has the data rate matched the data rate at which the stream was encoded. Now consider if this clip lasts for 30 seconds, the complete file size is 150 kbytes. This is downloaded to the browser cache not a great problem. Now suppose we move up to a 20-minute presentation encoded at 300 kbit/s. Now the file size is 45 Mbytes very large for the cache. This has been the way that the Flash player handled video files, but Flash was limited to short clips. When you stream content in real-time, the media packets are processed by the player as they arrive. There is no local caching, so the local storage issues are solved. This may not seem an issue to PC users, but many media players have very limited memory, for example set-top boxes and mobile devices. The problem with Flash also has gone, Macromedia now has developed the Flash player to support streaming of longform video, and the content is rendered then discarded. There is still the rate control problem. If the stream is encoded at 40 kbit/s it must be delivered at that rate for satisfactory viewing. One of the functions of the transport layer protocol is to regulate the stream rate. But what happens in the example where the network is congested and the best rate is 30 kbit/s? The player runs out of data and stops, one of the main complaints about streaming. There are ways around this, but the first is to encode at a rate below that which will suit the worst-case network conditions. That may be hard to predict, so there are more sophisticated ways; the usual is to encode at several rates, Stream serving 211
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services
In the case of interactive content the client or player is requesting the files from the server. With the simulated-live webcast, the server runs a playlist, which streams files at the scheduled time to the player. 210 The Technology of Video and Audio Streaming Table 11.1 Web Server versus Streaming Server Web server Streaming server Advantages Part of existing infrastructure Optimized media delivery No additional expertise or training Dynamic stream control for IT staff Interactive media control Multicast support Improved server hardware utilization Supports live webcasting Disadvantages None of the streaming server Additional equipment required advantages Only supports progressive download Live Simulated live Interactive encode encode FTP server servers stream Live event WEBCAST source distribution serve delivery ON-DEMAND Pre-recorded content Figure 11.1 Webcasting and on-demand.
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Adult Web Hosting services
11 Stream serving Introduction What happens once you successfully have encoded your multimedia content? Much like publishing a web page, the file is uploaded to the delivery server. That is where things diverge. A conventional web server simply downloads the media file. A streaming server has to manage the delivery rate of the stream to give real-time playback. In addition, the streaming server supports VCR-like control of the media clip. When a browser requests a web page, the files are delivered as fast as the network connection allows. TCP manages an error-free transmission by retransmitting lost packets, but the download time depends upon the intervening bandwidth available. TCP starts at a low rate then ramps up to the maximum that can be achieved. An accurate delivery is ensured, but timely delivery cannot be guaranteed. Streaming media has opposite requirements: the delivery must be in real-time, but reasonable levels of transmission errors can be accepted. Streaming servers can be proprietary to an architecture or designed to handle standard formats like MPEG-4. The system architecture can vary from a single machine serving a small corporate training site, to large distributed server farms, capable of serving hundreds of thousands of streams for live events like breaking news footage, fashion shows, and rock concerts. Streaming can be delivered as a push or pull process. Push is used to stream live or prerecorded content as a webcast this is the television model. Push streaming can be used for web channels or live events. Alternatively, the user can pull prerecorded content on-demand. This interactive experience is akin to using a CD-ROM or a web browser. A webcast can be a mix of live and prerecorded content. With live events the server is acting as a distribution point, just echoing the stream onto the viewers. For the prerecorded content the server has two functions. The first is to recall the content from the local disk storage arrays and the second is to control the stream delivery rate.
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Adult Web Hosting services
correction also may be used; the rendering of color is different from the composite television systems. The audio usually will benefit from dynamic range compression as well as noise reduction. The dynamics compression will make speech much easier to understand when it is played back on a PC s small speaker. The processors work in three areas: peaks are limited to avoid clipping, the general audio is compressed by an AGC, and then noise below a threshold is suppressed. Last of all, do not forget to watch and listen to the stream after the selected treatments. It is the best way to judge the improvements that preprocessing can make to your streams. 208 The Technology of Video and Audio Streaming
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost PHP Web Hosting services
Processing hardware Preprocessing is not limited to software applications. Some manufacturers have hardware developed specifically to clean up video and audio. The dedicated hardware products process in real-time. Companies like Orban and Omnia have been making high-performance compressors for the radio business for many years, and now have products targeted at the webcasting market. The Orban product line includes the Optimod-PC and the Opticodec-PC. The Optimod is a PCI card with on-board DSP that gives automatic gain control (AGC), equalization, multiband gain control, and peak-level control. The Opticodec is a complete solution for compression and encoding aimed at the Internet radio station. The preprocessing includes AGC, multiband dynamics compression, and look-ahead limiting. For data compression, it uses the aacPLUS codec from Coding Technologies. An alternative vendor is Omnia, a division of Telos Systems. The Omnia-3net digital audio processor is designed especially for the webcaster and has AGC, limiting, and multiband processing. Monitoring It is vital that you monitor the results of any audio processing before the file is encoded. Although big monitor speakers will make the audio sound impressive, it is worth listening to some of the small speakers often used with desktop workstations this is what your audience is probably using. Summary Noise is a killer for codecs. So before encoding, it makes sense to try and clean up as much of the noise as possible. That way you can ensure that you have the optimal encoding for a given stream bandwidth. This noise reduction applies equally to both video and audio. The preprocessing products have a veritable toolbox of noise filters to deal with the different types of noise that you will encounter with different sources. The sparkle from a satellite link will need a different treatment from the tape noise from an analog VCR. Video requires additional processing to remove some of the characteristics of the television system. This includes interlace scanning and the 3:2 cadence of content originally shot on film then transferred to video by a telecine. To cut down unnecessary data, the outside edge of a television picture usually can be cropped without losing relevant content. Television frame composition allows for over-scanning in the television receiver, not a problem with PCs. Some color Preprocessing 207
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Adult Web Hosting services
Discreet Cleaner Cleaner is not just a clean-up tool. It has all the facilities to take raw content and publish to the streaming servers. It processes both the video and audio. Cleaner 6/XL can perform the basic de-interlace and reverse telecine on the video. The signal levels and color can be adjusted and the noise cleaned up. The adaptive noise reducer uses spatio-temporal processing to clean up flat areas of color without degrading the edges of objects. 206 The Technology of Video and Audio Streaming Figure 10.11 Discreet Cleaner 6 audio controls. Autodesk Inc. Cleaner 6/XL also has powerful audio processing capabilities. The product has a wide range of filters, including high- and low-pass filters with adjustable turnover frequencies. It has control of dynamic range, and a noise removal filter to reduce unwanted background noise. A useful addition is the deep notch filter that can be set to remove power line hum, which is often a problem at live events. If you want to change the sampling rate of the audio, it can resample an input file. Cleaner can add reverberation to the audio to give a presentation the sound of a big auditorium. Once a file is processed it then can be encoded on the same platform.
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services
seconds. This prevents the modulation effects that you would get with a very short release time. Usually all these controls are used together. The peak limiter is set to remove transients, then louder sounds are passed at unity gain. Below a set threshold, the quieter sounds are compressed to improve intelligibility. Everything below a lower threshold is treated as noise and gated out. This may seem like a lot of adjustment, but once you arrive at some good settings they can be stored for reuse with similar content. Processing applications Typical streaming media encoder applications include many facilities to preprocess audio. Some are implemented in software, some in hardware. The hardware products are best if you want real-time processing for live webcasts. A good example of a comprehensive encoding package is Discreet s Cleaner. Preprocessing 205 0 -20 -40 -60 -60 -40 -20 Output level (dB) Input level (dB) unity gain esion etag noisserpmoc dlohserht 4:1 compression kaep retimil Figure 10.10 Compressor limiter composite characteristics.
Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services