Storage Requirements

We can also use units to guide our computations of space requirements for multimedia applications. Consider the question of how much video RAM (Random Access Memory) is required for a given monitor (or display) setting. The display setting is specified by

The ratio of the width in pixels to the height in pixels is often referred to as the aspect ratio. Common aspect ratios include:

• 4 : 3 = 1.33 : 1, used for some digital cameras, films before the 1950s (and some after), analog television, and some computer monitors;
• 1.44 : 1, used for IMAX films;
• 3 : 2 = 1.5 : 1, used for some digital cameras;
• 16 : 9 = 1.78 : 1, used for High Definition (HD) television and monitors;
• 1.85 : 1, used for many films;
• 2.39 : 1, also used for many films;
• 2.59 : 1, used for Cinerama films (see How the West Was Won; no, really, see it!).
A typical display resolution might be
1024 pixels wide by 768 pixels in height,
with a depth of 65 thousand colors per pixel.
Vertical resolution is sometimes given in lines, so that the above would be denoted as 768 lines, each of which is 1024 pixels wide. One can think of the horizontal resolution in units of "pixels per line", and vertical resolution as "lines per image". In any case, a two dimensional video image is represented as a simple list of pixels; in this example, the computer would start a new line after every 1024 pixels.
Since 65,536 is 216, we will need 16
bits for each pixel. We then compute the amount of video memory required:

1. ? bytes = number of pixels * bytes per pixel
2. number of pixels = 1024 * 768 = 786,432 pixels
3. bytes per pixel = 16 bits / pixel / ( 8 bits / byte ) = 2 bytes / pixel
4. bytes of video RAM = 786,432 pixels * 2 bytes / pixel = 1,572,864 bytes

We can again use proportionalities to do additional problems:

• For 1920 by 1080 16 bit resolution, we need
1,572,864 bytes * ( 1920 * 1080 pixels / ( 1024 * 768 pixels))
= 4,147,200 bytes / ( 220 bytes / MB )
= 4 MB.
• A Truecolor (24 bits per pixel) digital camera image which is 3648 by 2736 pixels requires
1,572,864 bytes * 24 bits / pixel / ( 16 bits / pixel) * 3648 * 2736 pixels / ( 1024 * 768 pixels)
= 29,942,784 bytes / ( 220 bytes / MB )
= 28.56 MB.

Here we have used powers of 2 for conversions to MB, since we are dealing with storage requirements.

In computer displays, double buffering (or multiple buffering) is used to prepare an image off-screen before displaying it. This means that the amount of RAM used for displays is often an integer multiple of the amounts we have computed here.

The space computations for audio files can be done in much the same way. A microphone converts sound into changing voltages:
In order to digitize the sound, we need to convert the changing voltages into a series of numbers. To do this, we sample the voltages periodically:

The sample rate determines the "fidelity" of the recorded sound (how similar it sounds to the original).

For CD quality sound, the sample rate is 44,100 samples per second. Of course, for stereo recordings we need twice that many samples per second, one for each channel. We also need to decide how many bits are necessary to hold the voltage for each sample:

The sample depth determines the dynamic range of the recording: the range of "loudnesses" which can be discerned. Note that loudness is related to the logarithm of voltage.

For CD quality sound 16 bits are used. This corresponds to 216 voltage divisions in the graph above.
Of course, actual sounds are much more complex than these simple sine waves. Here is a plot of the author speaking the word "bison":

The amount of space required per minute of CD quality sound is then
( 60 seconds / minute) * ( 44,100 samples / second / channel) * 2 channels * ( 16 bits / sample) / ( 8 bits / byte)
= 10,584,000 bytes / minute / ( 220 bytes / MB)
= 10.09 MB / minute.
For a 3:21 (3 minutes, 21 seconds or 201 seconds) mono (one channel) audio file with a sample rate of 22,050 Hz and a sample depth of 8 bits, we require
10.09 MB * ( 201 seconds / 60 seconds) * (1 channel / 2 channels) * ( 8 bits / sample / ( 16 bits / sample)) *
(22,050 samples / second / (44,100 samples / second))
= 4.23 MB.
Typical sample rates include 8,000, 11,025, 22,050, 44,100 and 48,000 samples / second. Lower rates are often used for spoken word recordings, while 48,000 is typically used in DVD audio (although rates up to 192 KHz, depths up to 24 and up to 8 channels are also supported).

We can perform a similar computation for full motion video. Using the NTSC (National Television Systems Committee) standard and VHS quality, full motion video involves

• a resolution of 352 pixels in width by 240 lines in height for each field;
• two fields are shown for each frame (they are interlaced: the odd lines are shown from one field, followed by the even lines from the next), and
• there are 29.97 frames per second.

If we use 24 bits for each pixel (TrueColor), we can compute the space requirements per minute of VHS video:

( 352 pixels / line ) * ( 240 lines / field ) * ( 2 fields / frame ) * ( 29.97 frames / second ) * ( 60 seconds / minute ) *
( 24 bits / pixel ) / ( 8 bits / byte ) / ( 220 bytes / MB )
= 869.25 MB / minute.
For a 3:21 full-motion video clip, we would need
869.25 MB * ( 201 seconds / 60 seconds )
= 2912 MB / ( 210 MB / GB)
= 2.844 GB.
In progressive video, the fields have been de-interlaced, so there are no longer separate fields. With DVD quality images, the resolution is 720 by 480 pixels, with 30 frames per second, so for each minute we require
( 720 * 480 pixels / frame ) * ( 30 frames / second ) * ( 60 seconds / minute ) * ( 24 bits / pixel) / ( 8 bits / byte * 230 bytes / GB)
= 1.74 GB / minute.
With HDTV images, the resolution is 1920 by 1080 pixels, with 60 frames per second, requiring
1.74 GB * ( 1920 * 1080 pixels / frame * 60 frames / second ) / ( 720 * 480 pixels / frame * 30 frames / second )
= 20.88 GB / minute.
(Note that most films are shot at 24 frames per second.)

It is clear that capturing full motion video without some sort of compression is a daunting task. It is not even possible to do so on many computers; for one second of full motion HD video, we need

20.88 GB / minute / ( 60 seconds / minute)
= 348 MB /second
throughput; many PCs cannot achieve the disk throughput necessary to store full motion video.
( 60 seconds / minute) * ( 128,000 bits / second) / ( 8 bits / byte)
= 960,000 bytes / minute / ( 210 bytes / KB)
= 937.5 KB / minute.
Note that we used 128,000 for 128 K, since it is a frequency and not a storage requirement.
Typical bit rates are 128 K, 160 K, 192 K, 224 K, 256 K and 320 K (although lower rates are sometimes used in streaming). In general, data loss in music makes itself heard during passages in which many instruments or voices are heard, or with instruments with a large amount of high-frequency information (i.e., cymbals).

The MPEG 1 standard is used for video CDs and the MPEG 2 standard is used for DVDs; the bit rates range between 1.15 Mbits per second (fixed for MPEG 1) and between 5 and 9.8 Mbits per second (variable) for MPEG 2. So, for example:

Software also plays an important role: MPEG 4 (and the H.264 codec, or coder/decoder, standard), which allows a reduction of target bit rates by a factor of two without reducing video quality, has enabled the introduction of Blu-ray discs and has fueled a significant increase in the amount of video found on the Internet.

Our next topic is an introduction to statistics useful in understanding computers.