Storage Requirements

We can also use units to guide our computations of space requirements for multimedia applications. Consider the question of how much video RAM (Random Access Memory) is required for a given monitor (or display) setting. The display setting is specified by

Common color depths are 16, 24 and 32 bits per pixel.

The ratio of the width in pixels to the height in pixels is often referred to as the aspect ratio. Common aspect ratios include:

A typical display resolution might be
1024 pixels wide by 768 pixels in height,
with a depth of 65 thousand colors per pixel.
Vertical resolution is sometimes given in lines, so that the above would be denoted as 768 lines, each of which is 1024 pixels wide. One can think of the horizontal resolution in units of "pixels per line", and vertical resolution as "lines per image". In any case, a two dimensional video image is represented as a simple list of pixels; in this example, the computer would start a new line after every 1024 pixels.
Since 65,536 is 216, we will need 16
bits for each pixel. We then compute the amount of video memory required:

  1. ? bytes = number of pixels * bytes per pixel
  2. number of pixels = 1024 * 768 = 786,432 pixels
  3. bytes per pixel = 16 bits / pixel / ( 8 bits / byte ) = 2 bytes / pixel
  4. bytes of video RAM = 786,432 pixels * 2 bytes / pixel = 1,572,864 bytes

We can again use proportionalities to do additional problems:

Here we have used powers of 2 for conversions to MB, since we are dealing with storage requirements.

In computer displays, double buffering (or multiple buffering) is used to prepare an image off-screen before displaying it. This means that the amount of RAM used for displays is often an integer multiple of the amounts we have computed here.


The space computations for audio files can be done in much the same way. A microphone converts sound into changing voltages:
In order to digitize the sound, we need to convert the changing voltages into a series of numbers. To do this, we sample the voltages periodically:

The sample rate determines the "fidelity" of the recorded sound (how similar it sounds to the original).

For CD quality sound, the sample rate is 44,100 samples per second. Of course, for stereo recordings we need twice that many samples per second, one for each channel. We also need to decide how many bits are necessary to hold the voltage for each sample:

The sample depth determines the dynamic range of the recording: the range of "loudnesses" which can be discerned. Note that loudness is related to the logarithm of voltage.

For CD quality sound 16 bits are used. This corresponds to 216 voltage divisions in the graph above.
Of course, actual sounds are much more complex than these simple sine waves. Here is a plot of the author speaking the word "bison":

The amount of space required per minute of CD quality sound is then
( 60 seconds / minute) * ( 44,100 samples / second / channel) * 2 channels * ( 16 bits / sample) / ( 8 bits / byte)
= 10,584,000 bytes / minute / ( 220 bytes / MB)
= 10.09 MB / minute.
For a 3:21 (3 minutes, 21 seconds or 201 seconds) mono (one channel) audio file with a sample rate of 22,050 Hz and a sample depth of 8 bits, we require
10.09 MB * ( 201 seconds / 60 seconds) * (1 channel / 2 channels) * ( 8 bits / sample / ( 16 bits / sample)) *
(22,050 samples / second / (44,100 samples / second))
= 4.23 MB.
Typical sample rates include 8,000, 11,025, 22,050, 44,100 and 48,000 samples / second. Lower rates are often used for spoken word recordings, while 48,000 is typically used in DVD audio (although rates up to 192 KHz, depths up to 24 and up to 8 channels are also supported).

We can perform a similar computation for full motion video. Using the NTSC (National Television Systems Committee) standard and VHS quality, full motion video involves

If we use 24 bits for each pixel (TrueColor), we can compute the space requirements per minute of VHS video:

( 352 pixels / line ) * ( 240 lines / field ) * ( 2 fields / frame ) * ( 29.97 frames / second ) * ( 60 seconds / minute ) *
( 24 bits / pixel ) / ( 8 bits / byte ) / ( 220 bytes / MB )
= 869.25 MB / minute.
For a 3:21 full-motion video clip, we would need
869.25 MB * ( 201 seconds / 60 seconds )
= 2912 MB / ( 210 MB / GB)
= 2.844 GB.
In progressive video, the fields have been de-interlaced, so there are no longer separate fields. With DVD quality images, the resolution is 720 by 480 pixels, with 30 frames per second, so for each minute we require
( 720 * 480 pixels / frame ) * ( 30 frames / second ) * ( 60 seconds / minute ) * ( 24 bits / pixel) / ( 8 bits / byte * 230 bytes / GB)
= 1.74 GB / minute.
With HDTV images, the resolution is 1920 by 1080 pixels, with 60 frames per second, requiring
1.74 GB * ( 1920 * 1080 pixels / frame * 60 frames / second ) / ( 720 * 480 pixels / frame * 30 frames / second )
= 20.88 GB / minute.
(Note that most films are shot at 24 frames per second.)

It is clear that capturing full motion video without some sort of compression is a daunting task. It is not even possible to do so on many computers; for one second of full motion HD video, we need

20.88 GB / minute / ( 60 seconds / minute)
= 348 MB /second
throughput; many PCs cannot achieve the disk throughput necessary to store full motion video.

A series of compression standards have been established by the Motion Picture Experts Group (MPEG) which makes the use of digital audio and video practical for today's computers. The compression techniques are "lossy" in that information is lost during compression; they are variable in that you can choose the degree of loss. In each case, a target bit rate is established and the space requirement is then the product of the bit rate and the length of time of the recording. For instance, an audio mp3 file (technically MPEG audio layer 3) containing audio recorded at 128K bits per second requires

( 60 seconds / minute) * ( 128,000 bits / second) / ( 8 bits / byte)
= 960,000 bytes / minute / ( 210 bytes / KB)
= 937.5 KB / minute.
Note that we used 128,000 for 128 K, since it is a frequency and not a storage requirement.
Typical bit rates are 128 K, 160 K, 192 K, 224 K, 256 K and 320 K (although lower rates are sometimes used in streaming). In general, data loss in music makes itself heard during passages in which many instruments or voices are heard, or with instruments with a large amount of high-frequency information (i.e., cymbals).

The MPEG 1 standard is used for video CDs and the MPEG 2 standard is used for DVDs; the bit rates range between 1.15 Mbits per second (fixed for MPEG 1) and between 5 and 9.8 Mbits per second (variable) for MPEG 2. So, for example:

It is interesting to compare the storage requirements for a similar amount of multimedia data; it is clear that computer multimedia is not practical without compression technologies, and that multimedia fuels the trend toward larger disk drives for computers.

Software also plays an important role: MPEG 4 (and the H.264 codec, or coder/decoder, standard), which allows a reduction of target bit rates by a factor of two without reducing video quality, has enabled the introduction of Blu-ray discs and has fueled a significant increase in the amount of video found on the Internet.

Our next topic is an introduction to statistics useful in understanding computers.


Go to:Title PageTable of ContentsIndex

©2012, Kenneth R. Koehler. All Rights Reserved. This document may be freely reproduced provided that this copyright notice is included.

Please send comments or suggestions to the author.