2009-04-28 Collecting and organising audio

For further details, please refer to: IPT: Collecting & Organising Multimedia

Measures of performance: audio data

Three factors affect the perceived quality of collected audio data: sampling rate, bitrate, and stereo/mono.
Sampling rate: For audio data to be collected, samples need to be made. Samples are measurements of sound frequency (pitch) and intensity (volume) at the time of sampling. If enough samples are made every second, the resulting recording, when played back to a human listener, does not sound like a set of discrete measurements, but instead, a continuous sequence. It is therefore imperative that the sampling rate is high enough, as it varies directly with how "convincing" the collected audio data is. In CD-quality recordings, the sampling rate is 44.1 kilohertz - 44,100 times per second.

Resolution: The bit resolution of an audio recording can be compared with the bit depth of an image. The bit resolution is, effectively, the range of the recording - how high-pitched and low-pitched it can be. The actual number of the bit resolution is the number of bits in each sample. DVD audio is generally 24-bit, CD audio is generally 16-bit, and for many other purposes (particularly human speech), 8-bit audio is adequate.
Stereo/mono: Humans use their ears to sense direction, which is why there are two of them. If sound gets to one ear before the other, the relative location of the source of the sound can be approximated. In order to trick the human ears and hence convince the human brain that the sound is not coming from a set of speakers, audio data can be recorded (and displayed) in more than channel - this is known as stereo, which does not mean "two", but instead, means "solid". If only one channel is used, then the audio is said to be "mono" - meaning "one".

There are different ways of collecting audio data in stereo. They can either be recorded from the same location (coincident pair) or from different locations (spaced pair). If they are recorded from the same location, the speakers are placed perpendicular to each other, either horizontally ("x/y setup", also known as "intensity stereophony") or vertically ("m/s setup", also known as "mid-side setup"). If they are recorded from different locations, the setup is known as a "time of arrival" setup.

Jacaranda questions 7.5

1. List two scanner settings that will affect image file sizes.
Bit depth and DPI (dots per inch)
2. List the digitising settings that will affect the quality of an audio recording.
Sampling rate, bit rate, stereo/mono
3. Which storyboard layout gives the user the most control over a presentation?
Non-linear storyboard layout gives the user the most control over the running of the presentation.
4. Most web video capture cameras produce very 'jerky' motion. Explain why.
Most web video capture cameras (webcams) produce a very 'jerky' motion because they do not have a particularly high frame rate - a high frame rate would result in larger video files, and in this case, more bandwidth usage, which is undesirable for webcams, generally used to collect data intended for transmission over internet connections with bandwidth limits.

5. Why is an audio CD restricted to about an hour of recorded music?
The Red Book audio standard used by most audio CDs requires tracks to be stored with 2 channels of audio, each with 16-bit values sampled at 44100 Hz.

Hence, 2 × 16 × 44100 bits are produced every second
2 × 16 × 44100 = 1,411,200
Hence, 1,411,200 bits are produced every second

Compact discs generally typically have a storage capacity of 700mb,
700 × 1024 × 1024 × 8 = 5,872,025,600
Hence, 5,872,025,600 bits can be stored on a CD

5,872,025,600 ÷ 1,411,200 = 4,161.01587
Hence, 4,161.01587 seconds of audio can be stored on an audio CD.

4,161.01587 ÷ 60 = 69.3502645
69.3502645 = 69 minutes, 21.095 seconds
Hence, 69 minutes and 21 seconds of audio can be stored on a standard audio CD
6. Most consumer video equipment produces 'composite video' signals instead of the RGB used by computer VDUs. What is 'composite video'?
Composite video is a method of transmitting video that seperates the audio and visual (animation) channels. Audio is transmitted as digital (modulated) signals, while visuals (animations) are transmitted as analog signals.
7. Digital cameras used CCD instead of film. What is CCD and how does it work?
CCD stands for Couple-Charged Device. The CCD inside a camera works by hosting many photosensitive sensors on a flat surface. Each sensor produces a small electrical charge when exposed to light. The charge varies depending on the intensity and frequency of the light that the sensor is exposed to, making CCDs ideal for colour photography, among other things. The charge output of each sensor is converted into a digital signal, and then interpreted as a pixel in the resulting digital image.