SAA Early Born Digital Formats
Total Page:16
File Type:pdf, Size:1020Kb
Early born-digital audio formats (Slide Notes) Compiled by George Blood SLIDE 1 GBA logo SLIDE 2 GBA logo audio SLIDE 3 GBA audio video SLIDE 4 SSA logo SLIDE 5 SSA presents SLIDE 6 Title Slide Early born-digital audio formats SLIDE 7 First Commercially Available Formats: PCM-1 PCM-10 PCM-F1 PCM1600/1610/1630 ...DAT As a practical matter, I'll be speaking about the last three of these early formats. Unless you work for a record label, it is highly unlikely you'll see PCM-1 or PCM-10 tapes. In 25 years I've only seen one PCM-10. (Hint, it plays like a PCM1600, but uses a PCM-F1 decoder). SLIDE 8 citation: "The Dawn of Commercial Digital Recording" ARSC Journal, Tom Fine There are earlier formats, mostly based on data recorders or custom hardware. For an excellent introduction to this Jurassic period of digital recording, see Tom Fine's article in the Spring 2008 issue of the Journal of the Association for Recorded Sound Collections. SLIDE 8b Principles of Digital Audio, Ken Pohlman As with most materials we face in preservation, it helps to understand the properties of our quarry. I'm going to do a very brief introduction to digital audio. For anyone who wishes to dig deeper, I highly recommend this book. Ken's writing his extremely clear. You can look up individual topics, or read entire chapters. It is a standard desk reference at my facility. SLIDE 9a Quantization "The process of converting analog signals to digital." syn: digitization For our purposes, in preservation, we are talking about Pulse Code Modulation: PCM. SLIDE 9b Pulse Code Modulation: PCM At regular intervals of time, the amount of voltage ("quanta") is measured and assigned a numerical value. Or: at a given frequency (many times a second), we measure how loud the sound wave is and assign it a number. SLIDE 10 [picture of sine wave] SLIDE 11 [picture of sine wave with sampling] SLIDE 12 PCM≈TIFF TIFF congruent to PCM DPI congruent to kHz range of colors congruent to range of volume There is, conceptually very straight forward. It is exactly analogous to a TIFF in imaging. At regular intervals (dots per inch vs. samples per second), you assign a value of the amount of information (light vs. volume). There are other methods of quantization. SLIDE 13 Other quantization methods PWM: Pulse wide modulation Delta-Sigma: sum of change Delta-Modulation: change in value (used in SACD's "direct stream digital") Instead of defining a fixed period in time and encoding the value of the volume, define a fixed period of change in amplitude and encode the length of time it took: Pulse Wide Modulation. Or set a starting point (typically zero), and record how much the signal has changed since the last sampling (Delta Sigma). Or simple record whether the signal has increased or decreased in volume since the last sampling (as used in SACD). All of these other methods have the same problem, however. It is impossible to process the information in their native encoding. You can't change the volume, you cannot combined signals, you can't put simple fades at the beginning and end, you cannot edit. They lack fixed reference points at any random point in their continuum. All forms of digitize audio must be converted to PCM to do any of these things. The same goes for compressed formats, such as MP3 and AAC. You cell phone conversations travel through the ether as highly compressed digital signals; but they have to be reconstituted as PCM to be converted to analog sound. Same with your iPod. As such you can think of PCM as the "lingua franca" of digital signal processing. Always has been since 1932. Many people have attempted get their PhDs solving this transcoding problem. All the money Sony's R&D lab couldn't come up with a way to process DSD in its native form. For preservation this means that PCM is an extremely stable format. We just have to figure out a way to store it. And that's where things get challenging. SLIDE 14a 1' s & 0's Digital data is stored as ones and zeros, numerical values represented in binary, base2. This is very convenient because we only need two states to represent each fundamental datum: SLIDE 14b (read slide) light on light off positive voltage negative voltage positive magnetic flux negative magnetic flux lands (light reflects) pits (light doesn't reflect) OK, so how many bits are we talking about? SLIDE 15 Nyquist formula: the highest frequency that can be captured in PCM is exactly one half the sample rate fN = (fs/2) where fN is the Nyquist frequency and fs is the sampling frequency We could spend all day on this (and many other topics I'll breeze by). For the purposes of this presentation we'll stick to basic theory. The basis is straight forward: we need to represent both periods of the highest frequency, the top and bottom of the sine wave. Like this: SLIDE 16 Nyquist in Action [picture of sine wave with single sample at top and bottom] The upper limit of human hearing is classically quoted as 20kHz. So we'd need 40kHz. For reasons we'll not delve into, we need to build in some margin, say 10%. And here we have 44kHz. SLIDE 17 44 kHz, 16 bits 20kHz target upper limit *2 = 40kHz 10% margin = 44kHz According to basic theory 16 bits is enough to encode all the amplitude information. Each bit equals a 6dB change in level, therefore (read slide) SLIDE 17a 16bits * 6dB/bit = 96dB of dynamic range This is also convenient, because it works out to 2 8-bit blocks and we can piggyback onto silicon that is available to handle 8-bit blocks called bytes! How much data are we talking about? SLIDE 18 44,000 samples per second 16 bits per sample 2 channels (stereo) 44,000*16*2 = 1,411,200Hz (1.4MHz) This presents a bit of a problem for us. Audio tape recorders are only designed to record frequencies up to the upper limit of human hearing, about 20kHz, or maybe with a 10% margin, 22kHz. In fact, there's a very good reason to NOT go any higher: you're getting close to the range of AM radio. In addition to the "sound" you wish to record, you start picking up (and recording) AM radio signals. Analog tape machines deliberately limit signals above 22kHz or so. There should be no input signals above 22kHz to capture (at say 96kHz digitization). I'm making trouble, and there are good reasons to consider 96kHz; simply not because there is frequency information to capture on the original!But I digress.We still have a problem. Our analog recording technology will only record a signal that is 1/64th lower than we need to capture our simple data stream.The problem is not especially complicated. SLIDE 19 [picture of magnetic tape head] The "gap" in a record or playback head, that narrow slit that imprints or reads the recorded signal cannot "resolve" a frequency whose wavelength is higher than twice the size of the gap. If the wavelength is shorter than this, both the positive and negative side of the frequency appear to the head at the same time, canceling each other out. Only bigger, longer wavelengths can be read. SLIDE 20 [superimpose sine wave over gap to show +&- together over gap] SLIDE 21 (read slide) "How can we increase the size of the signal relative to the head gap"? SLIDE 21B We could move the tape faster. SLIDE 21C Or we could move the head in relation to the tape! This was the fundamental breakthrough of the Ampex team (including one Ray Dolby, of noise reduction fame) that gave us the quadraplex video recorder. SLIDE 22 [picture of helical scan signal on tape] The tape moves at a linear speed, and the head spins, leaving a signal recoded across the tape. The tape moves on a little bit, and the head makes another pass, leaving these stripes. If you look down on the spinning head and if you could tape the top of the tape to the bottom, it would form a helix. This is how video is recorded. The color carrier frequency is 3.58MHz; or more than twice the frequency of our lowly digital audio stream. SLIDE 23 [picture of helix] "Helical Scan" Let's rewind a little bit: SLIDE 24 1 0 light on light off positive voltage negative voltage positive magnetic flux negative magnetic flux lands (light reflects) pits (light doesn't reflect) Let's add SLIDE 24b white black Take the zeros and ones, convert them black and white video, and voila! a signal you can send to an off-the-shelf machine. This greatly simplifies the system, piggy backing on exiting, proven technology. Manufacturers build analog-to-digital-to-video interfaces for video recorders. Moan all you want about obsolete video formats that carry digital audio in this way. This is MANY times better than if we were dealing with dedicated, proprietary hardware for these early born-digital formats! SLIDE 25 [picture of a video monitor showing digital audio tape played] When this first came out, we used to talk about watching the video monitor and it would tell us something about the digital audio stream. The only thing you could ever tell was whether you had good-enough video to get the data back! SLIDE 26 [video] SLIDE 27 As a consequence of this marriage to digital our 44kHz sample rate becomes 44,100kHz, in case you ever wondered where that came from.