The soundtrack to “Naked Came the Stranger” has exactly as many channels of audio as most non-blockbuster films of its day: one. Distribpix includes this “Original Mono” track on its “Naked Came the Stranger” DVD and it sounds spectacular: crisp, clean, and surprisingly dynamic. It was transferred from the director’s own 35mm blow-up internegative, and we’re proud to say that it sounds better than on any prior release. It’s the audio that plays if you just drop the DVD in the player and press “play.”
So why did Distribpix include Dolby Digital 5.1 audio on the DVD as well? And how in the hell do you get 5 channels of audio out of a one-channel soundtrack?
The surprising answer is that while the film’s soundtrack was mixed FOR monoaural (one channel) audio, it wasn’t mixed IN monoaural audio. At least not entirely. Hiding at the bottom of a box in Radley’s archive was a stack of four “3-stripe mag” tapes labeled “Naked Reels 1-4”. While the final mix for “Naked” was destined to be monoaural, the original audio engineers used multi-track tape for the mixing process. (The “tape” in question is actually clear 35mm film stock, coated with stripes of magnetic oxide that function as audio tracks.) Standard practice is to store dialogue on one stripe, sound effects on another, and music on the one to two remaining stripes (either mono or stereo).
When such sources (generically called “3-stripe mags”) can be located for a film, the shortest path to 5.1 channel audio is to simply specify a relatively conservative position for each source track on the soundstage, tell the mixing software to render out Dolby Digital 5.1 audio, and call it a day. This sort of “upmix” can feel a little less constrained than a monoaural track, and the separated elements from the high-fidelity multi-track tapes often allow for improved sound quality as well.
But surround sound is meant to be immersive. If a car whizzes past, the sound of its motor should too. If a scene is set in the middle of a party, the chatter of guests and clink of cocktail glasses should surround the listener on all sides. Making sounds reflect the position and motion of their on-screen sources helps draw the audience into the film.
Mixing positional audio from the original elements is labor-intensive; it is rarely done for older catalog titles. But Metzger’s Henry Paris films are well-loved and historically important. Distribpix is committed to making its releases of these films reflect that importance. We wanted to include a proper 5.1 audio mix for “Naked”: positional and immersive, but entirely true to the original mono mix. We used only original elements– no music or sound-effect substitutions. We hewed strictly to the editorial decisions of the original mix. And we included the original mono track for posterity and for viewers who might prefer it.
So how did we get from a stack of 3-stripe mags to a Dolby Digital 5.1 mix? Let’s walk through a little of the process.
First, Steven took the 3-stripe mag tapes to Duart Labs in New York City. Sound engineers at Duart dug through their old equipment bins to locate a head capable of reading the format, then transferred the tracks to lossless audio files on a DVD-ROM. The dialog, effects, and two-channel music tracks on each of the four tapes produced 16 digital audio files.
At this point, Ian Culmell stepped in, importing the files into Adobe Premiere Pro and carefully aligning the audio in each file with the visuals. When frame-accurate alignment of each file was complete, the 16 files were stitched together to form 4 continuous audio tracks.
Next, each track was previewed to locate “rogue elements.” Ambient sounds (traffic, party chatter, etc) are usually stored on the “effects” track. This necessitates storing other, overlapping sound effects on less appropriate tracks (like “music (left)” or “dialogue”). Likewise, ADR (additional dialogue recorded after filming) can show up on the wrong track if it overlaps with existing dialogue. None of this presents a problem in a mono mix, but these misplaced elements are best sorted-out before multichannel mixing begins.
The mixing process itself involves specifying the volume and soundstage “position” for each track at each moment in the film. In the final mix, each of the 5 speakers (front, center, left, left-surround and right-surround) has its own “channel” on the DVD (plus an additional channel of enhanced bass for the subwoofer– the “.1” in the 5.1 designation). The mixing software does the work of calculating what percentage of a given sound should come out of each speaker to place it in the requested location relative to the audience.
Sound effects are positioned on the soundstage, so it’s natural to assume that dialogue is positioned this way as well. In the earliest surround-sound films (the original audio mix for Spartacus, for example), it was. Doing so has a disorienting side-effect, however. Any time the camera cuts to a different angle, characters’ voices jump as well. The apparent direction of a character’s voice might snap from left to right and then front to back within the span of a single sentence.
These sorts of directional jumps are jarring, not immersive. So in practice, dialogue is most often anchored mostly or completely at the center speaker. (This is an assumption modern recievers exploit when offering “dialogue enhancement”). Exceptions are made in certain situations, but dialogue remains at the front and center of the soundstage for the majority of most films. The listener’s brain uses visual cues to compensate, often perceiving the dialogue as positional even when it is not.
The first few minutes of “Naked” provide an illustrative example of what is involved in mixing. There’s an intermittently chirping bird, a continuously playing television (blasting out Antony Balch’s 1970 film “Secrets of Sex”), two alarm clocks going off, a switch getting flipped, the bed cover getting pulled back, Gilly blowing across a jar of menthol, and some dialogue. All of these sound elements must be placed on the soundstage using keyframes to specify their positions on two panners (“front-back” and “left-right”) over time as the camera tracks and pans around the room in near-constant motion. Volume levels for each track are specified the same way. A rough count places the number of audio keyframes in the first two minutes of “Naked” at over forty.
After mixing, each scene is played back to check for errors in position and relative volume. Notes are taken, corrections are made, and the process repeats until the scene’s audio is deemed satisfactory. Even small oversights can be highly distracting. In Billy and Phyllis’s first sex scene, additional moans (recorded as ADR, but missed in the first pass looking for rogue elements) appeared on the sound effects track. Due to the prior sound effects being located off-screen to the left, this had the effect of placing half of Phyllis’s vocalizations apparently coming from her and the other half apparently coming from a disembodied “other woman” to her left. Identifying and correcting these issues is time consuming, but essential.
At the end of the process, the mixing software generates an audio file for each speaker. These files are fed into a Dolby Digital encoder, in our case Minnetonka Audio’s SurCode. DialNorm (to ensure volume parity with other audio tracks) and other esoteric settings are specified, another round of QC is performed, and the end result is imported as a DVD asset in the authoring process.
If that all sounds tedious… it is. Listening straight-through to each of the four source tracks in an hour-and-a-half film is a six hour undertaking. Positioning each sound over time, adjusting relative volumes between elements, comparing the new mix to the original, and other QC all adds dozens of hours to the process. But we believe the end-results sound amazing and were well worth the effort. To check it out, select the “Remastered DD 5.1” audio track from the “Setup” menu.