19
Mar
So, when is it appropriate to record at 48.048 kHz, and how should you set your recorder when doing so? The answer lies in your project sound and picture workflow – not just on set, but all through film-to-video transfer and postproduction. If you understand the workflow, and how your audio files are being used, you will better understand when 48.048 kHz sound can benefit the post-production process, and ultimately the quality of sound in the film or television program it produces.
There are 4 rules of thumb that determine whether or not 48.048 kHz workflow is possible in your project, and how it must be used. These are:
Rule # 1: That 48.048 kHz workflow only works if picture is being shot at true 24 fps, and picture and sound editorial is being done in NTSC 29.97 or 23.98 HD video.
Rule # 2: That 48.048 kHz workflow only works when the project is finishing and releasing on video (not film).
Rule # 3: That 48.048 files are only useful if they are stamped using –F mode.
Rule # 4: Sound Editorial must agree that it’s a good idea to use 48.048 kHz sound, and the producers and post supervisor must agree with them.
The first rule eliminates anything being shot at 23.98 fps or 29.97 fps video. For these shoots you would normally use 48 kHz and 29.97 NDF settings (or 23.98 sometimes, but that’s a whole other column!). The second rule eliminates feature film shoots from the list of possibilities. The third rule applies to how you need to set your recorder in order to ensure that the production gets the editorial benefits of shooting sound at 48.048 kHz. The last rule is probably the most important – if sound editorial is not on board, or worse, not even chosen on the project yet, 48.048 kHz workflow is absolutely not recommended.
At this point, some of you may be wondering why all these rules apply, and possibly why anyone would want to go to the trouble of using a 48.048 kHz sound workflow in the first place. To answer these questions, we need to better understand what recording at 48.048 kHz really means.
Sound recorded at 48.048 kHz is pulledup, that is to say, it’s recorded at a sample rate that is 0.1% higher than standard 48 kHz sound. This is the same 0.1% difference that governs the relationship between 24 frame film or HD video and 29.97 fps NTSC standard definition video or 23.98 fps HD. In the NTSC and Sony HDCam world, picture is usually shot at 24 fps, but must be transferred to video at 23.98 fps. This 0.1% slowdown is known as video pull-down, and must govern everything in the chain, including sound. Sound recorded at 48 kHz is pulled down to a sample rate of 47.952 kHz, so that it plays slower and in sync with the video. For the analog heads among us, this manipulation of sample rates is how digital audio is vari-sped, and works just like slowing down your analog tape transport (or digital tape transport, for that matter).
When sound is recorded at 48.048 kHz, a 0.1% pull-down results in sound that plays in sync at a true 48 kHz, instead of 47.952 kHz. This fact alone explains both the first and second rules. Projects shot at video speed, that is to say, at frame rates that are already pulled down like 23.98 and 29.97 fps, do not get pulled down when transferred for dailies or work tapes. For 48.048 kHz sound to be useful, it must be pulled down to 48 kHz. Similarly, projects that finish on film will have their sound pulled back up 0.1% when the sound goes to the optical track. Once again, for 48.048 kHz sound to be useful, it must end up at 48 kHz. Television programs and video releases are transferred to video and remain in their pulled-down, 0.1% slower state.
The upshot of all this appears to be that recording sound at 48.048 kHz allows the post process to maintain a digital signal path at 48 kHz when using video pull-down. Some of you may be wondering why it is so important to end up at 48 kHz without sample rate conversion. After all, we use sample rate converters all the time, especially in Dailies film-to-video transfer. All transfer facilities do this when they transfer picture and sound in real-time to video. It’s the most common workflow there is in this business, and it makes no difference if you are transferring 48.048 kHz material or 48 kHz material. Actually, it does make some difference – 48.048 kHz material may actually be slightly more difficult to transfer in sync because the workflow and setup are different than the usual.
In fact, most film-to-video transfer people I’ve talked to prefer sound at a straight 48 kHz. It is simpler and more straightforward for them to deal with when preparing dailies. Why would anyone choose to do something which could make Dailies more difficult to produce? Why indeed? Because the quality and speed benefits down the road in sound editorial are often worth the trouble.
Editorial is where 48.048 kHz audio can really make a difference. The key here is that this is truly a file-based workflow. None of this sound transfer needs to be in real-time, and to do so would actually be detrimental, regardless of whether we are working with 48 kHz or 48.048 kHz material. The only way to ensure that the editor gets the full benefit of your file names and metadata is to ensure that the files are transferred to the workstation in their original state. This brings us to the third rule, that all files must be stamped using –F mode (pronounced dash eff mode).
The F in dash F stands for Fostex, and refers to the Fostex designed method for recording 48.048 kHz material. If you are recording on a Fostex machine, then you are already there. Other machines, like the Zaxcom DEVA and Sound Devices 744T, include the option of recording your files in –F mode. Files recorded at 48.048 kHz with this method are stamped in the Broadcast WAV header as being recorded at 48 kHz. This is incorrect, and possibly confusing to others, but is necessary to make the files useful in Pro Tools. It is important that Pro Tools identify the files as being at a true 48 kHz, so that it does not sample rate convert them on import. Done correctly, this import results in an automatic pull-down of the audio material when imported into a 29.97 fps, 48 kHz session.
At this point, I think I should explain a few things about Broadcast WAV files and Digidesign Pro Tools. Generally speaking, the preferred sound delivery file format for post-production is Broadcast WAV Polyphonic, often shortened to BWF-P. These multi-track files are preferred over monophonic Broadcast WAV files because they are easier to handle (although some equipment may only deal with the mono variety, so be sure to check with the post house before delivering). While BWF-P files may be the best way to deliver to Pro Tools, the software cannot deal with them directly. Files must be converted to mono on import and it is this conversion process that makes –F mode so important. If these files are identified by Pro Tools as being 48.048 kHz, then they will be automatically sample rate converted to 48 kHz on the way in, negating the pull-down effect.
The good news is that Digidesign understands that this is a problem and has offered a solution in their newest software release, Pro Tools version 7.3 for both HD and LE systems. The audio file import dialog now includes a Sample Rate Conversion option, allowing editors to choose if and how sample rate conversion takes place. I should also note here that Digidesign has finally made Broadcast WAV metadata available to editors. This feature was actually introduced in Pro Tools 7.2 HD, but few editors have the chance to work on the latest Pro Tools TDM hardware, so they couldn’t take advantage of it. This brings me to an important point – the persistence of legacy hardware in this business. Older Digidesign hardware will not run the new upgrade, and so it will take some time for these new capabilities to move their way into and through the system. It is important to note that –F mode files will work just fine on the new systems as well as the older ones.
Rule number four states that sound editorial must agree that it’s a good idea to use 48.048 kHz sound. Many editors and facilities will edit and mix television at a true 48 kHz because this is how they will have to lay it back to video for delivery. This is a more traditional way of working which once again helps maintain the integrity of the original location sound by avoiding sample rate conversion as much as possible. Recording at 48.048 kHz is undesirable if sound is to be edited and mixed in pull-down.
Assuming that sound editorial is to take place at a true 48 kHz, perhaps some of you are still unsure of the benefits of the automatic pull-down that occurs when audio is recorded at 48.048 –F. If, however, you have ever had to load 300 track hours of material (that’s only 25 days of 4-track material on a fairly busy show) into Pro Tools and watch it sample rate convert every second of it, you’ll understand. You’ll understand even better if you’ve had to listen to that audio after the process is completed. I have worked with editors who prefer to cut around the 0.1% drift rather than bother losing the time and quality to sample rate conversion.
I have also worked with productions who specified very early on and without prompting that their workflow would include 48.048 kHz audio. These were generally shows with a very savvy post supervisor that were working on very tight production schedules with a lot of material. They had their sound and picture editorial teams all ready to go, and they were sure to test the workflow before they started shooting.
Of course, this kind of organized production is not always the case. Oftentimes projects must start shooting before producers and post-supervisors have had a chance to choose or confirm a sound post house. In this case, it is almost always (remember, no rule is truly absolute in this business) correct to start shooting at 48 kHz. This is the most standard and compatible workflow, and everyone understands it.
Besides the four rules of thumb I’ve set out here, there are other things to consider when deciding whether 48.048 kHz recording is right for you and your project. The first is your equipment – is your recorder capable of delivering correctly formatted 48.048 kHz material? Will the files be timestamped correctly when the recording is pulled-up on your machine? Older file-based recorders like the Nagra V will not work correctly, and software development on these units has long since stopped. It’s important to keep in mind that newer recorders may work fine under one software version, and incorrectly under another newer or older version. The most important element in any decision to record at 48.048 kHz is testing. You must do a test recording with picture to assure that everything stays in sync, and that sound shows up at the correct timecode.
These are very cost-conscious times in this business, and film and television producers appreciate anything that saves them money. More than that, however, they appreciate something that works. For the right projects, 48.048 kHz recording can do both. The keys are preparation, education, and most importantly, communication. Maybe I shouldn’t keep my mouth shut after all.
One comment
Leave a reply Delete Message
You must be logged in to post a comment.
thank you so much for this!!!!!