Surround 3D [Archive] - Doom9's Forum

View Full Version : Surround 3D

tebasuna51

14th April 2025, 13:05

In this post, I'm going to discuss topics that may be useful for beginners and perhaps debatable for experts.
I'll say that these are simply my opinions, and they may or may not be taken into account.

I'm going to use speaker names and positions suggested by MediaInfo on this page.https://mediaarea.net/AudioChannelLayout

Mathematically, 3 speakers (Left, Right, and BackCenter) would be enough to achieve a 2D surround sound experience.
But considering we have two ears, one on each side of our head and oriented toward hearing frontal sounds, it's better to use 4:
L, R, LeftSurround, and RightSurround (LS and RS, not to be confused with SideLeft and SideRight in M$ WAV).
Additionally, to emphasize the dialogue normally coming from the screen, we can use a front Center speaker with less high-frequency demands.
We'll leave the front L-R speakers, also used for stereo music, as the highest-quality speakers for high frequencies.
Finally, taking advantage of the fact that low frequencies are non-directional, we can add a single and expensive SubWoofer specialized in low frequencies.
Note that I haven't used the LFE name because the SW reproduces all the low frequencies present in any channel, not just the LFE.

With all of them, we have the standard 5.1 2D surround sound. See first image.
18934
But 6.1 and 7.1 systems emerged. Why?

Do we need more rear sources for a 2D surround sound experience? Mathematically, no.
Do we need more rear volume (4 speakers) than front (3)? Only in exceptional cases; the volume of the LS-RS-LeftBack-RightBak channels is usually very low compared to the front ones.
Do we have ears on the back of our heads? We already know we don't.

I can only think of one valid answer: the desire to sell more speakers.
In my opinion, the 7.1 surround system is completely unnecessary.

But we can improve 2D Surround with 3D by installing speakers in the ceiling.
The second figure shows the possible alternatives.
18935
I've added some speakers not included on the mediaarea page, which I'll discuss below:
Front Height speakers, which we'll discuss in the next post.
FDL-FDR and SDL-SDR speakers, which, due to their reflection in the ceiling, can replace TopFront or TopSurround speakers when they can't be installed in the ceiling.

I therefore recommend installing system 5.1.2 without, or with, reflection on the ceiling shown in the last figure.
18936
In fact, I would recommend that the default configuration for 8-channel audio be L, R, C, LFE, LS, RS, TFL, TFR, disregarding the obsolete (and with differences in order) L, R, C, LFE, LS, RS, LB, RB.
Currently, I only know of one DD+ encoder (Audition 2017) that offers this configuration (TFL-TFR, referred to as Vhl-Vhr), although the ability to encode to EAC3 (https://forum.doom9.org/showthread.php?p=2017412#post2017412) with this configuration using ffmpeg is under development.
It would be desirable for all encoders to offer this configuration (Surround 3D, hereafter S3D) for 8 channels by default, including the most efficient ones like AAC and OPUS.

There are amplifiers (AVRs) on the market that are prepared for this and capable of decoding Dolby Digital Plus (DD+), Atmos, and other sources.

There are even tools to convert some formats into more compatible or space-saving ones.
Thus, it is possible to decode Atmos (THD and EAC3) with Dolby Reference Player (https://forum.doom9.org/showthread.php?t=183995) or Cavernize (https://forum.doom9.org/showthread.php?t=184364).

I will soon include some examples of how to perform the conversions, as well as how to convert a 7.1 audio system to 5.1.2.

tebasuna51

14th April 2025, 13:11

Conversion Examples

When converting audio for playback on an 8-speaker S3D system, you also need to consider the player and its connection.
The minimum required would be a player capable of sending audio over an optical/coaxial SPDIF connection.
TV players are usually capable of this, although the SPDIF connection limits audio to PCM stereo and DD+.
If you have a PC connected via HDMI, there's a lot of flexibility, as you can send certain audio unencoded or decode it on the PC to send it as multi-channel PCM.

At the moment, there are no PC players that decode Atmos audio (or the DTS:X equivalent...) on the fly, so this type of 3D audio must be sent via HDMI to the AVR for decoding.
However, whether the (old) AVR or the player (TV) doesn't support Atmos, or we simply want to reduce the audio size, we should consider converting the audio.

The minimum conversion (TV player, SPDIF connection) is DD+ 1.5.2 (must be supported by the AVR).
However, it can also be converted to AAC/OPUS, which provide maximum compression (minimum size), with the player (PC) responsible for decoding to 5.1.2 multichannel PCM to send via HDMI to the AVR (which must support it).

For the conversions, I'll assume a minimum level of knowledge of running commands online, using .bat or .cmd files, and that the encoders (ffmpeg, qaac, opusenc) are located in the system's %PATH%.
I'll attach .bat files into which I can drag and drop the sources to convert.

It'll take me a while to prepare all the .bat files, as well as the examples I plan to add. Please be patient.

DOWNMIX 7.1 -> 5.1

First, when we send 7.1 audio to a properly configured 5.1.2 AVR, it will convert it to the available speakers.
However, we may want to do our own conversion either because we trust our conversion method more, or if we want to re-encode it to save size, or simply because our AVR is only 5.1.
I recommend using a mix of the Side (100°) and Back (140°) channels to obtain the Surround (120°) channels, which will provide an identical 2D surround sound experience.
The method I use in BeHappy can be implemented with ffmpeg:

ffmpeg -hide_banner -drc_scale 0 -bitexact -i INPUT_71 -vn -filter_complex^
"asplit [f][s]; [f] pan=3.1|c0=c0|c1=c1|c2=c2|c3=c3 [r]; [s] pan=stereo|c0=0.5*c4+0.5*c6|c1=0.5*c5+0.5*c7, compand=attacks=0:decays=0:points=-90/-84|-8/-2|-6/-1|-0/-0.1, aformat=channel_layouts=stereo [d]; [r][d] amerge [a]"^
-map "[a]" OUTPUT_51
Where:
INPUT_71 can be any 7.1 audio file or even a container, in which case it converts the first audio file.
OUTPUT_51 can be any format supported by ffmpeg or even an external encoder. Examples:

-acodec eac3 -ab 640k output51.eac3
-acodec libopus -ab 512k output51.opus
-acodec aac -aq 1.0 output51.aac
-acodec libfdk_aac -vbr 4 output51.aac
-acodec pcm_s24le -f wav - | qaac -V 91 --ignorelength --adts --no-delay -o output51.aac -
-acodec pcm_s24le -f wav - | opusenc --ignorelength --bitrate 512 - output51.opus

Let's see a size comparison:

Track 2:16:57.084 Size in bytes File Min / Avg / Max Kb/s
-------------------- ------------- ---------------- --------------------
SOURCE thd Atmos 4.837.954.028 SOURCE.thd 4710
decoded 24 int 8 ch 9.466.080.896 OUTPUT_71.w64 9216 uncompressed
decoded 24 int 6 ch 7.099.560.822 OUTPUT_51.wav 6912 uncompressed
thd without Atmos 3.313.912.088 SOURCE-Atmos.thd 3226 losless 71
8 chan 24 int flac 2.960.033.365 OUTPUT_71.flac 2882 losless 71
6 chan 24 int flac 2.790.928.430 OUTPUT_51.flac 2717 losless 51
8 chan ffmpeg eac3 1.051.787.264 OUTPUT_71.eac3 1024 core 640
6 chan ffmpeg eac3 657.367.040 OUTPUT_51.eac3 640 (SPDIF)
6 chan ffmpeg opus 521.902.457 OUTPUT_51.opus ? / 508 / ?
6 chan ffmpeg aac q1 524.102.620 OUTPUT_51.aac 225 / 510 / 1525
6 chan ffmpeg fdk 4 423.961.695 OUTPUT_51_k.aac 12 / 412 / 1007
6 chan qaac -V 91 469.152.015 OUTPUT_51_q.aac 14 / 456 / 1099
6 chan opusenc 512 463.098.524 OUTPUT_51_o.opus 215 / 451 / 1002

- An EAC3 7.1 at 1024 with a 640-bit core provides the same quality and 2D surround sound as 5.1.
Therefore, EAC3 5.1 will be the preferred method when we are limited to an SPDIF connection with the AVR.

- When we have an HDMI connection and want to reduce the size, we have the lossless FLAC 5.1 option.
Or the lossy option with VBR compression (using the bitrate only when needed) with much higher quality/size than EAC3.

DECODING ATMOS WITH DOLBY REFERENCE PLAYER (https://forum.doom9.org/showthread.php?t=183995) (DRP)

Whether our AVR doesn't support Atmos or we want to reduce the audio track size, we can decode it to our specific speaker configuration.
Although there are many possible configurations, the same applies to 2D surround sound: a 5.1.2 configuration is more than sufficient for a 3D experience, so we'll stick to that.

Leaving details about DRP to the existing forum thread, we'll explain how to use it.
Once installed, I recommend creating a folder dedicated to decoding, for example C:\Temp\x\, and place the Atmos file to be converted there.

The encoder needs a complex syntax so if our file to be converted is C:\Temp\x\Z.thd the following .bat would produce a Z_tmp.wav:

set TEM=C:\\Temp\\x\\
set DECODER=C:\Program Files\Dolby\Dolby Reference Player
"%DECODER%\gst-launch-1.0.exe" --gst-plugin-path "%DECODER%/gst-plugins" filesrc location=%TEM%Z.thd ! dlbtruehdparse align-major-sync=false ! dlbaudiodecbin truehddec-presentation=16 out-ch-config=5.1.2 ! audio/x-raw,format=S32LE ! wavenc ! filesink location=%TEM%Z_tmp.wav

The obtained wav has 8 channels (5.1.2) with 32int samples (S32LE) with a header:

File: C:\Temp\x\Z_tmp.wav Size: 12621441104 bytes
---------------------------------------------- Header Info
ChunkID .....: RIFF
RiffLength ..: 4031506504 (ERROR: Must be Size - 8 = 12621441096)
Container ...: WAVE
SubchunkID ..: fmt (Length: 40)
AudioFormat .: 65534 (WAVE_FORMAT_EXTENSIBLE)
NumChannels .: 8
SampleRate ..: 48000
ByteRate ....: 1536000
BlockAlign ..: 32
BitsPerSample: 32
ValidBitsPS .: 32
MaskChannels : 0 (ERROR: Invalid ChannelMask)
SubType .....: 1 (Integer)
SubchunkID ..: fact (Length: 4)
SampleLength : 394420032 (fact Duration: 8217.084 sec.)
SubchunkID ..: data (Length: 4031506432)
Offset data .: 80 (WARNING: Assumed DataLength = 12621441024)
Duration ....: 8217.084 sec., (2h. 16m. 57.084s.)
------------------------------------------------- End Info

Problems:
1) Since it's > 4GB, the RiffLength is incorrect; it should have a W64 or RF64 header.
2) MaskChannels 0, without reporting the channels present.
3) We don't know if DRC is applied, so the volume may be lower than the original source.
4) It uses 32-bit samples, which is unnecessary precision and not always supported by other software.

We can find the volume (only 35 seconds on my PC with the 12GB wav) with:

ffmpeg -hide_banner -ignore_length true -i Z_tmp.wav -af volumedetect -vn -f null - 2>&1 | FINDSTR "max_volume:"

If you detect, for example:

[Parsed_volumedetect_0 @ 00000186a3bbbf80] max_volume: -6.3 dB

We can leave the file correct, and with normalized peak, with:

ffmpeg -hide_banner -ignore_length true -i Z_tmp.wav -af aformat=channel_layouts=5.1.2,volume=6.2dB -c:a pcm_s24le Z_tmp.w64

And it produces the correct 9GB w64 file we mentioned earlier.
This file can be loaded into Audition 2017 to encode a DD+ 5.1.2 file, or we can use the new Hellgauss process to obtain an EAC3 L,R,C,LFE,Ls,Rs,Tfl,Tfr file.
We'll look at other possible encodings later.

Notes on DRP:
- If we use the DRP GUI (C:\Program Files\Dolby\Dolby Reference Player\drpv.exe), we'll see that with the 5.1.2 layout, it uses the Ltm and Rtm channels, not Ltf and Rtf, which would be equivalent to Tfl and Tfr.
Unfortunately, there are no channels equivalent to Ltm,Rtm in a WAV (Tsl,Tsr in MediaInfo), so we'll make an approximation.
18938
- Sometimes the decoder fails or produces empty channels. We'll use the DRP thread to discuss these issues, not here.
However, we can find out the volume of the Top channels to determine whether we really have 3D Surround or just 2D with:

ffmpeg -hide_banner -i Z_tmp.w64 -af "pan=stereo|FL=c6|FR=c7, volumedetect" -f null -

When the maximum volume is set to -91 dB, the channels are empty, leaving it up to the user to decide what minimum volume is worthwhile.

DECODING DD+ ATMOS WITH CAVERNIZE (https://forum.doom9.org/showthread.php?t=184364)

Now support also TrueHD (https://cavern.sbence.hu/cavern/downloading.php?get=cavernize_gui), and its GUI requires minimal configuration:
18939
- Render target: 5.1.2 front, which produces an L, R, C, LFE, Ls, Rs, Tfl, Tfr output.
There are more, but we're only interested in that one. With 5.1.2 side, you get L, R, C, LFE, Ls, Rs, Tbl, Tbr, which also doesn't match a recommended speaker setup.
- Locate ffmpeg, it's not necessary; we'll process the decoding ourselves.
- After loading (Open or Drag & Drop) an eac3 Atmos, select:
Output: PCM, float [EDIT] Beter 24 int PCM previously selected in Rendering menu
- By clicking 'Render now', we can select our working folder (C:\Temp\x) and the RIFF WAVE type (*.wav).

When finished, we get a wav (Coda movie):
File: C:\Temp\x\Coda.wav Size: 10298572904 bytes
---------------------------------------------- Header Info
ChunkID .....: RF64
RiffLength ..: 4294967295
Container ...: WAVE
SubchunkID ..: ds64 (Length: 28)
RiffSize ....: 10298572896
DataSize ....: 10298572800
SampleCount .: 321830400
SubchunkID ..: fmt (Length: 40)
AudioFormat .: 65534 (WAVE_FORMAT_EXTENSIBLE)
NumChannels .: 8
SampleRate ..: 48000
ByteRate ....: 1536000
BlockAlign ..: 32
BitsPerSample: 32
ValidBitsPS .: 32
MaskChannels : 22031 (FL FR FC LF SL SR TFL TFR)
SubType .....: 3 (Float)
SubchunkID ..: data (Length: 4294967295)
Offset data .: 104
Duration ....: 6704.8 sec., (1h. 51m. 44.8s.)
------------------------------------------------- End Info

It's a WAV file with an RF64 header and all the correct values. It can be loaded directly into Audition 2017 and handled by many audio software.
However, we can convert it to w64 24 int (precision equivalent to 32 float) and normalize it to a smaller file size.
[EDIT] I recommend in the menu Rendering -> Force 24 int PCM rendering to avoid any conversion because the decode seems alteady normalized.

Cavernize doesn't usually apply DRC; this particular file had a maximum volume of -1.6 dB, while decoded with DRP, it was -5.1 dB.
When both decodes were normalized to -0.1 dB, the Top channels (Tsl, Tsr) yielded -24 dB.

A movie doesn't always have to have 3D audio (this is an example), and it doesn't make much sense to encode it as Atmos.
In this case, I would recommend a simple 5.1.

3D AUDIO RECODING

When deciding how to recode it, we must take into account several factors:

- If we're going to transmit it via SPDIF (optical or coaxial), we're limited to using EAC3.
- Even if the connection is HDMI, our player and AVR may be limited to certain formats.
Thus, a TV player or similar device may not support OPUS or FLAC, or an AVR may not support DD+ 5.1.2, only 7.1.
It's also not guaranteed that even if the best player (a PC) can send PCM via HDMI with the correct MaskChannel, the AVR will be able to understand it.

We'll start by looking at which encoders support the 5.1.2 layout, meaning that encoding a WAV with L,R,C,LFE,Ls,Rs,Tfl,Tfr layout will result in the same layout when decoded.

FLAC
It's the ideal format; any MaskChannel is accepted, saved, and restored without problems.
The pity is that it has to be decoded to PCM, which is obviously only recommended for TrueHD sources and to reduce their size somewhat.

EAC3
We've already seen good results with both Audition 2017 and the Hellgauss method.
It remains to be seen whether our AVR is fully compatible with DD+ and recognizes the channels.

OPUS
Unfortunately, neither ffmpeg nor opusenc supports this channel mask.
To support it with ffmpeg, you must add the parameter -mapping_family 255.
With opusenc, you must add --channels discrete, which warns against using coupled channels (this will reduce efficiency).
Channel information is not stored, and when retrieved, it is done without a channel mask.

AAC
It's incomprehensible that this codec defaults to 8 channels:
kAudioChannelLayoutTag_AAC_7_1 (https://developer.apple.com/documentation/coreaudiotypes/1572101-audio_channel_layout_tags#1593828)-> C,Lc,Rc,L,R,Ls,Rs,LFE (Who uses Lc,Rc speakers?)
So an AAC with an ADTS header and channel count of 8 will be decoded using that Channel Mask.

Although it has other options:
kAudioChannelLayoutTag_AAC_7_1_B -> C,L,R,Ls,Rs,Lb,Rb,LFE (useless 2D 7.1 surround)
Interestingly, an AAC with an ADTS header and channel count of 0 will be decoded as 7.1.

And also:
kAudioChannelLayoutTag_AAC_7_1_C -> C,L,R,Ls,Rs,Tfl,Tfr,LFE (3D surround 5.1.2)
Unfortunately, no encoder I've tested supports that channel mask, with one exception:
ffmpeg with libfdk_aac and .m4a output stores the correct channel mask and decodes as such.
MediaInfo reports the Channel layout as: ChannelLayout14 (another name for AAC_7_1_C)
[EDIT] fdkaac v1.0.7 (https://forum.doom9.org/showthread.php?p=2030859#post2030859) encoder now support also that ChannelLayout (https://github.com/nu774/fdkaac/issues/63)

These problems would be solved if all encoders, decoders, AVRs, etc. used S3D 5.1.2 by default for 8 channels, and 7.1 or other layouts were considered exceptional cases.
In the meantime, other solutions will have to be found.

DESIRABLE SITUATION

All encoders, even if they use a different order internally, must support a WAV/RF64/W64 input by default with 8 channels: L, R, C, LFE, Ls, Rs, Tfl, Tfr.
ChannelMask 20543 (FL FR FC LFE BL BR TFL TFR) and 22031 (FL FR FC LFE SL SR TFL TFR) will be accepted as equivalents.
Obviously, for 6 channels, the same applies to the first 6 (without TFL, TFR) 2D surround.
Any other situation will be exceptional.

All decoders will output a WAV/RF64/W64 22031 (FL FR FC LFE SL SR TFL TFR) for 8 channels by default unless otherwise specified.
Assuming that SL, SR (Side 100º) are equivalent to Ls, Rs (Surround 120º), and only with specific indication of other exceptional decoding, such as 1599 (FL FR FC LFE BL BR SL SR).

All 8-channel AVRs must assume, unless otherwise configured, that the PCM channels they receive are L, R, C, LFE, Ls, Rs, Tfl, Tfr and not L, R, C, LFE, Lb, Rb, Ls, Rs.
When receiving 6 channels, and unless otherwise configured, they should not use the excess speakers (currently, they usually send the surround to the Side and Back speakers).

CURRENT SITUATION

We can only encode S3D 5.1.2 to EAC3 with Audition 2017 or the Hellgauss method, but we need to verify if our AVR supports it.
We can also encode it to m4a with ffmpeg/libfdk. Even if our player decodes correctly, we must also verify that our AVR correctly redirects the channels to the appropriate speakers.
If our AVR doesn't support it, we can use the alternative explained in the following post.

tebasuna51

14th April 2025, 13:11

Converting a 7.1 system to 5.1.2.

If we have a 7.1 audio system connected via HDMI to our player capable of sending multi-channel PCM audio (PC or other capable), we can convert it to a 5.1.2.

Physically, you'll need to place the obsolete back speakers (LB-RB) in the FHL-FHR position (in the second figure) and trick your system into sending the real TFL-TFR audio through the LB-RB channels.

My 5.1.2 AVR doesn't recognize EAC3 5.1.2 or the PCM TFL TFR channels sent via HDMI with any configuration. I can only configure the speakers as LB-RB and also use this method.
I use the mpc-hc player's ability to, whenever I have 8 audio channels, redirect channels 5-6 to SL-SR and 7-8 to BL-BR (they actually have TFL-TFR), as shown in the figure:
18948

When there are only 6 channels, 5-6 remains SL-SR, but 7-8 disappears to BL-BR.
This way, whether mpc-hc decodes a correct EAC3/m4a/flac 5.1.2 or an AAC/Opus 7.1 with inverted 5.1.2 channels, the result will be the desired one: hearing the top channels through the fake back channels.

Of course, a real 7.1 can't be played like this; any real 7.1 must be converted to 5.1, saving space and with identical 2D surround sound (use the included DragNDrop71-51.bat).
The other included .bat (AtmosTo512.bat) attempts to automate the process, although it's not simple. I'll try to explain the process:

- If you have Dolby Reference Player installed, we need to unzip the attached file to the C:\Temp\x folder (you could choose another one by modifying the bat file) using a special syntax for DRP.
We'll also place the Atmos files to be converted in that folder, which we'll drag over AtmosTo512.bat.
When the conversion is complete, you'll get a temporary wav file.

- If only Cavernize is available, the resulting wav file (see previous figure, it doesn't matter where it is) will be dragged over AtmosTo512.bat.

- You need at least ffmpeg.exe (recommended with libfdk) in the folder or %PATH% of the system, or define it in the .bat file.

- The peak volume is analyzed from both sources; even with large files, it shouldn't take more than a minute.
You can normalize it and convert it to RF64 24 int (recommended format).

- A window (LeeWav.ese) will open with information about the wav file and the option to modify the Mask Channel.
If you want to encode it to EAC3/m4a/Flac, you can use the real 22031 (FL FR FC LFE SL SR TFL TFR).
However, if you want to encode it to AAC/Opus, the fake 1599 (FL FR FC LFE BL BR SL SR) is better.
Press EXIT.

- The following menu will appear:
Select option over source: C:\Temp\x\out_8w3D.wav
==========================
1 = Encode to EAC3 with Hellgauss method
2 = Encode to opus with libopus -ab 512k
3 = Encode to m4a with libfdk -vbr 4
4 = Encode to opus with opusenc --bitrate 512
5 = Encode to aac with qaac -V 91

6 = Test volume in Surround Channels (4,5)
7 = Test volume in Top Channels (6,7)
8 = Set Opus bitrate
9 = Set fdk quality
10 = Set Qaac quality
11 = Info and Set MaskChannels

[0-11] (leave blank and [Enter] to exit):
It is recommended to check the volume of the Top Channels because, due to either conversion failure or negligible volume, 3D surround sound may be useless.
To encode with Hellgauss, you need ffmpeg_eac371.exe and md71.exe in the folder or in the system %PATH% or define it in the .bat file.
To encode with opusenc, you need opusenc.exe in the folder or in the system %PATH% or define it in the .bat file.
To encode with qaac, you need qaac.exe (and the necessary DLLs) in the folder or in the system %PATH% or define it in the .bat file.

- A test channel out_8w3D.m4a is included, which can be decoded by dragging it over AtmosTo512.bat, for testing.

SquallMX

18th April 2025, 14:29

Great information.
Waiting for the full guide!

j7n

19th April 2025, 01:58

What are the advantages and shortcomings of placing speakers in surround vs back? I believe that the earlier Quad layout was supposed to be further back than then Surround (side) positions. If you want speech in the center to have good clarity without overly boosting it, it must have full ~16k bandwidth in high frequencies. Surely the sibilants of speech became more distinct moving from MW to FM radio.

tebasuna51

17th May 2025, 11:19

What are the advantages and shortcomings of placing speakers in surround vs back? I believe that the earlier Quad layout was supposed to be further back than then Surround (side) positions.
First Quad layout locate the speakers at 135º (back) but first surround 5.1 systems (AC3 and DTS) locate the Surround speakers at 120º, not Side 90-100º and not Back (or rear) 135º but in the midle.

If you want speech in the center to have good clarity without overly boosting it, it must have full ~16k bandwidth in high frequencies. Surely the sibilants of speech became more distinct moving from MW to FM radio.
Of course better with more bandwidth, but HiFi FL-FR speakers must go until 20 KHz and normally Center speakers don't need all (my old ears don't need more than 10 KHz).

@To all
It's taken me a while, but I've finally finished what I promised.

I hope it helps someone.

FranceBB

17th May 2025, 16:03

It's taken me a while, but I've finally finished what I promised.

I hope it helps someone.

It definitely does. It's been an interesting reading. :)
Also interesting the bit about encoding with the right layout in AAC but luckily fdk_aac got the job done. I'm personally still with my old 5.1 setup and as such I generally play everything back from my PC and carry it via HDMI as PCM 5.1 48000Hz to my TV which then redirects it to the 5.1. That is unless I'm watching our own linear channels in which case it would be the good old AC3 5.1 directly going to the 5.1. :) At work it's a different thing, but even there I don't actually do anything with Atmos or any other "fancy" audio channel layouts. Unlike the normal 5.1 which can be carried in a DolbyE stream and it uses the same bitrate as a 2ch stereo PCM 24bit 48000Hz stream, 5.1.4 has to be carried as a DolbyED stream and it uses the same bitrate as two stereos PCM 24bit 48000Hz, so it uses 4ch out of the 8 maximum available by the old SDI cables (and 16 maximum available by the "new" SDI cables). The problem is that when you get a Dolby Atmos file you get a DAMF, basically a folder with the beds, the actual audio and the metadata, but unfortunately the Dolby Reference Encoder (or anything else for that matter) can't re-encode it to DolbyED2. There are hardware encoders for things like live events but not software encoders. Unbelievable. Anyway, the Dolby Reference Encoder can, however, re-encode it to E-AC3, which is why it's generally included by streaming companies in the final renditions, like you get the sidecar .ec3 file that then gets muxed in the final mp4 rendition before it goes through packaging and encryption. Problem is that for linear channels you need DolbyED2 muxed in an mxf container alongside the video, which is not feasible most of the time as there are really no encoders out there. This is why linear channels are almost always just 5.1 (the good old classic) or even just stereo. Unfortunately SDI has its own limitations and so does the mxf container, add to that the complexity of multi language and the cost to upgrade and you can easily see how easy it is to run out of channels.

Sunspark

20th May 2025, 17:03

Quick question, if one sets input channels to 2 and places a check only in Front Left-1 and Front Right-2, is the player down-mixing from 5.1 or 7.1 or whatever configuration to 2.0, or is it dropping everything except FL and FR?

The output seems fine, but I can't easily tell if it's down-mixing or dropping channels.

tebasuna51

21st May 2025, 14:16

You can use a channel test (like included in x.7z) to listen the channels played (or when they are downmixed).

In the Audio Switcher -> Speaker configuration for: there are one for each number of channels.
If the 8 (like image) you only set 1-FL, 2-FR and empty the rest you only listen FL-FR and the rest are dropped.
Check what hapen if you chek also 5->FL, 6->FR ...
It is not a downmix is redirect channels to speakers.

For a downmix you don't need the Audio Switcher, use Audio Decoder -> Mixing tab

Busty

23rd May 2025, 12:51

Definitely interesting thread.

Regarding your statement that anything above 5.1 is overkill: Imagine a big cinema screening room with minimum reflections from walls and ceiling. The additional descrete channels between front and rear might help listeners to imagine a defined location of a sound. Of course you can always have a phantom location located between front and back, but then the perceived location of a sound will shift with your location in the room. Just like the phantom center shifts when only using a stereo setup. AFAIK, that's the main reason for a center speaker; so it does not matter if you sit in the middle or on the side of the room to locate sounds coming from the middle of the screen.

That's why I would think that 7.1 does have it's use cases, at least in bigger rooms. For most home cinema setups I agree that it's probably unneccessary. Depending on listening location it might even be distracting when you're sitting to near to side speakers.

Does this make some sense?

tebasuna51

23rd May 2025, 22:56

Obviously, I'm referring to a home theater system.
The perfect listening spot is in the center; naturally, if you're on the side, the effect is lost.
I don't think having four surround speakers is useful, but that's my opinion.

tebasuna51

18th August 2025, 13:12

There are a Cavernize new version (https://cavern.sbence.hu/cavern/downloads.php) with support for Truehd.

After test it I'll modify my second post.

tebasuna51

4th May 2026, 07:15

fdkaac v1.0.7 (https://forum.doom9.org/showthread.php?p=2030859#post2030859) encoder now support also the 5.1.2 ChannelLayout (https://github.com/nu774/fdkaac/issues/63)

Warning: use only in m4a container, ADTS header can't support it.

SeeMoreDigital

4th May 2026, 08:19

tebasuna51

4th May 2026, 08:50

But how can such 5.1.2 AAC audio streams be played back correctly?

It's a problem with the audio equipment.

My Denon (configurable as 5.1.2 or 7.1) doesn't decode AAC, but it also doesn't understand EAC3 5.1.2, so in both cases I'm decoding from MPC-HC, as I explained in my previous post (https://forum.doom9.org/showthread.php?p=2017560#post2017560).

SeeMoreDigital

4th May 2026, 09:41

It's a problem with the audio equipment.

My Denon (configurable as 5.1.2 or 7.1) doesn't decode AAC, but it also doesn't understand EAC3 5.1.2, so in both cases I'm decoding from MPC-HC, as I explained in my previous post (https://forum.doom9.org/showthread.php?p=2017560#post2017560).Yes, it's a great shame AVR manufacturers never really embraced onboard AAC bit-streaming and multi-channel decoding...

raquete

16th May 2026, 16:01

Great thread, bookmarked!

DOWNMIX 7.1 -> 5.1 is fantastic cos i don't have Dolby Atmos.

:thanks:

microchip8

16th May 2026, 17:38

Great thread, bookmarked!

DOWNMIX 7.1 -> 5.1 is fantastic cos i don't have Dolby Atmos.

:thanks:

You can always upmix using Dolby Surround and DTS Neural:X on the AVR. Not as good as the real thing, but I like it. - I have upfiring speakers to simulate Top ones.

Balling

2nd June 2026, 13:13

Yes, it's a great shame AVR manufacturers never really embraced onboard AAC bit-streaming and multi-channel decoding...

Not really a shame, it is a standard to use base layer 7.1 or 5.1 and stegonographically encode objects at arbitrary positions in it using Atmos (EAC3 Joint object coding). AAC PCE is a very niche thing and even ffmpeg still does not support decoding it, here PCE enablement for 5.1.2: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23254