downmix matrix values [Archive]

View Full Version : downmix matrix values

nuked

9th July 2003, 00:59

I couldn't help responding to this thread even if it's a little old:

-------------------------------quoting
http://forum.doom9.org/showthread.php?s=&threadid=40346

Midas:
Here are the downmix equations :

Lt=0.3225*fl+0.2280*c-0.2633*sl-0.1862*sr;
Rt=0.3225*fr+0.2280*c+0.1862*sl+0.2633*sr;

Does anyone have a more theoretical/mathmatical basis of those numbers?
Or are the number deducted by the trial/failure method?
I mean I use 32-bit float arithmetics in azid and to make DPL II work
properly I need more accurate numbers. There is only roughly 13 bits of
information is those numbers/constants above -- I need at least 20bits
or
6 significant digits.

DSPguru:
imho, the values should represent 10db, 11.5db, 13db, 14.5db.
-----------------end quoting

Those db levels are about right, but if they are really the theoretical
values then the mixing numbers are already only right to barely 2 digits, do the math.
It really shouldn't matter though. The amplitude (square root of power)
detected by your ear depends on 1/r where r is the distance to the
speaker. If you know the listeners distance to 6 digits(about 3
micrometers tolerance) I'm quite impressed. I'm not saying 32 bit sound
is useseless, but for relative detail and waveform, not for mixing channels... it
doesn't matter. I think all these db levels are approximate anyway. The
3db level that's used everywhere for instance, happens to be almost
exactly a factor of 2 (1.9953) in power or (square root of 2 in amplitude)
which makes tons of sense when mixing around 2 speakers to maintain
constant total power. Again, though, I think 3db is used cause it's close
enough. I'm a physicist, not a programmer or a sound engineer. I'm not
terribly experienced in this stuff, but this is how I see it. I've made
my own hardware surround up-mixer once too, and I promise there was no
exactly right place to set the mixing gain. +/- 5% would probably make
less difference than than leaning over to grab your beer. In the end this
stuff is as much art as theory.

nuked

nuked

9th July 2003, 01:09

by the way, there is some theory. If you square all the coefficients on one row and add them up you get .25. Why, becuase your mixing from 4 speakers to one.

nuked

nuked

9th July 2003, 01:29

ok now I'm talking to myslef but that's ok.... anyway.. I might be wrong about that last statement, ... does seem to work out pretty close though, hmmmm. what's even closer is the total amplitude being normalized to 1 but I guess everyone knows that already. Total power addition actually depends on phase coherence and stuff anyway right?, so maybe it's not so simple. Oh well, I'm obviously too hungry to think now.

nuked

frank

9th July 2003, 16:27

Read my thread
BeSweet v1.4b12 & Dolby Surround II matrix (http://forum.doom9.org/showthread.php?s=&threadid=27936)

nuked

9th July 2003, 19:01

Very nice, thanks for the link.

Still, it's not as scientific or at least as obvious as it seems. Addding C to L +R for instance, dolby assumes that power adds as the sum of the squares This is true for incoherent sound, but for coherent/decoherent sound it add as the square of the waveform sum which depends on the phase relation and is ...VERY diferent. 1+ .707= 1.5(incoherent) or 2.9(in phase) or .09(out of phase) phase. Of course they would have added more or less that way originally when they hit your ear anyway(depending on the frequency and the positional phase shift where you were standing), so the error may not be so great as it seems, but still I think one COULD argue(I'm not) that straight amplitutde addition is better and a factor of .5 would be just as "correct", at the least it's not so obvious.

But of course downmixing is not perfect. If it was we wouldn't need 5.1 sound storage. I imagine dolby has figured out what sounds the best in stereo and comprimised with what can be decoded the best. All in keeping with my claim that close enough is close enough. Do these numbers actually come from Dolby? Do they release their mixing matrices?

nukeD

nuked

9th July 2003, 19:08

my bad for anyone keeping track(not likely)...
those sums on the left are amplitude, but the results on the right of the equals are power.

nukeD

frank

11th July 2003, 12:06

You have much to learn.
Read Digital Audio Compression (AC-3) Standard, Rev. A (http://www.atsc.org/standards.html)
There are equations for downmix discussed.
And first study Dolby Prologic before going to Pro Logic II.

Much better: make your own sound tests with BeeSweet.

nuked

11th July 2003, 17:57

Ok, I'm learning a few things, but nothing that's turning my view
of things on it's head, and most of them confirm my thoughts.

I've learned (with #4 being the most relavent)
1) There's a standard so sound programmers know what to expect
and that seems good.

2) There are words available to teh programers to tweak the down-mix
a little for LoRo down mix (but not ltrt as far as I can tell,
but something indicated I think that there are again in a
newer standard some new level controls added for ls vs rs mixing?)

** This indicates indeed there are no perfect values for stereo
playback or it wouldn't need be left up to the sound programmer

3) Since LtRt do not seem allow these tweaks a surround encoded
downmix will probably not be as optimal when played back in stereo
as an LoRo downmix but this isn't surprising. Even an ltrt downmix
marked as such can never be recoverd exactly to the LoRo downmix.
even if standard values are assumed. thus so called stero
"compatibility" is not exact.

4) A tolerance of .25db is QUOTED on the mixing levels indicating
indeed 32 bit mixing preciscion is not required as I argued above.

5) clev and slev values of .5 are indeed the default for stereo
downmixing so again the argument about power alone doesn't
lead rigorously or at least simplistically,
to values of .707; it has some to do with
comprimising so that a better surround decode can be reproduced.
For the surround one could argue that encoding the signal
out of phase justifies the boost to .707. But for clev it seems
harder to make that argument.

All seems to be more about comprimises and standardization than exact
theory.

I didn't look much at pro-logic II stuff but I don't think it's a
terribly radical extension of the ideas...

nukeD

nuked

11th July 2003, 18:09

cheated (off subject soap-box)

byt the way.. this means all of us with really nice powerful yet clean
stereo speakers as oposed to 5 6 7 9 or whatever little bose are
getting cheated a bit by everyhting encoded in dolby. My stereo
speakers can play back pink floyd recordings and make stuff sound like
it's reverberating off the grand canyon behind me. And stereo is
is more realistic for stage music anyway than these instrumental
artificial 5.1 separations that are produced. We only have 2 ears
( I know there's spatial stability to worry about too, unless your in
your lazy boy).

nukeD

bleo

14th July 2003, 17:18

I'm not too sure what the objective of this thread is so my post may be a little off topic, but I am quite interested in the Dolby Pro Logic II downmix matrix.

In my tests with the Azid surround2 downmix, I have found the left-right rear channel separation to be quite disappointing. Alright, so my test setup is not optimal, being the Cyberlink Audio Effect filter for DPL2 decoding and Dolby Headphone or Dolby Virtual Speaker emulation. However, it does provide a decent rendition of the original AC3 5.1 test samples such as 'Avia', 'THX Tex EX' and the DPL2 encoded 'Dolby Fire'.

I started experimenting with my own downmix matrix by increasing the rear downmix separation to 6 dB, i.e.:

Lt = L + 0.707 C + 0.707 LFE - 0.894 SL - 0.447 SR
Rt = R + 0.707 C + 0.707 LFE + 0.447 SL + 0.894 SR

The coefficients may be normalised to prevent arithmetic overload, however, I haven't heard any clipping yet.

This matrix yields much better rear channel separation, though I suspect it decreases front-rear separation. Thus I was wondering, is there a 'sweet spot' for the rear channel downmix for optimal left-rear right-rear and front-rear separation?

nuked

15th July 2003, 02:26

Hmmm..., of course your looking for a better opinion than mine, but..under the assumption of identical signals in the front speakers, ie bassically mono, I could make some useful statements about the front to back separation(which could in priciple then aproach perfect with the right upmix technique) Without that assumption though, it's mathematicaly IMPOSSIBLE(I do mean impossible) to say in general although maybe something can be said in practice. The amount that the front speakers leak into the back and vice versa depends not only on the downmix, but on the original asymetry of the signal in the front speakers as well. Then you add DPLII non-linear steering into the mess and who knows what exactly will happen. Actually much of the point of adding the steering was to artifically clean up leakage between speakers. As for left-right separation I can't take a stab at any kind of equation, not with all the steering involved but as I mentioned DPLII is not my strong point.

Oh yeahh.. I did have a useful reason for replying.. :P... sounds like what you want is an easier way to experiment. Play with Ac3filter. You can change the downmix on the fly during playback. You can even use MatrixMixer at the same time and play with diferent combinations of downmixes and upmixes! of course MatrixMixer only supports linear matrix upmixing... no DPLII steering, but anyway.. AC3Filter.

nukeD

ps my only real point for this admitedly excuse of a thread was to point out that 32 bit preciscion isn't needed and I stand FIRMLY by that, but of course it wont hurt... just a kindof pet peeve us research scientists are known for making fun of... like when someone quotes their gas milage with 10 digits preciscion... WOW! :)

bleo

15th July 2003, 05:01

haha, ironically I've been feeding 32 bit audio from Ac3Filter into Trombettworks Channel Downmixer just so I could use downmix coefficients with 32 significant figures... :p

In case anyone was wondering, the exact Azid surround downmix coefficients are SQRT(2/3) and SQRT(1/3) divided by the sum of the coefficients (3.101).

Anyway, I guess the point of my post was that I was wondering how those surround downmix coefficients were derived. In fact, I am trying to make a DPL2 downmix that sounds as close as possible to the original AC3 5.1.

nuked

15th July 2003, 15:08

I seriously doubt you can easily "derive" these values. beyond the simple power conservation ideas that you seem to know... depends too much on the upmixer which is not such a simple thing. Read the links Frank posted here though.. they are pretty good... in that thread they even mention some real testing of matrices to reproduce 5.1. I wonder if the story is different for music and movies. I imagine it would be a little since spatial spatial separation and channel correlations are very diferent.

bleo

16th July 2003, 06:01

I read frank's thread and also some threads at HydrogenAudio a while ago, and that is why I started experimenting with my own downmixes.

I agree with all of the principles there except for the bit that says:

We set the amount of acoustic power of Rs in Lt equal to the half of Ls. Or, Rs has a level of -3 dB referred to Ls. Same to Ls in Rt.

It appears that Valex and ux-3 also question what seems to be an arbitrary setting. However, I also agree with frank's response:

If you enhance the difference of Ls, Rs you'll get lower compatibility to DS, if you lower the difference you'll get less separation of the rear channels.

I have done some experiments and found the separation of the rear channels to be inadequate. Since I do not need compatibility to DS, I enhanced the difference of Ls and Rs in my downmix. Of course, my 6 dB separation is arbitrary too but I guess what we are trying to achieve is a downmix that, when upmixed by a certified DPL2 decoder with all its fancy nonlinear steering etc, sounds as close as possible to the original AC3 5.1. The current Azid surround2 downmix does not achieve this.

I suppose the easy way to find out the correct downmix is to take the official Dolby DP563 or SurCode encoder, feed it one channel at a time and measure the outputs... But since I don't have either of these, I guess I'll just have to keep experimenting... :p

PS sorry nuked for apparently highjacking your thread, but you seem quite knowledgeable and I thought some general DPL2 discussion would not hurt...

nuked

16th July 2003, 19:11

Well don't take my word for anything. Seems like you know as much me and it's not hard to hijack a plane with no pilot. As for single channel testing, unfortunately I think it's not the whole picture especialy for people intersted in surround for music(which I'm not a huge fan of anyway, I'd usually rather it sound like the band is really in the front of my living room than like I'm really in the middle of a stage in a concert hall, but that's me). I think DPLII should be much better at localizing single point sources than a mix of stuff. Still for movie sound that's probably good enough. I've read and I imagine it's true that if you have music in front and a sound effect in say left-rear that the steering will average the two and kinda steer to the middle. The "steering" tries to determine the primary source of a sound, enhance it there and silence it elswhere to reduce the cross channel leakage fundamentally inherent in DPL, but I don't think there's any frequency analysis to separate out different sounds or anything quite that fancy so sources can get combined like that, but you should probably find info on that on for yourself as it's just something I ran across on a link from google in passing. I agree that many of the numbers are fairly arbitrary. I also though think it's good for standard software to use standards. To make things more fun(and more off forum) there are other brands of decoders that claim they do things better than dolby... I think SRS has one.. but then they kinda have at least some ties with dolby I think.

As you see from my other thread, I'm more interested in a better solution now that we have the technology to do it. This whole thing of using analog trickery to compress digital information is mostly silly, but maybe this is just not the right forum for that type of talk.

bleo

18th July 2003, 06:00

argh! :eek: It appears that the CyberLink DPL2 decoder I was using was operating in DPL 1 mode... Hence I hearby retract all my comments about lack of rear channel separation in the Azid surround2 downmix... :(