Another flexible and extensible way of multi channel audio encoding using Avisynth. - Page 9

Dark-Cracker · 6th July 2006, 14:49

perhaps this thread could help you.

http://forum.doom9.org/showthread.php?t=101259

++

Rockaria · 6th July 2006, 18:25

Quote:

perhaps this thread could help you.

No, never! And I don't get the purpose of your post reasonably...

The DPL II encoding methos are fully examined here.
http://forum.doom9.org/showthread.php?t=112122
http://forum.doom9.org/showthread.php?t=111603

All the invert-only methods(including some of my models) are just partial incomplete implementations.
The full DPL II encoding requires the full-pass (frequencies) 90deg phase shifts in addition to the inverts.

If you are interested and cannot read all the related threads, I can include the links to explain why those approaches are misleading.

BTW, Avisynth is based on the streaming i/o. So in order to get the peak volume to use for Amplify(1.0) like other tools, it must scan the entire stream first, which is Normalize(1.0) : max gain with no clipping.

Also you will have to check if the Normalize() performs seperately(by each peak volume) on each channel. If it does, use the amplify(x) on all channels with the replaygain value x from other tool such as foobar2k.

NorthPole · 22nd September 2006, 16:07

Quote:

Originally Posted by Rockaria

.
The full DPL II encoding requires the full-pass (frequencies) 90deg phase shifts in addition to the inverts.

I've read the above referenced posts. If I understand this correctly, you are using ffdshow to decode 5.1. I've never used ffdshow, can you use it to upmix to 5.1 from stereo?

I still have not found a avisynth filter capable of the 90 deg phase shift to use in a avisynth/bepipe script.

Rockaria · 23rd September 2006, 02:09

Unfortunately, me neither yet..

There are some VST(PhaseBug,) & winamp(Stereo Tool,) plugins which perform the 90 deg phase shift on cetrtain channel(s) that can possibly be used through avs::grf:: (DC-DSP or ffdshow::winamp) on rear channels to be aligned-mixed to the fronts before the DPL II mix.
But aligning the delay of the shifted channels would still be the problems unless we have more accurate contol on the phase shift algorithms.

I have found Csound has a c++ hilbert() and FMOD has a built-in DPL(II) mix function that can be reasonably integrated into a avisynth plugin. But unfortunately it's a long future plan for me(if I have to) because I am not into the development any longer(my fingers have lost the touches).
It just is strange why the relatively simple plugin is not integrated into avisynth yet. Maybe partially because it can only enhance the seperation quality to 10~30% over the invert-only model and can never be better than the current discrete multichannel encoding(DD, DTS, AAC...).
However, it's an intended(from multi channel sources) multichannel encoding differentiating itself from the artificial multi channel effects(any kind of upmix from 2ch sources) and still be applied to the higher channels as dpl IIx.

The ffdshow DSPs(including the upmix, downmix, hrtf, resample...but no decoding) can be used within the avs with ffavisynth.dll. But as the avisynth development has got complicated(with different branches), there seems to be some problems in the dynamic linking(probably because of different versions of interfaces used in the ffavisynth.dll. I remember avisynth v2.55 worked fine with some old versions of ffavisynth.dll) and cannot be used correctly as described in the avisynth manual.

Currently, the only way to use the ffdshow as a decoder & dsp container together is to use in a graphedit file(*.grf) included in a avs script with directshowsource("a.grf"). To reuse the GRF, copying the sources and renaming to the predefined ones will be the simplest way.
But I believe the avisynth having the very open architecture thus lots of types of plugins attachable, it's a matter of time than the technology.

Meanwhile, you can evaluate the true DPLII with lots of existing professional vst(or with phase shifts DSPs) enabled wav editors if it really satisfies the tastes as is tested in the mentioned link.

NorthPole · 23rd September 2006, 14:34

@Rockaria

Thanks for the info, I'll post if I run across anything. Untill then, I think I may try the foobar approach with the ATsurround dsp plugin and the aften encoder.

Rockaria · 23rd September 2006, 20:36

Yes, as I mentioned in the related thread(http://forum.doom9.org/showpost.php?...3&postcount=6),
the foobar2k+ATSurround+any-encoder solution seems to be the most economic DPL II encoding solution ATM.

They say in their forum, it does the 90deg phase shift. But when I inspected the image, it shows a bit altered but very close to the original.
So I suspect it might be doing the selective-freq(10k?) phase shift, at least producing the un-biased playback in general.

NorthPole · 23rd September 2006, 22:35

Quote:

Originally Posted by Rockaria

the foobar2k+ATSurround+any-encoder solution seems to be the most economic DPL II encoding solution ATM.

Just curious about what the channel mixer is doing for your mix? (Note about LFE volume control?)
And I think you are use the hard limiter or the winamp dsp to do DRC?
I'm just using the ATsurround plugin on RG'd files.

Sorry, this is a bit off topic.

Rockaria · 23rd September 2006, 23:27

No problem, it's related and also mentioned in this thread before.

The ATSurround with no ch-gaining control gave me too much LFE. I used the channel mixer to control the ch-gains from the 6ch source before feeding to the ATSurrounf DPL II encoding.

The DRC is compressing the bit depth, usually volume range reduced centered around the typical average sound patterns. The more application, the more distortion will be created.

The hard limiter is dealing with the upper area of the wave form to soften the clippings(over 0dB), simpler and better fidelity than just allowing the clippings or any DRC or DN(Dynamic Normalization) solutions. The RG application in decoding may cause the clippings and I believe any RG-enabled players are designed to prevent these clippings when decoding.(check the new foobar RG application options)

The same resoning goes to the AC3 coding. Even if we can encode the AC3 with proper dialnorm and DRC by freq-levels encoding without clippings, the decoder might encounter the situation to hard-limit the over-peak decoded(freq-summed) area(by the dialnorm application) especially if the user chose not to use the DRC(for more fidelity because of the quiet listening environment). In this case(with or without considering the DRC), the pre-attenuation or hardlimiting is considered necessary for the safer transcoding(to me).

[edit] some additions to ac3 dialnorm
The decoder will eventually make the dialnorm value to -31dB. So when given -1dB it will attenuate -30dB when decoding. And when given -31dB, it won't adjust the volume level. So theoretically, the decoded wave form should not exceed the original volume level(conservative than RG).
But practically it is known that the decoded wave form can exceed the 0dB by psychoacoustic processing and in case of originally clipped source, it is also observed the decoder is rebuilding the clipped area depending on the decoder logic.
So if we suppose those effects are less than 3dB, the dialnorm range -1dB ~ -28dB will be safe if there are no other decoder specific constraints.

However, in my case of DDLive, the -31dB of dialnorm delivers tranparent(equal) steady volume level regardless of clippings onto my two receivers.

[edit2]some remarks on AC3 DialNorm & DRC for transcoding

Quote:

Originally Posted by Rockaria

Quote:

My problem is my output often has low volume and I was wondering why this is.
..
Im using ProCoder 2.0 which has built in filters for audio.

The filter Normalize has two options:
1. Normalize to mean RMS of sources
2. Normalize peak to specified DB level
..
use Dynamic Range Compression then(and after) normalize.
..
I don't like wide dynamic ranges (and the neighbourhood too

)

The first often-low-volume is covered by Dolby's normalize-to-mean-rms(aka. dialnorm) which is designed to have the average preceived dialog levels(between the sources).

Secondly, the DRC in Dolby is applied in two steps :
. encoding time : average the dynamics with certain predefined patterns scanning inside : looks a bit closer to the mastering concept
. decoding(play/transcoding) time : the decoder mostly has the scanned DRC application level(0~1, the more the less fidelity) to personalize the listening environment(listener + neighborhood + devices), losing this flexibility(personalized) by transcoding.

So 'preferred DRC decoding level + normalize(-3dB) | RG(scan)+limiter' is expected for normal ac3 transcoding steps.
In case, 'RG(scan)+limiter' can be used for further average-boosting the perceived sound levels without touching the volume knob when switching between the sources.

A remind : Dolby's DialNorm mean RMS sound level is expected to be altered by the 'normalize(-3dB)' here.

6th July 2006, 14:49	#161 \| Link
Dark-Cracker Registered User Join Date: Feb 2002 Posts: 1,195	perhaps this thread could help you. http://forum.doom9.org/showthread.php?t=101259 ++ __________________ AutoDub v1.8 : Divx3/4/5 & Xvid Video codec and .OGG/.MP3/.AC3/.WMA audio codec. AutoRV10 v1.0 : Use RealVideo 10 Codec and support 2 Audio Streams and Subtitles.

23rd September 2006, 02:09	#164 \| Link
Rockaria nobody's nobody Join Date: Mar 2005 Location: The Sun, somewhere around Posts: 553	Unfortunately, me neither yet.. There are some VST(PhaseBug,) & winamp(Stereo Tool,) plugins which perform the 90 deg phase shift on cetrtain channel(s) that can possibly be used through avs::grf:: (DC-DSP or ffdshow::winamp) on rear channels to be aligned-mixed to the fronts before the DPL II mix. But aligning the delay of the shifted channels would still be the problems unless we have more accurate contol on the phase shift algorithms. I have found Csound has a c++ hilbert() and FMOD has a built-in DPL(II) mix function that can be reasonably integrated into a avisynth plugin. But unfortunately it's a long future plan for me(if I have to) because I am not into the development any longer(my fingers have lost the touches). It just is strange why the relatively simple plugin is not integrated into avisynth yet. Maybe partially because it can only enhance the seperation quality to 10~30% over the invert-only model and can never be better than the current discrete multichannel encoding(DD, DTS, AAC...). However, it's an intended(from multi channel sources) multichannel encoding differentiating itself from the artificial multi channel effects(any kind of upmix from 2ch sources) and still be applied to the higher channels as dpl IIx. The ffdshow DSPs(including the upmix, downmix, hrtf, resample...but no decoding) can be used within the avs with ffavisynth.dll. But as the avisynth development has got complicated(with different branches), there seems to be some problems in the dynamic linking(probably because of different versions of interfaces used in the ffavisynth.dll. I remember avisynth v2.55 worked fine with some old versions of ffavisynth.dll) and cannot be used correctly as described in the avisynth manual. Currently, the only way to use the ffdshow as a decoder & dsp container together is to use in a graphedit file(*.grf) included in a avs script with directshowsource("a.grf"). To reuse the GRF, copying the sources and renaming to the predefined ones will be the simplest way. But I believe the avisynth having the very open architecture thus lots of types of plugins attachable, it's a matter of time than the technology. Meanwhile, you can evaluate the true DPLII with lots of existing professional vst(or with phase shifts DSPs) enabled wav editors if it really satisfies the tastes as is tested in the mentioned link. __________________ u know everything in the end, or now if aligned... no right(x).right(y) pls. it's confusing... : phase-shift /Jun.2006

23rd September 2006, 20:36	#166 \| Link
Rockaria nobody's nobody Join Date: Mar 2005 Location: The Sun, somewhere around Posts: 553	Yes, as I mentioned in the related thread(http://forum.doom9.org/showpost.php?...3&postcount=6), the foobar2k+ATSurround+any-encoder solution seems to be the most economic DPL II encoding solution ATM. They say in their forum, it does the 90deg phase shift. But when I inspected the image, it shows a bit altered but very close to the original. So I suspect it might be doing the selective-freq(10k?) phase shift, at least producing the un-biased playback in general. __________________ u know everything in the end, or now if aligned... no right(x).right(y) pls. it's confusing... : phase-shift /Jun.2006

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

23rd September 2006, 14:34	#165 \| Link
NorthPole Registered User Join Date: Jan 2006 Posts: 141	@Rockaria Thanks for the info, I'll post if I run across anything. Untill then, I think I may try the foobar approach with the ATsurround dsp plugin and the aften encoder.