Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 6th March 2011, 14:52   #13141  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 5,351
Quote:
Originally Posted by madshi View Post

What did happen to albain, btw?
That's a damn good question.
__________________
HTPC: Windows 11, AMD 5900X, RTX 3080, Pioneer Elite VSX-LX303, LG G2 77" OLED
SamuriHL is offline   Reply With Quote
Old 6th March 2011, 15:15   #13142  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by JEEB View Post
Please do, the ffmpeg process was streamlined and I'd bet they'd be more realistic about their current progress on many accounts.
Ok, will put that on my to do list.

Quote:
Originally Posted by JEEB View Post
Also, ever since I saw this audiophile herp derp on these threads I've been thinking, don't the specifications for audio decoders specify what is right and what is wrong to do with an encoded audio stream? Or am I one of those happy fellows who has gotten used to standards like H.264 that standardize the decoder to be bit-exact? All this "You should have X instead of Y to have better output" stuff just doesn't make muchos sense, coming from such a background.
Video codecs work *very* differently compared to lossy audio codecs. Video codecs use motion estimation to try to minimize the difference between video frames and then store the motion vectors together with changed pixels (well, very much simplified). Lossy audio compression is worlds away from that in technical implementation. There's no such thing as "motion estimation" for audio compression.

Audio decoders are standardized in a way, too. However, what you need to be aware of is that standards only tell us how to decode video and audio. They don't tell us how to do post processing. E.g. does the h264 standard tell you how to upsample chroma from 4:2:0 to 4:4:4? No, it doesn't, that's outside of the decoder, it's a post process. Does the h264 standard tell you how to downconvert 8bit per component video to 7bit per component video? Nope, it doesn't, because that's got nothing to do with encoding/decoding. The same thing applies to audio: How you post process the audio got nothing to do with decoding. So it's not specified in the codec specs. You can downconvert 24bit audio to 4bit, if you like. You shouldn't do that, but you can. So do you expect the decoder specs to contain information on how to downconvert 24bit audio to 4bit?

Quote:
Originally Posted by JEEB View Post
And if it's something like dithering post-decoding, I don't really get why it can't be decoded with int to get the exact output that was meant to be gotten (given if the specification specifies this -- and I would guess it actually might specify it given the fact that float math's results depend highly on the system/architecture etc.), and then converted to float with dithering for output/filtering/whatever your cat wants to do to it.

But maybe lossy audio codecs just make less sense than I thought.
AC3 and DTS do not compress in time domain. They do not compress PCM audio. They convert PCM to frequency domain, IIRC, and then compress the data in the frequency domain. When decompressing, the frequency data needs to be converted back to PCM, which usually results in floating point data. If you look at the AC3 and DTS decoder source code, you'll notice that it natively decodes to floating point data.

So the final question is: *After* the decoders have completed their task, what further processing is done to the decoded audio data? libavcodec and liba52/libdts do not differ so much in how they decode. They differ on how they post process. libavcodec violates processing laws by forcedly rounding down to 16bit integer. liba52 doesn't do that, it outputs the decoding result untouched. So the problem with libavcodec is not the decoding itself, it's the post processing, which can't be turned off. Not even via compiler switches or #defines.

Last edited by madshi; 6th March 2011 at 15:18.
madshi is offline   Reply With Quote
Old 6th March 2011, 15:41   #13143  |  Link
JEEB
もこたんインしたお!
 
JEEB's Avatar
 
Join Date: Jan 2008
Location: Finland / Japan
Posts: 512
Quote:
Originally Posted by madshi View Post
...So the problem with libavcodec is not the decoding itself, it's the post processing, which can't be turned off. Not even via compiler switches or #defines.
Finally, the first person to actually make sense for me in this. So the problem really isn't in the decoding itself, but, as I was already kind of thinking, the output a post processor gives. Thank you for making this clear for me, as the multiple levels of audio hipsters have made this problem look like an actual decoder problem, which of course makes me go wee on the "How much does this make sense" scale.

+1 reason for ffmpeg not to reject a patch that lets it handle more than one type of output. Or something that would just let the calling application get the decoded output and do whatever it wants to the output, so the next generation of audio hipsters can get their float64 or float128 instead without concerning the ffmpeg project itself (*grin*).
__________________
[I'm human, no debug]
JEEB is offline   Reply With Quote
Old 6th March 2011, 15:49   #13144  |  Link
Kovensky
Registered User
 
Kovensky's Avatar
 
Join Date: Oct 2008
Posts: 5
Quote:
Originally Posted by madshi View Post
A couple of years ago I tried getting patches in to allow floating point output via #define.
From AVCodecContext's definition:
Code:
/**
* audio sample format
* - encoding: Set by user.
* - decoding: Set by libavcodec.
*/
enum AVSampleFormat sample_fmt;  ///< sample format
A better approach for a patch would be to allow the user to set sample_fmt for decoding too. There are several audio functions in the API that deal with int16_t* btw, but that does *not* mean they only return int16_t.

The audio decoding API will soon be changed anyway to return AVFrames instead of the user having to manage buffers.
Kovensky is offline   Reply With Quote
Old 6th March 2011, 16:29   #13145  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by Kovensky View Post
A better approach for a patch would be to allow the user to set sample_fmt for decoding too.
Yes, that would be nice.

Quote:
Originally Posted by Kovensky View Post
The audio decoding API will soon be changed anyway
Deja vu. I've been told that 3 years ago...
madshi is offline   Reply With Quote
Old 6th March 2011, 17:34   #13146  |  Link
Kovensky
Registered User
 
Kovensky's Avatar
 
Join Date: Oct 2008
Posts: 5
Quote:
Originally Posted by madshi View Post
Deja vu. I've been told that 3 years ago...
This time there are actual patches

EDIT: and this wouldn't affect or block any possible patches you make for sample format; it's just an user interface change not internal ffmpeg stuff

Last edited by Kovensky; 6th March 2011 at 17:42.
Kovensky is offline   Reply With Quote
Old 6th March 2011, 17:37   #13147  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Well, that sounds good...
madshi is offline   Reply With Quote
Old 6th March 2011, 17:52   #13148  |  Link
ranpha
Registered User
 
Join Date: Feb 2008
Posts: 335
libdts at least should not be removed, until libavcodec can decode DTS core in DTS-HD MA track properly. My test MKVs files with DTS-HD MA tracks will not start playing, if I were to use libavcodec to play the core audio track, but with libdts playback works perfectly.
ranpha is offline   Reply With Quote
Old 6th March 2011, 18:09   #13149  |  Link
STaRGaZeR
4:2:0 hater
 
Join Date: Apr 2008
Posts: 1,302
Quote:
Originally Posted by yesgrey View Post
If you also agree that it might be useful outputting with the same bit depth used on internal processing, why insisting in removing the other decoders? It was you, not me, who said they were redundant, and I only consider something to be redundant when there is another one which does exactly the same, and that's not what's happening.
Sorry if put words on you that weren't exact. I have no hidden agenda, just gave my opinion.
Because a decoder is a lot more than what it outputs. Accepting this, the decoders are redundant. But it seems that this fact is not acknowledged here. As you can see there's still not a single argument against the removal based on audio quality or features supported, the ultimate goals of an audio decoder.

Since clsid is with the placebo guys, I'm outta this debate. At least I hope that the guy in charge of the patching does it right: by patching AAC, AC3, DTS, Vorbis, WMA and Nellymoser. That's another thing I've noticed, the guys complaining only complain because it affects themselves directly. ffdshow is a lot more than an AC3/DTS decoder for playing your DVDs. I don't remember anyone complaining when tremor (32-bit int output) was removed, leaving only libavcodec (16-bit int). A word exist for this: hipocrisy.

Quote:
Originally Posted by madshi View Post
Actually I just checked out the liba52 source code and it *DOES* decode to full floating point. So it is definitely better quality than (unpatched) ffmpeg/libavcodec. QED.
As I said, working in float or integer, by itself, means *NOTHING* to the final audio quality. QED.

And as always, you don't answer the key questions, only what's best for your interests. I will do the same with you from now on.

Quote:
Originally Posted by madshi View Post
AC3 and DTS do not compress in time domain. They do not compress PCM audio. They convert PCM to frequency domain, IIRC, and then compress the data in the frequency domain. When decompressing, the frequency data needs to be converted back to PCM, which usually results in floating point data. If you look at the AC3 and DTS decoder source code, you'll notice that it natively decodes to floating point data.
You're wrong. DTS doesn't work in frequency domain. DTS compress in time domain, filtered PCM audio as ADPCM or APCM. For details, see here: http://www.mp3-tech.org/programmer/d...whitepaper.pdf, pages 5 and 7. You don't seem to have informed yourself properly, and at the same time you like to judge others. I won't comment on that, it's pretty self explanatory.

Quote:
Originally Posted by ranpha View Post
libdts at least should not be removed, until libavcodec can decode DTS core in DTS-HD MA track properly. My test MKVs files with DTS-HD MA tracks will not start playing, if I were to use libavcodec to play the core audio track, but with libdts playback works perfectly.
I've fixed that in my last build. Search a few pages back for it. And report if it works for you too!
__________________
Specs, GTX970 - PLS 1440p@96Hz
Quote:
Originally Posted by Manao View Post
That way, you have xxxx[p|i]yyy, where xxxx is the vertical resolution, yyy is the temporal resolution, and 'i' says the image has been irremediably destroyed.

Last edited by STaRGaZeR; 6th March 2011 at 19:29.
STaRGaZeR is offline   Reply With Quote
Old 6th March 2011, 18:25   #13150  |  Link
yesgrey
Registered User
 
Join Date: Sep 2004
Posts: 1,295
Quote:
Originally Posted by STaRGaZeR View Post
At least I hope that the guy in charge of the patching does it right: by patching AAC, AC3, DTS, Vorbis, WMA and Nellymoser.
Agreed.

Quote:
Originally Posted by STaRGaZeR View Post
I don't remember anyone complaining when tremor (32-bit int output) was removed, leaving only libavcodec (16-bit int). A word exist for this: hipocrisy.
There is another word for that (3 words to be exact): "lack of knowledge"

I don't know every single feature of ffdshow. It contains a lot more decoders than I will ever need, so it's natural for me to complain only about the parts that I know.
yesgrey is offline   Reply With Quote
Old 6th March 2011, 18:50   #13151  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by STaRGaZeR View Post
As you can see there's still not a single argument against the removal based on audio quality
Oh yes, there is, you're just choosing to ignore it.

Quote:
Originally Posted by STaRGaZeR View Post
That's another thing I've noticed, the guys complaining only complain because it affects themselves directly. ffdshow is a lot more than an AC3/DTS decoder for playing your DVDs. I don't remember anyone complaining when tremor (32-bit int output) was removed, leaving only libavcodec (16-bit int). A word exist for this: hipocrisy.
Huh? I can only comment on things I know anything about. I've no clue what "tremor" is, never heard of it.

Quote:
Originally Posted by STaRGaZeR View Post
As I said, working in float or integer, by itself, means *NOTHING* to the final audio quality. QED.
Violating audio processing laws, resulting in measurable addition of quantization noise, does very much mean something to the final audio quality.

Quote:
Originally Posted by STaRGaZeR View Post
You're wrong. DTS doesn't work in frequency domain. DTS compress in time domain, filtered PCM audio as ADPCM or APCM.
Ok, didn't know that. But it doesn't change the fact that the libavcodec DTS decoder natively produces floating point samples. Rounding them to 16bit integer samples introduces quantization noise.
madshi is offline   Reply With Quote
Old 6th March 2011, 19:10   #13152  |  Link
STaRGaZeR
4:2:0 hater
 
Join Date: Apr 2008
Posts: 1,302
Quote:
Originally Posted by madshi View Post
Oh yes, there is, you're just choosing to ignore it.
Maybe I'm missing something, but I don't remember you saying anywhere that liba52, libdts, libfaad or libmad sound better than unpatched libavcodec? Can you point me to it? All I read is "32 to 16 sucks, adds shit to the decoded PCM". While this is 100% true, you're comparing patched libavcodec with unpatched libavcodec. And the debate is not there.

Quote:
Originally Posted by madshi View Post
Huh? I can only comment on things I know anything about. I've no clue what "tremor" is, never heard of it.
That's the problem, there's a lot of things you don't know about ffdshow and stuff in general, yet you're here pretending to be some kind of I-know-it-all guy. Just to put you in perspective: it was the same situation, nobody objected. tremor sounded like crap compared to libavcodec, despite outputting 32-bit int samples.

Quote:
Originally Posted by madshi View Post
Violating audio processing laws, resulting in measurable addition of quantization noise, does very much mean something to the final audio quality.
Aha. Since I'm sure you're not talking out of your a** when you say it does very much mean something to the final audio quality, where's that audio quality comparison between liba52/dts/faad/mad and unpatched libavcodec? Where are the blind tests?

Quote:
Originally Posted by madshi View Post
Ok, didn't know that. But it doesn't change the fact that the libavcodec DTS decoder natively produces floating point samples. Rounding them to 16bit integer samples introduces quantization noise.
Read the first quote: the debate is not here.
__________________
Specs, GTX970 - PLS 1440p@96Hz
Quote:
Originally Posted by Manao View Post
That way, you have xxxx[p|i]yyy, where xxxx is the vertical resolution, yyy is the temporal resolution, and 'i' says the image has been irremediably destroyed.
STaRGaZeR is offline   Reply With Quote
Old 6th March 2011, 19:40   #13153  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by STaRGaZeR View Post
All I read is "32 to 16 sucks, adds shit to the decoded PCM".
Not really. If libavcodec *dithered* down to 16bit, I would agree with removing liba52 and libdts. The problem is not doing a conversion from 32bit float to 16bit integer. That's fine with me. The problem is that libav is doing the conversion in a bad way.

Quote:
Originally Posted by STaRGaZeR View Post
That's the problem, there's a lot of things you don't know about ffdshow and stuff in general, yet you're here pretending to be some kind of I-know-it-all guy.
I don't "know it all". But I do know some things, one of them is how audio and video data processing should be done. And I know that libav's conversion from 32fp to 16int is done in a bad way, which adds quantization noise.

Quote:
Originally Posted by STaRGaZeR View Post
Aha. Since I'm sure you're not talking out of your a** when you say it does very much mean something to the final audio quality, where's that audio quality comparison between liba52/dts/faad/mad and unpatched libavcodec? Where are the blind tests?
I don't trust in blind tests I haven't faked myself.

I think you're misunderstanding me a bit: I do not explicitly claim that liba52 sounds noticeably better than libavcodec. I don't know that for a fact. *However*, there's a known problem with libavcodec decoding quality, while there is no known problem with liba52 decoding quality. This alone is IMHO a key argument to not remove liba52 until the libavcodec problem is fixed. I don't see any sense in removing a decoder which has no known audio quality problems in favor of another decoder which does have known audio quality problems. And that's basically what I was trying to say from the start.
madshi is offline   Reply With Quote
Old 6th March 2011, 19:53   #13154  |  Link
fastplayer
Registered User
 
Join Date: Nov 2006
Posts: 799
Quote:
Originally Posted by STaRGaZeR View Post
I've fixed the last issue I had with libavcodec and DTS streams. So here's a test build, so you guys can torture it. What to do:

- Test everything, DTS, all variants of DTS-HD (only the DTS core will be decoded obviously), switching, retarded splitters, etc.
- Test only software decoding. No bitstreaming.
- If you're going to report anything, try libdts and confirm you don't have the issue with it before reporting.
- Since someone is going to say it, 16-bit output is not an issue. Decoding failures are.

And if possible, do the same with AC3, AAC, MP1/2/3.

http://www.mediafire.com/?ac6bc5s3wb1st9u
Just a bump in case people missed it. Please test!
fastplayer is offline   Reply With Quote
Old 6th March 2011, 20:48   #13155  |  Link
clsid
*****
 
Join Date: Feb 2005
Posts: 5,643
Here is a test build with fp32 output for libavcodec AC3 and DTS:
http://www.sendspace.com/file/lrjn02
__________________
MPC-HC 2.1.7.2
clsid is offline   Reply With Quote
Old 6th March 2011, 20:50   #13156  |  Link
STaRGaZeR
4:2:0 hater
 
Join Date: Apr 2008
Posts: 1,302
Quote:
Originally Posted by madshi View Post
Not really. If libavcodec *dithered* down to 16bit, I would agree with removing liba52 and libdts. The problem is not doing a conversion from 32bit float to 16bit integer. That's fine with me. The problem is that libav is doing the conversion in a bad way.
I know that. But we're not talking about ifs. I'll paste you here the question you didn't selectively answer, since you accused me of something I didn't do, and it's also very relevant to this debate:

Quote:
Originally Posted by STaRGaZeR View Post
Maybe I'm missing something, but I don't remember you saying anywhere that liba52, libdts, libfaad or libmad sound better than unpatched libavcodec? Can you point me to it?
Thanks.

Quote:
Originally Posted by madshi View Post
I don't "know it all". But I do know some things, one of them is how audio and video data processing should be done. And I know that libav's conversion from 32fp to 16int is done in a bad way, which adds quantization noise.
I know that too. There's no need to repeat it in every post. However, thanks.

Quote:
Originally Posted by madshi View Post
I don't trust in blind tests I haven't faked myself.

I think you're misunderstanding me a bit: I do not explicitly claim that liba52 sounds noticeably better than libavcodec. I don't know that for a fact. *However*, there's a known problem with libavcodec decoding quality, while there is no known problem with liba52 decoding quality. This alone is IMHO a key argument to not remove liba52 until the libavcodec problem is fixed. I don't see any sense in removing a decoder which has no known audio quality problems in favor of another decoder which does have known audio quality problems. And that's basically what I was trying to say from the start.
Just in case it wasn't clear enough already, I'll tell you again what I think it's THE flaw in your argument: the bolded part. Nobody here should be thinking about what happens internally in ffdshow, you should only care about the final result. You base your claim in problems you can't hear, in numbers you can't hear. Well, let's rephrase that a bit: in stuff I can't hear. I'm human, I have limitations. A lot of them. I'm asking you to prove me wrong since the very beginning. You're constantly ignoring that simple request, for example when I ask you for blind tests, you talk about faking and all that. Do we trust the numbers, or do we trust our ears? This is not an academic signal processing exercise. Prove me wrong on the field. I know what should be done from the academic point of view, and I'll apply it once it's done in the proper place to do it: the ffmpeg repository. Until then, I'll trust my ears unless someone comes with an audio quality objection (based on his or someone's ears, of course).

And just for the record: the libs removal was in no way inmediate. Time is needed to fix any possible bugs in ffdshow's implementation of libavcodec. You, in the meantime, should try to get the ffmpeg stuff done. That's where you excell, and with the new ffmpeg's direction I see it doable. Go for it.
__________________
Specs, GTX970 - PLS 1440p@96Hz
Quote:
Originally Posted by Manao View Post
That way, you have xxxx[p|i]yyy, where xxxx is the vertical resolution, yyy is the temporal resolution, and 'i' says the image has been irremediably destroyed.
STaRGaZeR is offline   Reply With Quote
Old 6th March 2011, 21:00   #13157  |  Link
STaRGaZeR
4:2:0 hater
 
Join Date: Apr 2008
Posts: 1,302
Quote:
Originally Posted by clsid View Post
Here is a test build with fp32 output for libavcodec AC3 and DTS:
http://www.sendspace.com/file/lrjn02
Works just fine here.
__________________
Specs, GTX970 - PLS 1440p@96Hz
Quote:
Originally Posted by Manao View Post
That way, you have xxxx[p|i]yyy, where xxxx is the vertical resolution, yyy is the temporal resolution, and 'i' says the image has been irremediably destroyed.
STaRGaZeR is offline   Reply With Quote
Old 6th March 2011, 21:03   #13158  |  Link
pirlouy
_
 
Join Date: May 2008
Location: France
Posts: 692
Quote:
Originally Posted by STaRGaZeR View Post
You base your claim in problems you can't hear, in numbers you can't hear. Well, let's rephrase that a bit: in stuff I can't hear. I'm human, I have limitations.
I'm a noob, so you can ignore this, but on Hydrogenaudio forum (that I dislike because too much harsh), they ban you if you base things on what you hear, and not on numbers, graph,etc.

In my case, I don't hear any difference between sources in 16 bits, 24 bits, 44100Hz, 96000Hz etc.

But even if it's boring for you, it's a good thing that there is a discussion. Better now than later.
pirlouy is offline   Reply With Quote
Old 6th March 2011, 21:26   #13159  |  Link
Astrophizz
Registered User
 
Join Date: Jul 2008
Posts: 184
Quote:
Originally Posted by pirlouy View Post
I'm a noob, so you can ignore this, but on Hydrogenaudio forum (that I dislike because too much harsh), they ban you if you base things on what you hear, and not on numbers, graph,etc.
AFAIK on Hydrogenaudio they actually prefer if you can provide ABX results for your comparisons. That's based on what you hear and not on numbers. But that's for more minute differences than 32 bit downconverted with and without dithering, for which they have an agreed upon stance (the same as madshi's) based on past ABX tests. That's also why they don't like certain people who post here on Doom9 that use company press releases as evidence for superior audio quality and don't provide ABX comparisons for their claims that (eg.) upsampling 44.1 kHz to 96 kHz makes audio sound better.
Astrophizz is offline   Reply With Quote
Old 6th March 2011, 21:42   #13160  |  Link
Gleb Egorych
Registered User
 
Join Date: Aug 2008
Posts: 231
Quote:
Originally Posted by STaRGaZeR View Post
AAC, AC3, DTS, Vorbis, WMA and Nellymoser
E-AC3 as well.
Quote:
Originally Posted by STaRGaZeR View Post
hipocrisy
I guess very few people need Vorbis in ffdshow. In fact only AAC, (E-)AC3 and DTS decoders need to be patched. Everything else either work as it should or simply not used. BTW I don't know who needs 12 (twelve) software deinterlacers (and +1 HW) in ffdshow.

Quote:
Originally Posted by STaRGaZeR View Post
DTS doesn't work in frequency domain. DTS compress in time domain, filtered PCM audio as ADPCM or APCM. For details, see here: http://www.mp3-tech.org/programmer/d...whitepaper.pdf, pages 5 and 7.
I've read that paper. DTS encoder clearly works in frequency domain. To be exact, it operates in joint time-frequency domain, for every given time-window it decomposes input signal into frequency sub-bands and then works with that decomposition.

Quote:
Originally Posted by clsid View Post
Here is a test build with fp32 output for libavcodec AC3 and DTS
Thanks, clsid

Last edited by Gleb Egorych; 6th March 2011 at 22:04.
Gleb Egorych is offline   Reply With Quote
Reply

Tags
ffdshow, ffdshow tryouts, ffdshow-mt, ffplay, icl

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:44.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.