Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Audio encoding

Reply
 
Thread Tools Search this Thread Display Modes
Old 11th October 2007, 22:09   #721  |  Link
surfer63
Registered User
 
Join Date: Sep 2007
Posts: 5
@DarkAvenger

The complete output after make clean and make verbose=1 can't be copied completely in a post. So I will send it to you in a private mail (from this forum)
surfer63 is offline   Reply With Quote
Old 12th October 2007, 08:32   #722  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by wisodev View Post
I have fixed the crush problem when using my builds (actually there is one problem with libaften.dll build by Kurtnoise13 when used on system without SSE3 but Kurtnoise13 aften.exe build works perfectly) and released updated binaries! My aften.exe and libaften.dll builds are now working without any problems.
Thank you for the updated binaries! Two little questions:

(1) Is there a specific reason why the libaften.dll is much bigger than the aften.exe? Shouldn't it be the other way round in theory?

(2) Those different builds with SSE(1,2,3) : Which build should I distribute with my software if I don't know if SSE will be available or not? Why are there different builds in the first place? Wouldn't it make more sense to have only one build which internally switches between different branches, depending on what the current CPU supports?

Thanks!
madshi is offline   Reply With Quote
Old 12th October 2007, 09:23   #723  |  Link
wisodev
Registered User
 
Join Date: Nov 2006
Posts: 161
Quote:
Originally Posted by madshi View Post
Thank you for the updated binaries! Two little questions:

(1) Is there a specific reason why the libaften.dll is much bigger than the aften.exe? Shouldn't it be the other way round in theory?

(2) Those different builds with SSE(1,2,3) : Which build should I distribute with my software if I don't know if SSE will be available or not? Why are there different builds in the first place? Wouldn't it make more sense to have only one build which internally switches between different branches, depending on what the current CPU supports?

Thanks!
(1) Actually it's opposite the aften.exe is bigger then libaften.dll. I'm talking about aften-0.0.8-icl10 release (this is recommended build).

E.g.
aften_x86\aften.exe - 281 KB
libaftendll_x86\libaften.dll - 233 KB

You only need libaften.dll from libaftendll_* directory. The other files are needed when you link your program dynamically with libaften.dll. The aften.exe is there to test libaften.dll (this exe is linked dynamically with libaften.dll and I use it also for PGO optimizations when building with Intel C++ Compiler ).

(2) I know it's confusing but I had earlier in this thread explained why I do it this way. The aften_x86_SSE, aften_x86_SSE2 and aften_x86_SSE3 builds (and respectively the libaften.dll builds) have been built with special compiler switches that enable some optimization for newer CPUs and will not run on CPUs without specific instruction support (e.g. aften_x86_SSE3 build will not run on machine without SSE3 support). This builds are bit faster (not on all machines but on my they are) then aften_x86 builds. This also applies to *_AMD64 builds.

So you should use aften_x86 and libaftendll_x86 builds (or respectively aften_AMD64 and libaftendll_AMD64). This are universal binaries for Win32 (Win64) machines. They have built in MMX, SSE, SSE2 and SSE3 optimizations but are using them only when are supported by CPU.

wisodev

Last edited by wisodev; 12th October 2007 at 09:29.
wisodev is offline   Reply With Quote
Old 12th October 2007, 09:39   #724  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by wisodev View Post
(1) Actually it's opposite the aften.exe is bigger then libaften.dll. I'm talking about aften-0.0.8-icl10 release (this is recommended build).

E.g.
aften_x86\aften.exe - 281 KB
libaftendll_x86\libaften.dll - 233 KB

You only need libaften.dll from libaftendll_* directory. The other files are needed when you link your program dynamically with libaften.dll. The aften.exe is there to test libaften.dll (this exe is linked dynamically with libaften.dll and I use it also for PGO optimizations when building with Intel C++ Compiler ).

(2) I know it's confusing but I had earlier in this thread explained why I do it this way. The aften_x86_SSE, aften_x86_SSE2 and aften_x86_SSE3 builds (and respectively the libaften.dll builds) have been built with special compiler switches that enable some optimization for newer CPUs and will not run on CPUs without specific instruction support (e.g. aften_x86_SSE3 build will not run on machine without SSE3 support). This builds are bit faster (not on all machines but on my they are) then aften_x86 builds. This also applies to *_AMD64 builds.

So you should use aften_x86 and libaftendll_x86 builds (or respectively aften_AMD64 and libaftendll_AMD64). This are universal binaries for Win32 (Win64) machines. They have built in MMX, SSE, SSE2 and SSE3 optimizations but are using them only when are supported by CPU.
Thank you!
madshi is offline   Reply With Quote
Old 12th October 2007, 14:28   #725  |  Link
DarkAvenger
HeadAC3he coder
 
DarkAvenger's Avatar
 
Join Date: Oct 2001
Posts: 413
@surfer63

I looked thorugh the output and I don't understand why you get the error. Either your version of gcc (could you give me output of gcc -v) or binutils seems to be buggy.

Could you test whether commenting out

TEST_COMPILER_VISIBILITY()

in the CMakeLists.txt helps? Do you by chance have some older version of aften installed?

Last edited by DarkAvenger; 12th October 2007 at 14:30.
DarkAvenger is offline   Reply With Quote
Old 12th October 2007, 16:03   #726  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
I think there's a bug in aften.c:

Code:
    if(!opts.pad_start) {

      [...]   // code adds padding here

    }
Or am I missing/misunderstanding something?
madshi is offline   Reply With Quote
Old 12th October 2007, 16:48   #727  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by madshi View Post
Or am I missing/misunderstanding something?
Isn't a bug,
Code:
[-pad #] Start-of-stream padding
            The AC-3 format uses an overlap/add cycle for encoding
            each block.  By default, Aften pads the delay buffer
            with a block of silence to avoid inaccurate encoding
            of the first frame of audio.  If this behavior is not
            wanted, it can be disabled.  The pad value can be a
            1 (default) to use padding or 0 to not use padding.
Each frame uses the last 256 samples from precedent frame to encode the actual frame (time -> frequency domain). This is a problem for first frame, there are two methods:

With -pad 1 (default), the first 256 samples (delay buffer) are already filled with silence (and introduce a 5.33 ms delay if 48 KHz.) and the real samples are encoded properly.

With -pad 0 [if(!opts.pad_start)] the first 256 samples are filled with real samples, without delay but with something like a fade-in in first 5.33 ms.

Last edited by tebasuna51; 12th October 2007 at 16:51.
tebasuna51 is offline   Reply With Quote
Old 12th October 2007, 17:57   #728  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by tebasuna51 View Post
With -pad 1 (default), the first 256 samples (delay buffer) are already filled with silence (and introduce a 5.33 ms delay if 48 KHz.) and the real samples are encoded properly.

With -pad 0 [if(!opts.pad_start)] the first 256 samples are filled with real samples, without delay but with something like a fade-in in first 5.33 ms.
I don't see that in the source code. When "opts.pad_start" is 0, there is an additional "aften_encode_frame" call in the source code, where the first 1280 samples are set to zero and only the last 256 samples are real samples. When "opts.pad_start" is 1, there is no padding at all, as far as I can see from the source code.

Or is there some magic going on behind the scenes? But how can an additional "aften_encode_frame" call result in less padding?



I think the code should read "if(opts.pad_start)". But well, maybe I'm embarassing myself right now...
madshi is offline   Reply With Quote
Old 12th October 2007, 18:12   #729  |  Link
surfer63
Registered User
 
Join Date: Sep 2007
Posts: 5
@DarkAvenger

I have been working on the "not-compiling" of 0.08 for quite some weeks now. Some of my "fellow" (more clever) programmers found the solution.
Code:
export CFLAGS=-fno-common
cmake -DSHARED=1 ..
make
sudo make install
Apparently global variables that are defined in different object files need to be initialised on the Mac. The no-common option will initialise these variables to zero.
Sorry for taking your time and than solving it self.
surfer63 is offline   Reply With Quote
Old 12th October 2007, 18:44   #730  |  Link
DarkAvenger
HeadAC3he coder
 
DarkAvenger's Avatar
 
Join Date: Oct 2001
Posts: 413
@surfer63

Ah, thx for the hint. I remember now I had this problem once with OpenAL. A shame that I forgot about it...

Could you try this patch:
Code:
Index: libaften/exponent.c
===================================================================
--- libaften/exponent.c (Revision 563)
+++ libaften/exponent.c (Arbeitskopie)
@@ -29,7 +29,7 @@

 #include "cpu_caps.h"

-uint16_t expstr_set_bits[6][256];
+uint16_t expstr_set_bits[6][256] = {{0}};

 static void process_exponents(A52ThreadContext *tctx);

Index: libaften/window.c
===================================================================
--- libaften/window.c   (Revision 563)
+++ libaften/window.c   (Arbeitskopie)
@@ -33,7 +33,7 @@
 #include "cpu_caps.h"


-ALIGN16(FLOAT) a52_window[512];
+ALIGN16(FLOAT) a52_window[512] = {0};

 static void
 apply_a52_window(FLOAT *samples)
Please delete CMakeCache.txt, don't set -fno-common, run cmake and make, and report back whether it worked. Thx!

Last edited by DarkAvenger; 12th October 2007 at 18:48.
DarkAvenger is offline   Reply With Quote
Old 12th October 2007, 19:06   #731  |  Link
surfer63
Registered User
 
Join Date: Sep 2007
Posts: 5
@DarkAvenger.

The patch does work for a cmake ..

However, I like to have a dynamic library. When I use "cmake -DSHARED=1 .." I get the following error.

Code:
Linking C shared library libaften.dylib
ld: common symbols not allowed with MH_DYLIB output format with the -multi_module option
CMakeFiles/aften.dir/libaften/a52enc.o private external definition of common _nexpgrptab (size 3072)
/usr/bin/libtool: internal link edit command failed
make[2]: *** [libaften.0.0.8.dylib] Error 1
make[1]: *** [CMakeFiles/aften.dir/all] Error 2
make: *** [all] Error 2
When reapplying the export CFLAGS=-fno-common, the cmake -DSHARED=1 .. and make works again.
surfer63 is offline   Reply With Quote
Old 12th October 2007, 21:55   #732  |  Link
DarkAvenger
HeadAC3he coder
 
DarkAvenger's Avatar
 
Join Date: Oct 2001
Posts: 413
Well, if you look at the error message, you'll see it complains about another variable.

Try putting this patch on top. I wonder why no other mac user complained before...
Code:
Index: libaften/a52enc.c
===================================================================
--- libaften/a52enc.c   (Revision 563)
+++ libaften/a52enc.c   (Arbeitskopie)
@@ -46,7 +46,7 @@
  * LUT for number of exponent groups present.
  * expsizetab[exponent strategy][number of coefficients]
  */
-int nexpgrptab[3][256];
+int nexpgrptab[3][256] = {{0}};

 /**
  * Pre-defined sets of exponent strategies. A strategy set is selected for
DarkAvenger is offline   Reply With Quote
Old 13th October 2007, 08:41   #733  |  Link
surfer63
Registered User
 
Join Date: Sep 2007
Posts: 5
@DarkAvenger

This second patch does the job. Aften compiles/builds fine now, also when building a shared library. I will start using/testing.


Thanks a lot for your help and good work!

Last edited by surfer63; 13th October 2007 at 08:52.
surfer63 is offline   Reply With Quote
Old 13th October 2007, 16:10   #734  |  Link
DarkAvenger
HeadAC3he coder
 
DarkAvenger's Avatar
 
Join Date: Oct 2001
Posts: 413
Thx, commited.

@madshi

I think you are right. I don't understand it, as well. Justin?

Last edited by DarkAvenger; 13th October 2007 at 16:23.
DarkAvenger is offline   Reply With Quote
Old 13th October 2007, 16:57   #735  |  Link
jruggle
Registered User
 
Join Date: Jul 2006
Posts: 276
Quote:
Originally Posted by madshi View Post
I don't see that in the source code. When "opts.pad_start" is 0, there is an additional "aften_encode_frame" call in the source code, where the first 1280 samples are set to zero and only the last 256 samples are real samples. When "opts.pad_start" is 1, there is no padding at all, as far as I can see from the source code.

Or is there some magic going on behind the scenes? But how can an additional "aften_encode_frame" call result in less padding?



I think the code should read "if(opts.pad_start)". But well, maybe I'm embarassing myself right now...
I'll try to explain the process better.

The encoder reads 1536 samples (1 frame) at a time, but due to the overlap/add process it needs 256 samples from the previous frame. At the start of encoding, those "delay" samples are just initialized to zero. This is the standard way of doing things, but will end up delaying the decoded output. When the option to get rid of the delay is turned on, Aften reads 256 input samples to prime the delay instead of zeros. That eliminates the decoding delay. The call to encode_frame() is really only used to get those samples into the delay buffer. The resulting frame is not actually written to the file output.

Hope that helps.
jruggle is offline   Reply With Quote
Old 13th October 2007, 17:05   #736  |  Link
jruggle
Registered User
 
Join Date: Jul 2006
Posts: 276
Quote:
Originally Posted by DarkAvenger View Post
Thx, commited.

@madshi

I think you are right. I don't understand it, as well. Justin?
I don't fully understand linking, and even less with shared libs. But yeah, it just looks like gcc on mac complains about uninitialized globals. I always thought they were initialized to zero by default, but maybe that's just the static ones...?

FFmpeg probably has 50+ of these. I wonder why they don't have the same complaints... Maybe something in the build system turns on the right compiler/linker flags?
jruggle is offline   Reply With Quote
Old 13th October 2007, 17:33   #737  |  Link
DarkAvenger
HeadAC3he coder
 
DarkAvenger's Avatar
 
Join Date: Oct 2001
Posts: 413
Oh, I was rather referring to the padding issue. [Edit] Forget about it, I haven't seen your first post, but now I did.

Regarding the linker: It seems to be some feature of the macho binary format or alike. Yes, it can be avoided by compiler flags (-fno-common) or linker flag (something with single module) or by explicitly initializing as far as I learnt. I am not sure what would be the right way. According to what I read on a mailing list, the linker flag should be the right way, but well... I think fixes in C code are more stable than forcing compiler/linker flags.

Last edited by DarkAvenger; 13th October 2007 at 17:42.
DarkAvenger is offline   Reply With Quote
Old 13th October 2007, 20:56   #738  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by jruggle View Post
I'll try to explain the process better.

The encoder reads 1536 samples (1 frame) at a time, but due to the overlap/add process it needs 256 samples from the previous frame. At the start of encoding, those "delay" samples are just initialized to zero. This is the standard way of doing things, but will end up delaying the decoded output. When the option to get rid of the delay is turned on, Aften reads 256 input samples to prime the delay instead of zeros. That eliminates the decoding delay. The call to encode_frame() is really only used to get those samples into the delay buffer. The resulting frame is not actually written to the file output.

Hope that helps.
Thank you for the explanation! But I'm still a bit confused. For me the code in "aften.c" reads like this:

pad 0:
Code:
encode_frame(1280 zero samples + 256 real samples);
repeat
  encode_frame(1536 real samples);
until stream_end;
pad 1:
Code:
repeat
  encode_frame(1536 real samples);
until stream_end;
Do I read that correctly? Here's what I have problems with:

(1) With "pad 0": How does the encoder know that the "1280+256" frame is only meant to initialize the delay buffer and not meant to be output?

(2) With "pad 1": How does the encoder know that the first encode_frame call (which has 1536 real samples in it) is meant to be output?

If your explanation is correct (which it surely is) the encoder behaves differently with "pad 0" and "pad 1". With "pad 0" the encoder just eats the first frame and doesn't output it. With "pad 1" the first frame is output. I just don't see how the encoder can differ between "pad 0" and "pad 1" because I don't see anything in the code where the encoder is told which pad setting is used!
madshi is offline   Reply With Quote
Old 13th October 2007, 21:29   #739  |  Link
DarkAvenger
HeadAC3he coder
 
DarkAvenger's Avatar
 
Join Date: Oct 2001
Posts: 413
The mdct keeps a buffer of the last 256 samples by itself, thus it works as explained.

reg. (1) The encoder doesn't know it. But the front-end doesn't write the encoded frame. In your pseudo code you are missing the write_encoded_frame, which only happens in the loop. I also haven't noticed this until Justin explained it...

Last edited by DarkAvenger; 13th October 2007 at 21:33.
DarkAvenger is offline   Reply With Quote
Old 13th October 2007, 22:20   #740  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by DarkAvenger View Post
The mdct keeps a buffer of the last 256 samples by itself, thus it works as explained.

reg. (1) The encoder doesn't know it. But the front-end doesn't write the encoded frame. In your pseudo code you are missing the write_encoded_frame, which only happens in the loop. I also haven't noticed this until Justin explained it...
The first two "encode_frame" calls always return an output length of 0. So the missing write_encoded_frame has no effect.
madshi is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:12.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.