Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se |
|
|
#41 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
Oops, looks like the old CELT has better (lower) generational loss than libopus.
https://workupload.com/archive/HEpHj2AnfA Which might have a slight chance to explain why FFmpeg's experimental native Opus encoder also does pretty well, because it only implements CELT? BUUUUUT... When I tried testing libopus with frame length less than 10ms (I used 5ms), which makes SILK impossible, there's still that wooshing sound. So, who knows...
Last edited by Z2697; 14th September 2025 at 19:34. |
|
|
|
|
|
#42 | Link |
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
Sounds pretty good.
Who knows where the culprit lies with libopus. Perhaps it's a bug or oversight, masked in ordinary usage. At high bitrates, does Opus use SILK at all for certain components? Last edited by GeoffreyA; 13th September 2025 at 11:27. |
|
|
|
|
|
#43 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
I think it should use CELT-only mode for our "normal" bitrates...
But the structure of the libopus and opus-tools is, uhh, a bit too much for me, so I can only be 99% sure. (Look: you need (lib)opus, libopusenc, opusfile and opus-tools, 4 repos! to build this opusenc.exe tool) Code:
/* Threshold bit-rates for switching between mono and stereo */
static const opus_int32 stereo_voice_threshold = 19000;
static const opus_int32 stereo_music_threshold = 17000;
/* Threshold bit-rate for switching between SILK/hybrid and CELT-only */
static const opus_int32 mode_thresholds[2][2] = {
/* voice */ /* music */
{ 64000, 10000}, /* mono */
{ 44000, 10000}, /* stereo */
};
|
|
|
|
|
|
#45 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
IIRC, in automatic mode (which is the default) the voice / music type detection is run on each frame, also there's no "lowest bitrate constraint" unless you use hard-cbr, so there's no guarantee even if the content type detection is overridden.
Even if I modified the encoder to always use the CELT mode, the artifact is still there. Well... maybe it's time to git bisect. Let's go. Oh no, need to sleep first. Last edited by Z2697; 13th September 2025 at 21:22. |
|
|
|
|
|
#46 | Link |
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
It's interesting how the audio would be unfolded into voice and music, but I suppose that's a solved problem.
If I remember correctly, there were changes when it became Opus and wasn't a straight port of CELT. In other words, it might be like finding the needle in a haystack. Well, I hope you slept well and are ready to vanquish the code
|
|
|
|
|
|
#47 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
*Vanquished by the code*
Luckily there's a "primitive" opus_demo.exe encoding/decoding CLI tool available just from the main libopus repo so I don't have to go through all other repos each step. Look what have I found! I said there's some filter stuff going on didn't I? (Need to complete the rest of the steps to confirm, of course) Code:
Bisecting: 2 revisions left to test after this (roughly 2 steps) [0869829f343f85935fac22462d228a065d0ba320] Adds a 3 Hz high-pass filter and boost allocation on leakage Last edited by Z2697; 14th September 2025 at 19:46. |
|
|
|
|
|
#48 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
0869829f343f85935fac22462d228a065d0ba320 is the first bad commit
Yeah... I'm not surprised. Code:
git bisect start # status: waiting for both good and bad commits # bad: [bf52802a47dafd6408aa972dd593599d0d431847] abyss thresh (this is the local commit that I sent the mode threshold to the abyss so that it will always use CELT only mode... there's probably a better way to do that) git bisect bad bf52802a47dafd6408aa972dd593599d0d431847 # status: waiting for good commit(s), bad commit known # good: [63590897db35326cd1ce7784806f9b89a98631ea] Initial commit with the autotools stuff and files taken from Speex and Vorbis. git bisect good 63590897db35326cd1ce7784806f9b89a98631ea # bad: [a6d663c6ae089b0a9682c848f3ff7701b316526f] Disables tf_analysis() for hybrid mode git bisect bad a6d663c6ae089b0a9682c848f3ff7701b316526f # good: [5c80391b3529bcb3fc8afd3b83d7562bae4991ae] Comments, low bit-rate busting avoidance git bisect good 5c80391b3529bcb3fc8afd3b83d7562bae4991ae # skip: [07f884042eccd913c1e96c1503cc15dfe6af9d2b] Wrapping all allocation within opus_alloc() and opus_free() git bisect skip 07f884042eccd913c1e96c1503cc15dfe6af9d2b # good: [fc8b605c1712a9b64b587835dde75920e0df71c6] Eliminate signed overflow in constant, minor makefile.draft updates. git bisect good fc8b605c1712a9b64b587835dde75920e0df71c6 # skip: [dac1b4fc92b0581874af28f84f013b8618ec315c] Make vararray and restrict checks fail GCC 2.95.3's broken implementation. git bisect skip dac1b4fc92b0581874af28f84f013b8618ec315c # bad: [0918365b5141780969e2a09608a11d91228b502e] Fixes a VBR bug with 2.5 ms frames git bisect bad 0918365b5141780969e2a09608a11d91228b502e # skip: [5609cec9a5e1ea8fcb056f2306a115cb3b61c4c9] Fixes two minor issues found in random testing at ridiculously low rate. git bisect skip 5609cec9a5e1ea8fcb056f2306a115cb3b61c4c9 # skip: [217cdae98e44699648a215ebb527c4c3f4f171ac] Make it possible for run_vectors.sh to fail on the mono tests. git bisect skip 217cdae98e44699648a215ebb527c4c3f4f171ac # good: [6619a736376221f2782cecff55d051c3ecfc2ff7] Move nbits_total initialize before renormalization. git bisect good 6619a736376221f2782cecff55d051c3ecfc2ff7 # good: [7143b2d0ff61690b698cc8a8b0e61852ba74d984] Merge branch 'tmp_draft' git bisect good 7143b2d0ff61690b698cc8a8b0e61852ba74d984 # good: [9881484dbde25707b93d988cff6316d2f375727a] test_opus_api: Fix valgrind expectations broken by last commit. git bisect good 9881484dbde25707b93d988cff6316d2f375727a # skip: [747c817d96482883527c592b48367975c6f9a1a2] Adds MFCC standard deviation features git bisect skip 747c817d96482883527c592b48367975c6f9a1a2 # good: [9cf62baafc02e53ffb86e498ac3dc0e6b1f8e03e] Implements a better transient metric for VBR git bisect good 9cf62baafc02e53ffb86e498ac3dc0e6b1f8e03e # good: [70d90d115d01cb47426295efca7a5a3ffcafd130] VBR tuning git bisect good 70d90d115d01cb47426295efca7a5a3ffcafd130 # bad: [ac2e623d251bc336ca1d401b95ee47e6ffef0c51] Converting most of the new code to fixed-point (not complete yet) git bisect bad ac2e623d251bc336ca1d401b95ee47e6ffef0c51 # bad: [0869829f343f85935fac22462d228a065d0ba320] Adds a 3 Hz high-pass filter and boost allocation on leakage git bisect bad 0869829f343f85935fac22462d228a065d0ba320 # good: [2a9fdbc93dd048bb8fe5d992c1d4502fe3510666] Transient/VBR tuning, give more bits to frames where pitch changes git bisect good 2a9fdbc93dd048bb8fe5d992c1d4502fe3510666 # good: [96d7a079425d39712f0214a868beb415c46b8061] Dynalloc based on a bands that stand out of the "noise floor" git bisect good 96d7a079425d39712f0214a868beb415c46b8061 # first bad commit: [0869829f343f85935fac22462d228a065d0ba320] Adds a 3 Hz high-pass filter and boost allocation on leakage Last edited by Z2697; 14th September 2025 at 20:59. |
|
|
|
|
|
#49 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
I changed the hp filter from 3 to 0, trying to not break anything... (this is done on current main branch)
The problem is mostly gone. https://workupload.com/archive/b2dMuedNrn (Somehow I still think the old CELT is better, is this just something in my head?) Code:
diff --git a/src/opus_encoder.c b/src/opus_encoder.c
index 276dc58d..e87a5379 100644
--- a/src/opus_encoder.c
+++ b/src/opus_encoder.c
@@ -1897,7 +1897,7 @@ static opus_int32 opus_encode_frame_native(OpusEncoder *st, const opus_res *pcm,
}
#endif
} else {
- dc_reject(pcm, 3, &pcm_buf[total_buffer*st->channels], st->hp_mem, frame_size, st->channels, st->Fs);
+ dc_reject(pcm, 0, &pcm_buf[total_buffer*st->channels], st->hp_mem, frame_size, st->channels, st->Fs);
}
#ifndef FIXED_POINT
if (float_api)
|
|
|
|
|
|
#50 | Link |
|
Registered User
Join Date: Apr 2006
Posts: 185
|
The highpass filter also impacts gapless playback in case DC or rumble is present. I think it should cause bass content to be delayed, maybe enough to be noticeable after 100 times. The encoder doesn't allow any such quality switches, and only people who can edit source code can toggle them.
|
|
|
|
|
|
#51 | Link | ||
|
Registered User
Join Date: Aug 2015
Posts: 354
|
Quote:
BTW, here's an explanation of this filter: https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml Quote:
|
||
|
|
|
|
|
#52 | Link | ||
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
Quote:
![]() Quote:
|
||
|
|
|
|
|
#53 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
I think the filter is meant to remove the DC shift, which is uhh, the audio that has DC shift is problematic to begin with.
Should it be dealt with in a hard coded way? I don't know. Filtering anything just for a few problematic samples? Yes 3 Hz is inaudible, and repeated encoding is not a general use case, but still... I guess we "tolerate" in-loop deblocking filter for a reason. I'm not sure about the masking, and the effect to general sound quality, but here's an example of how it affects energy analysis, thus rate control I think: (Cut off 1 Hz is enough to remove the DC shift in this case, and the generational loss is also much lower) Code:
>ffmpeg -v 0 -i Roundabout.flac -f wav -vn -sn -ar 48k -af dcshift=0.1 - | .\org\opusenc - org.opus
Skipping chunk of type "LIST", length 124
Encoding using libopus 1.5.2-213-gb5dc74f (audio)
-----------------------------------------------------
Input: WAV, 48 kHz, 2 channels, stereo
Output: Opus, 2 channels (2 coupled), stereo
20ms packets, 96 kbit/s VBR
Preskip: 312
Encoding complete
-----------------------------------------------------
Encoded: 8 minutes and 35.22 seconds
Runtime: 2 seconds
(257.6x realtime)
Wrote: 6735550 bytes, 25761 packets, 518 pages
Bitrate: 103.805 kbit/s (without overhead)
Instant rates: 1.2 to 196.4 kbit/s
(3 to 491 bytes per packet)
Overhead: 0.746% (container+metadata)
>ffmpeg -v 0 -i Roundabout.flac -f wav -vn -sn -ar 48k -af dcshift=0.1 - | .\3to0\opusenc - 3to0.opus
Skipping chunk of type "LIST", length 124
Encoding using libopus 1.5.2-214-gca9ec9b (audio)
-----------------------------------------------------
Input: WAV, 48 kHz, 2 channels, stereo
Output: Opus, 2 channels (2 coupled), stereo
20ms packets, 96 kbit/s VBR
Preskip: 312
Encoding complete
-----------------------------------------------------
Encoded: 8 minutes and 35.22 seconds
Runtime: 2 seconds
(257.6x realtime)
Wrote: 6909963 bytes, 25761 packets, 518 pages
Bitrate: 106.507 kbit/s (without overhead)
Instant rates: 65.6 to 196.8 kbit/s
(164 to 492 bytes per packet)
Overhead: 0.733% (container+metadata)
>ffmpeg -v 0 -i Roundabout.flac -f wav -vn -sn -ar 48k -af dcshift=0.1 - | .\3to1\opusenc - 3to1.opus
Skipping chunk of type "LIST", length 124
Encoding using libopus 1.5.2-214-g8a35163 (audio)
-----------------------------------------------------
Input: WAV, 48 kHz, 2 channels, stereo
Output: Opus, 2 channels (2 coupled), stereo
20ms packets, 96 kbit/s VBR
Preskip: 312
Encoding complete
-----------------------------------------------------
Encoded: 8 minutes and 35.22 seconds
Runtime: 2 seconds
(257.6x realtime)
Wrote: 6748345 bytes, 25761 packets, 518 pages
Bitrate: 104.003 kbit/s (without overhead)
Instant rates: 1.2 to 196.8 kbit/s
(3 to 492 bytes per packet)
Overhead: 0.745% (container+metadata)
We can restore it to the code before that filter was introdecud. Code:
diff --git a/src/opus_encoder.c b/src/opus_encoder.c
index 276dc58d..e8b9f5f5 100644
--- a/src/opus_encoder.c
+++ b/src/opus_encoder.c
@@ -1897,7 +1897,8 @@ static opus_int32 opus_encode_frame_native(OpusEncoder *st, const opus_res *pcm,
}
#endif
} else {
- dc_reject(pcm, 3, &pcm_buf[total_buffer*st->channels], st->hp_mem, frame_size, st->channels, st->Fs);
+ for (i=0;i<frame_size*st->channels;i++)
+ pcm_buf[total_buffer*st->channels + i] = pcm[i];
}
#ifndef FIXED_POINT
if (float_api)
Last edited by Z2697; 15th September 2025 at 14:51. |
|
|
|
|
|
#54 | Link |
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
I hear you. The less filtering, the better, especially if it's of no benefit for most samples.
So, the bitrate is going up slightly. I'd like to try this at one-iteration low-bitrate and see if there's any effect. |
|
|
|
|
|
#55 | Link |
|
Registered User
Join Date: Apr 2006
Posts: 185
|
Significant DC is uncommon. You have to zoom all the way in to see it when it exists. Rumble is only found on vinyl and maybe some wind or malfunctioning equipment (I've seen it on a radio recording with one source). AC-3 lets you toggle it. Other codecs just encode it without problems. Highpass requires 1/3rd of a second (for 3 Hz) to catch up at the start, causing a click with the preceding track. An audio interface with its "coupling" has a highpass filter already.
Last edited by j7n; 15th September 2025 at 13:57. |
|
|
|
|
|
#56 | Link | |
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
Quote:
|
|
|
|
|
|
|
#57 | Link |
|
Registered User
Join Date: Aug 2024
Location: Between my two ears
Posts: 959
|
I think it's more about the encoder's internal status / decision making than anything else, if you remove the filter (or use old CELT, or use FFmpeg native Opus), you can see a increase in intensity in low frequency when there's DC offset, this is likely due to some fundamental design of CELT, other codecs are likely not gonna have this issue, thus they don't "have to" hard code a filter.
(libopus don't have to either! It's completely out-of-loop and DC offset is uncommon to begin with) I've opened an issue on libopus' github repo by the way https://github.com/xiph/opus/issues/434 Last edited by Z2697; 17th September 2025 at 14:18. |
|
|
|
|
|
#58 | Link |
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
Good that you reported our findings.
I'm trying to understand the difference between CELT and other transform codecs like MP3 and AAC, but as far as I can tell as a layman, the main differences are the size of the blocks, the filter bank, and how the frequency coefficients are quantised, along with delta encoding and pyramid quantisation. CELT discards all bands below 8 Hz, so I wonder how the DC rejection is having an effect; or perhaps, coming before MDCT, there is a difference whether one filters or not. |
|
|
|
|
|
#60 | Link | ||
|
Donor
![]() Join Date: Jun 2024
Location: South Africa
Posts: 680
|
Quote:
Quote:
Last edited by GeoffreyA; 18th September 2025 at 13:09. |
||
|
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|