Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 13th July 2024, 00:15   #1  |  Link
bencuri
Registered User
 
Join Date: Apr 2024
Posts: 4
Video converted with x264.exe lagging on Youtube

I have noticed recently, that my videos that I converted with x264.exe tends to lag when played back on Youtube. The lagging happens when you move the mouse on the screen. It can be just a momentary move or continuous. The lagging is mostly in the video, but sometimes the sound stops for a moment as well. This is what I captured on this video:
https://drive.google.com/file/d/1YbY...ew?usp=sharing

The videos in question are 1080 50p, this option is selected in Youtube as well. When I select 360p on Youtube, the video is not lagging any more.

You could conclude: Okay this is a laptop performance error, but I checked 1080 50p videos uploaded by others, and in those cases I did not experience this lagging. I captured it here:
https://drive.google.com/file/d/1ML7...ew?usp=sharing

So the error has to be in my conversion process. I wonder what do I do wrong that it happens? I wonder if it is caused by the fact that the profile of my videos are: High L4.2 and not Main or Baseline?

Here is the full video that you can see in the attachments:
https://drive.google.com/file/d/1go7...ew?usp=sharing

I have used this mod to convert recently:
https://github.com/jpsdr/x264/releases

However I checked my older videos as well, and this lagging happens in case of those, too. I created those with the x264.exe versions available on the VLC Player website. Those are mkv files of course. In their case the lagging is not that heavy though. Occurs here and there, but less noticeable.

By the way, I noticed certain errors while playing back these videos on my laptop as well. With the mkv's there is no problem. But the mp4's that I converted with the mod, behave strangely when played back on the laptop, too. Depening on what software you use to mux them, they may lag or not lag. For example if you mux with Yamb, they lag heavily when you click to various positions on the timeline during playback, but even when you mux with MeGui. The videos seems to have no lagging or just minimal when using this software to mux:
https://www.videohelp.com/software/MP4-tool

So based on this I am not sure whether the error comes from x264.exe or the muxing process. The outcome is, however, that no matter what software I use to mux, the lagging is there in every case when I watch these videos on Youtube.
bencuri is offline   Reply With Quote
Old 15th July 2024, 17:53   #2  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,019
Hi Bencuri,
the issue isn't with x264, but rather with whatever you're using to feed it.
Strictly speaking, x264 is just an encoder, it encodes whatever you're passing to it, which is generally an uncompressed A/V stream generated by frameservers like Avisynth/VapourSynth or other tools like FFMpeg.
In Avisynth, I've indexed your file like this:

Quote:
video=LWLibavVideoSource("Obligatoire_Low.mp4")
audio=LWLibavAudioSource("Obligatoire_Low.mp4")
AudioDub(video, audio)
and then I started going frame by frame to see whether there were any dups or not and sure enough there were.

If we take a look at frame 1266, for instance, at 00:00:25.32, it's the same as frame 1267 at 00:00:25:34.

Below you're gonna see the following frames (scaled to 1024x576):

1264 -> ok
1265 -> dup
1266 -> ok
1267 -> dup
1268 -> ok
1269 -> dup










In other words, your video has 1 true frame and 1 duplicated frame, so it's not 50p, but rather 25p.
To fix that, the first thing we need to do is remove the duplicated frames by using TDecimate(), in other words:

Quote:
video=LWLibavVideoSource("Obligatoire_Low.mp4")
audio=LWLibavAudioSource("Obligatoire_Low.mp4")
AudioDub(video, audio)

TDecimate(mode=2, rate=25)
I have then encoded the result of that with:

Quote:
x264.exe "Obligatoire_Low.mp4.avs" --crf 22 --preset medium --ref 4 --level 4.1 --profile High --vbv-maxrate 25000 --vbv-bufsize 25000 --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --log-level info --thread-input --opencl --transfer bt709 --colorprim bt709 --videoformat component --nal-hrd vbr --output "raw_video.h264"

ffmpeg.exe -hide_banner -i "Obligatoire_Low.mp4.avs" -vn -sn -c:a aac -b:a 550k -ar 48000 -y "audio.aac"

mp4box.exe -add "raw_video.h264" -add "audio.aac" "Obligatoire_Low_25p.mp4"

pause
You can find the result here: https://we.tl/t-yfE0RtBvvd
(link available for 7 days)

Now, if for whatever reason you actually need a 50fps output that isn't simply 25p duplicated to 50p, then you have two ways:

1) Blending
2) Linear Interpolation

The first technique consists in taking the image immediately before, the one immediately after and overlay them one on top of the other to create the missing frame. The idea being that this overlay - although horrible when you move frame by frame - will create the illusion of movement when you're watching it on playback.
To do that, we can use the following script:

Quote:
video=LWLibavVideoSource("Obligatoire_Low.mp4")
audio=LWLibavAudioSource("Obligatoire_Low.mp4")
AudioDub(video, audio)

TDecimate(mode=2, rate=25)

ConvertFPS(50)

Of course this time round you're gonna have blended frames instead of duplicated ones, so, if we go back to the same group of frames as before we're gonna have:

1264 -> ok
1265 -> blended
1266 -> ok
1267 -> blended
1268 -> ok
1269 -> blended









The resulting file is called Obligatoire_Low_Blending_50p.mp4 and you can find it here: https://we.tl/t-U5Zu7CiQIa
(link available for 7 days)

The second technique consists in using linear interpolation, so basically we're dividing the frame in blocks of 16x16 pixel, assigning a value to each one of them and then performing the same calculation on the following frames. This way, when the values "match", we're gonna know where things are moving. For instance, if a person is moving his hand upwards, then in theory we're gonna have a match on the blocks comprising his hand in the next frame, so we know that it moved from point A to point B and we're gonna be able to linearly fill the gap.
Although this feels amazing on paper as it uses backwards and forward motion vectors to recreate the missing frames, it can also lead to artifacts whenever it doesn't find the references (like when there's grain/noise or because of other reasons), so be very careful with linear interpolation.

To do this we can use the following script:

Quote:
video=LWLibavVideoSource("Obligatoire_Low.mp4")
audio=LWLibavAudioSource("Obligatoire_Low.mp4")
AudioDub(video, audio)

TDecimate(mode=2, rate=25)

super=MSuper(pel=1, hpad=0, vpad=0)
backward_1=MAnalyse(super, chroma=false, isb=true, blksize=16, blksizev=16, searchparam=3, plevel=0, search=3, badrange=(-24))
forward_1 =MAnalyse(super, chroma=false, isb=false, blksize=16, blksizev=16, searchparam=3, plevel=0, search=3, badrange=(-24))
backward_2 = MRecalculate(super, chroma=false, backward_1, blksize=8, blksizev=8, searchparam=0, search=3)
forward_2 = MRecalculate(super, chroma=false, forward_1, blksize=8, blksizev=8, searchparam=0, search=3)
MBlockFps(super, backward_2, forward_2, num=50000, den=1000, mode=0)

Going back to the same group of frames, we're gonna have

1264 -> ok
1265 -> interpolated
1266 -> ok
1267 -> interpolated
1268 -> ok
1269 -> interpolated









The resulting file is called Obligatoire_Low_LinearInterpolation_50p.mp4 and you can find it here: https://we.tl/t-izZXiy2IsB
(link available for 7 days)

Last edited by FranceBB; 15th July 2024 at 17:59.
FranceBB is offline   Reply With Quote
Old 17th July 2024, 01:42   #3  |  Link
bencuri
Registered User
 
Join Date: Apr 2024
Posts: 4
Wow. Thanks for the detailed insight! Lots of useful inormation.

The video is 50p because the original was 50i, I deinterlaced it with QTGMC, so it became 50p. This is the original:
https://drive.google.com/file/d/1qUP...ew?usp=sharing

The lagging I referred to however is not the lower motion quality caused by duplicate frames, but momentary slowdowns and skipping when moving the mouse on the screen during playback from Youtube. But I have no problem when I let the video play and don't move the mouse, then it plays fine.

Meanwhile I investigated that the video that does not do that momentary lagging is streamed in VP9 format by Youtube. My videos are streamed in AVC. So maybe if I exported my videos in VP9 I could eliminate this problem.

I found ffmpeg can do it, and I received this command line as recommendation:

ffmpeg -i source.mp4 -g 240 -quality good -crf 18 -c:v libvpx-vp9 -c:a libopus -speed 4 -y output.webm

However I am not sure about how many threads I should use to convert? I read ffmpeg does not respect thread settings all the time. The processor I would use is:

Amd Ryzen 5 3600xt

6 cores 12 threads
bencuri is offline   Reply With Quote
Old 17th July 2024, 12:41   #4  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,019
Quote:
Originally Posted by bencuri View Post
The video is 50p because the original was 50i, I deinterlaced it with QTGMC, so it became 50p. This is the original:
https://drive.google.com/file/d/1qUP...ew?usp=sharing
Well, I've downloaded the original sample and mediainfo shows the following:

Quote:
General
Complete name : A:\MEDIA\temp\Obligatoire.mpg
Format : MPEG-PS
File size : 332 MiB
Duration : 8 min 24 s
Overall bit rate mode : Variable
Overall bit rate : 5 518 kb/s
Frame rate : 25.000 FPS

Video
ID : 224 (0xE0)
Format : MPEG Video
Format version : Version 2
Format profile : Main@Main
Format settings : CustomMatrix / BVOP
Format settings, BVOP : Yes
Format settings, Matrix : Custom
Format settings, GOP : M=3, N=15
Format settings, picture structure : Frame
Duration : 8 min 24 s
Bit rate mode : Variable
Bit rate : 5 024 kb/s
Maximum bit rate : 7 000 kb/s
Width : 720 pixels
Height : 576 pixels
Display aspect ratio : 4:3
Frame rate : 25.000 FPS
Standard : PAL
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Interlaced
Scan order : Bottom Field First
Compression mode : Lossy
Bits/(Pixel*Frame) : 0.485
Time code of first frame : 00:00:00:00
Time code source : Group of pictures header
GOP, Open/Closed : Closed
Stream size : 302 MiB (91%)
Writing library : (dvd5: Oct 27 2015)
Color primaries : BT.601 PAL
Transfer characteristics : BT.470 System B/G
Matrix coefficients : BT.601

Audio
ID : 189 (0xBD)-128 (0x80)
Format : AC-3
Format/Info : Audio Coding 3
Commercial name : Dolby Digital
Muxing mode : DVD-Video
Duration : 8 min 24 s
Bit rate mode : Constant
Bit rate : 384 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 48.0 kHz
Frame rate : 31.250 FPS (1536 SPF)
Compression mode : Lossy
Stream size : 23.1 MiB (7%)
Service kind : Complete Main
Dialog Normalization : -27 dB
compr : -0.28 dB
dialnorm_Average : -27 dB
dialnorm_Minimum : -27 dB
dialnorm_Maximum : -27 dB

Once I indexed I noticed that - although being divided in fields - it's actually 25fps 'cause the even and odd fields are exactly the same, but slightly shifted vertically.

This is immediately noticeable once you try to bob it:

Quote:
video=LWLibavVideoSource("Obligatoire.mpg")
audio=LWLibavAudioSource("Obligatoire.mpg")
AudioDub(video, audio)

AssumeTFF()
Bob()
Looking at frame 1264 at 00:00:25.28 and the one immediately after, we can see that they're the same:




Subtracting them from one another also shows no movement at all as they're literally the same but slightly shifted upwards:

Quote:
video=LWLibavVideoSource("Obligatoire.mpg")
audio=LWLibavAudioSource("Obligatoire.mpg")
AudioDub(video, audio)

AssumeTFF()
my_bobbed=Bob()

my_even=SelectEven(my_bobbed)
my_odd=SelectOdd(my_bobbed)

Subtract(my_even, my_odd)


For reference, if we look at the differences between frames instead of fields (i.e if we look at the actual movement) we get something much more meaningful:




So, to encode this, a simple TDeint() should suffice.
The content is also anamorphic flagged 4:3 but actually it's letterboxed, so we can easily crop and resize to 16:9 1.77 FF. It's still gonna be SD but at least we don't get black borders on modern displays.
Unfortunately, the problems don't stop here.
The content has overshooting going above 0.7V (i.e 235 in 8bit) which we can easily see with VideoTek().




The white shirts are way too high, so we need to compensate for that and bring them down to make them fall within the legal range to get a proper Limited TV Range output.
Unfortunately the cutoff in both blacks at the bottom and highlights at the top isn't clear 'cause the MPEG-2 encoder used created plenty of compression artifacts which led to over and under-shooting, so I tried to stay a bit conservative with Levels().



and then I added Limiter() for good measure.
The second thing is that the audio is all over the place and in some regions it even goes to 0db, thus clipping.
I know that in AC3 you can specify the DRC and have the decoder output the audio to that level for you, but having it set to -27 (not a standard that I'm aware of) and relying on the decoder to do the hard work while leaving the stream inside totally uncapped is definitely not a best practice.



In other words, we need to do something to tackle this and although there are many ways to do that, the very least we can do inside Avisynth is use Normalize().


Anyway, the final script would look like this:

Quote:
#Indexing
video=LWLibavVideoSource("Obligatoire.mpg")
audio=LWLibavAudioSource("Obligatoire.mpg")
AudioDub(video, audio)

#Deinterlacing
AssumeTFF()
TDeint()

#Resizing to 16x9 1.77 FF
Crop(4, 60, -4, -60)
Spline64Resize(848, 480)

#Limited TV Range
Levels(0, 1, 255, 1, 235, coring=false)
Limiter(min_luma=16, max_luma=235, min_chroma=16, max_chroma=240)

#Loudness Correction
ConvertAudioToFloat()
Normalize(0.22)


so we can encode with

Quote:
x264.exe "A:\MEDIA\temp\Obligatoire.avs" --crf 18 --preset medium --profile Main --level 3.1 --deblock -1:-1 --overscan show --colormatrix bt470bg --range tv --log-level info --thread-input --opencl --transfer bt470bg --colorprim bt470bg --videoformat component --nal-hrd vbr --vbv-maxrate 25000 --vbv-bufsize 25000 --output "A:\MEDIA\temp\raw_video.h264"

ffmpeg.exe -hide_banner -i "A:\MEDIA\temp\Obligatoire.avs" -vn -sn -c:a aac -b:a 550k -ar 48000 "A:\MEDIA\temp\audio.aac"

mp4box.exe -add "A:\MEDIA\temp\raw_video.h264" -add "A:\MEDIA\temp\audio.aac" "A:\MEDIA\temp\final_output.mp4"

pause
If you wanna perform LC in FFMpeg instead, you can use the loudnorm audio filter like so -af loudnorm=I=-24:LRA=12:tp=-2

Final result here: https://we.tl/t-p94F6vEo5a
(link valid for 7 days)


Quote:
Originally Posted by bencuri View Post
The lagging I referred to however is not the lower motion quality caused by duplicate frames, but momentary slowdowns and skipping when moving the mouse on the screen during playback from Youtube. But I have no problem when I let the video play and don't move the mouse, then it plays fine.
That sounds like an hardware acceleration issue on your PC, though, nothing x264 has anything to do with.

Quote:
Originally Posted by bencuri View Post
Meanwhile I investigated that the video that does not do that momentary lagging is streamed in VP9 format by Youtube. My videos are streamed in AVC. So maybe if I exported my videos in VP9 I could eliminate this problem.
Nope, encoding the file in a different way has pretty much no effect on the way YouTube serves it to its users. This is because YouTube is re-encoding every file you upload anyway to create the various renditions. It generally creates the H.264 ones for backwards compatibility with very old devices, the VP9 ones for normal devices and the AV1 ones for a handful of modern devices. Audio is encoded in AAC 128 kbit/s for compatibility reasons and Opus 152 kbit/s for modern devices. How YouTube serves you the video totally depends on a bunch of factors, including which OS and which browser you're using and of course which resolution you're picking (if you force it from the engine reel in the player). In your case, what happens is probably that the AVC stream is using hardware decoding and is not behaving correctly, while for VP9 it's probably relying on software decoding thus not being affected by that.

Quote:
Originally Posted by bencuri View Post
The processor I would use is:
Amd Ryzen 5 3600xt
6 cores 12 threads
Since you have a fairly new CPU, you should be more than capable to handle AV1 software decoding, so if you have a YouTube account go to settings, playback and performance and select "Always prefer AV1" like I have done here in this screenshot using my account:



Obviously there's no guarantee that YouTube will be serving AV1 only from now on, but at least it will know that you prefer to get AV1.
FranceBB is offline   Reply With Quote
Old 18th July 2024, 01:46   #5  |  Link
bencuri
Registered User
 
Join Date: Apr 2024
Posts: 4
Wow, the outcome is very good. Much easier for the eye to watch.

So in case I want this in 50p, I need to use the interpolation script you included earlier?

Can you suggest something to upscale this to Full HD? I used this script. The frame sizing is different, because the sizing of this original seems to be strange, and seem to vary sceen by sceen, I decided to go with 14/9 as a compromise. I also used Resize8 because I liked the outcome better than Splineresize:

Code:
nnedi3_rpow2(2, cshift="Spline36Resize", fwidth=1120, fheight=720)
aWarpSharp2(depth=5)
CAS(0.5)
nnedi3_rpow2(2, cshift="Spline36Resize")
CropResize(1680,1080, InDAR=14.0/9.0, Resizer="Resize8")
aWarpSharp2(depth=5)
CAS(0.5)
AddBorders(120,0,120,0)
I don't understand how you could investigate the audio is clipping? When I open it in Goldwave the volume seems to be okay, still much space until max. How is it that in spite of that it is clipping?



Could you improve the whites in this one below as well? This is even worse in quality, so would benefit a lot from better white/background balance. I understood what you described previously, but I have no idea how to read some diagrams that you presented or how to conclude what values from the things that I can interpret. I am not that expert.

https://drive.google.com/file/d/17bK...ew?usp=sharing
bencuri is offline   Reply With Quote
Old 18th July 2024, 15:03   #6  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,424
Obligatoire.mpg is mostly interlaced content - it should be double rate deinterlaced to 50p . You don't want to discard the real motion data that you already have, just to interpolate it later with artifacts

There is the short problem section around 25 seconds, you can probably splice in a fix with 25p to 50p interpolated using rife or similar . There might be other problem sections but >95% is 50 fields/s motion just looking at it quickly
poisondeathray is offline   Reply With Quote
Old 19th July 2024, 20:13   #7  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,656
Quote:
I don't understand how you could investigate the audio is clipping? When I open it in Goldwave the volume seems to be okay, still much space until max. How is it that in spite of that it is clipping?
Zoom in and watch the waveform. Has been clipped @ -4dBFS before encoding.
BTW, Luma boosted out of range, chroma boosted out of range, a very bright party.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."

Last edited by Emulgator; 19th July 2024 at 20:31.
Emulgator is offline   Reply With Quote
Old 31st July 2024, 09:23   #8  |  Link
Balling
Registered User
 
Join Date: Feb 2020
Posts: 552
"Unfortunately, the problems don't stop here.
The content has overshooting going above 0.7V (i.e 235 in 8bit) which we can easily see with VideoTek()."


Superwhite is a thing (here we have indeed 242 white in that scene). Limited range is limited so as to allow > 100% white... like 235, 128, 128 is 100% white and then 236, 128, 128 and up to 254, 128, 128 are used. 255 is not allowed, at least in HDMI and friends.

That is even less surprising in legacy SD content but still happens in HD. Besides this is a strange 2.8 gamma System B/G...

Last edited by Balling; 31st July 2024 at 10:39.
Balling is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:30.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.