Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > VirtualDub, VDubMod & AviDemux

Reply
 
Thread Tools Search this Thread Display Modes
Old 16th December 2020, 11:36   #1  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Variable Frame Rate (VFR) - basic questions

I recently stumbled upon a movie that has been coded with vfr.
Quite rare I found out, so I would like to ask:

1. Why isn't vfr used more often?
I made some tests and found out that it is MUCH smaller
for movies with lots of still scenes (as you find often
at audios at youtube).
If I build a movie (H264) with one identical image (1024 x 736)
the vfr version uses 80kB (independent number of frames).
The cfr version uses 1.3kB/frame (which gives large files
if you consider fps, or the typical "audio movies" on youtube).

2. Conversion cfr to vfr - is there an easy way?
(This question is often asked the other way round,
as if you want to use the timeline in any way in
a video editor it needs to deal with vfr
which most of them don't.)
Is there an easy way to turn a cfr to a vfr?
A parameter would be a similarity limit,
so if below should be taken as "still scene"?
Or do I have to do it all manually
(Identify the dupes (ffmpeg), build timecode file from that,
remove dupes, update the timecode, hopefully audio stays sync)?
nji is offline   Reply With Quote
Old 16th December 2020, 14:20   #2  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Finally found these sources:
http://forum.doom9.org/showthread.php?t=149339
http://avisynth.nl/index.php/Categor...rame_Detectors
nji is offline   Reply With Quote
Old 16th December 2020, 17:57   #3  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post


1. Why isn't vfr used more often?
I made some tests and found out that it is MUCH smaller
for movies with lots of still scenes (as you find often
at audios at youtube).
If I build a movie (H264) with one identical image (1024 x 736)
the vfr version uses 80kB (independent number of frames).
The cfr version uses 1.3kB/frame (which gives large files
if you consider fps, or the typical "audio movies" on youtube).


It's not completely independent - it also depends on the number of frames and max keyframe interval of your encoding settings. The max keyframe interval is when encoder must place a new keyframe. eg. If you fps was 24, you had 10 second video, 240 original frames, you could theoretically represent that with 1 frame if your keyframe interval was 240 or more. But some devices, scenarios have limitations on max keyframe intervals for specifications. e.g. a "24p" typical blu-ray can only have a max keyframe of 1 second (so 24 frames). Default x264 setting is 250. You can set it to "infinite".


VFR can be ok for some types of end distribution , but -

-Impaired navigation. The ability to seek in a video is determined by keyframe placement. The entire GOP has to be decoded, so it can be slow. eg. If you have a slideshow presentation with long periods of time represented by 1 keyframe - you might have to wait a long time, and sometimes cannot seek at all (depends partially on the playback software)

-It won't be "much smaller" for typical content using modern compression. There is some filesize reduction, but a typical movie with many duplicates (e.g. low fps animation), might only save 5-15%. A typical hollywood movie maybe 0.5-1%. The reason is b-frames do not "cost" a lot to encode. Yes, if you have a slideshow/presentation where there can be long periods of no change - the savings can be higher.

-Min fps on YT re-encode was 6 FPS (it might have changed) . Theoretically you could have 1 frame video for the upload 0.000001 FPS (or some small number or functional equivalen), but YT re-encodes it anyways it to 6 FPS. It can save you some bandwidth on the upload, and the YT version is still seekable to an extent, but not a massive savings on YT end.

-difficult to edit (so preferably not used for acquistion)

-Potential loss of real data. Small movements might not be detected in the VFR conversion, depending on how it's done. eg. Subtle changes like eye movement might be lost by automatic conversions if they are not QCed. Usually there is a threshold control for duplicates, but lossy compression can already cause differnces between duplicate frames . Added grain in a movie. (ie. it can difficult to determine what is supposed to be a duplicate automatically) . Often the cutoff is a grey overlapping area and needs human eyes to be accurate.



Quote:
2. Conversion cfr to vfr - is there an easy way?
(This question is often asked the other way round,
as if you want to use the timeline in any way in
a video editor it needs to deal with vfr
which most of them don't.)
Is there an easy way to turn a cfr to a vfr?
A parameter would be a similarity limit,
so if below should be taken as "still scene"?
Or do I have to do it all manually
(Identify the dupes (ffmpeg), build timecode file from that,
remove dupes, update the timecode, hopefully audio stays sync)?


2) I think handbrake can do it for "easy" and automatic without options

I prefer avisynth dedup, because you have the ability to adjust thresholds (your "similarity limits") and debug with overlays (you can "see" what you're doing and adjust), settings to decide which frame to keep (e.g. sometimes the 1st duplicate frame is higher quality, sometimes the last) . Many options like limiting strings of duplicates , and it generates timecodes for you. The main negative is the max limit is strings of 20 . It's meant for typical sources. So for slideshows/presentations with long gaps with no activity it's not as useful

Manual timecodes is not bad when you have long gaps such as presentations, because there will only be a few entries
poisondeathray is offline   Reply With Quote
Old 16th December 2020, 21:12   #4  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Thank you (once again) for your very valuable information!

To give an example - some time ago I downloaded this
https://youtu.be/NOIbJFCa2xk
Although with better quality.
Size was 19kB, 75% of it the "video stream".
Although the frames are identical, YT uses 25fps.

For this special case (only 1 frame)
there was no need for vfr, as I can reduce the constant
frame rate of the one and only frame to the complete length.
I did that with VD's filter "remove frames" (from shekh himself).

And actually I ran into the seek problem.
It doesn't play at all if I remove all but one frame.
there have to be about 20 frames left.

For the general case there remains the question to me
how the workflow will be when there are few different frames
at arbitrary positions (not only one as above).

I will use one of avisynth's dedupe filters then.
They will generate a timecode file ok.
After removing the dupes from the movie
it will have the same lenght but with adjusted constant fps
so it has still the same lenght as the audio?
And after saving that I use the timecode file to
make it vfr, right?
nji is offline   Reply With Quote
Old 16th December 2020, 21:21   #5  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post
To give an example - some time ago I downloaded this
https://youtu.be/NOIbJFCa2xk
Although with better quality.
Size was 19kB, 75% of it the "video stream".
Although the frames are identical, YT uses 25fps.

That was probably uploaded at 25fps, so YT keeps the same framerate and number of frames as the uploaded original

What I mean is the minimum FPS was 6fps of the YT re-encode. e.g If you upload a 1 fps video, 10 seconds. That's 10 frames. Youtube will re-encode it to 6fps, 60 frames (duplicates). 6x the number. That theoretically take more bits to encode . YOu can upload an equivalent FPS 0.000001 fps video (1 frame) . But youtube will re-encode it to 6fps (maybe ten thousand more frames). In those situations the savings is potentially on the user upload side, not as much on the youtube streaming side because of the MIN fps . YT probably does this for good reason (seeking / navigation). Maybe people want to skip to chorus or other parts of song or slideshow



Quote:

I will use one of avisynth's dedupe filters then.
They will generate a timecode file ok.
After removing the dupes from the movie
it will have the same lenght but with adjusted constant fps
so it has still the same lenght as the audio?
And after saving that I use the timecode file to
make it vfr, right?
Yes, same audio duration, equivalent CFR (out of sync) . You add the timecode file to make it VFR and it will be in sync .

Last edited by poisondeathray; 16th December 2020 at 21:25.
poisondeathray is offline   Reply With Quote
Old 17th December 2020, 15:56   #6  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Quote:
Originally Posted by poisondeathray View Post
...
Manual timecodes is not bad when you have long gaps such as presentations, because there will only be a few entries
But how to do that?

If the whole lenght is say 30min.

I prepare a timecode file manually for say 5 frames.
The fps of the cfr "pre movie" are so small (5/1800)
I probably cannot set it exact to match the audio length.

But that is necessary to combine the movie with the audio.
And after that turn to vfr by the timecode file.
nji is offline   Reply With Quote
Old 17th December 2020, 22:47   #7  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post
But how to do that?

If the whole lenght is say 30min.

I prepare a timecode file manually for say 5 frames.
The fps of the cfr "pre movie" are so small (5/1800)
I probably cannot set it exact to match the audio length.

But that is necessary to combine the movie with the audio.
And after that turn to vfr by the timecode file.

5 frames total for 30 min ? That's roughly 1 frame per 6 min (not necessarily spaced evenly) . I think that is quite large - most players will probably not play it properly (a very low equivalent FPS), you might need to increase the "granularity" by adding a few duplicates . Also , most players will have difficulty seeking.

Look at the mkvtoolnix (mkvmerge) documentation on the formatting and types of timestamps under the "external timestamp files" section . ("timecodes" and "timestamps" are the same thing, the name has changed , they used to be called timecodes). v2 timestamps require an entry per frame, but they need to be "evenly spaced" in time. v4 timestamps do not, but sometimes they don't work correctly with some players

You add the video and audio, select the video track, add the timestamps file in the timestamps box. You can also add chapters to navigate by in the player and use chapter names . Those chapter points must be I-frames to be seekable

Other ways are to use mp4fpsmod for mp4 container (but does not support chapters directly) , and to add the timecodes in with tcfile-in while encoding with x264
poisondeathray is offline   Reply With Quote
Old 17th December 2020, 23:22   #8  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Quote:
Originally Posted by poisondeathray View Post
...
Other ways are to use mp4fpsmod for mp4 container (but does not support chapters directly) , and to add the timecodes in with tcfile-in while encoding with x264
At this point I'm confused.
What is the application I do the muxing with? Surely not VirtualDub?
Do I have to use the codec kind of stand-alone?
nji is offline   Reply With Quote
Old 17th December 2020, 23:38   #9  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post
At this point I'm confused.
What is the application I do the muxing with? Surely not VirtualDub?
Do I have to use the codec kind of stand-alone?
mkvtoolnix (gui) or mkvmerge(cli) to mux

If you're using x264 with tcfile-in, that's commandline only option.

I tried a very low fps like your 5 frame/ 30 min example - it doesn't work for most players. At around 1 min frames it starts to work fairly consistently for common players. 30 second is even better. I think the frame duration is just too long for most players past 1 minute. They work in sync and with chapter markers, I placed audio tones at specific spots to check
poisondeathray is offline   Reply With Quote
Old 18th December 2020, 00:19   #10  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Thank you very much for your help.

I was focussing on manual timecode edit as AviSynth's DeDup has that 20 frames sequence lenght limitation.

In conjuction with shekh's Remove Frame filter I can still exand my personal CLI app to do not only Dupe Patterns but also Dupe Sequences and to produce a timecode file.
Will work on that.
nji is offline   Reply With Quote
Old 18th December 2020, 01:31   #11  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post
I was focussing on manual timecode edit as AviSynth's DeDup has that 20 frames sequence lenght limitation.
It would be nice if someone could mod that to something larger


I'm seeing compatibility issues with some players and container combinations as you use very low equivalent FPS's

I got 1 sec frames, 30 frames to work for the 30min example in MPCHC, Potplayer in MKV container, but it needed to be remuxed to MP4 for VLC, MPV . It's a bit flaky behaviour. Going lower than this gets worse

Pure duplicates , such as slideshows, don't "cost" much. Just increase the --bframes to 16 .

A low FPS, e.g. 1FPS, with duplicates, --bframes 16, long GOP will not have those flaky compatibility issues . Your audio stream will often be larger than your video stream


In my test example,

1min frames, all I -frames so seekable each minute, 30 frames, crf 20 video stream 397kb - some players have issues with some containers (e,g mp4 might not work for some, mkv for others)

1fps CFR , 1800 frames, keyframes every min (every 60 frames) ,crf 20 16bframes video stream 669kb 60x more frames, 1.65x the size, no compatibility issues because CFR

1fps CFR , 1800 frames, keyframes every 5 min (every 300 frames) ,crf 20 16bframes video stream 227kb . So this was smaller than the VFR I-frame encode, because of fewer keyframes

I could have used long gop for the VFR encode, and that would make it smaller, but it was already showing some compatibility issues in some players

In this example audio was 7.55Mb low bitrate MP3, so proportionaly much larger than the video streams .

It will vary a bit depending on what the content is exactly, but pure duplicates "cost" very little
poisondeathray is offline   Reply With Quote
Old 18th December 2020, 10:35   #12  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Wow - thank you for providing these results.

I would like to add just one aspect I observed
if the dupes are not pure (bit-wise identical).

Strangely enough this is the case for many already
already existing "still frame" movies
where one might want to get rid of the useless huge amount
of video data.
I would have bet it works for them too,
but a closer look reveils that the codecs produced
slightly different frames.

So Avisynth's ExactDeDup (no sequence lenght limit)
won't do.
(BTW StainlessS pointed out that Lagarith produces
nul frames for pure dupes).
So it would be necessary to generate pure dupes
from the similar ones before (by the orig Dup filter).

I wonder why all these conditions (especially the ones
poisondeathray worked out) aren't provided by a fine
converter cfr > vfr, so nobody needed care about
compatibility, and saved huge amounts of space.

Last edited by nji; 18th December 2020 at 10:40.
nji is offline   Reply With Quote
Old 18th December 2020, 12:47   #13  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Quote:
Originally Posted by poisondeathray View Post
It would be nice if someone could mod [20 frames sequence lenght limitation] to something larger
Moreover DeDup seems to be for AviSynth v.2.5 and not for v.2.6.

... and also seems to be outdated for some other reason that are discussed
in the other deduping filters...
So: Which dedupe filter should be taken?

And one more thing I wonder:
Do all they take into account that similariy is not associative?
Take the sequence
10 11 12 13 14 15 16 has max. neighbour distance of 1, but the distance of 10 and 16...
and the sequence
10 11 9 12 8 13 7 14 6 has max. neighbour distance of 8, but the distance of 10 and 6...

Last edited by nji; 18th December 2020 at 20:09. Reason: typo + clearification
nji is offline   Reply With Quote
Old 18th December 2020, 15:59   #14  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post

I would like to add just one aspect I observed
if the dupes are not pure (bit-wise identical).

Strangely enough this is the case for many already
already existing "still frame" movies
where one might want to get rid of the useless huge amount
of video data.
I would have bet it works for them too,
but a closer look reveils that the codecs produced
slightly different frames.

Yes - for non pure duplicates, the more noisy, the more space saved. It's pure duplicates that don't "cost" much. This can be a significant savings if there are many duplicates

This is also the "grey" area mentioned earlier about accuracy of duplicate detection. The threshold is not always accurate because of grain, noise, wobbly transfer, lossy compression. It takes some human eyes to get it right, or you might drop good frames, or keep noisy duplicates . Pure or very similar duplicates are much easier to detect accurately


Quote:
I wonder why all these conditions (especially the ones
poisondeathray worked out) aren't provided by a fine
converter cfr > vfr, so nobody needed care about
compatibility, and saved huge amounts of space.
Pros / cons mentioned earlier.

It's not huge amounts of space, for typical "clean" scenarios. And for 1 frame "songs", several frame lectures - being able to seek within a song or a lecture is arguably more important than saving a few MB

Playback is a player/splitter/decoder issue . It varies. They all work for "normal conditions" with variable FPS's such as dedup where you have strings of <20 duplicates. Basically all theatrical movies will be covered. It's only when you have very large gaps (very small equivalent FPS) that some begin to have issues





Quote:
Originally Posted by nji View Post
Moreover DeDup seems to be for AviSynth v.2.5 and not for v.2.6.

It works with all versions, but it's x86 only. You can use it with mp_pipeline for avs+ x64 (avisynth 3.6.2)

Quote:

And one more thing I wonder:
Do all they take into account that similariy is not associative?
Take the sequence
10 11 12 13 13 15 16 has max. neighbour distance of 1, but the distance of 10 and 16...
and the sequence
10 11 9 12 8 13 7 14 6 has max. neighbour distance of 8, but the distance of 10 and 6...

If you're referring to duplicates out of order, or later in a sequence - dedup only works for consecutive duplicate frames . It keeps a reference placeholder frame for each string of duplicates
poisondeathray is offline   Reply With Quote
Old 18th December 2020, 19:12   #15  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Just a suspicion:

Maybe the reason for AviSynth's DeDup 20 frames sequence lenght limitation
is to prevent the "shifting scenario" from my first sequence above?
That would be a bad alg.
nji is offline   Reply With Quote
Old 18th December 2020, 19:38   #16  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Maybe I don't understand the notation ?

In the 1st sequence, 13 and 13 are duplicates, and they are neighboring . This is ok . The 2nd "13" will be dropped, timecodes adjusted by dedup. 10 and 16 are different frames

The 2nd sequence has no duplicates? Or is this referring to something else ?
poisondeathray is offline   Reply With Quote
Old 18th December 2020, 20:09   #17  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
I'm terribly sorry ... I did a typo in the first sequence ...
I corrected it just now.
And yes, you're misunderstanding.
It's not about frames,
but simply natural numbers and their differences.

Last edited by nji; 18th December 2020 at 21:08.
nji is offline   Reply With Quote
Old 18th December 2020, 20:20   #18  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post
and the sequence
10 11 9 12 8 13 7 14 6 has max. neighbour distance of 8, but the distance of 10 and 6...


Quote:
Originally Posted by nji View Post
its not about frames,
but simply natural numbers and their differences.
Sorry, I do not understand what that refers to -

If it's "not about frames",what do the numbers refer to ?

This implies each number has a relative "position", because you mention "neighbor distance of 8"

Is it just about misordered frames ?

6,7,8,9,10,11,12,13,14 ?
poisondeathray is offline   Reply With Quote
Old 18th December 2020, 21:04   #19  |  Link
nji
Registered User
 
Join Date: Mar 2018
Location: Germany
Posts: 230
Think of the numbers as the "characteristic values" of the frames they stand for.
The question is how the deduping algorithms define "sequence of similar":
Are the neighboured frames compared, or is the first frame of the sequence
the reference to compare the others to?
If it's the first definition, then a "sliding changement" of only difference 1
(10 11 12 13 etc.) has to be "cut off" at some point as the difference between the first (10)
and the last (13 or whatever) has become too large ALTHOUGH the difference
of the neighbours is only 1.
My suspicion is that might be the reason for AviSynth DeDups limit of 20.

Last edited by nji; 18th December 2020 at 21:07.
nji is offline   Reply With Quote
Old 18th December 2020, 21:25   #20  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,435
Quote:
Originally Posted by nji View Post
Think of the numbers as the "characteristic values" of the frames they stand for.
The question is how the deduping algorithms define "sequence of similar":
Are the neighboured frames compared, or is the first frame of the sequence
the reference to compare the others to?

They usually compare similarity characteristics either difference from previous, or difference to next. Usually only small areas are sampled, not entire frame for speed purposes . The "%" value is what is used for the threshold

As mentioned earlier, for dedup it's only directly adjacent frame

Position
10 difference from 11
11 difference from 12
.
.
.
An example from a log looks like this

Code:
frm 22: diff from frm 23 = 2.1809% at (160,352)
frm 23: diff from frm 24 = 3.5938% at (256,448)
frm 24: diff from frm 25 = 2.9563% at (128,352)
frm 25: diff from frm 26 = 3.1458% at (160,352)
frm 26: diff from frm 27 = 3.4811% at (256,448)
frm 27: diff from frm 28 = 0.2851% at (416,256)
frm 28: diff from frm 29 = 3.4487% at (256,448)
poisondeathray is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:51.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.