Variable Frame Rate with RV9 [Archive]

View Full Version : Variable Frame Rate with RV9

iwod

19th June 2003, 18:23

Ever since i thought about reducing the amount of frames in a anime to reduce the size of movie. I have been researching, and found MarcFD 's copysame and neuron 2 Dup 2.x avisyth filter.

At first, MarcFD had the same idea as i have thought ( But he was way ahead of me :D ) and made copysame. But then he discover than he strangly conclude that copysame won;t bring any improvement to anime. And B-frame would do the job. I found no other resources to have more details on the subject. Therefore i decide to post here.

Kaiousama wanted to do the same thing with RV9 as well. Since the codec itself support variable framerate. Although somebody said this features was only intended for streaming purpose as to drop a frame to reduce bandwidth.
I am sure with some tweaks we could do it more manually.

In previous Kaiousama post Pamel suggested in avisyth, to modifly Dup 2.0 to drop a frame when the frame are silimar. But i found out since AVI does not support varible frame rate therefore there is no way to do that.

I open up this topic for people to post up ideas.

Recently Drak Cracker posted a tool to detect motion, if it is high/normal/slow. In anime if we could tell x to drop to 15fps when it is in slow motion, would that be great??

Or could somebody think of a way to input a varible frame rate source to Helix Producer?

Any suggestion would be more than welcome.

Atamido

19th June 2003, 19:02

Let me clarify on what I said earlier. AVI does not support Variable FrameRate, but it does support dropped frames. What dropped frames were for is when capturing video, if the computer falls behind, it can skip a frame. When it does this though, it has to put a dummy frame down so that the video and audio stay in synch. What this dumm frame does is to tell the reader to show the same frame over again.

It is also used by many anime rippers that are dealing with DVD's that switch between 24fps and 30fps video. Because AVI is Constant FrameRate only, there isn't a way to store video that switches between the two framerates. So, what they do is Increase each portion of video to 120fps. (24fps x 5 = 30fps x 4 = 120fps) To do this they create extra frames that are drop frames. So, there is a real frame, and then 4-5 dropped frames that just keep showing the same thing. Of course, adding all of these fake frames does have a fairly significant amount of overhead.

How is this useful for you? Well, you could create a filter that passes a dropped frame when it detects no movement in the video. I don't know though if this is possible with AVISynth filters.

If you were encoding through DirectShow to Matroska, you could create a filter that simply removed a frame when it detected no movement. Maybe it could even adjust timecodes to compensate in low motion video.

For a lot of anime, it would be pretty easy to drop down to show 1-2fps for those shots where there is absolutely no no change for a good second or more. That would translate to a lot of bytes saved over the course of an hour.

Yusaku

24th June 2003, 02:11

I think that this would be a great feature for anime encodes, but you should file this as a request to XviD (or other video format) developers, as it cannot be done on the container level. (and there is not much of a chance of it getting into encoding applications for RM, IMO)

Since you have always some level of noise on the DVD, not any two frames will be exactly the same, but these frames could be distinguished by very small size under some threshold in XviD (after compression). If the drop frame falls into B-Frame, you can just drop it on application level (small hack in VDub would do that easily). But if it falls to P-frame, you need to tell the codec to continue motion estimation from previous I/P frame instead of the actual one, otherwhise you'll see some serious quantization errors (thus the need to support this on codec level).

The other problem might be in MPEG4 specifications - even though you can easily put this into AVI/OGM/Matroska fileformats, I am not sure whether the resulting videostream still will be MPEG-4 compatible. In MPEG-2 PS it is possible, though, so it should not be a problem.

Do not expect wonders, though - you'll be eliminating only the smallest frames and I think that you'll save at most around 5% of total videostream size.

karl_lillevold

24th June 2003, 03:54

Originally posted by Yusaku
I think that this would be a great feature for anime encodes, but you should file this as a request to XviD (or other video format) developers, as it cannot be done on the container level. (and there is not much of a chance of it getting into encoding applications for RM, IMO)
never say never :sly: The times they are a changing. maybe the producer team can't add it to Producer, but the only thing stopping me from adding such a feature to RV9, the codec, is finding time to do it, and with your "challenge" above, I think I might just make time!

Yusaku

24th June 2003, 11:47

Hehe. Actually, I was hoping a bit for such a reply :devil:

I'm always glad to be proven wrong :)

iwod

25th June 2003, 07:05

Thanks for that Yusaku......... as i have personally never prevade Karl to do anything...... and now he is doing a ANIME MODE!!!!!!! :devil:

@Karl, on the RV9 info thread you said you are having trouble with detecting noise, There is a AVIsyth scripts that will detect such thing and if they are nearly the same. it is called Dup.

I don't know if it may be any use to you to look at how it works.

karl_lillevold

25th June 2003, 07:13

@iwod: when did you ask for anything, and I did not listen :) ? I am looking into using existing parameters and calculations inside the codec to determine whether or not a frame can be dropped. If the one parameter I am thinking of does not work, I may look into other options, so thanks for the pointer. This is very experimental..

Yusaku

25th June 2003, 15:51

Another algorithm that may be used is the interfield difference computation in Decomb (careful, GPL ;) ), which works pretty nice.

Also, if this is implemented, there should be some noise filter applied to the output (for each frame), to give people the idea that they are watching a movie and not five second lasting frame (especially valid for south park/pokemon style animation)

Another practical observation: many programs (including decomb) when deciding which frame to drop leave the first frame in the row and drop the rest of duplicates. For new digital era shots it is not a problem, but on some old material with crappy telecinema this means that the frame with blending artifacts is preserved and the (better looking) frames that folow it are dropped.

karl_lillevold

25th June 2003, 16:19

Yes, there are many ways to detect duplicated frames, and I am pretty sure a good solution can be found. a first experiment would just be a rough cut and then determine if there is any measurable improvement, /and/ if the result is pleasant to watch. Since I can not change the bitstream or the player at this point, playback "comfort noise" can not easily be added, and I think dropping more than every other frame might lead to the impression of frozen video. People are after all used to a slight degree of full framerate noise.

Kaiousama

26th June 2003, 11:39

First of all, thanks iwod for having recollected informations about VFR posts, and thanks Karl for your interest in anime-oriented codec developing :)

I agree with pamel's idea of dropping frames, I think it could be the right way to walk.
From my experience i can tell you that Dup filter is actually the best example to learn from, in order to detect anime lower-framerate sequences; even if it has a little problems due to the thresehold that leads sometimes to false positive duplicate cases during CG slow zooming scenes (if you set a too high threseold it detects this scenes as repeted frames, but if you lower the threseold it detects noise as a significative change) a tweak on this issue will make the filter fit perfectly for the duplicate detection part of this codec tweaking.

I also agree that this implementation isn't possible via container, it has to be done on avisynth side but in this case it would need codec support, so modifying directly the codec and leaving unchanged the avisynth side is in my opinion the better solution.

a first experiment would just be a rough cut and then determine if there is any measurable improvement, /and/ if the result is pleasant to watch.

The weak point of dup-copysame filter was a slightly non fluid movement when watching the result video, more noticeable when the framerate slow down / speed up between 2 parts painted at different framerate.
I don't know if improvements will be so measurable, surely it will bring a bitrate reduction, and surely it'll helps to have a cleaner video, more similar to what Anime producers have before the dvd encoding process.

@Pamel:
It is also used by many anime rippers that are dealing with DVD's that switch between 24fps and 30fps video. Because AVI is Constant FrameRate only, there isn't a way to store video that switches between the two framerates. So, what they do is Increase each portion of video to 120fps. (24fps x 5 = 30fps x 4 = 120fps) To do this they create extra frames that are drop frames. So, there is a real frame, and then 4-5 dropped frames that just keep showing the same thing. Of course, adding all of these fake frames does have a fairly significant amount of overhead.

I've seen only one time a video such like this, can you tell me how is this process pratically done (if you think it's too OT, please send me a pm, i'm very interested)?

@all:
I think RV9, first with EHQ, and then with this possible new feature, is becoming very effective on anime content, a lot more than other codecs.... :eek: :D

karl_lillevold

26th June 2003, 14:53

i have a version working, using SSD comparisons of current and previous frame, currently average over the whole image. This is very effective for the sources I have tried, which were both pretty clean. One was T01 that Sirber provided, will try T03 later. I will run some PSNR/bitrate measurements, to determine effectiveness. Even though these may not be huge, encoding is significantly sped up though, since the SSD comparison is much quicker than a full encode. I can also early exit if the difference is large.

I will also add a threshold for SSD max per area, and tune the thresholds for more noisy sequences. However, like Kaiousama mentioned, even if such frame dropping were to made to work for noisy sequences, I am afraid the visual effect of frame drops would be less fluidity, and maybe not desirable.

P.S. If someone were so inclined they could also write an input transform plugin for Helix Producer that removes duplicated frames. You can get example source code from the Helix Community and for instance use the Dup example mentioned previously. the source code for our Inverse Telecine / De-interlace filter is available from producersdk/pluings/transform/videoprogressive and is a good starting point.
My version lives inside the codec for now, but thinking out loud, maybe I will indeed move it to such a plugin. That way anyone can get the source code for it, and tweak it themselves.

Sirber

26th June 2003, 14:59

@Karl

Could you compress T03 with the Brand New Golden Frame Removal Mega Engine and send me a link? I'll add it to my comparison. Use the same settings and Helix Producer GUI. :D

Thanks!!!

karl_lillevold

26th June 2003, 15:02

"working" != "complete", but I will let you know if or when, there is an improvement worth including in the comparison... and today there is the dentist, meetings, I will write up info about usage of RA M-Channel, but soon.

Sirber

26th June 2003, 15:05

Ok

But working is enough for me :D. In computer-related stuff, nothing can be "complete", there is always room for improvements.

Exemples:

Windows, HDD, codecs, speaker sound quality, etc.

Atamido

26th June 2003, 19:05

@Kaiousama:

I'm not sure how the Dup filter does it, but you may be able to tell me. The good way to implement this kind of technique would be to do some complex frame buffering where a frame is not compared to the following frame, but rather to the frame 2-3 frame afterwards. This is because the anime and CG scenes where it would be proper to use dropped frames are always going to be longer than 2-3 frames. So, if a frame is determined to be the same as the frame 3 frames afterwards, then you can drop/copy those 3 frames because you know they are all going to be the same.

It would prevent a false positive on slow CG zoom sequences because while a frame might be almost identical to the following frame, it is going to be much more different than a frame 2-3 down the line.

Also, you could increase the bufferring to 10+ frames to make sure that you only get those scenes where it is still for at least 1/3 of a second.

About the 120fps issue, I'm afraid I don't know anymore about it. I have never even seen one of these clips. I have only heard people talk about having to deal with the already created 120fps clips.

RadicalEd

27th June 2003, 09:15

Originally posted by Pamel
This is because the anime and CG scenes where it would be proper to use dropped frames are always going to be longer than 2-3 frames.

That's not necessarily true, a lot of the time animation is drawn at 12 or 6 fps and the frames merely copied once or twice to create 24 fps. If more than half the anime is at 12fps, that's quite a lot of frames you'd be missing that could be dropped.

bill_baroud

27th June 2003, 12:01

@Kaiousama : just read the post i started about the avisynth handling of those 120fps clip ... there is several interesting link.

http://forum.doom9.org/showthread.php?s=&threadid=49561

sorry for the OT

Kurosu

27th June 2003, 12:42

Originally posted by Pamel

I'm not sure how the Dup filter does it, but you may be able to tell me.

If I recall well, dup does a Sum of Absolute Differences (quicker but less effective than the Sum of Squared Differences used by Karl) between each 32x32 block of the picture to process and a reference frame - I don't know how the missing pixels (when a dimension isn't MOD32).

Then a threshold is used to compare against the normalized SAD (ie SAD/(32x32)) for each block to see if there were noticable changes in it: if one has a higher change, the frame isn't modified. Otherwise, it's either averaged with the reference frame or simply copied. There is at most a maximum of 30 consecutive copies (I think) in a run.

I would also recommend the reading of this (http://forum.doom9.org/showthread.php?threadid=47203) thread, although nothing really came out of it: I believe the video stream must be delivered through DirectShow/DMO/whatever to achieve VFR controllable by the user. But as you guess, it's such a leap compared to Avisynth implementation that the conclusion was: not for now.

Maybe the coming of matroska and VirtualDub2 (and such threads) may change this. It will still be a lot of effort, but it might be extremely useful for some of us. Empty delta frame (as in: no difference with previous one) aren't that costly it seems (around 1KB for a 720x576 frame in MPEG-4).

RadicalEd

27th June 2003, 14:01

If the codec is doing it's job I suppose there shouldn't be a big gain, but then again even 0.9 kb per empty frame, if at least on average a, say, 2 hour anime, runs at 12 fps, that's 75 megabytes saved. Which is.. only something like 85 kbps extra.. but hey, that's enough for another ogg or RA audio stream.
hmm
screw theory, we'll see what happens :|

Atamido

27th June 2003, 21:38

Originally posted by RadicalEd
That's not necessarily true, a lot of the time animation is drawn at 12 or 6 fps and the frames merely copied once or twice to create 24 fps. If more than half the anime is at 12fps, that's quite a lot of frames you'd be missing that could be dropped. Oops, I should have been more clear. I was only refering to still scenes, where there would be an excessive number of identical frames.

Of course, being able to configure thresholds and number of frames to compare would fix most of these issues.

Also, it looks like DirectShow is the only practical method to implement this. If someone wants to try integrating the Dup filter into DS, then by all means, do.

Sirber

27th June 2003, 22:22

Are we still talking about RV9? :confused:

karl_lillevold

28th June 2003, 01:21

I moved my frame duplication detection code into a separate pre-filter for use in Producer 9.2. It's still very rough, with no adjustable thresholds, but will add those next, probably maxAvgSSD, maxAreaSSD, and maxDistance or maxNumberOfFrames. It is checked into producersdk/plugins/transform/videodupframedropper, if anyone wants to take a look. This is open source. Go grab it :cool:

The good news is that it already works very well, and the improvement is nice. The main reason is that in addition to the 5-10% saved bits from not encoding the duplicated frames, the adaptivity in the RV9 encoder is not really optimized to encode duplicated frames frames, and this pre-filter takes care of that problem automatically.

In addition, it will of course encode much faster, about 2X faster when every 2nd frame is a duplicate. The MMX optimized SSD is pretty fast, and early exits make it not take almost any time on non-duplicated frames.

For a section of the T01 clip at 250 kbps the PSNR improvement is 1.5 dB. This is a high number, and generally corresponds to about 30% bitrate reduction at the same quality (10% per 0.5 dB).

I also tried the full T03 from this codec test, but the RV9-EHQ version of T03 already looks very nice, so it's hard to spot much improvement there. bill_baroud mentioned he thought the bitrate in Sirber and Ramirez' test is not fair. I think I agree, it should perhaps have been even lower :rolleyes:

Sirber

28th June 2003, 04:15

I wanted 300kbps for T03.avi, but Ramirez told me it would kill XviD :D

The comparision will begin (Me and ramirez) tomorrow. Stay tuned!

karl_lillevold

28th June 2003, 04:20

maybe I will post a version of T03 at 250 kbps or so compressed with RV9-EHQ/Helix Animation DropDupe Pre-filter :sly:

Sirber

28th June 2003, 04:40

Would be great. :D

Do you think your filter can work with real movies also? Maybe not... :(

[edit]

Do you want the whole TXX Series? 8 episodes :D Hours and hours of fun!!! :D :D :D

Atamido

28th June 2003, 04:50

From looking at the Real Container, I don't recall any particular reason that you couldn't drop out a frame altogether. The equivalent of the BlockGroup in Matroska is the DataPacket in Real. The structure is just:

Version
Size of the packet
Stream number
Timestamp
Flags
Data itself.

From this, it seems that you could could have any variable framerate by setting the timestamp to whatever you like. If this is the case, then it would make sense that if you detect a duplicate frame, you wouldn't need to save it at all, but rather skip the frame and leave it completely out of the file. This would save 13 bytes, plus the data needed to say to repeat the frame, for every frame you dropped out.

From what RadicalEd said, 50-60% of animation could be duped frames. And the number rises if there are a lot of still scenes. A 2 hour animation video with a modest 50% duped frames at 24fps would result in 86400 duped frames. Assuming you could make a duped frame only take 15 bytes, you would still have to dedicate 1.3MB, as opposed to 0MB if you didn't write them to the file at all.

@ karl_lillevold: I know you're a codec guy and don't deal much with the container, but that is some pretty significant instant savings for such a small action.

karl_lillevold

28th June 2003, 05:02

Originally posted by Pamel
From looking at the Real Container, I don't recall any particular reason that you couldn't drop out a frame altogether.
This is true, and exactly what I am doing in the pre-filter. The frame is dropped before encode, and never to be seen again, it does not exist in the output RM container.

@ karl_lillevold: I know you're a codec guy and don't deal much with the container, but that is some pretty significant instant savings for such a small action.
The improvement is variable, but it consist of two components. The first is the savings from the dropped frames, the second is something I found when working with these animations, and is that the adaptivity within RV9 was tuned towards linear motion from frame to frame, and with duplicated frames, this is just not true, and a few more bits than necessary were sometimes spent on the duped frames. Still, RV9 as it is, works great on animation, as can be seen from Sirber's anime codec comparison (http://forum.doom9.org/showthread.php?s=&threadid=56392), but it can be made even more effective if the duplicated frames are dropped.

karl_lillevold

28th June 2003, 21:56

See this new thread for how to try out this new pre-filter for Helix Producer:
http://forum.doom9.org/showthread.php?s=&threadid=56564