Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 ASP

Reply
 
Thread Tools Search this Thread Display Modes
Old 28th May 2003, 16:32   #41  |  Link
Acaila
Retired
 
Acaila's Avatar
 
Join Date: Jan 2002
Location: Netherlands
Posts: 1,529
You're not the first to think of this , DCTune already does per-frame QM optimization during encoding. Even though the program seems to be gone by now, they did prove it was possible. But I bet it required a lot of horsepower as well .

Doing it with an external program adds some extra difficulties. I-frames wouldn't be a problem, but P-/B-frames require a reference frame as you all know. Normally this reference frame is a decode version of a previously encoded frame on which motion estimation is performed.
If you take two videos, each encoded with a different matrix, this will mean that all frames that are used as reference will be different between those two videos. If you take an I-frame out of the first video and combine it with the following P-frame of the second video, the P-frame would act on other data than it was created for and the result would be a total mess.

The only way that I can see your idea can work, is if by analysing multiple videos you could output a logfile that says with which QM to encode each frame based on which would result in the smallest frame. During encoding this logfile would then be used to encode the video with the appropriate QM.
But faster would be to just calculate multiple matrices during encoding and pick the one that results in the smallest frame. But would have to be incorporated into a future codec someday.
Acaila is offline   Reply With Quote
Old 28th May 2003, 19:07   #42  |  Link
crusty
Ahhh....Nuts!!
 
crusty's Avatar
 
Join Date: May 2002
Location: Holland
Posts: 309
Quote:
If you take two videos, each encoded with a different matrix, this will mean that all frames that are used as reference will be different between those two videos. If you take an I-frame out of the first video and combine it with the following P-frame of the second video, the P-frame would act on other data than it was created for and the result would be a total mess.
Well a quick workaround for that would be to only cut those parts with keyframes on the same frames.

Say movie 1 has keyframes on these:
1 112 229 300 330 400 567 1000.....
And movie 2 has keyframes on these:
1 96 229 300 334 401 567 1000.....
You see that it would be possible to swap out sections 1-229, 1-300, 229-300, 1-567, 229-567, 300-567 etc etc ...
Nothing difficult here, really.

Quote:
The only way that I can see your idea can work, is if by analysing multiple videos you could output a logfile that says with which QM to encode each frame based on which would result in the smallest frame. During encoding this logfile would then be used to encode the video with the appropriate QM.
This would be indeed be a better but far more complex manner. Even more so because you would have to take into account the I/B/P-frame decision and the use of different QM's for I and B/P-frames.
So, if you did an analysis like that on each frame, you would have to take into account whether or not the current frame would be a I-frame, in which case the calculations aren't that complex, or whether it would be an B/P-frame, in which case you have to recalculate the preceding frames up to the I-frame used for reference. This would be both inefficient and far more cpu intensive.
If you just calculated all frames as I-frames, you would ignore the effects of a separate B/P-frame QM. It would be far simpler tho.
Quote:
But faster would be to just calculate multiple matrices during encoding and pick the one that results in the smallest frame.
True. That's why I suggested to put it in the codec in the first place.
But the crude way I mentioned above, splitting only when I-frames match, would be far easier to implement.
It would allow people to modulate QM on a much more intelligent basis, than just the current Quantizer-based decision. It would also allow modulation on more than 2 QM's provided you have the cpu and time to spare, and you would be able to use other QM's than just h263 and Mpeg.
__________________
Core 2 Duo intel, overclocked at something (can't remember what i set it at), about 2 TB in hard storage, nvidia 9800 GT and a pretty old 19tftscreen.
Crusty007 at en.Wikipedia.org
crusty is offline   Reply With Quote
Old 28th May 2003, 19:21   #43  |  Link
MrBunny
Registered User
 
Join Date: Oct 2002
Posts: 82
Quote:
Originally posted by crusty

MrBunny, I couldn't find the thread you posted.

You mean, because a large first pass tells the codec that the end result would be bigger than allowed if it used a lower quantizer, right? Just to get this straight...
Sorry, I don't know what happened to the url, must have copy&pasted it wrong.
http://forum.doom9.org/showthread.ph...827#post305827

Quote:
Originally posted by crusty

I was just thinking....
When I started encoding a few years ago I had a really hard time encoding the movie '2010: The year we made contact' end I ended up using both the DivX 3 Fast-motion and Low-motion codecs.
It's a space movie and on the whole it looked better with the Low-motion codec but there where two or three scenes of fast motion that looked blocky.
I encoded the whole movie with both codecs and then used a tool (cannot remember it's name) that would merge the best parts of both.
The end result was very good.

Now I also have used different QM's recently inside one movie, by using a Heavy Compression matrix on the end credits and using H263 or modulated on the main movie.
It may not be mpeg4 compliant, but it works like a charm.
So using different QM's in one clip is quite possible.

This leads me to the following idea (you're probably guessing it already):

Encode the entire movie with several different QM's and then use a tool to switch between the best looking/best compressed parts and merge those. Then you would have a modulated end result.
And with this process you can easily alter the algorithm for choosing between the QM's without altering the codec itself, as it's just a standalone tool.
This way you could use up to a zillion different QM's in one clip, depending on how you make the decision. So you could automatically use different QM's for end credits, low motion and fast motion scenes and dark and light content etc etc.

Any thoughts on this?
Ahhh, the old manual MM4 days. I remember using a tool called ProjectDivX for it, but your point is well taken. Like that tool, the QM/cutscene points would need to be per i-frame as not to break Xvid Compatibility (much for the reasons Acaila described).

I personally would love a tool like ProjectDivX just for cutting and pasting between two xvid files. Sometimes I enjoy tweaking a scene or two (or many more...) that the curve just doesn't deal with right. Using it for clips with different QMs would also be an option. Multiple QM xvid files are possible as long as they are cut @ i-frames, I've done one myself. But that was before I know about MPEG-4 compatibility issues, and right now I prefer to stay MPEG-4 compliant, even at the cost of a little quality.

Sigh...new post as I was writing this
My major problem with auto-selection of QM by any algorithm, is that it can't "see" what the end result is. It's quite possible that matrix A is results in slightly smaller framesizes than matrix B, but encoding with matrix A looks really ugly relative to B. I think such an algoritm would have to be very carefully written.
MrBunny is offline   Reply With Quote
Old 28th May 2003, 19:33   #44  |  Link
Acaila
Retired
 
Acaila's Avatar
 
Join Date: Jan 2002
Location: Netherlands
Posts: 1,529
Quote:
Well a quick workaround for that would be to only cut those parts with keyframes on the same frames.
No, you misunderstood me. What I meant was that a frame encoded with one matrix cannot serve as a reference for the following frame when that frame expects a reference frame encoded with some other matrix.

Suppose you have one video encoded with H263 and another with the MPEG matrix. If you take an I-frame from the first video and put a P-frame from the second after it, the P-frame will not have the correct data because it expects an I-frame encoded with MPEG, but gets one with H263.
The fact that a frame is encoded with either H263 or MPEG makes it look completely different and mismatched I-frames like that will corrupt all P- and B-frames referencing it.

So it's simply not possible to cut&paste a video together to get the most efficient matrices. Not without decoding-reencoding anyway.
Acaila is offline   Reply With Quote
Old 28th May 2003, 20:12   #45  |  Link
Defiler
Asker of Questions
 
Join Date: Oct 2001
Location: Florida
Posts: 433
This sounds like a job for... KLUDGE!
http://sourceforge.net/projects/kludge
Defiler is offline   Reply With Quote
Old 29th May 2003, 12:31   #46  |  Link
crusty
Ahhh....Nuts!!
 
crusty's Avatar
 
Join Date: May 2002
Location: Holland
Posts: 309
Quote:
No, you misunderstood me. What I meant was that a frame encoded with one matrix cannot serve as a reference for the following frame when that frame expects a reference frame encoded with some other matrix.
Ok, I see. But that doesn't change the workaround because you can still cut on keyframes, because they have no reference.
__________________
Core 2 Duo intel, overclocked at something (can't remember what i set it at), about 2 TB in hard storage, nvidia 9800 GT and a pretty old 19tftscreen.
Crusty007 at en.Wikipedia.org
crusty is offline   Reply With Quote
Old 29th May 2003, 13:15   #47  |  Link
mf
·
 
mf's Avatar
 
Join Date: Jan 2002
Posts: 1,729
Quote:
Originally posted by Acaila
The only way that I can see your idea can work, is if by analysing multiple videos you could output a logfile that says with which QM to encode each frame based on which would result in the smallest frame. During encoding this logfile would then be used to encode the video with the appropriate QM.
But faster would be to just calculate multiple matrices during encoding and pick the one that results in the smallest frame. But would have to be incorporated into a future codec someday.
Why smallest frame? If I have a matrix that destroys lots of detail, it will almost always create the smallest frames, and will "win" the decision over all the other matrices I might have. I'd say something like PSNR coupled to size, but then again, PSNR would probably always be related to size (since in this stage all you're doing is reducing coefficients, which will always kill detail and thus reduce PSNR), so you'd have to involve some psychovisual mathematics in it.
mf is offline   Reply With Quote
Old 29th May 2003, 13:31   #48  |  Link
Acaila
Retired
 
Acaila's Avatar
 
Join Date: Jan 2002
Location: Netherlands
Posts: 1,529
I followed the smallest frame scenario because that is what crusty originally mentioned when he brought that idea up. But you're right, something based on an optimal rate-distortion curve would probably be preferable. That way you try to achieve the highest PSNR possible while still keeping the same total bitrate as the original video.
Acaila is offline   Reply With Quote
Old 29th May 2003, 14:29   #49  |  Link
mf
·
 
mf's Avatar
 
Join Date: Jan 2002
Posts: 1,729
Well ok in the 2nd pass it's doable . But at constant quant a decision would be harder.
mf is offline   Reply With Quote
Old 29th May 2003, 15:39   #50  |  Link
Acaila
Retired
 
Acaila's Avatar
 
Join Date: Jan 2002
Location: Netherlands
Posts: 1,529
In the 2nd pass? We were talking about an external program to cut up videos that were already finished (which would always be suboptimal compared to encoder based optimization, but that's beside the point). Whether that video was created with 2-pass or 1-pass is no longer relevant at that point.
Acaila is offline   Reply With Quote
Old 29th May 2003, 17:14   #51  |  Link
crusty
Ahhh....Nuts!!
 
crusty's Avatar
 
Join Date: May 2002
Location: Holland
Posts: 309
Offcourse it would be suboptimal. It would be relatively easy to implement too. I'm no programmer, but I'd say that probably one experienced person could program something like this in a few days.
That's a lot quicker than getting it through a CVS thread in an Opensource project. And just because something isn't in the codec doesn't have to mean it's not good. Nandub SBC anyone?

You could indeed have two 1-pass files encoded at constant Quantizer or constant quality and then have the tool merge the best parts based on different assumptions. These could be PSNR, filesize, Quantizer, PSNR/Filesize ratio, etc, whatever you can think of basically.
You could even add intelligent filters that would base the decision on motion estimation or luminance variations, for instance to switch QM's intelligently between fast-motion and low-motion scenes.
You could also do this with 2-pass files. Or Nth-pass if you like DivX, except offcourse that you can't use different QM's with Divx.
Offcourse, the more files you have to merge from, the less mergable parts there will be because there will be less and less identical keyframes with every extra file.

You also do not necessarily have to go for the smallest filesize, but instead for a certain quantizer range or PSNR. Smallest filesize for the same quality or the best possible quality at a given filesize are however my greatest interests.

I have to think a bit about how this would affect the end result of pre-processing noise filters. Say for instance you used a different filter-set for two 2-pass files.
You could devise a way to take the best parts of both files, those parts with the fewest artifacts, and still use heavy filtering.
One point I often find is that you cannot set filters at the strenght you would like the most, without introducing horrible artifacts in some scenes in your movie. That's why I always encode end credits separately, even if it's just one minute, because you can use much heavier filtering on end credits.
You could take filtered and unfiltered files and simply take the best parts of both and merge them, if your decision algorithm is intelligent enough.
Maybe this is not possible, but maybe it is. A decision like that would probably not be based on PSNR because a filtered file would probably have a lower PSNR. Maybe if you could add a PSNR-offset to the filtered file and simply switch whenever the filtered file has the better PSNR(adjusted with the offset) than the unfiltered file.....everybody still getting this??

You can even take this one step further and make it a kind of general multi-file merge tool, with plugin modules that add different merge-decision-algorithms into the tool. And in the end you could even try to merge this tool itself with VirtualDubMod.
__________________
Core 2 Duo intel, overclocked at something (can't remember what i set it at), about 2 TB in hard storage, nvidia 9800 GT and a pretty old 19tftscreen.
Crusty007 at en.Wikipedia.org
crusty is offline   Reply With Quote
Old 29th May 2003, 17:45   #52  |  Link
JimiK
just me
 
Join Date: Sep 2002
Posts: 158
@Acaila
Did you compare the quality of the clips when you confirmed Didée's test? Because as we all know: higher PSNR does not necessarily mean better visual quality (that's why they used another method called JNI in a test by the german computer magazin c't. JNI should also consider the human visual system). Of course I know that Didée is here for a long time and I don't doubt that this matrix is great, but it just makes me wonder why.
@all
Some thoughts about different matrices:
1. you would no longer be able to use bframes (as you might know, Modulated and bframes don't mix). Do you really think you could achieve the same quality without bframes, by just using better fitting matrices? (I mean quality at low bitrates).
2. you would have to store all these matrices (the decoder has to know the matrix the encoder used). Then you would to set a flag which matrix is used. I don't know how modulated is implemented. But in the best case scenario, you would have to set a flag, everytime the matrix changes. In the worst case, you would need a flag for every frame that tells the decoder which matrix to use (well, worst case would be if the matrix would not be stored at a single point in the file, but would have to be there everytime it changes. But that is highly unlikely). Even in the best case, you would waste some bits. Say you have up to 256 different matrices. The flag would need 8Bit=1Byte to say which matrix to use.
@crusty
You don't have to encode your credits separately. Just use the trim command in avisynth and filter this section more than the other. Then you can also use the xvid credits section and compress this part more than the others.
Best regards,
JimiK
JimiK is offline   Reply With Quote
Old 29th May 2003, 18:18   #53  |  Link
crusty
Ahhh....Nuts!!
 
crusty's Avatar
 
Join Date: May 2002
Location: Holland
Posts: 309
Quote:
Some thoughts about different matrices: 1. you would no longer be able to use bframes (as you might know, Modulated and bframes don't mix). Do you really think you could achieve the same quality without bframes, by just using better fitting matrices? (I mean quality at low bitrates).
You're both right and wrong here.
Modulated and B-frames don't mix because the Xvid codec doesn't alter the QM only at keyframes in modulated mode. (this should be a quick fix for modulated BTW)
Xvid takes h263 when the average quantizer is under one value and mpeg when it is over one value. It doesn't do this just at keyframes but whenever it feels like doing that. So altering the QM at a B or P frame would break predictability in the codec. Ergo, no modulated QM with B-frames.

But the method I mentioned only alters the QM in complete I-B-P frame sequences, so there should be no problem.
So yes, you could still use B-frames. You could even alter all the B-frame settings (ratio, offset and treshold) for every separate clip because those are only ENcoding settings and do not matter when DEcoding.

@point 2:
Good question. I have NO idea how big a QM actually is inside a mpeg4 stream. Anyone got the answer?
Quote:
Say you have up to 256 different matrices
A bit overkill probably, I doubt anyone would ever use more than 8 matrices.

I encode my end credits using separate avs files and the trim function. And I do filter them much heavier and I DO use the lowest bitrate without introducing too many artifacts. That's what I said.
__________________
Core 2 Duo intel, overclocked at something (can't remember what i set it at), about 2 TB in hard storage, nvidia 9800 GT and a pretty old 19tftscreen.
Crusty007 at en.Wikipedia.org
crusty is offline   Reply With Quote
Old 29th May 2003, 19:09   #54  |  Link
Acaila
Retired
 
Acaila's Avatar
 
Join Date: Jan 2002
Location: Netherlands
Posts: 1,529
Quote:
Originally posted by JimiK
Did you compare the quality of the clips when you confirmed Didée's test? Because as we all know: higher PSNR does not necessarily mean better visual quality (that's why they used another method called JNI in a test by the german computer magazin c't. JNI should also consider the human visual system). Of course I know that Didée is here for a long time and I don't doubt that this matrix is great, but it just makes me wonder why.
I quickly scanned through them and couldn't see any differences, however I doubt anyone could see the difference between e.g. a PSNR of 45 and a PSNR of 45.5 though (yes I know it's possible, but I couldn't do it). I did my tests at a compressibility of about 70%, so either will look great. And when both videos look great, I tend to label the one with the highest PSNR as the best.

Quote:
Originally posted by crusty
Good question. I have NO idea how big a QM actually is inside a mpeg4 stream. Anyone got the answer?
All I know is that QM values are 8-bit, so those 64 values would take up 64 bytes. I don't know if they have to be repeated at every QM change, or can be set with a flag with all matrices written out in a header somewhere.
Acaila is offline   Reply With Quote
Old 30th May 2003, 00:32   #55  |  Link
crusty
Ahhh....Nuts!!
 
crusty's Avatar
 
Join Date: May 2002
Location: Holland
Posts: 309
Quote:
I quickly scanned through them and couldn't see any differences, however I doubt anyone could see the difference between e.g. a PSNR of 45 and a PSNR of 45.5 though (yes I know it's possible, but I couldn't do it).
That's probably just a matter of training.
After ripping 400 Audio CD's I could differentiate easily between a 192 kbps mp3 and the original WAV...hell I could even hear the difference between CD's and records by then!
And I found lately when doing a lot of difficult audio syncing (enterprise serie from mpeg to xvid) that I could 'feel' the difference between 10 and 20 ms of delay. That's 1/100th of a second.
I guess that with enough training people would probably be able to see PSNR drops like you mentioned.
Quote:
All I know is that QM values are 8-bit, so those 64 values would take up 64 bytes. I don't know if they have to be repeated at every QM change, or can be set with a flag with all matrices written out in a header somewhere.
Well that would mean that at every QM change 64 bytes extra are used, because I doubt that this would be implemented by flags.
__________________
Core 2 Duo intel, overclocked at something (can't remember what i set it at), about 2 TB in hard storage, nvidia 9800 GT and a pretty old 19tftscreen.
Crusty007 at en.Wikipedia.org
crusty is offline   Reply With Quote
Old 30th May 2003, 06:23   #56  |  Link
mf
·
 
mf's Avatar
 
Join Date: Jan 2002
Posts: 1,729
Actually a .5 PSNR increase is quite much. It's noticable on a full movie.
mf is offline   Reply With Quote
Old 30th May 2003, 21:44   #57  |  Link
OUTPinged_
MooPolice 1st division
 
OUTPinged_'s Avatar
 
Join Date: Dec 2001
Location: VIlnius,LT
Posts: 448
That's strange. I am getting pretty bad results with the provided matrixes on test clip.

It is actually a "torture" one, the codec is expected to provide good-looking picture with ~30% 1/2_pass_ratio. Last koepi's build is used, vhq4+bframes+chroma etc etc.

new matrixes were working in quantization range of 4-10 while h263 was using 2-6 range. h263 was looking pretty good everywhere, while custom matrixes looked like shit on hi-motion scenes where quantizers jupmed up to 8 and beyond.

Interesting thing was that if i was using custom matrix with minimum value of 16 (increasing to 32 same way akaila's does), it looked ok again. Quantizer range used was 3-7.

So i picked a frame which was referred by KF and had practically same size for all matrixes.

it had: quant6 for h263, quant 7 for "16-32" matrix and quant9 for akaila's (i was using it since they both give pretty same results and incompatibility with ffdshow is a big no-no).

here are images that show this very well:
Code:
"http://www.mif.vu.lt/~dmku1330/lowkoeffbug-16-32 matrix at quant7.jpg"
"http://www.mif.vu.lt/~dmku1330/lowkoeffbug-akaila's matrix at quant9.jpg"
"http://www.mif.vu.lt/~dmku1330/lowkoeffbug-h263 matrix at quant6.jpg"
notice how 16-32 and h262 matrixes look pretty much the same when "akaila" shows up blocky.

screens were taken in vdub.

Does that occur everywhere where higher quantizer ranges are used? (literally,am i supposed to get crap quality with those low koeff. matrixes for low 1/2 pass ratios?) Maybe there is a bug someplace i missed?


Also, here is a small OT question: i thought that h263 was equivalent to "all 16" matrix but 16s are giving me higher compression ratio. Any ideas which matrix will represent h263 most closely?
__________________
___________________MooPolice is watching you!____.o/________
OUTPinged_ is offline   Reply With Quote
Old 30th May 2003, 21:46   #58  |  Link
OUTPinged_
MooPolice 1st division
 
OUTPinged_'s Avatar
 
Join Date: Dec 2001
Location: VIlnius,LT
Posts: 448
@Acaila

i believe, QM is stored inside each intra frame.
__________________
___________________MooPolice is watching you!____.o/________
OUTPinged_ is offline   Reply With Quote
Old 31st May 2003, 08:05   #59  |  Link
Acaila
Retired
 
Acaila's Avatar
 
Join Date: Jan 2002
Location: Netherlands
Posts: 1,529
I don't know why these custom matrices performed bad in your high compression tests. The only reason I can think of is because they quantize low coefficients roughly the same way as high coefficients. Whereas the standard matrices quantize the high coeficients much stronger than the low ones. Once you get to high quants (>6) this could have a great effect on the image detail because you'll start losing a lot of low coefficients as well. But that's just a guess.

One thing I do noticed is that the picture from the frame encoded with my matrix looked a lot clearer (less blurry) than the other two. Was the rest of the video also like that or was it just that one frame?
Acaila is offline   Reply With Quote
Old 31st May 2003, 16:44   #60  |  Link
crusty
Ahhh....Nuts!!
 
crusty's Avatar
 
Join Date: May 2002
Location: Holland
Posts: 309
Quote:
i believe, QM is stored inside each intra frame.
Damn, I keep forgetting the difference between inter and intra frame
That's why I talk about Key-frames and B/P-frames, cause that's a lot clearer.

Do you mean at every Key-frame?

Quote:
One thing I do noticed is that the picture from the frame encoded with my matrix looked a lot clearer (less blurry) than the other two.
Yeah I noticed that too. It also looks like most of this particular frame is ok but there is distortion in part of the picture, which gets amplified by Acaila's QM.
Was this distortion also present in the original? Please check.

Maybe if you could post a small clip (<250 frames) we could take a look at it.

EDIT: is it me or has this forum slowed down a bit the last few weeks?
__________________
Core 2 Duo intel, overclocked at something (can't remember what i set it at), about 2 TB in hard storage, nvidia 9800 GT and a pretty old 19tftscreen.
Crusty007 at en.Wikipedia.org

Last edited by crusty; 1st June 2003 at 19:42.
crusty is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:06.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.