Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
29th April 2007, 21:17 | #1 | Link |
Registered User
Join Date: Apr 2007
Posts: 61
|
The quest for true constant quality with x264
There has been lots of threads about these things, and I think I read through the most relevant ones. They still didn't give me the answers I wanted.
However, I feel this topic is important, since I think constant perceived quality is what people usually really want, since there really is no reason to go to higher bitrates if you can achieve acceptable quality on a lower bitrate. (Unless you want to fit exactly one file per CD etc.) First I will list some bits of information I have gathered and on which I am basing my assumptions and questions. I may have lots of misinformation and wrond ideas, so if I'm wrong about them, I hope someone could fix my flawed facts. For a long time I was wondering if 2pass is really better than "Constant Quality" of the same bitrate. I mean logically it shouldn't be: it tries to achieve constant perceived quality, but 2pass does exactly the same. Only difference is that 2pass calculates the level of quality that gives the wanted filesize. (Approximately.) So if 2pass really is better than CFR, it must mean 2pass's definition of "constant quality" is different from CFR's definition. I really can't understand why 2pass encoding is inherently so much better than one pass. The only reason I've gotten for it is because in 2pass the 1st pass makes a stats file for later use in 2nd pass. On the other hand one pass constant quality can't predict what are the properties of the frames ahead of the frame currently worked on. This gives 2pass only one advantage: If there is a frame ahead that needs more quantizer and that frame is using the current frame as a reference, the current one should be given more too. I don't want to use the expression "more complex frame" since I'm still not quite convinced if complex frames need more quantizier or not. If there is some other reason, PLEASE tell me. All I've seen is stuff like "2-pass mode is superior given an equal file size..". A few times even something like "I tried both, and 2pass was slightly better." Never the reason. (Other than the one I already mentioned.) I'd REALLY like to know what is this magical quality that makes 2pass encoding "superior". I'm under impression when the bitrate goes up, CFR comes very close to 2pass. And I guess it wouldn't be too much of a quality loss if I just did CFR, but being the perfectionist I am, I would want to achieve the best possible quality while still maintaining the philosophy of having constant quality for all encodes. (Of one series for example.) The way I see it there are four ways to solve this: 1) 1st pass CRF with stats file. Then somehow try to figure out a good bitrate from that, then do 2nd pass with that bitrate. I still have pretty much no idea how to get a good bitrate estimate from it, so if someone could please tell me, it would be nice, as this seems to be currently the most feasible way. 2) A normal one pass CRF outputting a stats file, then pick the final bitrate of the resulting video file, and do a normal second pass with that bitrate. Now this is the only one I really know how to do with reasonable accuracy, but it seems like a lot of wasted CPU time. Now we move into the hypothetical methods. 3) 1st pass constant quality mode with stats file. Then 2nd pass constant quality mode. I read that "crf inherently doesn't work in 2pass", but is there a reason for that? When a normal 2nd pass is given a stats file and a bitrate, it somehow magically decides on a quality everything is going to be encoded in. Is it really impossible to give that magical quality factor as an argument to x264? 4) This suggestion makes even more assumptions, mainly assuming I'm right about the fact that makes 2pass encoding better than one pass. Or to be specific I mean there is really no reason to know stuff about the frames that aren't near the one currently under progress. (If there is, please tell me why.) Therefore, would it be completely impossible to calculate stats of the few frames ahead? There might be frames using frames as a reference that are using frames as a reference etc, but I'm not even sure how far this is possible, and even if it is possible to carry on from the first frame to the last frame, it should be taken care of by a reasonable buffer. Now, this isn't too different from the #3, so why go through the difficulties? Well, there are situations where rendering of the video is really expensive, like having crazy filters in Avisynth or maybe some other frameserver. Or you might be encoding real time. So why would it be impossible to render some frames in advance, store them in memory, then calculate stats for those, then encode the current frame (which is many frames before the one rendered last) using the given CMF? (Constant Magic Factor) |
29th April 2007, 22:26 | #2 | Link | |||
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Quote:
* --direct=auto doesn't work very well in 1pass. In particular, 2pass direct=auto can choose the mode of each B-frame based on that frame's own stats, while 1pass direct=auto chooses the mode of each B-frame based on the previous frames' stats. This could be solved by re-encoding the B-frame if its optimal mode differs from the guess. This would be somewhat faster than 2pass, though still many frames would be encoded twice. This also has some issues when combined with pyramid + threads. * The 2pass "complexity" metric is bits, while the 1pass "complexity" metric is based on SATD. This may or may not be an improvement at all. The combined effect of the above reasons is somewhere between 0 and 0.1 dB PSNR. So you might be better off just spending the extra cpu-time on slightly slower 1pass settings. Quote:
Though that doesn't mean it can't be done. After all, CRF doesn't inherently have anything to do with QP either, the values of --crf are only similar to the QP they produce because I tuned a formula to translate the value from --crf to the internal rate_factor. So I suppose I could write a similar translation for the 2nd pass, and a given --crf in 1pass would usually be similar bitrate to the same value of --crf in 2pass. Or I could just store the SATD values used in 1pass CRF in the 2pass file, and use them. Then 2pass CRF would be identical to 1pass CRF except for the points specified above. Quote:
Actually, "called" is a bit strong. It's stored in a variable named "rate_factor", hence the name "constant rate factor". Last edited by akupenguin; 29th April 2007 at 22:35. |
|||
30th April 2007, 03:27 | #3 | Link | ||||
Registered User
Join Date: Apr 2007
Posts: 61
|
First, thanks for answering, it really cleared some suspicions and gave me hope. Also, congrats on the 1337th post. ;P
Quote:
Quote:
Quote:
However, even though it's the same quality, it's a lot faster... Let's not forget expensive rendering either, I have an Avisynth script I've used that renders at the rate of 3fps. And not only that... It would also help a lot in trying to figure out the best encoding settings... Quote:
As I said... What I was after in this thread was a way to create "true" predefined constant quality. The suggestions/possibilites you gave look very promising indeed, and I guess all I could hope for was that my meditations on this could be proven to be possible/correct. Still... If the suggestion #4 or at least #3 would at any point become part of the actual implementation, it would be the best thing since DCT. I feel that it wouldn't be too hard to implement, since you already have the 2nd pass code ready with all bells and whistles for those things and only things needed are the two things I mentioned. |
||||
30th April 2007, 03:42 | #4 | Link | ||||
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Quote:
Quote:
Quote:
Quote:
Last edited by akupenguin; 30th April 2007 at 16:53. |
||||
30th April 2007, 05:58 | #5 | Link | |||
Registered User
Join Date: Apr 2007
Posts: 61
|
Quote:
http://www.undercut.org/Nandub_OnePass/ No idea how well it works or anything though, didn't look into it too much. Quote:
That was never the point though. Point wasn't really how to fix MY next encode. Point was just that I wondered about how it could possibly be fixed altogether for everyone, as with just the 20 frame buffer thing and one pass encoding, I don't really see a reason for decoding to huffyuv anymore. Quote:
Thank you for answering to all these questions. I really wish I you got at least some ideas based on my ramblings. |
|||
30th April 2007, 15:47 | #6 | Link |
WiLD CaRD
Join Date: Mar 2007
Location: Toronto Canada
Posts: 258
|
Fascinating topic if I may jump in.
This debate can spurn on forever because it really doesn’t come down to quantitative metrics such as PSNR to conclude a fair comparison between the two, especially when this measurement is almost negligible anyway. Other than the fact that 1-pass’s advantage is less CPU time and a rather accurate quality forecaster and 2-pass’s advantage will be very predictable file size and (maybe) optimal quality per file size there may not be a definite quality advantage to either, especially with x264. A believer of 2-pass can conclude that the bits are being allocated to areas of the clip most needed. However, assuming this logic, this gives more margin for 1-pass to delegate more bits to other areas maybe deprived in 2-pass (assuming the same amount of bitrate). So what would be elements in the decision criteria between these schemes in data distribution? Akupenguin mentioned the older codecs, DivX, Xvid, etc, having a definite advantage with a second pass. Very true. However, using this methodology as a model, wouldn’t some of the theory apply as well to x264? Please let me explain. With these older codecs, what was apparently noticeable was in scenes of higher motion. One pass was obviously short-sighted, and would allocate bits on a frame-by-frame basis. Every frame, without knowledge of the next, would be given its “due”. However, with the advantage of foresight, a 2-pass scheme would better allocate data to the higher motion scenes in the video, something a frame-by-frame encoding scheme would not be able to recognize. This would especially make sense when such activity can last for hundreds of frames at a time and 1-pass cannot predict this in advance. Even a 20-frame head start can be a significant advantage for 2-pass in motion scenes. Qualitatively speaking, given this hypothesis, the result was video that had a higher spread of visual quality among the more stationary frames in 1-pass, maybe even the majority of the frames in total, while 2-pass would have far less blur in the scenes with higher movement. I do understand that x264 was better designed, and the algorithms implemented ease this variance, however wouldn’t this play a role as well? Forgive me if I’m wrong, I can’t help but think that maybe a 2-pass scheme, even with x264, may have an edge, albeit small, in higher motion scenes. Thanks for reading. Last edited by PuzZLeR; 5th May 2007 at 18:26. |
30th April 2007, 16:15 | #7 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Quantitative metrics such as PSNR are sufficient for comparing 1pass to 2pass, because they have exactly the same target bit distribution and differ only in how close they get to that target. PSNR can't compare macroscopically different bit distributions, but those aren't the issue here. I'm not saying x264's 2pass is the optimal distribution, only that you can get the same distribution in 1pass.
The reason xvid, divx, etc have a definite advantage with 2pass is because they have no unrestricted 1pass VBR (aside from CQP, which is suboptimal for other reasons). The difference between 1pass ABR and 2pass is most visible in high motion because ABR doesn't give each frame its due, it limits the bitrate. 1pass doesn't have to be short-sighted. A small lookahead buffer is sufficient to get all the benefit of 2pass bitrate distribution, because the inter-prediction dependencies between frames (and psy motion masking, if your ratecontrol uses that) are limited to a reasonably small radius of effect. Even if you don't believe my estimate of 20 frames for x264, inter prediction is definitely limited to 1 GOP. And psy effects can't possibly last more than a few seconds. If even a 20-frames head start helps, then that's evidence that you don't need 2pass, because you can get a 20 frame head start in 1pass just be spending a little extra RAM. Last edited by akupenguin; 30th April 2007 at 16:17. |
30th April 2007, 16:35 | #8 | Link | ||||||
Registered User
Join Date: Apr 2007
Posts: 61
|
Quote:
Quote:
Quote:
Quote:
Quote:
2pass encoding doesn't really look farther than 20 frames. Why should it? How on earth would a frame 100 frames away affect the current one? And even if it does, how does it matter in 1pass vs 2pass if 2pass doesn't care either? It doesn't. Quote:
Last edited by Kuukunen; 30th April 2007 at 16:36. Reason: aku finished before I did, oh well... |
||||||
30th April 2007, 16:40 | #9 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
Yep indeed the concept of 2pass is outdated i do most of my encodes in ABR (especialy high bitrate ones HDTV,since H.264) and the difference between a X264 ABR with more tools compared to a less tools but 2 pass encode is almost non existant and both need the same time, also i never managed it yet to overflow with H.264 (if you call +-5 kb overflow/underflow (wich happens from source to source with XviD and others more often,but then the overflow is most of the time in the Mb area).
Lookahead would be perfect so finaly 2pass could retire once and for all :P
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 30th April 2007 at 16:52. |
30th April 2007, 16:56 | #10 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
x264's ABR does still have some disadvantages. Even if the extra tools you can enable with the time saved make up for the worse bit distribution, a compressibility test followed by CRF will be better yet. i.e. the optimal method when you do have a filesize constraint is 2pass but with a sparsely sampled 1st pass. I would integrate that into x264 if it weren't so dependent on avisynth.
|
30th April 2007, 17:49 | #11 | Link |
WiLD CaRD
Join Date: Mar 2007
Location: Toronto Canada
Posts: 258
|
Old ASP habits do die hard don’t they?
In my experiments with x264, I admit that I have failed to notice any (meaningful) quality advantage to either 1-pass or 2-pass (with same file size). I would only look for it because of what my experience with DivX was where it was obvious. It appears that the 2-pass mindset is a thing of the past, especially regarding the abilities of the H.264 standard. In fact, even with x264, I would first encode in 1-pass and then feed that bitrate into 2-pass. Sometimes to alleviate some of the long H.264 encode times, I would even use the DivX quantizers to encode, and the resulting bitrate would serve as an index and to a proportionally equal bitrate for 2-pass x264 encoding. I am grateful to hear from this thread that this is more a waste of time than productive. Even if I did get any added quality with 2-pass, it would be so minute and not worthy of the extra couple of steps. 2-pass is king when needing a certain amount of video time, or clips, for a given amount of storage unit and I will continue to use it as such. However, after reading this thread, when a certain “fit” is not important, and bitrate calculators are not necessary, I’m going 1-pass all the way. Thank you to all. |
30th April 2007, 18:14 | #12 | Link | |
Registered User
Join Date: Apr 2007
Posts: 464
|
Quote:
I'd essentially tried only 2-pass following the general wisdom that "it's the best", but all my testing is pointing to the fact that I'd get 200% the encode time for less than 1% quality increase (SSIM) and visually identical content. I'd like to know if/how you managed to automate that in Windows : I dont think x264 sets env variables with the output parms, at least in win32. That said, 99% of my encodes I don't even specify a bitrate, I let x264 go with the "quality" settings II give it. |
|
30th April 2007, 18:35 | #13 | Link | ||
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Quote:
Quote:
Last edited by akupenguin; 30th April 2007 at 18:37. |
||
30th April 2007, 19:58 | #15 | Link | |
WiLD CaRD
Join Date: Mar 2007
Location: Toronto Canada
Posts: 258
|
Quote:
Actually, regarding HandBrake, a GUI for x264, you may find this thread interesting in their forums. They are attempting to accomplish just that, an automated process that implements the "benefits" of 1-pass combined with 2-pass: http://handbrake.m0k.org/forum/viewtopic.php?t=287 It does make "philosophical" sense to believe going over content a second time gives better results as I still somewhat do. However, hypothetically speaking, even if a 1% improvement is the result, in terms of logistics, and as an analogy, I personally now realize that it would be foolish to devote 300%, 283%, 200% or even 183% more effort into my job for a 1% pay increase now that I think of it, even if I'm doing it while asleep. |
|
1st May 2007, 06:25 | #16 | Link |
Registered User
Join Date: Apr 2007
Posts: 142
|
Why not do a 2 pass encode with x264? I do a lot of 1080p/1080i to 720p conversions. For a 43 minute episode of Firefly, using megui's HD-Slowest profile, the first pass takes an hour. The second pass takes 12. In my case, I might as well do it since the extra time added is negligable and the quality is better.
|
1st May 2007, 08:25 | #17 | Link | ||||
Registered User
Join Date: Apr 2007
Posts: 61
|
Quote:
Quote:
Quote:
On the 1% quality increase... Usually analogies suck. So does this. :p When you work, you get money, but with money the thing is, you can only spend it once. When you encode, the final product might be used many times. And my last 2pass encode didn't take 183% of 1pass. The first pass was 3.2FPS and the second was 0.71FPS. (I'm not sure, but I would think the second pass is pretty close to one pass.) Also, if I could 1% more money for work I do while I sleep, of course I'd get it. And finally, usually it's not about the amount of quality increase, but a matter of the sizes of the files that look the same. I don't know how much that difference is with current x264's CRF vs. 2pass. (But I've understood it has a lot to do with bitrate and/or the level of quality.) Quote:
|
||||
1st May 2007, 13:59 | #18 | Link |
Emperor building empire
Join Date: Mar 2007
Location: ZAR
Posts: 674
|
Any thoughts on Absolute Quality ???
I generally use 18-CQ-CRF but I have also experimented with Insane-2-pass... but the overall quallity is still some way from the original... possibly only 60%...
I took 'Band of Brothers' and tried Q18-CQ-CRF on DAResolution (1024x576) comparing the results to an Q18-CQ-CRF Anamorphic (720x576) encode... Anamorphic Text ... DAR ... VOB Nobody is interested in great looking text... but it is a relative indicator of the quality of the rest of the encode... look especially at the bottom of the angled leg of the capital R... Anamorphic ... DAR ... VOB If you download the VLC snapshots and watch the jeep, particularly between the Anamorphic and DAR encode... the clarity and detail will just jump right out at you... This DAR quality, however, does not come cheep... the Anamorphic encode is 45% less-than the VOB while the DAR encode is only 24% less-than the VOB original... and encodes times are also 25% slower. A further consideration is that your movie has been correctly resized during encode rather than, in realtime on plaback... but larger files and higher res can also impact negatively on playback unless you have a dual-core system with RAID... especially if 'seeking' is involved. I used the Lanczos4 resize filter though there may well be a filter better suited since the movie hasn't been resized, as much as, re-mapped to it's originally intended resolution. I get the impression that Q18 reaches a quality-threshhold per-pixel... and that the only way to achieve further gains is by 'spreading the love' and letting more pixels carry the load... I've just finished 'Lord of the Rings 1' at 1152x480 (2.4:1) originally (720x576 PAL / 16:9, black-space top and bottom)... The MKV is 5 GB (with direct copy of audio and subs) compared to 6.3 GB of the Mpeg-2 orginal... Hardly a great saving, but an interesting experiment in the quest for Absolute Quality ... Pascal BTW I took this a step further and over-sampled a BoB clip to 1280x720 and there was still a significant increase in quality... though it's not quite there with the original... Original Post Last edited by delacroixp; 2nd May 2007 at 00:33. Reason: Last word |
1st May 2007, 14:35 | #19 | Link | |||
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Quote:
Quote:
Pre-scaling could help if and only if all of the following are satisfied: You're encoding only for your own monitor so you know in advance the playback resolution. You use a non-realtime scaling algorithm that's even slower than decoding high resolution h264. And you use high enough bitrate that the extra sharpness introduced by the scaler isn't lost in the encoding artifacts. Quote:
|
|||
1st May 2007, 19:33 | #20 | Link | |||
Emperor building empire
Join Date: Mar 2007
Location: ZAR
Posts: 674
|
@akupenguin
Quote:
However, for the same filesize, by increasing the resolution while maintaining Q18, there was, and is, a clear increase in quality... Sure, it's not 10 movie encodes on a single 4.5 GB DVD with beautifull efficiency... but a DAResolution encode does bridge the, not-inconsiderable, quality-gap between an anamorphic encode and the orginal VOB movie material... all else being equal... Quote:
Quote:
I can only comment on what I've seen and perceived given the short time I've used H264 or even given my limited encoding experience... It certainly seams to me that you have a better chance of increasing quality beyond Q18-CQ-CRF by increasing resolution after this point... even beyond Anamorphic or DAResolution if you so choose... which is a farcry from DVDShrink if all you need is a 10% or 20% reduction in filesize... You're the Guru... you explain the anomaly... Pascal BTW I had another look at the PNG's and the difference in quality is almost palpable... Even if over-sampling is a total crap-shoot... Q18-Anamorphic and even Q18 encodes in general, have a long way to go before they match original quality. Last edited by delacroixp; 1st May 2007 at 19:47. Reason: Review |
|||
Thread Tools | Search this Thread |
Display Modes | |
|
|