Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
31st July 2016, 21:33 | #4082 | Link |
Registered User
Join Date: Mar 2015
Location: New Zealand
Posts: 45
|
Hmmm, odd, by mostly coincidence, my settings are virtually the same as yours, except I use rdoq 1.1, and qg-size 32. I presume you are using staxrip video comparison to compare? It may be due to input source quality, I am currently testing sg1, ntsc.
|
31st July 2016, 22:49 | #4084 | Link | |
Registered User
Join Date: Jun 2016
Posts: 116
|
Quote:
@burfadel, Thanks for all of the info! I really appreciate the detailed response. I ran a quick test of your settings compared to mine. Here are a couple of screenshots. Your settings + medium profile: 6068 Kpbs I took another look at my previous settings, I use those for older video or that I'm trying to get through a little faster. Code:
--crf 21 --profile main10 --output-depth 10 --ctu 32 --bframes 8 --rc-lookahead 80 --scenecut 40 --ref 5 --limit-refs 0 --me 3 --merange 27 --subme 5 --no-rect --no-amp --limit-modes --max-merge 4 --no-early-skip --b-intra --no-sao --signhide --weightp --weightb --aq-mode 3 --aq-strength 0.9 --cutree --rd 4 --tu-intra-depth 3 --tu-inter-depth 3 --psy-rd 2 --psy-rdoq 2 --rdoq-level 2 --lookahead-slices 4 --qcomp 0.65 --no-strong-intra-smoothing --deblock -1:-1 --qg-size 32 I admit your setting do offer a lower bit rate, I used crf 21 for both encodes. Your settings were almost twice as fast as mine. I did a couple of quick comparisons purely for speed by adjusting b frames, tu-inter-depth, me. I noticed the about .2-.3 more fps when I removed tu-inter-depth. The others were .1ish. Regarding me range - I read a post from x265Project regarding how they determine their default number of 57, as it pertains to their profiles. It would take me sometime to find it. They took the ctu size 64 and subtracted 7 for varies reasons. I don't recall of the reasons that were stated. I essentially applied the same logic to a ctu size of 32. 32 - 7 = 25. I recently upped it to 27. With that said I admit watching the two encodes side by side I could barely see a difference. It was only with the screenshots that I could easily identify the differences. I will need to reevaluate some of my settings from a speed perspective. @Rogatti I went back and played with x264 a bit and I'm shocked to see the quality difference is far more noticeable then I ever remember. I know my tests weren't apples to apples but I tried x265 medium vs x264 slow + slower. In both cases x265 quality was noticeable better even when in motion. My question, is x265 worth it has been answered! @littlepox Why is ctu 32 in your recommended film tune? @ anyone willing to help Would anyone be willing to help me adjust my settings? I'd like to increase or maintain the quality of my settings while trying to remove the options that are slowing my encodes down with little to no benefit. The idea being I can increase/decrease crf to achieve the ideal bitrate/file size. On a side note. I'd like to thank the community as a whole for being awesome. I've never been a forum that is as willing to help and most importantly nice about it as doom9. Last edited by brumsky; 31st July 2016 at 23:14. Reason: fixed images |
|
1st August 2016, 04:55 | #4085 | Link | |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
Quote:
Another 'trick' to save a bit of bitrate, which in turn means you can lower the CRF slightly is to use: --nr-intra 400 --nr-inter 400 or whatever decent size number you choose. Don't go too high too low though, otherwise you defeat the purpose . This cleans up a little of the low-level noise. It shouldn't be used by itself though, you need to combine it with reducing the CRF a bit. You save a bit of bitrate by setting NR, in turn you an use this recovered bitrate to improve the quality of the encode. Maybe you could try the encode again, using similar to what I listed before but with the addition of the NR and a lower CRF. Note also the lookahead range, I don't know why I wrote 40 before since I use 50! Seeing as you are using 1920x1080, I would suggest a higher ME. Say 57, but let's for argument sakes say 40: Quote:
Can you explain your reasons for the above settings . Yes quality is the goal, but that needs to be balanced with speed and output size. Your output size is a little high, you should be able to achieve similar results with a lower CRF based on the settings I suggested. Last edited by burfadel; 1st August 2016 at 06:05. |
|
1st August 2016, 18:12 | #4086 | Link | |
Registered User
Join Date: Jun 2016
Posts: 116
|
Quote:
--ctu 32: comes from varies posts on Doom9. The short version is that ctu 64 causes the bitrate to increase to compensate for the increased compression. I've also read that there is increased quality. It is also in littlepox's suggested film tune. --bframe 8: Looking for increased compression with minimal increase in encoding time. Littlepox again. --limit-ref 0: looking for increased compression. --no-early-skip: Fear of decreased compression. I have changed to --early-skip for increased performance. --subme 5: Better motion est. I've been bouncing back and forth between 3 & 5 though. >=3 includes chroma residual cost... --no-sao: I've heard it called the blur all objects option. haha --no-strong-intra-smoothing: blurs --aq-mode 3: pulled from littlepox's tune film. I've been back and forth on this one as well. --qg-size 32: I've changed to 16 as well. I did some testing and show a minimal savings compared to 16. 16 does appear to be sharper. --me-range 25: x265 docs state the following. The default is derived from the default CTU size (64) minus the luma interpolation half-length (4) minus maximum subpel distance (2) minus one extra pixel just in case the hex search method is used. If the search range were any larger than this, another CTU row of latency would be required for reference frames. 64 - 4 - 2 - 1 = 57 I applied the same logic to a CTU of 32. 32 - 4 - 2 - 1 = 25 I made the change to avoid the additional CTU row of latency. Your settings have given me a lot to think about and test. I spent several hours testing and tweaking my settings compared to yours. I'm currently using these. Code:
--crf 21 --profile main10 --output-depth 10 --ctu 32 --bframes 6 --rc-lookahead 40 --scenecut 40 --ref 5 --limit-refs 0 --me 3 --merange 26 --subme 3 --no-rect --no-amp --limit-modes --max-merge 3 --early-skip --b-intra --no-sao --signhide --weightp --weightb --aq-mode 2 --aq-strength 1 --cutree --rd 4 --tu-intra-depth 3 --tu-inter-depth 1 --psy-rd 2 --psy-rdoq 1.5 --rdoq-level 2 --lookahead-slices 4 --qcomp 0.65 --no-strong-intra-smoothing --deblock -1:-1 --qg-size 16 Take your settings and add --ctu 32 --merange 25 and give it a go. I did and found ctu 32 give a slightly smaller file size. My 60 second test clip is 182 MB. Using your settings it comes out to 46.5 MB. Add --ctu 32, 44.8 MB. Nothing crazy I know but it encodes about 1 - 1.5 fps faster for a 3.5% smaller file. |
|
1st August 2016, 19:35 | #4087 | Link |
Registered User
Join Date: Nov 2011
Posts: 66
|
@bromski: Can you repeat the last test you talked about, but adding: --rect, --amp, --no-early-skip, and --tu-inter-depth 3 options when encoding with --ctu 64?
It defies logic that a video encoded with a limiting option like --ctu 32 results in smaller file. All these options I suggested replacing should enable the encoder to analyze bigger CUs more thoroughly, and reuse more material from them. |
1st August 2016, 20:36 | #4088 | Link | |
Registered User
Join Date: Jun 2016
Posts: 116
|
Quote:
Code:
--crf 21 --output-depth 10 --rd 4 --tu-intra-depth 3 --tu-inter-depth 3 --rdoq-level 2 --rect --amp --no-early-skip --fast-intra --b-intra --tskip --tskip-fast --limit-modes --aq-mode 2 --qg-size 16 --me star --merange 25 --max-merge 3 --weightb --bframes 6 --rc-lookahead 40 --ref 6 --psy-rdoq 1.38 Code:
--crf 21 --output-depth 10 --rd 4 --ctu 32 --tu-intra-depth 3 --tu-inter-depth 3 --rdoq-level 2 --rect --amp --no-early-skip --fast-intra --b-intra --tskip --tskip-fast --limit-modes --aq-mode 2 --qg-size 16 --me star --merange 25 --max-merge 3 --weightb --bframes 6 --rc-lookahead 40 --ref 6 --psy-rdoq 1.38 CTU 64 is about 19.1% smaller. Each encode was about 50-60% slower than before. If you compare the pics I actually see more detail in the ctu 32 pic. Look around the mouth and chin, it is noticeable more blurred in the ctu 64 pic. I was never trying to say that ctu 64 can never be smaller. Just that compared to burfadels original settings adding ctu 32 was faster and smaller. Burfadels original settings - > 46.5MBs - 6068 Kbps I'm admittedly trading size for speed. For comparison here are my current settings with CRF 22. I'm trying to get closer to the size and bitrate of the ctu 64 with amp rect settings. Code:
--crf 22 --profile main10 --output-depth 10 --ctu 32 --bframes 6 --rc-lookahead 40 --scenecut 40 --ref 5 --limit-refs 3 --me 3 --merange 26 --subme 3 --no-rect --no-amp --limit-modes --max-merge 3 --early-skip --b-intra --no-sao --signhide --weightp --weightb --aq-mode 2 --aq-strength 1 --cutree --rd 4 --tu-intra-depth 3 --tu-inter-depth 1 --psy-rd 2 --psy-rdoq 1.38 --rdoq-level 2 --qcomp 0.65 --no-strong-intra-smoothing --deblock -1:-1 --qg-size 16 Results: 187MB -> 37.7 MB - 4835 Kbps To my eye my settings look a little better then both encodes and I averaged closer to 7 fps... I shouldn't use the term "my settings' as those are largely Burfadels suggestions with a few minor tweaks. --no-sao --no-strong-intra-smoothing --ctu 32 Last edited by brumsky; 1st August 2016 at 20:42. |
|
1st August 2016, 23:08 | #4089 | Link |
Registered User
Join Date: Nov 2011
Posts: 66
|
Hey, brumsky, thanks a lot
That test meant a lot to me, as I could verify that --ctu 64 option can indeed give noticeably more efficient encoding, however - it also brings an unexpectedly big speed penalty. My logic tells me that difference in speed should get significantly reduced with further x265 optimizations, as biggest CUs are present in low detail areas, not many get created because of their large size, and I expect them to "crumble down" to smaller blocks fast. Also, could there be some sort of bug in x265 logic when using --ctu 64, since your first encode has extremely reduced details, and big reduction in bitrate - as if you encoded using completely different settings? I wouldn't expect to have that many largest CUs in a frame you showed. --amp is probably the most useless option quality wise of the ones I recommended. It is also, highly probably, slowing down process the most. --rect is more useful for quality and brings less speed penalty. --no-early-skip influenced speed dramatically in my tests, but likewise had large influence on quality too. --tu-inter/intra-depth 3 are the options I recently added to my encodes, as they did prove to increase quality. I lost about 10-15% speed. I encode using slightly modified slower profile. |
2nd August 2016, 00:01 | #4090 | Link | |
Registered User
Join Date: Jun 2016
Posts: 116
|
Quote:
burfadel had --tskip --tskip-early --fast-intra, those could be responsible for the decreased visual quality. Although, they were in both the ctu 64 and 32 tests - yet the 32 looked better to me. I don't use those. These were my old settings. Code:
--profile main10 --output-depth 10 --ctu 32 --bframes 8 --rc-lookahead 80 --scenecut 40 --ref 5 --limit-refs 0 --me 1 --merange 25 --subme 3 --no-rect --no-amp --limit-modes --max-merge 4 --no-early-skip --b-intra --no-sao --signhide --weightp --weightb --aq-mode 3 --aq-strength 0.8 --cutree --rd 4 --tu-intra-depth 3 --tu-inter-depth 3 --psy-rd 2.0 --psy-rdoq 2.0 --rdoq-level 2 --lookahead-slices 4 --qcomp 0.65 --no-strong-intra-smoothing --deblock -1:-1 --qg-size 32 I can't tell a difference between my old settings and the new ones. I use staxrip's video comparison and they are indistinguishable to me. I'd imagine a trained pro could pick out the changes but I can't. Try burfadel's recommendation, --tu-inter-depth 1 --intra-depth 3. I couldn't tell a difference and gained .5 - .75 fps. Also, I tested limit-refs a bit. 3 is obviously the fastest, with little to no discernable difference - to my eyes. 2 was slower than 1, 0 was the slowest of course. My guess is limiting the depth, 1, is faster than limiting the CU,2. I'm sticking with 3 for now, I may consider testing 1 further from a quality perspective. Try these setting and let me know what you think. Change the crf to meet your ideal bitrate. Based on my testing I'd rather up crf then go with ctu 64... Code:
--crf 19.75 --profile main10 --output-depth 10 --ctu 32 --bframes 6 --rc-lookahead 40 --scenecut 40 --ref 5 --limit-refs 3 --me 3 --merange 26 --subme 3 --no-rect --no-amp --limit-modes --max-merge 3 --early-skip --b-intra --no-sao --signhide --weightp --weightb --aq-mode 2 --aq-strength 1 --cutree --rd 4 --tu-intra-depth 3 --tu-inter-depth 1 --psy-rd 2 --psy-rdoq 1.38 --rdoq-level 2 --qcomp 0.65 --no-strong-intra-smoothing --deblock -1:-1 --qg-size 16 |
|
2nd August 2016, 04:29 | #4091 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
Yeah the settings --tskip --tskip-early --fast-intra are purely a speed consideration. If you want to balance out a slightly lower bitrate for comparison you could use --crf 21.7 for example . Also try a small amount of inter and intra noise reduction, like 400, and make up the lower output size with a lower crf. Since I put 21.7 above, try 21.2. It's about trying to maximise efficiency.
|
2nd August 2016, 06:33 | #4092 | Link | ||
Registered User
Join Date: Aug 2006
Posts: 2,229
|
I just did some further testing. When I originally did my testing with --t-skip, --t-skip-fast, and --fast-intra they were beneficial speed wise. However, x265 has undergone improvements since then and I am no longer seeing the speed increase. Things like recursion skip etc were added. Of course, a lot of that could be dependent on the source material.
Don't forget that any setting that changes the output can also affect the speed of the encode. For instance, even if --fast-intra does the processing of that particular area faster because the output has changed it affects other areas of the encode. Now that there have been changes in other areas such as --limit-modes, recursion skip, limit references etc., I found in some cases it was actually faster without the speed settings! It's all about synergy of settings. This synergy also applies to the noise reduction argument keeping in mind the noise reduction in x265 is very mild. I don't suggest using it by itself, for benefit you need to use it in conjunction with reducing the CRF. If you are using a CRF of 16 or something it probably wouldn't be worth it, but at a higher CRF it is. In the last lost of testing I did, without 400 noise reduction on intra and inter I got almost the same file size at 22 as I did with testing with both inter and intra NR on (400) at a CRF of 21.2. As I said, you don't have to go 400, you could test with 200 on both Inter and Intra, and a CRF of say, 21.6, or whatever the bitrate equivalency is, and work out the balance that best suits. I do believe though that you can achieve a higher output quality once you take into account the ability to use a lower CRF for a given bitrate. I also testing without --early-skip, I only used one clip for this, but the speed drop was 21 percent. That is quite a lot so I don't recommend it, I only tried it out to see exactly how much slower it actually is. If you are okay with the speed loss of --early-skip for quality, then you probably wouldn't mind --rd 5 and using --rd-refine. The picture quality is much nicer when you do a direct comparison, it retains picture texture very well, but is slow. So, my new settings: Quote:
If you want to try something that isn't practical speed wise, but gives a nicer picture for roughly the same file size (it retains flat areas well): Quote:
Last edited by burfadel; 2nd August 2016 at 06:40. |
||
2nd August 2016, 09:05 | #4094 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
I use the Staxrip video comparison tool, but I also just play it in Windows Media Player and compare them that way as well as it includes motion. I use WMP because if I used MPC-HC with the custom madVR settings I have it would throw out any comparison.
|
2nd August 2016, 10:21 | #4095 | Link |
Registered User
Join Date: Nov 2011
Posts: 66
|
brumsky: I haven't encoded using main10 so far. My target playback hardware are some future STBs, TVs, built-in GPU decoders - namely, the cheapest decoding hardware available, and I'm concerned about compatibility. Besides, I don't have a powerful hardware, so speed matters to an extent, and sometimes the quality of my sources is not the best as well.
I use these: --bframes 9, as I hadn't noticed significant reduction in speed over 6, so kept it for maximum coding efficiency --ref 6 --rect, as it improved quality slightly without hurting speed too much, unlike amp which brings slight improvement (if any) at a very high cost --no-sao, --no-strong-intra-smoothing, --deblock -2 (-3) --early-skip brought obvious quality loss despite big gains in speed, so I discarded it in some of previous tests --max-merge 2, --limit-modes, --limit-refs 3 --no-weightb, all chosen for speed --qg-size 16 was accumulating artifacts in blocks containing "important" details to my eyes, so I discarded it to allow the encoder to reduce quality of largest CUs as well. However, I also use --no-cutree, as my encodes are of a kind where quality of the background is not that important, unlike areas of intense movement and changes in foreground. So, even options that reduce encoding efficiency, by discarding some of the textures in such areas, can seem "optimal" for me, as motion estimation tends to blur retained visual material. Precise motion estimation algorithms avoiding harsh and "blocky" look, and well defined edges - especially ones with lower contrast, mean most to me. Most encoders (x264 included) tend to "dissolve" less pronounced edges, so they almost become like gradients. Only well defined objects keep their outlines, while everything else becomes smoothed. For comparing images I use "old school methods". Save frames for comparison as BMP using MPC (preferably B-frames), then open both in full screen in separate image viewer windows, and alternate between them. Last edited by gamebox; 2nd August 2016 at 10:23. |
4th August 2016, 06:46 | #4097 | Link | |
Registered User
Join Date: Oct 2014
Posts: 476
|
Quote:
So you might want to take a look at your logs to see what the b-frame usage actually is, and do some runs to see what the penalty is. I haven't done any work with x265 so I don't know if the increased computational complexity overshadows the b-frame computation or if it adds to them. |
|
4th August 2016, 07:22 | #4098 | Link |
German doom9/Gleitz SuMo
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,783
|
Long range B-frames also decrease compatibility with consumer players. The more consecutive B-frames may exist between I or P frames, the more frames the decoder will have to handle until a GOP is done, especially when B-frame pyramid and references among B-frames are used, that may go beyond hardware limits.
|
4th August 2016, 10:19 | #4099 | Link |
Registered User
Join Date: Nov 2011
Posts: 66
|
@ kuchikirukia, Ligh:
In my logs, indeed, B-frame usage falls off sharply after 6 or 7. Over 6 I generally see only 1-2 %. I might reconsider that option soon. I'm currently struggling with increased encoding time after I added --tu-intra/inter-depth 3. Quality did increase considerably, but encoding time seems to have suffered more than I previously estimated in tests. I'll try lower depths and different combinations, aiming to preserve most of the quality gain. --no-rskip also proved useful, but increased encoding time as well. I hope to offset the slowdown with new RAM planned for my system - 1866MHz DDR3. I'm temporarily using 1333MHz modules from previous configuration. CPU is AMD FX-8320, octo-core, AVX capable. |
4th August 2016, 10:54 | #4100 | Link | |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
I did some more tests, seems it is not only a nicer picture without SAO, but it is faster as others have found. This faster is by several percent and repeatable, so it's not a margin of error thing! SAO is probably not a bad thing in theory, just the internal smoothing amount is probably too high?
I did some further testing out of interest, regarding a couple of key things that directly affect output quality. Quote:
I ended up with a smaller file as a result on the clip I used!... yes, when I thought about it, because of it affecting everything else this can happen. I then thought I'd try upping the quality of the B frames a bit. Default is 1.30, I tried 1.25. I then thought about the b-frame usage beyond 6 frames, this is easily adjusted with using the --bframe-bias option. Yes, 35 is probably fairly high, but I did manage to be able to select a much higher number of bframes, I tested 9, and still got double digit b-frames at the 9th consecutive, whereas by default 6 seems to be the limit. Even if you still had 6, the b-frame usage would be higher. Of course, b-frames are lower quality, but if you adjust the ratio's and work it out properly . By that I mean much less extreme changes. If the file size in the end is smaller than an encoded clip without these changes, if the drop is significant enough try dropping the CRF by 0.1. So I didn't go into fully testing with these settings, but I believe small tweaking of these could help with some of the smoothing issues. The current ratios just seem to be pulled from x264, and there's nothing to say those ratio's were perfected either. That said, because this was pulled from x264, it's probably much more off in x265. Now there's some testing for you to do! Try small variation for a start, and things like 1.20 for the ipratio probably wouldn't be helpful. I am referring more to say, between 1.34 and 1.42. Likewise with the pbratio. If the pbratio is adjusted then a slight b-frame bias can be applied since the b-frames will be higher quality. So the fine tuning fun isn't over yet! Last edited by burfadel; 4th August 2016 at 10:56. |
|
|
|