Digeridoo
2nd March 2019, 00:33
Hi. I need help with compression of action cam footage with riding a bicycle. It was shooted at ultra-wide angle with a lot of grass and road grain. 95% of footage is constant fast motion.
Source: MP4 1080p@50 High@4.2 VBR at 47 Mbit/s, without B frames.
Target: The same as source + quality as close to the source as possible with the same bitrate (+-5%).
I have made some tests exporting video from Adobe Premiere:
1. Using ffmpeg and x264 with CRF 22 and default settings gives me avg bitrate 47 Mbit/s, which is the same as source. Probably, there is no visual loose of the quality when watching a video. However, in comparison on frame by frame, some details has been changed (a lot of single pixels or small areas/blocks). Encoding using CRF 17-18 does not have visible changes on frames, but avg bitrate raised up to 75 Mbit/s, which is more than 50% higher than the source.
3. Using slower presets increases the avg bitrate in comparison to the default (medium) preset.
4. Using tune film increases bitrate up to 1%, while using tune grain - up to 50%.
5. Encoder does not use more then 4-5 consecutive B frames and more than 4-5 ref frames. Usually, 85- 95% of ref frames used only by first two positions.
6. Using CRF 19 without B frames gives me bitrate about 49 Mbit/s, which is close to the source/target.
As a summary of a lot of different tests (some of them here - http://grabilla.com/09301-8095f7a6-2d45-4cd3-a8c2-1b39f5935a59.png)
1. Best match is: crf=19 / tune=film / bframes=0 / ref=4 / rc-lookahead=50 / me=umh / merange=24 / subme=9
2. Disabling B frames allows me to lower CRF from 22 to 19 and get better avg QP for I/P frames within target avg bitrate.
With Adobe Premiere and used exporter, I am limited to use some features like: CRF and 2 pass, output of SSIM/PSNR metrics in log file. So, I can compare results only visually in motion (useless at high bitrate) or frame by frame (useless at some point - I can’t encode with 2 pass) or by statistics in log file like Avg QP of frames.
As a newbe not only for x264, but also for encoding in general, I have a lot of questions to ensure that I am on the right way.
1. Disabling B frames is a common practice to preserve the quality (visually lossless)? What would you suggest in my case?
2. With which other options/values/combinations I can play to rich, probably, better results of quality or compression? Should I try trellis=2, --no-dct-decimate, --no-fast-pskip? Would it be useful for CRF one pass encode?
3. Should I exclude b8x8 from partitions? Or there is no correlations and b8x8 can be used/useful without B frames?
4. Which GOP structure will be more suitable for video without B frames and with constant fast motion? Fixed or Adaptive (with scene detection), long (5-10+ sec) or short (1-2 sec)?
5. For cross fades I usually use "Cross Dissolve transition" effect. Sometimes, scene detection do not recognize such cross fades as scene change and do not insert IDR/I frame - http://grabilla.com/09301-6827e343-7185-4676-b060-4468de23a2a4.png. Is it OK?
6. CRF 19 gives me avg QP 23 and 25,8 for I and P frames respectively. Is that more or less normal values? As I understand bigger values means worse quality and compression (hard to compress). Which other metrics from ffmpeg’s statistics in log can be useful to compare the quality/compression?
7. Using VBV is a good practice or it should be avoided until there is strong reason to use it?
Source: MP4 1080p@50 High@4.2 VBR at 47 Mbit/s, without B frames.
Target: The same as source + quality as close to the source as possible with the same bitrate (+-5%).
I have made some tests exporting video from Adobe Premiere:
1. Using ffmpeg and x264 with CRF 22 and default settings gives me avg bitrate 47 Mbit/s, which is the same as source. Probably, there is no visual loose of the quality when watching a video. However, in comparison on frame by frame, some details has been changed (a lot of single pixels or small areas/blocks). Encoding using CRF 17-18 does not have visible changes on frames, but avg bitrate raised up to 75 Mbit/s, which is more than 50% higher than the source.
3. Using slower presets increases the avg bitrate in comparison to the default (medium) preset.
4. Using tune film increases bitrate up to 1%, while using tune grain - up to 50%.
5. Encoder does not use more then 4-5 consecutive B frames and more than 4-5 ref frames. Usually, 85- 95% of ref frames used only by first two positions.
6. Using CRF 19 without B frames gives me bitrate about 49 Mbit/s, which is close to the source/target.
As a summary of a lot of different tests (some of them here - http://grabilla.com/09301-8095f7a6-2d45-4cd3-a8c2-1b39f5935a59.png)
1. Best match is: crf=19 / tune=film / bframes=0 / ref=4 / rc-lookahead=50 / me=umh / merange=24 / subme=9
2. Disabling B frames allows me to lower CRF from 22 to 19 and get better avg QP for I/P frames within target avg bitrate.
With Adobe Premiere and used exporter, I am limited to use some features like: CRF and 2 pass, output of SSIM/PSNR metrics in log file. So, I can compare results only visually in motion (useless at high bitrate) or frame by frame (useless at some point - I can’t encode with 2 pass) or by statistics in log file like Avg QP of frames.
As a newbe not only for x264, but also for encoding in general, I have a lot of questions to ensure that I am on the right way.
1. Disabling B frames is a common practice to preserve the quality (visually lossless)? What would you suggest in my case?
2. With which other options/values/combinations I can play to rich, probably, better results of quality or compression? Should I try trellis=2, --no-dct-decimate, --no-fast-pskip? Would it be useful for CRF one pass encode?
3. Should I exclude b8x8 from partitions? Or there is no correlations and b8x8 can be used/useful without B frames?
4. Which GOP structure will be more suitable for video without B frames and with constant fast motion? Fixed or Adaptive (with scene detection), long (5-10+ sec) or short (1-2 sec)?
5. For cross fades I usually use "Cross Dissolve transition" effect. Sometimes, scene detection do not recognize such cross fades as scene change and do not insert IDR/I frame - http://grabilla.com/09301-6827e343-7185-4676-b060-4468de23a2a4.png. Is it OK?
6. CRF 19 gives me avg QP 23 and 25,8 for I and P frames respectively. Is that more or less normal values? As I understand bigger values means worse quality and compression (hard to compress). Which other metrics from ffmpeg’s statistics in log can be useful to compare the quality/compression?
7. Using VBV is a good practice or it should be avoided until there is strong reason to use it?