Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th April 2021, 20:55   #441  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
Quote:
Originally Posted by benwaggoner View Post
But anyone giving a VMAF score needs to state what the comparison resolution was and what VMAF version was used.

All my VMAF results are based on VMAF 2.0.0 (model 0.6.1) and resolution is unchanged, original resolution for all.

I found an Iris Xe HEVC/AVC comparison from Intel btw: https://dgpu-docs.intel.com/devices/...des/media.html


Yups is offline   Reply With Quote
Old 7th April 2021, 21:34   #442  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 3,595
Quote:
Originally Posted by Yups View Post
All my VMAF results are based on VMAF 2.0.0 (model 0.6.1) and resolution is unchanged, original resolution for all.

I found an Iris Xe HEVC/AVC comparison from Intel btw: https://dgpu-docs.intel.com/devices/...des/media.html
The metrics from that article are based on Luma PSNR, which isn't something x265 is optimized for (unless you use --tune psnr). The BDRATE differences in luma PSNR is within the range where subjective quality could be quite different; it just isn't that great a metric. And x265 defaults to a lot of psychovisual optimizations that reduce PSNR in favor of improving subjective qualtiy.

That said, these suggest a generally competent encoder for high speed use (presumably why --preset medium was the top option). If you used x265 with --preset slower --tune psnr, x265 likely would win by a fair margin.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 7th April 2021, 21:51   #443  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
That being said, Intel didn't use the highest quality preset in this (there is another image with quality preset). But of course x265 slower would win in almost every case unless their Ubuntu FFMPEG environment is better than my Windows QSVEnc environment which I don't think it is.
Yups is offline   Reply With Quote
Old 8th April 2021, 17:12   #444  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
I could test a GTX 1660 Super next week, it's already running on the current 7th gen Nvenc generation, I'm curious how it compares to Iris Xe. Is there a settings tutorial somewhere? Is it correct that 5 bframes is the maximum number of bframes on Nvenc?
Yups is offline   Reply With Quote
Old 8th April 2021, 20:14   #445  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 3,595
Quote:
Originally Posted by Yups View Post
That being said, Intel didn't use the highest quality preset in this (there is another image with quality preset). But of course x265 slower would win in almost every case unless their Ubuntu FFMPEG environment is better than my Windows QSVEnc environment which I don't think it is.
I'm surprised they didn't use their best setting.

An always-interesting question is where the crossover point in speed/quality is between HW and SW encoders.

A key use of GPU encoders is for game streaming, where even 25% CPU utilization would hurt FPS in a lot of games.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th April 2021, 21:38   #446  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
They did use the best setting in the other chart: average BDRATE computed across 27 standard short sequences generated in both CBR and VBR


Yups is offline   Reply With Quote
Old 8th April 2021, 22:51   #447  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 3,595
Quote:
Originally Posted by Yups View Post
They did use the best setting in the other chart: average BDRATE computed across 27 standard short sequences generated in both CBR and VBR
Are the axes mislabled or am I misreading? I really doubt that x265 efficiency gets worse with slower presets!

Although faster presets do use less psychovisual optimization, and mainly make choices based on SAD, which maps to PSNR better...
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th April 2021, 23:04   #448  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
Quote:
Originally Posted by benwaggoner View Post
Are the axes mislabled or am I misreading? I really doubt that x265 efficiency gets worse with slower presets!

Left side of the chart: Bit-rate savings (higher is better).


13.7% bitrate saving for x265 slow over medium and 11.0% higher bitrate required for very fast preset over medium. VME quality is the best Quicksync preset.
Yups is offline   Reply With Quote
Old 9th April 2021, 17:03   #449  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
Earlier than expected I got this:




I will try CQP+Lookahead 32+bframes 5+quality preset later, if there is any other important setting I should use let me know. Bframes 5 is indeed the maximum on Turing.
Yups is offline   Reply With Quote
Old 10th April 2021, 00:29   #450  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
I have finished my first GTX 1660 test from my last video sample. I have tried lots of different settings and this is the best I could find (b-frame ref middle gave me a nice score boost).

Code:
 
Intel Demo Clip 1080p         VMAF    PSNR    SSIM     VQM     speed    bitrate

Quicksync H265 
Iris Xe CQP best              91.76   41.85   0.9748   0.790   67 fps   2438 kbit

NVENC H265
GTX 1660S CQP best            90.62   41.28   0.9699   0.885   150 fps  2439 Kbit

x265 (Staxrip 2.1.9.0)
i7-1165G7 CRF slow            93.20   41.90   0.9754   0.800   8 fps    2430 kbit
i7-1165G7 CRF medium          91.05   41.35   0.9744   0.861   19 fps   2440 Kbit
i7-1165G7 CRF very fast       89.99   40.97   0.9726   0.894   32 fps   2440 Kbit


x264 (Staxrip 2.1.9.0)
i7-1165G7 CRF slow            89.38   40.21   0.9693   0.974   30 fps   2425 Kbit
https://drive.google.com/file/d/1Nzl...ew?usp=sharing
https://drive.google.com/file/d/1onL...ew?usp=sharing


Metric scores are a mixed bag, respectable VMAF and PSNR scores but not that good at VQM and especially SSIM metrics. Subjective frame to frame comparison it's obvious detail preservation is a lot worse compared to Iris Xe (VME/GPU) and x265.
Yups is offline   Reply With Quote
Old 10th April 2021, 15:35   #451  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
Blender Open Movie from here: https://forum.doom9.org/showpost.php...&postcount=423

Code:
    
HERO - Blender Open Movie            VMAF      speed       bitrate

Quicksync H265 (27.20.100.9316)
Iris Xe CQP FF best                  86.56     550 fps     199 kbit
Iris Xe CQP FF balanced              85.66     850 fps     201 Kbit
Iris Xe CQP FF speed                 76.25     1550 fps    200 Kbit

Iris Xe CQP best                     88.59     167 fps     200 kbit
Iris Xe CQP balanced                 87.18     280 fps     201 Kbit
Iris Xe CQP speed                    86.22     520 fps     200 Kbit


NVENC H265 (470.14)
GTX 1660S CQP best                   84.52     430 fps     200 Kbit
GTX 1660S CQP default                83.55     990 fps     201 Kbit
GTX 1660S CQP performance            79.22     1130 fps    200 Kbit


x265 (Staxrip 2.1.9.0)
i7-1165G7 CRF slower                 90.25     8 fps       200 Kbit
i7-1165G7 CRF slow                   88.18     34 fps      200 kbit
i7-1165G7 CRF medium                 85.72     56 fps      200 Kbit
i7-1165G7 CRF very fast              83.36     73 fps      200 Kbit


x264 (Staxrip 2.1.9.0)
i7-1165G7 CRF slower                 75.37     81 fps      200 Kbit

Disabled b-adapt is better for this video. Turing CQP cannot reach Iris Xe CQP quality, subjective and objective the difference is large.

Turing has two downsides, only 5 bframes versus 16 bframes on Iris Xe and there is no GPU equivalent mode which is more flexible than a fully fixed function solution, however even the FF mode from Iris Xe looks better. It might look different with CBR vs CBR which I haven't tried. That said, the H265 CQP results from Turing are really good for a hardware encoder, something like x265 fast-faster with extremely fast encoding times, the CQP quality from Iris Xe is just insane.

Last edited by Yups; 10th April 2021 at 15:47.
Yups is offline   Reply With Quote
Old 10th April 2021, 17:41   #452  |  Link
Tenkei
Registered User
 
Join Date: Jan 2021
Posts: 9
Is there any reason to use CQP instead of ICQ with QuickSync. Never used it but it seems that ICQ is CRF equivalent. Did you try --ctu 64 and --ref X?

Last edited by Tenkei; 10th April 2021 at 17:46.
Tenkei is offline   Reply With Quote
Old 10th April 2021, 20:03   #453  |  Link
Yups
Registered User
 
Join Date: Sep 2011
Posts: 286
CQP with custom offset offers higher quality than ICQ, this old Quicksync bitrate method overview is still valid:

Quote:
Constant QP (CQP) provides the most control and best performance. Without question, the best coding efficiency with Intel codecs can be obtained via CQP plus custom content analysis. CQP often has significant performance advantages as well. CQP operates most closely to reference implementations. It is the most direct way to access codec capabilities and measure the effects of encoder parameter/algorithm trade-offs and also is the clearest way to evaluate against other codec algorithm implementations.
https://software.intel.com/content/w...media-sdk.html


For a basic user ICQ is easier to handle, there is just one global setting and that's it. Furthermore ICQ does not really scale over 5 bframes (16 bframes can be worse than 5 at low bitrate) whereas CQP scales really good beyond 5 bframes even at low bitrate. Here I did include both ICQ and CQP: https://forum.doom9.org/showpost.php...&postcount=369

On Iris Xe it automatically uses ctu 64 (Gen 9 ctu 32), this can't be changed at the moment. Tskip and SAO are also enabled on Tigerlake which I can't disable. Reference frames best leave it auto, with 16 bframes+bpyramid Intel sets it to 6 reference frames, I've tried 8 reference frames but there is no improvement.
Yups is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:15.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.