Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
22nd January 2017, 15:45 | #101 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Don't post anything here until JohnLai opens a thread regarding HW encoding
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
22nd January 2017, 15:48 | #102 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 |
22nd January 2017, 16:15 | #103 | Link | |
Registered User
Join Date: Mar 2008
Posts: 448
|
Quote:
By the way, why does rigaya explicitly mention --vbaq (H.264 only)? AMF 1.4 simply states By default, disable VBAQ. It should be possible to enable it for HEVC or if this is just an oversight by rigaya? It appears Long Term Reference picture support is not implemented too. EDIT: Assuming if AMD VCE HEVC P frames can refers to multiple preceding frames, LTR support is even more crucial. Last edited by JohnLai; 22nd January 2017 at 16:21. |
|
22nd January 2017, 17:12 | #104 | Link | ||
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
I really, really, really can't understand you... Quote:
"VBAQ is not supported with HEVC encoding, disabled." Yes, I think the first version of VCEEnc supporting AMF has some small bugs and is missing a few things. While I'm downloading the 10bit HEVC file, I did some tests with this source: ftp://helpedia.com/pub/multimedia/te...f5-108Mbps.mkv For all my tests I used --cqp 25 along with --preset-analysis auto I only changed -u (quality) using three options fast, balanced, slow. The results: (All HEVC encoded samples have half size of the original H.264 file with quality you can see by yourselves) Ducks_HEVC_CQP25_fast.mkv https://www.sendspace.com/file/9djrh3 DX9: List of adapters: 0: Device ID: 67DF [Radeon (TM) RX 470 Graphics] DX9 : Chosen Device 0: Device ID: 67DF [Radeon (TM) RX 470 Graphics] VCEEnc 3.00 (x64) / Windows 10 (x64) CPU: Intel Core i5-2400 @ 3.10GHz [TB: 3.20GHz] (4C/4T) GPU: \\.\DISPLAY1 [Ellesmere 1300MHz (2236.10)] Input Info: avcodec video: H.264/AVC, 1920x1080, 30000/1001 fps Output: H.265/HEVC main @ Level 4.1 1920x1080p 1:1 29.970fps (30000/1001fps) avwriter: hevc => matroska Quality: balanced CQP: I:25, P:25 VBV Bufsize: 20000 kbps Bframes: 0 frames Motion Est: Q-pel Slices: 1 GOP Len: 300 frames Others: deblock hrd pre-analysis:auto encoded 500 frames, 62.77 fps, 57915.24 kbps, 115.18 MB encode time 0:00:08, CPULoad: 26.13% frame type IDR 2 frame type I 2, total size 0.60 MB frame type P 498, total size 114.58 MB Ducks_HEVC_CQP25_balanced.mkv https://www.sendspace.com/file/yfii43 DX9: List of adapters: 0: Device ID: 67DF [Radeon (TM) RX 470 Graphics] DX9 : Chosen Device 0: Device ID: 67DF [Radeon (TM) RX 470 Graphics] VCEEnc 3.00 (x64) / Windows 10 (x64) CPU: Intel Core i5-2400 @ 3.10GHz [TB: 3.30GHz] (4C/4T) GPU: \\.\DISPLAY1 [Ellesmere 1300MHz (2236.10)] Input Info: avcodec video: H.264/AVC, 1920x1080, 30000/1001 fps Output: H.265/HEVC main @ Level 4.1 1920x1080p 1:1 29.970fps (30000/1001fps) avwriter: hevc => matroska Quality: balanced CQP: I:25, P:25 VBV Bufsize: 20000 kbps Bframes: 0 frames Motion Est: Q-pel Slices: 1 GOP Len: 300 frames Others: deblock hrd pre-analysis:auto encoded 500 frames, 55.80 fps, 58366.81 kbps, 116.08 MB encode time 0:00:09, CPULoad: 25.84% frame type IDR 2 frame type I 2, total size 0.60 MB frame type P 498, total size 115.48 MB Ducks_HEVC_CQP25_quality.mkv https://www.sendspace.com/file/cteioc DX9: List of adapters: 0: Device ID: 67DF [Radeon (TM) RX 470 Graphics] DX9 : Chosen Device 0: Device ID: 67DF [Radeon (TM) RX 470 Graphics] VCEEnc 3.00 (x64) / Windows 10 (x64) CPU: Intel Core i5-2400 @ 3.10GHz [TB: 3.30GHz] (4C/4T) GPU: \\.\DISPLAY1 [Ellesmere 1300MHz (2236.10)] Input Info: avcodec video: H.264/AVC, 1920x1080, 30000/1001 fps Output: H.265/HEVC main @ Level 4.1 1920x1080p 1:1 29.970fps (30000/1001fps) avwriter: hevc => matroska Quality: balanced CQP: I:25, P:25 VBV Bufsize: 20000 kbps Bframes: 0 frames Motion Est: Q-pel Slices: 1 GOP Len: 300 frames Others: deblock hrd pre-analysis:auto During the encoding of all three, GPU utilisation and GPU Memory controller was 0%, GPU clock was at low 751MHz but GPU Memory speed was at max 2000MHz GPU core only power consumption was about 7.7W to 12.3W As you can see Quality says always balanced which is probably a bug. Also, the size of balanced and slow samples is exactly the same. Using other samples of 1080p H.264 files as a source (~45Mbps), I managed a speed ~100fps for HEVC encoding. Bframes reported are always 0, no matter what.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all Last edited by NikosD; 22nd January 2017 at 17:14. |
||
22nd January 2017, 18:35 | #105 | Link |
Registered User
Join Date: Mar 2008
Posts: 448
|
Darn, I forgot to send a request to rigaya about adding AMF_VIDEO_ENCODER_HEVC_MAX_NUM_REFRAMES support after enumerating AMFCaps interface of AMF_VIDEO_ENCODER_HEVC_CAP_MAX_REFERENCE_FRAMES.
Anyway, currently checking Ducks_HEVC_CQP25_quality.mkv. Intra PU sizes 4x4 8x8 16x16 32x32 Inter PU sizes 8x8 8x16 16x8 16x16 16x32 32x16 32x32 32x64 64x32 64x64 Hmm...... VCE "quality" sample from nikosd max_transform_hierarchy_depth_inter 4 max_transform_hierarchy_depth_intra 4 transform_skip_enabled_flag 0 cu_qp_delta_enabled_flag 0 pps_loop_filter_across_slices_enabled_flag 0 NvenC using GTX970 max_transform_hierarchy_depth_inter 3 max_transform_hierarchy_depth_intra 0 transform_skip_enabled_flag 1 cu_qp_delta_enabled_flag 1 pps_loop_filter_across_slices_enabled_flag 1 Well, as usual, no SAO for Polaris. QSV TU1 Skylake max_transform_hierarchy_depth_inter 2 max_transform_hierarchy_depth_intra 2 transform_skip_enabled_flag 0 cu_qp_delta_enabled_flag 1 pps_loop_filter_across_slices_enabled_flag 0 |
22nd January 2017, 18:37 | #106 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
I will send him a thorough email with various small bugs I have found out.
Tell me to add features from the AMF.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
22nd January 2017, 18:56 | #107 | Link | |
Registered User
Join Date: Mar 2008
Posts: 448
|
Quote:
HEVC_DE_BLOCKING_FILTER_DISABLE, this one should be set to true or false if I wanna keep deblocking active? Judging from the AMF sdk, that about it. |
|
22nd January 2017, 19:36 | #108 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
NVEnc 3.02 (x64), using NVENC API v7.0
OS Version Windows 7 (x64) CPU Intel Core i5-2400 @ 3.10GHz [TB: 3.30GHz] (4C/4T) GPU #0: GeForce GTX 970 (13 EU) @ 1266 MHz (375.63) Input Buffers CUDA, 32 frames Input Info avsw: hevc(yv12(10bit))->nv12 [SSE2], 3840x2160, 60000/1001 fps Vpp Filters copyHtoD Output Info H.265/HEVC main @ Level auto 3840x2160p 1:1 59.940fps (60000/1001fps) avwriter: hevc => mp4 Rate Control VBR2 Bitrate 25000 kbps (Max: 30000 kbps) Initial QP I:20 P:23 B:25 VBV buf size auto Lookahead on, 16 frames, Adaptive I, B Insert GOP length 600 frames B frames 0 frames Ref frames 3 frames, LTR: on AQ off MV Quality Q-pel CU max / min 32 / 8 encoded 6646 frames, 30.14 fps, 22650.44 kbps, 299.38 MB encode time 0:03:40 / CPU Usage: 72.37% frame type IDR 31 frame type I 31, avgQP 23.52, total size 15.46 MB frame type P 6615, avgQP 28.61, total size 283.93 MB For highest Quality Playback Efficiency use a Renderer with Realtime Debanding option (MadVR/MPDotNet/MPv) or push it through a TV PP Decoder Pipeline Sony/Samsung e.c.t https://www.sendspace.com/file/declpg
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 |
22nd January 2017, 19:38 | #109 | Link | |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
Let's see what he could manage to add and fix. thanks!
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
|
22nd January 2017, 21:36 | #110 | Link | |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
1) HW decoding of VCE v3.0 is not working like v2.0, because it uses ~25% of a 4C/4T CPU which means that one core is used at 100%, while v2.0 uses HW decoding with ~2% CPU But v3.0 is slightly faster than v2.0 2) The Quality reported by the runtime info is always at balanced no matter what. I mean even if I choose fast or slow (quality), the Quality info line says "Balanced" I thought it was cosmetic but when I tried H.264 encoding it says Quality fast, balanced or slow and the variation in speed is a lot more than H.265 encoding between different presets. So, hold your horses about the "quality" sample until rigaya replies what's really going on.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
|
22nd January 2017, 23:37 | #111 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
375.95 Installed checking for result differences
So indeed only unification of things no change like the release notes also stated on the HEVC side. VBR 2Pass now called like Nvidias Internal naming convention VBR High Quality and so on NVEnc 3.02 (x64), using NVENC API v7.0 OS Version Windows 7 (x64) CPU Intel Core i5-2400 @ 3.10GHz [TB: 3.30GHz] (4C/4T) GPU #0: GeForce GTX 970 (13 EU) @ 1266 MHz (375.63) Input Buffers CUDA, 32 frames Input Info avsw: hevc(yv12(10bit))->nv12 [SSE2], 3840x2160, 60000/1001 fps Vpp Filters copyHtoD Output Info H.265/HEVC main @ Level auto 3840x2160p 1:1 59.940fps (60000/1001fps) avwriter: hevc => mp4 Rate Control VBR2 Bitrate 25000 kbps (Max: 30000 kbps) Initial QP I:20 P:23 B:25 VBV buf size auto Lookahead on, 16 frames, Adaptive I, B Insert GOP length 600 frames B frames 0 frames Ref frames 3 frames, LTR: on AQ off MV Quality Q-pel CU max / min 32 / 8 encoded 6646 frames, 30.14 fps, 22650.44 kbps, 299.38 MB encode time 0:03:40 / CPU Usage: 72.37% frame type IDR 31 frame type I 31, avgQP 23.52, total size 15.46 MB frame type P 6615, avgQP 28.61, total size 283.93 MB NVEnc 3.02 (x64), using NVENC API v7.0 OS Version Windows 7 (x64) CPU Intel Core i5-2400 @ 3.10GHz [TB: 3.30GHz] (4C/4T) GPU #0: GeForce GTX 970 (13 EU) @ 1266 MHz (375.95) Input Buffers CUDA, 32 frames Input Info avsw: hevc(yv12(10bit))->nv12 [SSE2], 3840x2160, 60000/1001 fps Vpp Filters copyHtoD Output Info H.265/HEVC main @ Level auto 3840x2160p 1:1 59.940fps (60000/1001fps) avwriter: hevc => mp4 Rate Control VBR2 Bitrate 25000 kbps (Max: 30000 kbps) Initial QP I:20 P:23 B:25 VBV buf size auto Lookahead on, 16 frames, Adaptive I, B Insert GOP length 600 frames B frames 0 frames Ref frames 3 frames, LTR: on AQ off MV Quality Q-pel CU max / min 32 / 8 encoded 6646 frames, 29.87 fps, 22650.44 kbps, 299.38 MB encode time 0:03:42 / CPU Usage: 70.54% frame type IDR 31 frame type I 31, avgQP 23.52, total size 15.46 MB frame type P 6615, avgQP 28.61, total size 283.93 MB NVEnc 3.05 (x64), using NVENC API v7.1 OS Version Windows 7 (x64) CPU Intel Core i5-2400 @ 3.10GHz [TB: 3.30GHz] (4C/4T) GPU #0: GeForce GTX 970 (13 EU) @ 1266 MHz (375.95) Input Buffers CUDA, 32 frames Input Info avsw: hevc(yv12(10bit))->nv12 [SSE2], 3840x2160, 60000/1001 fps Vpp Filters copyHtoD Output Info H.265/HEVC main @ Level auto 3840x2160p 1:1 59.940fps (60000/1001fps) avwriter: hevc => mp4 Rate Control VBRHQ Bitrate 25000 kbps (Max: 30000 kbps) Initial QP I:20 P:23 B:25 VBV buf size auto Lookahead on, 16 frames, Adaptive I, B Insert GOP length 600 frames B frames 0 frames Ref frames 3 frames, LTR: on AQ off MV Quality Q-pel CU max / min 32 / 8 encoded 6646 frames, 30.27 fps, 22650.44 kbps, 299.38 MB encode time 0:03:39 / CPU Usage: 72.38% frame type IDR 31 frame type I 31, avgQP 23.52, total size 15.46 MB frame type P 6615, avgQP 28.61, total size 283.93 MB btw [mpegts @ 00000000003779a0] start time for stream 1 is not set in estimate_timin gs_from_pts [mpegts @ 00000000003779a0] Could not find codec parameters for stream 1 (Audio: aac ([15][0][0][0] / 0x000F), 0 channels, fltp): unspecified sample rate Consider increasing the value for the 'analyzeduration' and 'probesize' options so it's not really Rigayas fault
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 23rd January 2017 at 01:00. |
23rd January 2017, 15:14 | #112 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
OK, I got some interesting replies from rigaya.
He will add REF in next version and probably add after a time LTR. Regarding HRD, it is always enabled - no need to disable it Deblocking filter always enabled, otherwise bad video quality. VBAQ for HEVC always disabled, otherwise bad video quality. Regarding HEVC -u options for quality always showing "balanced" he had no clue, but will investigate. It could be a bug or limitation.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
23rd January 2017, 15:20 | #113 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
I also told him to add:
HEVC tier (main, high) Full range color (H.264 only) HW HEVC decoding He could probably add them layer.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
23rd January 2017, 16:05 | #114 | Link |
Registered User
Join Date: Mar 2008
Posts: 448
|
Gotcha, NikosD.
Once rigaya-san adds the ref + ltr support, we can finally know if multiple reference frames are used. I was thinking VBAQ to be acronym of Variable Bitrate Adaptive Quantization, clearly it is not....since developer said it produces bad quality. Hmm, reading through AMF sdk source code.....where is AMD promised Two Pass encoding? There is nothing about two-pass in AMF SDK. |
23rd January 2017, 20:39 | #115 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
NikosD how does the Samsung 10->8bit retranscode comes forward for you in direct compare vs the the Nvidia one i posted at the 2x reduction target for it any watchable results yet ?
im trying to get some result out of ffmpeg currently but that is somehow tricky with that parsing issue together not as easy as i thought and i wonder if that .ts is corrupt overall but Lav Splitter has 0 issues with it and also MPV and others show no issues and adjusting the probesize doesn't fix this very weired i didn't got any encoding result for nvenc out of ffmpeg yet it behaves crazy with that input overall. and it tells me it can't find any nvenc device at all tried different pixel formats and things but it acts totally weired -gpu list detects it correctly overall, pretty frustrating.
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 23rd January 2017 at 20:50. |
23rd January 2017, 20:44 | #116 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
I'll wait a little for some basic fixes of rigaya regarding VCEENC before I try that sample.
You can see the results of HEVC encoding of the source and the samples I posted.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
23rd January 2017, 21:17 | #117 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
@NikosD
Ok so only ducks, jellyfish and some desktop results to directly compare for now, not ideal but better then nothing lets hope some Intel user joins in Skylake/Kabylake then we can throw each other results around and compare though obviously neither of us would have any chance vs Quicksync overall What the ???? Code:
[graph 0 input from stream 0:0 @ 0000000000357140] Setting 'video_size' to value '3840x2160' [graph 0 input from stream 0:0 @ 0000000000357140] Setting 'pix_fmt' to value '7 2' [graph 0 input from stream 0:0 @ 0000000000357140] Setting 'time_base' to value '1/90000' [graph 0 input from stream 0:0 @ 0000000000357140] Setting 'pixel_aspect' to val ue '1/1' [graph 0 input from stream 0:0 @ 0000000000357140] Setting 'sws_param' to value 'flags=2' [graph 0 input from stream 0:0 @ 0000000000357140] Setting 'frame_rate' to value '60000/1001' [graph 0 input from stream 0:0 @ 0000000000357140] w:3840 h:2160 pixfmt:yuv420p1 0le tb:1/90000 fr:60000/1001 sar:1/1 sws_param:flags=2 [format @ 0000000000358620] compat: called with args=[yuv420p|nv12|p010le|yuv444 p|yuv444p16le|bgr0|rgb0|cuda] [format @ 0000000000358620] Setting 'pix_fmts' to value 'yuv420p|nv12|p010le|yuv 444p|yuv444p16le|bgr0|rgb0|cuda' [auto_scaler_0 @ 0000000000358b80] Setting 'flags' to value 'bicubic' [auto_scaler_0 @ 0000000000358b80] w:iw h:ih flags:'bicubic' interl:0 [format @ 0000000000358620] auto-inserting filter 'auto_scaler_0' between the fi lter 'Parsed_null_0' and the filter 'format' [AVFilterGraph @ 0000000002f661e0] query_formats: 4 queried, 2 merged, 1 already done, 0 delayed [auto_scaler_0 @ 0000000000358b80] picking p010le out of 7 ref:yuv420p10le alpha :0 [auto_scaler_0 @ 0000000000358b80] w:3840 h:2160 fmt:yuv420p10le sar:1/1 -> w:38 40 h:2160 fmt:p010le sar:1/1 flags:0x4 [hevc_nvenc @ 0000000003269020] Loaded Nvenc version 7.1 [hevc_nvenc @ 0000000003269020] Nvenc initialized successfully [hevc_nvenc @ 0000000003269020] 1 CUDA capable devices found [hevc_nvenc @ 0000000003269020] [ GPU #0 - < GeForce GTX 970 > has Compute SM 5. 2 ] [hevc_nvenc @ 0000000003269020] 10 bit encode not supported [hevc_nvenc @ 0000000003269020] No NVENC capable devices found [hevc_nvenc @ 0000000003269020] Nvenc unloaded Stream mapping: Stream #0:0 -> #0:0 (hevc (native) -> hevc (hevc_nvenc)) Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height [AVIOContext @ 00000000003bf520] Statistics: 0 seeks, 0 writeouts [AVIOContext @ 00000000006b90a0] Statistics: 29381488 bytes read, 8 seeks ok got it ffmpegs picky parser First result doesn't make the same quality level impression as rigayas nvencc output i posted above currently need to tweak it to the same level first options wise -preset slow itself doesn't seem on that level alone.
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 23rd January 2017 at 23:51. |
24th January 2017, 20:53 | #118 | Link |
Registered User
Join Date: Jan 2002
Posts: 332
|
for information, new Media Server Studio 2017 R2 for intel QSV
https://software.intel.com/en-us/for...k/topic/708917 |
25th January 2017, 16:25 | #119 | Link |
Registered User
Join Date: Mar 2008
Posts: 448
|
http://rigaya34589.blog135.fc2.com/blog-entry-891.html
VCEEnc 3.01 is out. Google translate version of changelog: Added functions and fixed bugs. [Common] · Check the function of VCE at the time of execution and check the parameters. - Added option to specify reference distance. (- ref <int>) - Added option to specify the number of LTR frames. (- ltr <int>) · Added H.264 Level 5.2. - Version of AMF added to version information. [VCEEncC] · Fixed spelling error etc in help. · Added option to check the function of VCE. (- check-features) The function of HEVC can not be displayed normally. Is this ...? · Added HW decoding of HEVC (8 bit). · Since wmv3's HW decoding does not work properly, it is deleted. |
25th January 2017, 16:30 | #120 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
I know.
I've already exchanged a few emails with rigaya
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
|
|