Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 17th September 2019, 15:30   #921  |  Link
redbtn
Registered User
 
redbtn's Avatar
 
Join Date: Jan 2019
Location: Russia
Posts: 105
Quote:
Originally Posted by poisondeathray View Post
It depends on your scenario

e.g if the bottleneck is your encoder/ encoding settings, or some filters, maybe offloading the decoding to GPU might not make much of a difference

eg. If decoding only uses 0.5% of your CPU (maybe SD footage), it might not make much of a difference either

Other scenarios might be different. e.g. Decoding UHD/4K footage might take significant CPU resources. Offloading that decoding task to GPU should free up CPU cycles to encode faster (if using a "CPU encoder" )
Thank you for explaining. I encode 4k HDR > 1080p HDR using x265 and VapourSynth. So, I will try again, maybe I will see the difference. Is it right, that both software and hardware decoders work the same and no difference in quality or stability? Can I safely choose hardware decoder for all my encodes and don't worry about something goes wrong?
redbtn is offline   Reply With Quote
Old 17th September 2019, 15:41   #922  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,377
Quote:
Originally Posted by redbtn View Post
Thank you for explaining. I encode 4k HDR > 1080p HDR using x265 and VapourSynth. So, I will try again, maybe I will see the difference. Is it right, that both software and hardware decoders work the same and no difference in quality or stability? Can I safely choose hardware decoder for all my encodes and don't worry about something goes wrong?
Money back guarantee

There were differences in earlier versions with CUVID, but this version looks to have fixed it

I don't think there has been enough testing to ensure everything works 100%

If something goes wrong, report it . That's the only way stuff gets fixed
poisondeathray is offline   Reply With Quote
Old 17th September 2019, 15:48   #923  |  Link
stax76
Registered User
 
stax76's Avatar
 
Join Date: Jun 2002
Location: On thin ice
Posts: 6,837
Quote:
Can I ask what matter to use hardware decoder? I have Nvidia RTX 2060, but I did test and didn't notice difference in encoding speed.
What decoder is preferred in my case? Thank you!
How many CPU cores? I don't think you gain something substantial if you have more than 4 cores, that's what my encoding test showed, sw 60 fps, hw 61 fps.

StaxRip hasn't the most efficient drawing implementation, dealing with 4K in the crop and preview dialog users get a noticeable improvement with a HW decoder.

Last edited by stax76; 17th September 2019 at 15:57.
stax76 is offline   Reply With Quote
Old 17th September 2019, 15:57   #924  |  Link
redbtn
Registered User
 
redbtn's Avatar
 
Join Date: Jan 2019
Location: Russia
Posts: 105
Quote:
Originally Posted by stax76 View Post
How many CPU cores? I don't think you gain something substantial if you have more than 4 cores, that's what my encoding test showed.

StaxRip hasn't the most efficient drawing implementation, dealing with 4K in the crop and preview dialog users get a noticeable improvement with a HW decoder.
I have I5-9400f 6 core processor. On preset Slower x265 with some minor changes i get 2.4-2.6 fps.
So, if I can't see any difference after tests, the best way choose software decoder, right?

Last edited by redbtn; 17th September 2019 at 15:59.
redbtn is offline   Reply With Quote
Old 17th September 2019, 17:44   #925  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by redbtn View Post
I have I5-9400f 6 core processor. On preset Slower x265 with some minor changes i get 2.4-2.6 fps.
So, if I can't see any difference after tests, the best way choose software decoder, right?
No wonder you see no differences with hardware decoding if encoder is only requesting on average 1 frame every 400ms.
Atak_Snajpera is offline   Reply With Quote
Old 17th September 2019, 17:45   #926  |  Link
MeteorRain
結城有紀
 
Join Date: Dec 2003
Location: NJ; OR; Shanghai
Posts: 894
Hardware decoder can easily reach 200fps on 1080p source.
__________________
Projects
x265 - Yuuki-Asuna-mod Download / GitHub
TS - ADTS AAC Splitter | LATM AAC Splitter | BS4K-ASS
Neo AviSynth+ filters - F3KDB | FFT3D | DFTTest | MiniDeen | Temporal Median
MeteorRain is offline   Reply With Quote
Old 17th September 2019, 17:48   #927  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by MeteorRain View Post
Hardware decoder can easily reach 200fps on 1080p source.
Decoding speed with blu-ray avc source is around 130fps on Kepler.
For comparison:
Q8200@2.8GHz reaches 100fps (~85% cpu usage)
Xeon E5-2690@3.2GHz reaches 440fps (~75% cpu usage)

Last edited by Atak_Snajpera; 17th September 2019 at 18:13.
Atak_Snajpera is offline   Reply With Quote
Old 18th September 2019, 01:06   #928  |  Link
MeteorRain
結城有紀
 
Join Date: Dec 2003
Location: NJ; OR; Shanghai
Posts: 894
Thanks for correcting. Although those fps are "free" fps that does not fight with encoding speed at all. When transcoding HEVC 4k, having hardware decoder can free up large portion of CPU resources and leave them for encoding. I'd still use hardware decoder whenever possible to me.
__________________
Projects
x265 - Yuuki-Asuna-mod Download / GitHub
TS - ADTS AAC Splitter | LATM AAC Splitter | BS4K-ASS
Neo AviSynth+ filters - F3KDB | FFT3D | DFTTest | MiniDeen | Temporal Median
MeteorRain is offline   Reply With Quote
Old 18th September 2019, 09:04   #929  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by MeteorRain View Post
Thanks for correcting. Although those fps are "free" fps that does not fight with encoding speed at all. When transcoding HEVC 4k, having hardware decoder can free up large portion of CPU resources and leave them for encoding. I'd still use hardware decoder whenever possible to me.
In theory yes but in practice no. Encoder will still be responsible for about 98% of CPU time. You will only see difference If you encode hevc 4k 100Mbps to low resolution with x264 and preset superfast.

Last edited by Atak_Snajpera; 18th September 2019 at 09:10.
Atak_Snajpera is offline   Reply With Quote
Old 18th September 2019, 09:07   #930  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
If you're using heavy filters (like eedi3 or mdegrain), you'll definitely see an improvement in speed with HW decoder.
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 18th September 2019, 09:13   #931  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by DJATOM View Post
If you're using heavy filters (like eedi3 or mdegrain), you'll definitely see an improvement in speed with HW decoder.
Nope because those filters work on CPU creating additional bottleneck.
Atak_Snajpera is offline   Reply With Quote
Old 18th September 2019, 09:54   #932  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Quote:
Originally Posted by Atak_Snajpera View Post
Nope because those filters work on CPU creating additional bottleneck.
Yeah, and offloading decoder to GPU saves CPU cycles for encoder and filters.
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 18th September 2019, 10:08   #933  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Atak_Snajpera is right. The more of my CPU time "cake" is spent on filtering and encoding the less impact HW decoding will make on speed. It doesn't mean HW decoding won't make encoding faster it's just that the percentage goes down. (Exception is if my CPU isn't utilized 100%.)

Imagine you use AV1 encoder with placebo settings and very slow filtering (QTGMC+waifu2x) and software decoding of source. Then 99% of CPU time is spent on encoding+filtering, 1% on SW decoding. If you replace SW decoding with HW decoding you only free up that 1%. If you use no filtering and very fast encoder settings (x264 preset ultrafast) maybe you have 70% encoding and 30% SW decoding. Then replacing SW by HW decoding can increase speed much more.
sneaker_ger is offline   Reply With Quote
Old 18th September 2019, 10:11   #934  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by DJATOM View Post
Yeah, and offloading decoder to GPU saves CPU cycles for encoder and filters.
You do not realize how encoding chain works. Encoder determines how fast IT needs frames from decoder. Any filtering in avisynth will only slow down frame requests.
Atak_Snajpera is offline   Reply With Quote
Old 18th September 2019, 10:17   #935  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
Code:
# Latest lsmash Nvidia gpu
Output 3001 frames in 34.80 seconds (86.24 fps) # SMDegrain(clip, tr=1)
Output 3001 frames in 88.24 seconds (34.01 fps) # SMDegrain(clip, tr=3)
# CPU
Output 3001 frames in 40.83 seconds (73.51 fps) # SMDegrain(clip, tr=1)
Output 3001 frames in 95.05 seconds (31.57 fps) # SMDegrain(clip, tr=3)
Tested on ryzen 2600, GTX 1070 in vapoursynth
source is 1080p AVC
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database

Last edited by ChaosKing; 18th September 2019 at 10:34.
ChaosKing is offline   Reply With Quote
Old 18th September 2019, 13:06   #936  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Quote:
Originally Posted by Atak_Snajpera View Post
You do not realize how encoding chain works. Encoder determines how fast IT needs frames from decoder. Any filtering in avisynth will only slow down frame requests.
I'm talking about CPU usage, not how frame requests works. Obviously software decoder will leave less room for other stuff and that was my point. You still can repeat your mantra about frame requests, but you can't say "software decoder is free for CPU", right?
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 18th September 2019, 14:49   #937  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by DJATOM View Post
I'm talking about CPU usage, not how frame requests works. Obviously software decoder will leave less room for other stuff and that was my point. You still can repeat your mantra about frame requests, but you can't say "software decoder is free for CPU", right?
Check this out!

CPU: Intel Q8200@2.8GHz
GPU: NVidia GT 710 (Kepler)
SSD: Yes

Source Blu-ray John Carter (first 10 minutes)
Code:
Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L4.1
Format settings                          : CABAC / 4 Ref Frames
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 4 frames
Format settings, GOP                     : M=3, N=18
Muxing mode                              : Container profile=@0.0
Codec ID                                 : V_MPEG4/ISO/AVC
Duration                                 : 10 min 0 s
Bit rate mode                            : Variable
Bit rate                                 : 26.9 Mb/s
Maximum bit rate                         : 40.0 Mb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 23.976 (24000/1001) FPS
Standard                                 : NTSC
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.541
Stream size                              : 1.88 GiB (98%)
Default                                  : No
Forced                                   : No
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709
ScriptSW.avs
Code:
LoadPlugin("C:\Program Files (x86)\RipBot264\Tools\AviSynth plugins\lsmash\LSMASHSource.dll")
video=LWLibavVideoSource("C:\Temp\Video.mkv",cachefile="C:\Temp\Video.mkv.lwi",prefer_hw=0)
return video
ScriptHW.avs
Code:
LoadPlugin("C:\Program Files (x86)\RipBot264\Tools\AviSynth plugins\lsmash\LSMASHSource.dll")
video=LWLibavVideoSource("C:\Temp\Video.mkv",cachefile="C:\Temp\Video.mkv.lwi",prefer_hw=1)
return video
Decoding speed test in AVSMeter
Software Decoding
Code:
Log file created with:      AVSMeter 2.9.6 (x64)
Script file:                C:\Temp\scriptSW.avs

[OS/Hardware info]
Operating system:           Windows 7 (x64) Service Pack 1.0 (Build 7601)

CPU:                        Intel(R) Core(TM)2 Quad CPU Q8200 @ 2.80GHz / Yorkfield (Core 2 Quad) 2M
                            MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1
                            4 physical cores / 4 logical cores

Video card:                 NVIDIA GeForce GT 710
GPU version:                GK208
Video memory size:          1024 MiB
OpenCL version:             OpenCL 1.2 CUDA
Graphics driver version:    26.21.14.3615 (NVIDIA 436.15) / Win7 64


[Avisynth info]
VersionString:              AviSynth+ 0.1 (r2772, MT, x86_64)
VersionNumber:              2.60
File / Product version:     0.1.0.0 / 0.1.0.0
Interface Version:          6
Multi-threading support:    Yes
Avisynth.dll location:      C:\Windows\system32\avisynth.dll
Avisynth.dll time stamp:    2018-12-20, 12:55:18 (UTC)
PluginDir2_5 (HKLM, x64):   C:\Program Files (x86)\AviSynth+\plugins64
PluginDir+   (HKLM, x64):   C:\Program Files (x86)\AviSynth+\plugins64+


[Clip info]
Number of frames:                14405
Length (hh:mm:ss.ms):     00:10:00.809
Frame width:                      1920
Frame height:                     1080
Framerate:                      23.976 (24000/1001)
Colorspace:                       i420
Audio channels:                    n/a
Audio bits/sample:                 n/a
Audio sample rate:                 n/a
Audio samples:                     n/a


[Runtime info]
Frames processed:               14405 (0 - 14404)
FPS (min | max | average):      76.92 | 256.0 | 124.1
Process memory usage (max):     85 MiB
Thread count:                   10
CPU usage (average):            92.8%

GPU usage (average):            1%
VPU usage (average):            0%
GPU memory usage:               130 MiB

Time (elapsed):                 00:01:56.122


[Script]
LoadPlugin("C:\Program Files (x86)\RipBot264\Tools\AviSynth plugins\lsmash\LSMASHSource.dll")
video=LWLibavVideoSource("C:\Temp\Video.mkv",cachefile="C:\Temp\Video.mkv.lwi",prefer_hw=0)
return video
Hardware Decoding
Code:
Log file created with:      AVSMeter 2.9.6 (x64)
Script file:                C:\Temp\scriptHW.avs

[OS/Hardware info]
Operating system:           Windows 7 (x64) Service Pack 1.0 (Build 7601)

CPU:                        Intel(R) Core(TM)2 Quad CPU Q8200 @ 2.80GHz / Yorkfield (Core 2 Quad) 2M
                            MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1
                            4 physical cores / 4 logical cores

Video card:                 NVIDIA GeForce GT 710
GPU version:                GK208
Video memory size:          1024 MiB
OpenCL version:             OpenCL 1.2 CUDA
Graphics driver version:    26.21.14.3615 (NVIDIA 436.15) / Win7 64


[Avisynth info]
VersionString:              AviSynth+ 0.1 (r2772, MT, x86_64)
VersionNumber:              2.60
File / Product version:     0.1.0.0 / 0.1.0.0
Interface Version:          6
Multi-threading support:    Yes
Avisynth.dll location:      C:\Windows\system32\avisynth.dll
Avisynth.dll time stamp:    2018-12-20, 12:55:18 (UTC)
PluginDir2_5 (HKLM, x64):   C:\Program Files (x86)\AviSynth+\plugins64
PluginDir+   (HKLM, x64):   C:\Program Files (x86)\AviSynth+\plugins64+


[Clip info]
Number of frames:                14405
Length (hh:mm:ss.ms):     00:10:00.809
Frame width:                      1920
Frame height:                     1080
Framerate:                      23.976 (24000/1001)
Colorspace:                       i420
Audio channels:                    n/a
Audio bits/sample:                 n/a
Audio sample rate:                 n/a
Audio samples:                     n/a


[Runtime info]
Frames processed:               14405 (0 - 14404)
FPS (min | max | average):      38.94 | 148.9 | 122.6
Process memory usage (max):     91 MiB
Thread count:                   8
CPU usage (average):            13.2%

GPU usage (average):            22%
VPU usage (average):            99%
GPU memory usage:               230 MiB

Time (elapsed):                 00:01:57.449


[Script]
LoadPlugin("C:\Program Files (x86)\RipBot264\Tools\AviSynth plugins\lsmash\LSMASHSource.dll")
video=LWLibavVideoSource("C:\Temp\Video.mkv",cachefile="C:\Temp\Video.mkv.lwi",prefer_hw=1)
return video
Results when encoding in x264 (logs -> https://www.mediafire.com/file/9cs3p...x/Logs.7z/file )


The slower encoder the less you get from hardware decoding! It would be even worse if I added any filtering in AviSynth like MDegrain. Not to mention about a lot slower x265. Deal with it! Most of the time you get placebo effect

Last edited by Atak_Snajpera; 18th September 2019 at 20:35.
Atak_Snajpera is offline   Reply With Quote
Old 18th September 2019, 16:54   #938  |  Link
MeteorRain
結城有紀
 
Join Date: Dec 2003
Location: NJ; OR; Shanghai
Posts: 894
Time saved is time saved. Whatever slower preset you choose, the amount of CPU resource HW decoder saves you is basically the same.

The CPU resources saved always equals to the total cost of SW decoder demands, minus the effort to copy pictures from graphics card buffer, right?

Saving 5 minutes off 10 minutes ultrafast encoding is great, but saving 5 minutes off 50 minutes medium encoding is not bad either.
It's like upgrading your CPU from 3600 to 3600X for free. It's a free 5 minutes, I'll take it.
__________________
Projects
x265 - Yuuki-Asuna-mod Download / GitHub
TS - ADTS AAC Splitter | LATM AAC Splitter | BS4K-ASS
Neo AviSynth+ filters - F3KDB | FFT3D | DFTTest | MiniDeen | Temporal Median
MeteorRain is offline   Reply With Quote
Old 18th September 2019, 17:07   #939  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Fortunately I'm using Vapoursynth nowadays. Looks like it's both fine with HW and SW decoding (no penalty over HW decoding with heavy filtering).
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 18th September 2019, 17:46   #940  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,815
Quote:
Originally Posted by MeteorRain View Post
Time saved is time saved. Whatever slower preset you choose, the amount of CPU resource HW decoder saves you is basically the same.

The CPU resources saved always equals to the total cost of SW decoder demands, minus the effort to copy pictures from graphics card buffer, right?

Saving 5 minutes off 10 minutes ultrafast encoding is great, but saving 5 minutes off 50 minutes medium encoding is not bad either.
It's like upgrading your CPU from 3600 to 3600X for free. It's a free 5 minutes, I'll take it.
Keep in mind that this was the BEST case scenario. Add some MDegrain or/and some HDR->SDR tonemapping plus x265 and you will be lucky if you even see 1 minute saved! It's simple! The more you throw tasks on CPU the less speed gap between software and hardware decoding you get.

Last edited by Atak_Snajpera; 18th September 2019 at 17:50.
Atak_Snajpera is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 02:31.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.