Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
![]() |
|
Thread Tools | Search this Thread | Display Modes |
![]() |
#21041 | Link | |
Registered User
Join Date: Mar 2011
Posts: 470
|
Quote:
Looking at the 16 core CPU, the same problem was happening as it had 16 avisynth threads and 32 h265 threads. Raising the avisynth threads to 20 seems to keep the threads filled and not starving. I'll have to monitor the 16 core server a bit more as I have Topaz Video AI also running on it, but that mostly uses the GPU. When I get these jobs done in a week or so, I'll take my 4K Blade Runner clip that I used for testing last time and run some benchmarks with it and see what the results are The default 8096 sure seems to have fixed the problem of having to run multiple encoding servers to fully utilize the CPU, so I like that. Having to run only a single enconding server is nice. Not sure if I fully understand your question though. I haven't tested any 4K yet so I don't know if I can fully answer it yet. |
|
![]() |
![]() |
![]() |
#21042 | Link | |
RipBot264 author
Join Date: May 2006
Location: Poland
Posts: 7,915
|
Quote:
__________________
Windows 7 Image Updater - SkyLake\KabyLake\CoffeLake\Ryzen Threadripper |
|
![]() |
![]() |
![]() |
#21043 | Link | |
Registered User
Join Date: Aug 2020
Location: Pennsylvania
Posts: 172
|
Quote:
![]() Perhaps when doing an update in the future, it may be beneficial to add that cache memory value to the main ripbot settings. Default it at 8192, but allow us to change that. A tool tip for what it does and why to change it would also be helpful |
|
![]() |
![]() |
![]() |
#21044 | Link |
RipBot264 author
Join Date: May 2006
Location: Poland
Posts: 7,915
|
or maybe it should scale automatically with number of cores? That's why I'm asking.
__________________
Windows 7 Image Updater - SkyLake\KabyLake\CoffeLake\Ryzen Threadripper |
![]() |
![]() |
![]() |
#21045 | Link |
Registered User
Join Date: Mar 2011
Posts: 470
|
Well, I spoke too soon. It looks like if I'm just doing pure x265 with no filters I'll only get about 40% processor usage. Running two encoding servers with 24 cores each gets it to 85% CPU. I tried both 8192 and 16384. I'm only doing 2K right, now, 2560x1440. Same thing was happening with just x265 with the 16 core processor and I had to move back to using two encoding servers.
I should be able to test my Blade Runner 4K clip with and without SMDegrain in a couple of days. |
![]() |
![]() |
![]() |
#21046 | Link | |
Registered User
Join Date: Aug 2020
Location: Pennsylvania
Posts: 172
|
Quote:
My only concern would be say you have a 32 core as your client. Just to pick a number, say 16 gig is optimum for the cache on that. If that 16 gig cache number is sent to a lower powered server, say an 8 core with only 16 gig total memory, whats going to happen to the 8 core machine that gets the memory pegged just for the encoding cache set at 16 gig?? A machine having to go back to using a swap file even with ssd's would not be a good thing IMO. Scaling automatically (and we still need to wait and see if we need to scale the cache up with higher than 16 core machines while doing full frame 4k) would be ideal, but my thinking is that the cores and especially total individual memory of an entire distributed encoding farm would need to be taken into consideration and might be difficult to accomplish. |
|
![]() |
![]() |
![]() |
#21047 | Link |
Registered User
Join Date: Jan 2010
Posts: 480
|
I understood that the cache was set automatically, according to the number of cpus. What is the problem you have?
__________________
E5 2697 v2 @ 3.0GHz on P9X79 Deluxe 24GB Xeon E5-2680 v2 @ 3.1GHz 16GB Sony Vaio VPC-F13Z1E/B |
![]() |
![]() |
![]() |
#21048 | Link | |
Registered User
Join Date: Aug 2020
Location: Pennsylvania
Posts: 172
|
Quote:
We are discussing now that Ryushin is running an Epyc system and creating a much larger core count computer if that 8192 for the cache is going to be enough with all the added cores and avisynth-prefetch-threads. Before figuring out what was going on and the cache update, anything above 12 for the avisynth-prefetch-threads just killed 4k performance on 16 core computers. It may or not be depending on some future testing. If it needs to be bumped up, just throwing out ideas on how the best way to handle that will be in a distributed environment. |
|
![]() |
![]() |
![]() |
#21049 | Link | |
Registered User
Join Date: Jan 2010
Posts: 480
|
Quote:
__________________
E5 2697 v2 @ 3.0GHz on P9X79 Deluxe 24GB Xeon E5-2680 v2 @ 3.1GHz 16GB Sony Vaio VPC-F13Z1E/B |
|
![]() |
![]() |
![]() |
#21050 | Link | |
Registered User
Join Date: Aug 2020
Location: Pennsylvania
Posts: 172
|
Quote:
Setting the cache to 8192 works for avisynth-prefetch-threads of 16 when doing 4k so we don't have to "cripple" the 16 core ryzens either using a lower prefetch thread of 12 or using an affinity mask to only use 12 cores for ripbot. It is yet to be determined if that will need to be a higher number with avisynth-prefetch-threads of 24 or 32 on a higher core cpu. |
|
![]() |
![]() |
![]() |
#21051 | Link |
Registered User
Join Date: Jan 2010
Posts: 480
|
So avisynth can't handle the memory required above a number of processors. Even if there are 2 DE servers?
__________________
E5 2697 v2 @ 3.0GHz on P9X79 Deluxe 24GB Xeon E5-2680 v2 @ 3.1GHz 16GB Sony Vaio VPC-F13Z1E/B |
![]() |
![]() |
![]() |
#21052 | Link | |
Registered User
Join Date: Aug 2020
Location: Pennsylvania
Posts: 172
|
Quote:
The default cache size for avisynth (and i do not know exactly what it is, but it is not enough) is not large enough when doing all the extra bits when doing 4k encoding and using avisynth-prefetch-threads setting once that number hits 16 or larger. Encoding performance drops off a cliff if: A: the cache is not specified to at least 8192 or B: we tell ripbot to only use 12 of the 16 cores either by reducing the avisynth-prefetch-threads or disabling cores using an affinity mask. Either way turns your system into a 12 core machine for ripbot performance wise. Before we figured out the cache was the issue, I had to do 4k encoding with my 5950 and 7950's essentially performing as 5900 and 7900's. The 16 core Ryzens would perform at about half the fps of a 12 core Ryzen without crippling them (hence falling off the cliff). Of course a crippled 16 core would not fully perform like it should for 1080p either, so I had to have 2 different encoding server profiles, one for 1080 and below, and another crippled one for 4k and switch them depending on what I was encoding. Now with the cache being set to 8192 automatically in Ripbot, I no longer have to worry about anything, The 16 core Ryzens perform like they should automatically using all the cores regardless of the resolution. |
|
![]() |
![]() |
![]() |
#21053 | Link | |
Registered User
Join Date: Mar 2014
Posts: 11
|
Quote:
I don't see it making an impact in Linux (still getting issues with Wine binding the port though for DE mode). |
|
![]() |
![]() |
![]() |
#21055 | Link |
Registered User
Join Date: Mar 2011
Posts: 470
|
High Core Count
EPYC Turnin 9355P 32-Core Processor
24 Cores and 48 Threads Passed to VM Blade Runner 4K 15 Minute Clip - 3840x1600 Encoding Servers SetMemoryMax() Prefetch x265-threads SMDerain Chunk Size CPU Time 1 8192 24 48 None 20 70% 22m:29s 1 16384 24 48 None 20 70% 22m:49s 1 8192 28 48 None 20 70% 22m:46s 1 16384 28 48 None 20 70% 23m:03s 1 16384 28 48 None 1 70% 24m:50s 1 8192 24 48 Hard 20 75% 27m:23s 1 16384 24 48 Hard 20 88% 21m:56s 1 16384 27 48 Hard 20 90% 20m:10s 1 16384 28 48 Hard 20 93% 19m:45s 1 16384 29 48 Hard 20 98% 19m:51s 1 16384 32 48 Hard 20 98% 20m:22s 2 8192 12 24 None 1 85% 24m:16s 2 16384 12 24 None 1 85% 22m:34s 2 8192 12 24 Hard 1 83% 24m:38s 2 16384 12 24 Hard 1 85% 24m:30s 2 8192 14 24 None 1 88% 20m:46s 2 16384 14 24 None 1 88% 21m:16s 2 8192 14 24 Hard 1 95% 24m:08s 2 16384 14 24 Hard 1 95% 24m:25s 1 12288 28 48 Hard 20 90% 20m:40s Okay, I think I've finished my testing. Summary is SetMemoryMax to 16384 is a big help with 48 threads. Using a single encoding server was the faster compared to using two when encoding 4K. So it would be beneficial to have the option to SetMemoryMax per machine or at a minimum the option to set it for global. The Prefetch also needs to be increased from half of thread count to about 60% of thread count for optimum performance. I guess for me, I'll move to a single encoding server for 4K content, and run dual encoding servers for HD content, though I still need to test HD content. I know for 2560x1400, two encoding servers for x265 was about 30-50% faster. Another test to do later. Last edited by Ryushin; 23rd January 2025 at 15:22. |
![]() |
![]() |
![]() |
#21056 | Link |
Registered User
Join Date: Aug 2020
Location: Pennsylvania
Posts: 172
|
So looks to me like a cache estimate of 8 gig per 16 avisynth-prefetch-threads is reasonable as it scales up. I would imagine when you had prefetch threads set to 24 moving the cache down to 12 gig probably would have been fine as well.
I'll do some testing over the next couple of days bumping the cache up to 16384 and even beyond and see how the couple servers I only have 16 gig on them react. I have a 5950x and a 5900x with 16 gig total system memory, so it will be interesting to see the results using prefetch threads of 16 and 12 respectively. Having 16gig cache sent down to them may work without any issues, or may bog the system down. Just curious with the hard SMdegrain what were your appx best fps numbers you were seeing in the encoding server window? Last edited by rlev11; 23rd January 2025 at 01:33. |
![]() |
![]() |
![]() |
#21057 | Link | |
Registered User
Join Date: Mar 2011
Posts: 470
|
Quote:
8 GB per 16 threads is probably a good method. I just did another run with 12288 MB and the CPU graph was more spiky as some threads were waiting (added it to the previous post). So for 24 cores, 16384 MB is better. 32 cores might need another 8 GB. I suppose for testing sake, I can run two more tests giving the full 32 cores to the VM and see what that gives us. |
|
![]() |
![]() |
![]() |
#21058 | Link |
Registered User
Join Date: Mar 2011
Posts: 470
|
Encoding Server Crash with CPU Cores 26 or Higher
So it looks like I encountered a limit to RB when trying to select 32 cores in my VM. So I narrowed it down and selecting 26 cores and two threads (52 threads total) results in RB crashing.
Starting Ripbot gives me a Division by Zero error, clicking okay on the error then shows the Ripbot window. Starting the Encoding server I get a Invalid Pointer Operation then another two error windows: --------------------------- Encodingserver --------------------------- Access violation at address 00401EBA in module 'EncodingServer.exe'. Write of address 00000000. --------------------------- and --------------------------- Application Error --------------------------- Exception EAccessViolation in module EncodingServer.exe at 00001EBA. Access violation at address 00401EBA in module 'EncodingServer.exe'. Write of address 00000000. --------------------------- It's almost like Atak couldn't imagine all those years ago we would have so many cores in a chip. ![]() With the latest EPYC having 192 cores 384 threads, might as well use a 16 bit integer now. LOL |
![]() |
![]() |
![]() |
#21059 | Link |
RipBot264 author
Join Date: May 2006
Location: Poland
Posts: 7,915
|
That sucks. Does EncodingClient.exe also crash in the same way?
If you disable SMT (26 cores/26 Threads) will it crash as well?
__________________
Windows 7 Image Updater - SkyLake\KabyLake\CoffeLake\Ryzen Threadripper Last edited by Atak_Snajpera; 23rd January 2025 at 20:04. |
![]() |
![]() |
![]() |
#21060 | Link |
Registered User
Join Date: Mar 2011
Posts: 470
|
I disabled SMT and just passed cores to the VM. As soon as I enabled 51 cores the problem occurs. I had one job in the queue and it does not show up since it tried to divide by zero, so I can't test the Encoding Client.
|
![]() |
![]() |
![]() |
Tags |
264, 265, appletv, avchd, bluray, gui, iphone, ipod, ps3, psp, ripbot264, x264 2-pass, x264 gui, x264_64, x265, xbox360 |
Thread Tools | Search this Thread |
Display Modes | |
|
|