Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
17th May 2012, 21:58 | #21 | Link |
soy sauce buyer
Join Date: Mar 2010
Location: United Kingdom
Posts: 164
|
A test build
Compiled with APP's SDK. Can also run on NVidia cards, but not sure if it would be faster or not on NV cards when compiled with CUDA SDK.... |
17th May 2012, 22:50 | #25 | Link |
soy sauce buyer
Join Date: Mar 2010
Location: United Kingdom
Posts: 164
|
Oops, that's terrible....
The only case it crashed was when I use Ctrl+C to terminate the process, which now I found is always crashing. And my friends's GTX460 hadn't met crash in normal usage. Not sure if 12.4 has any issues with APP runtime or I have any problems with my compiling configure.... Last edited by 06_taro; 18th May 2012 at 00:06. |
18th May 2012, 00:57 | #27 | Link |
Registered User
Join Date: Nov 2003
Posts: 1,281
|
Looks like the patch has disappeared.
Quick tests on the test build, I'm finding that the higher the encoded bitrate, the lower the performance increase. dgnv 1440x1080 source, preset veryslow Around 60% @ 350kbps 45% @ 1500kbps 8% @ 6500kbps preset medium (default) 12% @ 1500kbps 6% @ 5000kbps
__________________
http://www.7-zip.org/ |
18th May 2012, 04:02 | #28 | Link |
Registered User
Join Date: Apr 2009
Posts: 478
|
I just tested it. Thanks for the build, 06_taro!
The source file is 4000 frames out of a 720p H.264 file. The source filter is FFVideoSource. No crashes during the test. Preset was --crf 21 preset "slower". That's it. Speed wise: OpenCL: 17.83 fps Normal: 11.12 fps The OpenCL version did produce a slightly bigger file, as Anandtech noted. The OpenCL file was 61.6MB and the normal file 60.8MB. My system: i7-2600K 16GB RAM Radeon 7850 Edit: I'm doing more comprehensive tests to determine is the crashes are driver related, or GPU generation related. Perhaps GCN is more stable? Last edited by aegisofrime; 18th May 2012 at 04:04. |
18th May 2012, 05:20 | #31 | Link |
soy sauce buyer
Join Date: Mar 2010
Location: United Kingdom
Posts: 164
|
Here's a step-by-step guide: How to Compile x264 on 32 & 64 Bit Windows. The only thing not mentioned in this article is that building x264 with opencl support needs opencl sdk installed in your system. Either CUDA SDK or AMD APP SDK is required. The opencl libs will be checked during configuring.
Last edited by 06_taro; 18th May 2012 at 05:28. |
18th May 2012, 05:33 | #32 | Link | |
Registered User
Join Date: Apr 2009
Posts: 478
|
Quote:
Basically what I did was to save the patch as OpenCL.diff. I then moved this file to my x264 folder. Here's my build script: Code:
#!/bin/bash -x set -e #git clone git://git.videolan.org/x264.git "C:/x264" patch -p1 < "C:/x264/OpenCL.diff" cd "C:/x264" CFLAGS=-march=corei7-avx ./configure --cross-prefix=x86_64-w64-mingw32- --host=x86_64-pc-mingw32 --enable-win32thread --bit-depth=10 make fprofiled VIDS="C:/fprofile.avs" Edit: Hmmm maybe my problem was that I didn't have the AMD APP SDK installed? I will install that and try again. |
|
18th May 2012, 05:34 | #33 | Link |
Registered User
Join Date: Nov 2003
Posts: 1,281
|
Can someone also re-upload the patch please.
__________________
http://www.7-zip.org/ |
18th May 2012, 05:38 | #34 | Link |
Registered User
Join Date: Apr 2009
Posts: 478
|
|
18th May 2012, 05:50 | #35 | Link | |
soy sauce buyer
Join Date: Mar 2010
Location: United Kingdom
Posts: 164
|
Quote:
Also note that opencl lookahead doesn't support high bit depth, so don't build 10-bit version. |
|
18th May 2012, 06:39 | #36 | Link |
The speed of stupid
Join Date: Sep 2011
Posts: 317
|
Well it's a bit faster @ --preset Medium & --crf 25.05 (and lots of other stuff) here, though with about 1 fps. (Strangely enough --rc-lookahead 80 was almost 2 fps faster than non-ocl)
Of course I just compared it to a x264 build without OCL that I had before. My silly machine: C2D E8600 @ 3.67 Ghz and a 9800 GT. If it crashes, then that's a bit unfortunate for it, since the video I tested it on (I was lazy, so it was a 0:29 clip) turned out to be 3 MB bigger than the normal x264. But yeah, it's a nice feature and all - but it's far from there. :/ |
18th May 2012, 09:31 | #37 | Link | |
もこたんインしたお!
Join Date: Jan 2008
Location: Finland / Japan
Posts: 512
|
Quote:
One of the problems for GPU lookahead pretty much seems to have been the fact that you'd have to implement something completely new that would work on the GPU at least with some amount of speed (if not in the context of no such prior art, at least in the context of the application). Should try building this version of the OpenCL patch with the nvidia's SDK, but I'm lazy to use dlltool on the dot-lib files for mingw (to create dot-a files) .-.
__________________
[I'm human, no debug]
|
|
18th May 2012, 14:43 | #38 | Link | |
Registered User
Join Date: Jun 2009
Location: Poland
Posts: 125
|
Quote:
http://vr-zone.com/articles/from-gtc...ts-/15903.html |
|
18th May 2012, 15:11 | #39 | Link |
Registered User
Join Date: May 2004
Posts: 5
|
I must be doing something wrong because the OpenCL build is about 2-3% slower with OpenCL activated than with the --no-opencl switch.
My PC is a Core2Duo E8200@3.4 Ghz with an Ati 5850 and Windows XP 32 Professional. I'm running the 11.12 Catalyst drivers which are the last to support OpenCL under XP. I can see my card speeding up from idle when running the OpenCL build but the load stays at 0%. I have tried OpenCL with other tools and benchmarks (to make sure it works) and the GPU load goes to 100%. My x264 settings are x264.exe --level 3.1 --preset slow --tune film --crf 20 --vbv-bufsize 14000 --vbv-maxrate 17500 --vbv-bufsize 14000 --vbv-maxrate 17500 -o e:\trailer.mkv e:\trailer.avs |
18th May 2012, 16:33 | #40 | Link | |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Quote:
In other words: It's very easy to port your CPU-based software to the GPU and get something that runs a lot slower than the original. At the same time getting something that actually runs faster is very difficult and sometimes impossible! (There are some calculations that are "sequential" by nature and therefore will never run fast on a massively parallel processor, such as a GPU)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 18th May 2012 at 16:41. |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|