Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
16th October 2011, 21:58 | #181 | Link |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
Any special reason to use 64 bit?
Can you try the 32 bit version? My decoder will be a little faster in 32bit as I've optimized the copy function in ASM. 64 bit use intrinsic functions but the compiler isn't 100% efficient using them.
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. |
17th October 2011, 20:35 | #183 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,251
|
ProcessExplorer?
And, if you want a more detailed analysis on how many CPU cycles have been spent in each function, you could have a look at Code Analyst: http://developer.amd.com/tools/CodeA...s/default.aspx (Although it is an AMD tool, it works on Intel CPU's just as well. Just make sure you use it with a Debug build, if you want function names!)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 17th October 2011 at 20:38. |
19th October 2011, 13:48 | #186 | Link |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
Multi GPU setup
I managed to solve the multi GPU problem without cables. You'll need v0.16 or newer to make this work.
1) You need to set up another (fake) screen. Right click on desktop->screen resolution. 2) Click the Detect button. Unconnected screens will appear. 3) Extend desktop to a VGA connection on the Intel GPU (screen 2 in the image). 4) Drag the 2nd screen to the corner of the primary screen so the mouse boundaries of the primary screen will remain (almost) the same. 5) Click OK/Apply. A reboot is recommended. 6) Open your favorite player and select MadVR or other GPU demanding renderer for to test the setup. You can test further by selecting EVR as renderer, open the control panel for your AMD/Nvidia GPU and override the color settings (e.g. kill the saturation). Here's a working setup
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. Last edited by egur; 22nd December 2011 at 09:13. |
19th October 2011, 14:08 | #187 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
awesome i just love NT6 now we can mix input output like crazy without needing any 3rd party solutions great work egur
i wonder though is DXVA also working or does the decoder need specifically to support this ? And what happens if you open a DXVA session and where does it get rendered ?
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 19th October 2011 at 14:13. |
19th October 2011, 14:53 | #188 | Link | |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,368
|
Quote:
Besides, if you already use DXVA, why not use the DXVA of your primary video card?
__________________
LAV Filters - open source ffmpeg based media splitter and decoders |
|
19th October 2011, 16:24 | #189 | Link |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
New version released 0.16
New and improved version. Zip files contains installer and documentation, please read.
Download version 0.16 alpha: 32 bit http://www.multiupload.com/Z4PX2UFGB4 64 bit http://www.multiupload.com/QH5ZZXINCQ Source code http://www.multiupload.com/06IZWGH4T0 Revision highlights: v1.16: * Support multi GPU setups. Now the decoder can run on separate HW then the renderer. Even without connecting the Intel GPU to a screen. See Multi GPU below for details. * This version will be the first version on SourceForge. * Updated to ffdshow build 3996. * Some fixes to the timestamp code. Now supporting streams with no frame rate. * Fixed several aspect ratio issues. * Very initial support for DVD playback. Menus are not displayed right yet. WIP. Recommend not to use except for testing purposes. * Changed mechanism for handling flush & seek event. Code is faster and more robust. A critical stage for playing DVDs. * Added a new callback for FFDShow’s internal decoders – EndFlush. This is needed for DVD playback. Other decoders do not need to implement it. * Enhanced FFDShow’s code with a faster memcpy function (SSE2 based). This replaces calling memcpy. The original source code would use ffmpeg to do it, but it crashes on NV12 images.
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. Last edited by egur; 19th October 2011 at 16:31. |
19th October 2011, 16:45 | #190 | Link | |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
Quote:
My decoder is mostly aimed at low power, but it was a nice problem to solve. I'm not aware of similar solutions. Since I copy the frames from the GPU to the CPU very quickly, it makes sense in using it with your favorite SW setup. The pipeline is File->CPU->GPU1->CPU->GPU2->Screen. This opens up a way for fast HW decoding with super strong programmable video processing on a discrete GPU. I wish Windows 7 was easier to use in sense of utilizing the various HW resources.
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. |
|
19th October 2011, 17:24 | #191 | Link |
RipBot264 author
Join Date: May 2006
Location: Poland
Posts: 7,886
|
Quicksync in official ffdshow r4000 would epic
__________________
Windows 7 Image Updater - SkyLake\KabyLake\CoffeLake\Ryzen Threadripper |
19th October 2011, 20:33 | #193 | Link |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
This part wasn't ready for this release. Currently it falls back to libavcodec if the platform can't support QuickSync or for H264 unsupported formats.
If something else is unsupported ffdhsow will decline the connection. I'll fix this for next release. Hopefully after I integrate into the main ffdshow trunk in sourceforge.
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. |
20th October 2011, 16:30 | #194 | Link |
Registered User
Join Date: Sep 2011
Posts: 22
|
I'm sorry. I don't speak English very well.
audio/video unsync (with Gabest Splitter) sample files. (2011.09.28) Hyun Young 조현영 _A_ @ Gachon University Festival Celebration Fancam(720p_H.264-AAC).mp4 http://o-o.preferred.fra02s05.v5.lsc...89491d982d9386 (2011.10.06) Hyun Young 조현영 _Mach_ @ Gyeonggi University of S&T Festival Fancam(720p_H.264-AAC).mp4 http://o-o.preferred.fra02s05.v7.lsc...30d6979f680c4c QuickSync = 30.303fps libavcodec = 29.97xfps please improve your timestamp code more. Last edited by pulbitz; 20th October 2011 at 19:25. |
20th October 2011, 21:11 | #195 | Link | |
Registered User
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,083
|
Quote:
Allocating a buffer explicitly in the video memory has always been possible. Proper memory resource management is even a key feature to any graphics rendering engine. Sharing resources trough the DirectX API is relatively new: http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx and http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx . The usual DXVA helper device for EVR uses a shared handle system to give the main rendering device access to DXVA output surfaces. The extra device runs mostly asynchronously from the main device. File->CPU->GPU1->GPU2->Screen is completely allowed, but I don't know what would be faster, a render target on GPU1's memory or on GPU2's memory. Making GPU1 render to system memory or doing an extra copy operation from video memory to system memory will most certainly slow things down. It's actually not the copy operation itself that's an issue. It's usually the wait for the lock operation. Scheduled transfers without locking surfaces/textures in video memory are a lot more efficient.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv |
|
20th October 2011, 21:54 | #196 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
I got Cyberlink HAM working on Intel it's basically nothing else then a Renderless DXVA (not bound to the renderer) that also Potplayers DXVA Decoder makes use of.
The big questions is do we really need APIs from every vendor for NT6 if Microsofts integrated the possibility to use DXVA Renderless from the beginning, and why integrate every each vendor ones if 1 for all exists (in terms of interoperability) ? DXVA Renderless (supports everyone) AMD OpenVIdeo (supports AMD) Intel MediaSDK (supports Intel) Nvidia Nvcuvid (supports Nvidia) Is there really such a big Performance difference that would justify implementing each vendors own (or is there even a performance lose doing so wrapping from a to b), for the specific hardware case ?
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 20th October 2011 at 22:33. |
20th October 2011, 22:24 | #197 | Link | |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
Quote:
The links you've posted are not working - "access denied" for both. You can share very quickly on http://www.multiupload.com
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. |
|
20th October 2011, 22:42 | #198 | Link | |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
Quote:
Hopefully this chaos will converge to a single API at some point. A user friendly API that abstracts enough details while remaining high performing. Performance is a very important issue in the mobile world - battery life. Every minute of video playback is worth a lot of R&D, validation and enabling resources. The architecture "war" with ARM (starting with Windows 8) will probably help to push HW acceleration forward on many fronts so small devices can compete with ARM based SOCs. Nvidia plays both sides of the fence in this war (GPU for x86 platforms as well as ARM CPU maker) so one can expect them to fork out a cross platform API for HW acceleration. This would probably be the best kind of API - abstract the HW completely - no need to be a DirectX expert to do complex stuff.
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. |
|
20th October 2011, 22:49 | #199 | Link | |
QuickSync Decoder author
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
|
Quote:
I can take your word for it but it's probably extremely complicated to accomplish. In the Intel GPU, I don't think there's any DMA going on when copying surfaces back and forth to the CPU. It's the same memory sitting on the same memory controller. A special SSE4 instruction was introduced in Penryn to address the complex mapping to solve the speed issues.
__________________
Eric Gur, Processor Application Engineer for Overclocking and CPU technologies Intel QuickSync Decoder author Intel Corp. |
|
21st October 2011, 01:02 | #200 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
I just recorded my first 3D Gaming with my Low Latency H.264 Quicksync Encoder Framework it runs rather smooth in the 3D Engine (ID tech 5) at least playable. Entirely on GT1 (Playing + Recording)
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 21st October 2011 at 03:29. |
Tags |
ffdshow, h264, intel, mpeg2, quicksync, vc1, zoom player |
Thread Tools | Search this Thread |
Display Modes | |
|
|