Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > New and alternative a/v containers

Reply
 
Thread Tools Search this Thread Display Modes
Old 12th January 2012, 17:37   #8081  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by nevcairiel View Post
I managed to increase the performance quite a bit
How? (If I may ask...)
madshi is offline   Reply With Quote
Old 12th January 2012, 17:38   #8082  |  Link
fastplayer
Registered User
 
Join Date: Nov 2006
Posts: 799
When it comes to DXVA, always assume it's the driver's fault!
fastplayer is offline   Reply With Quote
Old 12th January 2012, 17:40   #8083  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
Quote:
Originally Posted by madshi View Post
How? (If I may ask...)
I remembered something which i did for the CUVID decoder waaaay back.

I just store the surface in a queue, and only if the queue is full, i start processing them, one at a time. This gives the GPU time to finish rendering to the surface before i access it. A queue of 2 frames makes all the difference, just a tiny bit of delay to give the GPU some breathing room.
Speed is now 99% that of CUVID.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 12th January 2012 at 17:44.
nevcairiel is offline   Reply With Quote
Old 12th January 2012, 17:46   #8084  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 5,351
Quote:
Originally Posted by nevcairiel View Post
I remembered something which i did for the CUVID decoder waaaay back.

I just store the surface in a queue, and only if the queue is full, i start processing them, one at a time. This gives the GPU time to finish rendering to the surface before i access it. A queue of 2 frames makes all the difference, just a tiny bit of delay to give the GPU some breathing room.
Speed is now 99% that of CUVID.
Well done!!
__________________
HTPC: Windows 11, AMD 5900X, RTX 3080, Pioneer Elite VSX-LX303, LG G2 77" OLED
SamuriHL is offline   Reply With Quote
Old 12th January 2012, 17:50   #8085  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by nevcairiel View Post
I remembered something which i did for the CUVID decoder waaaay back.

I just store the surface in a queue, and only if the queue is full, i start processing them, one at a time. This gives the GPU time to finish rendering to the surface before i access it. A queue of 2 frames makes all the difference, just a tiny bit of delay to give the GPU some breathing room.
Speed is now 99% that of CUVID.
So how many FPS do you with ATI cards? I'm wondering why I got so miserable results with my madNV12Test tool. All I did there was LockRect + memcpy in a loop, with no real decoding going on...
madshi is offline   Reply With Quote
Old 12th January 2012, 17:51   #8086  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
Quote:
Originally Posted by madshi View Post
So how many FPS do you with ATI cards? I'm wondering why I got so miserable results with my madNV12Test tool. All I did there was LockRect + memcpy in a loop, with no real decoding going on...
I have no ATI card, but when i post a first real test version, i'm sure people will be glad to benchmark it.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 12th January 2012, 17:54   #8087  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 5,351
I'll be able to test it later for sure. Very interested in this.
__________________
HTPC: Windows 11, AMD 5900X, RTX 3080, Pioneer Elite VSX-LX303, LG G2 77" OLED
SamuriHL is offline   Reply With Quote
Old 12th January 2012, 17:56   #8088  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Ah sorry, misunderstood. Thought you had 99% of CUVID performance with ATI DXVA!
madshi is offline   Reply With Quote
Old 12th January 2012, 17:57   #8089  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 5,351
He might. That's what we need to test.
__________________
HTPC: Windows 11, AMD 5900X, RTX 3080, Pioneer Elite VSX-LX303, LG G2 77" OLED
SamuriHL is offline   Reply With Quote
Old 12th January 2012, 18:00   #8090  |  Link
goldie
Registered User
 
Join Date: Jun 2009
Posts: 22
Quote:
Originally Posted by SamuriHL View Post
I'll be able to test it later for sure. Very interested in this.
+1
goldie is offline   Reply With Quote
Old 12th January 2012, 18:09   #8091  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,926
@ nev
ehh how does it compare with the IMSDK layer (Quicksync Decoder overhead) ? do you think it's not really needed anymore if this runs generically fine on every GPU or do you meant you currently reach the same performance as CUVID in speed (on your Nvidia VPx) which would be still less then with Intels MSDK DXVA2 implementation ?
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 12th January 2012 at 18:22.
CruNcher is offline   Reply With Quote
Old 12th January 2012, 18:25   #8092  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
Intel has some issues with the "generic" DXVA2, i would have to figure out what settings exactly it needs to run properly - so instead just use the MSDK?
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 12th January 2012, 18:27   #8093  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 5,351
Is this stuff checked in to the repository yet? if so I can start building it and take a look.
__________________
HTPC: Windows 11, AMD 5900X, RTX 3080, Pioneer Elite VSX-LX303, LG G2 77" OLED
SamuriHL is offline   Reply With Quote
Old 12th January 2012, 18:36   #8094  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
Ok, here is a more polished version without debugging.

http://files.1f0.de/lavf/LAVFilters-...2-perftest.zip

It would be great if some people could benchmark this using following clip:

http://xhmikosr.1f0.de/samples/2160p...x264.CRF23.mkv

I would recommend to benchmark with GraphStudio, mostly because its so easy.
It would also be great if you could mention which kind of CPU you're running, so i know which kind of memory copy is being used.

As a reference, my NVIDIA with a VP4 decoder does around 73fps, both in DXVA2 and CUVID.

Disclaimer:
- This version is generally not all that much tested, and might blow up (and possibly take your PC with it)
- DXVA2 is a Vista/7 tech, don't expect it to work on XP
- Fallback to software decoding is still unfinished, and will most likely crash.
- Seeking in VC-1 still is somewhat rough, H264 seems to work better, however.
- VC-1 interlaced decoding is not yet implemented (if i can pull it off to finish it)
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 12th January 2012 at 18:50.
nevcairiel is offline   Reply With Quote
Old 12th January 2012, 18:48   #8095  |  Link
hoborg
Registered User
 
Join Date: Nov 2008
Posts: 454
Radeon HD 6750: (it stutter a little)
__________________
Working machine: Win10x64 + Intel Skull Canyon
My HTPC.

How to start with Bitcoin
hoborg is offline   Reply With Quote
Old 12th January 2012, 18:48   #8096  |  Link
Sebastiii
Registered User
 
Join Date: Oct 2009
Location: France
Posts: 615
Amazing Thanks
__________________
HTPC : i7 920 6Go Win10(x64) / Nvidia 1050Ti / P6T Deluxe / Harman-Kardon AVR-355.
Sebastiii is offline   Reply With Quote
Old 12th January 2012, 18:59   #8097  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 5,351
@hoborg....what is that? And where can I find it?
__________________
HTPC: Windows 11, AMD 5900X, RTX 3080, Pioneer Elite VSX-LX303, LG G2 77" OLED
SamuriHL is offline   Reply With Quote
Old 12th January 2012, 19:00   #8098  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
I would guess thats GraphStudio Next (http://code.google.com/p/graph-studio-next/)
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 12th January 2012, 19:00   #8099  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,926
Yup it stresses User as well as Kernel time a little more and thus isn't as efficient as the Quicksync Implementation on Intel Hardware but not that bad @ all for 1080p 60 fps the difference was 2%/4%/12%/17%/25% (includes render overhead) for Cyberlink and Arcsoft DXVA/CoreAVC DXVA/Intel(Egur/Nev) DXVA2/Generic DXVA2(Nev)/Libav(Nev) @ playback (not raw benchmark performance)
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 12th January 2012 at 19:18.
CruNcher is offline   Reply With Quote
Old 12th January 2012, 19:06   #8100  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Radeon HD 5850
~50.3 fps
sneaker_ger is offline   Reply With Quote
Reply

Tags
decoders, directshow, filters, splitter

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:47.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.