Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th September 2011, 09:37   #21  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,949
@Egur
samples coming, those that crash explicitly with Lav Splitter only (much more but i guess most of those crashes come from the same issue)

http://www.mediafire.com/?94f02bvzqhask37 <- Crash on Load with Lav Splitter
http://www.mediafire.com/?aocp4j26pj6i2qw <- Crash in the middle of playback

Here is something else (not so explosive but should be looked @ anyways):

http://www.mediafire.com/?7ob1wsdt1aon1ou <- Sync issue with Lav Audio

though please don't fix those if that could potentially mean problems for other splitter (or if you think it could) but talk with nevcairiel then first
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 7th September 2011 at 10:35.
CruNcher is offline   Reply With Quote
Old 7th September 2011, 12:28   #22  |  Link
Gser
Registered User
 
Join Date: Apr 2008
Posts: 410
Well it seems this isn't supported on core i7 860 as the gfx driver won't install.
Gser is offline   Reply With Quote
Old 7th September 2011, 12:38   #23  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,733
Quote:
Originally Posted by CruNcher View Post
though please don't fix those if that could potentially mean problems for other splitter (or if you think it could) but talk with nevcairiel then first
Since basically all other codecs play fine, its doubtful at best.

I do things the way i think they are meant to work, not how old stuff was done. Its the only way to break the cycle of old bugs being re-introduced in every new component, just because they took something old as a template.
By doing this, i can play alot more files that just fail on other splitters. If that means ruffing some feathers on some codecs, so be it. I provide my own audio and video codecs anyway.

With CPUs getting ever so much faster and efficient, the time of hardware video decoders in PCs is nearly over, imho.
The only thing missing really is a good way to use the GPU for deinterlacing without relying on EVR. Thats basically the only reason i still use LAV CUVID myself, for the deinterlacing (and interlaced VC-1 decoding)
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 7th September 2011 at 12:47.
nevcairiel is online now   Reply With Quote
Old 7th September 2011, 13:53   #24  |  Link
Blight
Software Developer
 
Blight's Avatar
 
Join Date: Oct 2001
Location: Israel
Posts: 995
nev:
I agree that it's more important for videos to play correctly than to support buggy code. This is especially true when dealing with new decoders that the dev. is still active.
With regards to CPU vs. Hardware Accel, you're only right on the desktop. With laptops/tablets/cellphones, hardware acceleration allows for a longer battery life.
__________________
Yaron Gur
Zoom Player . Lead Developer
Blight is offline   Reply With Quote
Old 7th September 2011, 14:04   #25  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,383
Quote:
Originally Posted by Gser View Post
Well it seems this isn't supported on core i7 860 as the gfx driver won't install.
Core i7-860 has the old Lynnfield architecture, not Sandy Bridge, so it was to be expected.
sneaker_ger is offline   Reply With Quote
Old 7th September 2011, 14:12   #26  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,733
Quote:
Originally Posted by Blight View Post
With laptops/tablets/cellphones, hardware acceleration allows for a longer battery life.
With tablets and cellphones that may be true, however for laptops i'm not 100% sure. Maybe in this generation thats still true, but for the future....

For CPUs, one key factor is also getting more efficient, while GPUs are apparently always going for power. On a desktop PC, the power usage difference today between DXVA2 and a software codec isn't all that big to begin with, so if the CPU gets faster and more efficient at the same time, there might be a point where the power argument is invalid (on PC/Laptop parts - SoC parts for tablets and phones are still far away from that).

Anyhow, i have quite some hope for Intels future CPUs, the Tri-Gate Transistors will be quite a nice boost both in performance and efficiency.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Old 7th September 2011, 17:32   #27  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,949
Quote:
Originally Posted by nevcairiel View Post
With tablets and cellphones that may be true, however for laptops i'm not 100% sure. Maybe in this generation thats still true, but for the future....

For CPUs, one key factor is also getting more efficient, while GPUs are apparently always going for power. On a desktop PC, the power usage difference today between DXVA2 and a software codec isn't all that big to begin with, so if the CPU gets faster and more efficient at the same time, there might be a point where the power argument is invalid (on PC/Laptop parts - SoC parts for tablets and phones are still far away from that).

Anyhow, i have quite some hope for Intels future CPUs, the Tri-Gate Transistors will be quite a nice boost both in performance and efficiency.
~5W (SB Decoder) vs ~12-15W (the best software decoders) is a difference also for normal Blu-Ray Playback where you would have to add the whole Player overhead (Java,Decryption tasks (not the decryption itself)) too it it adds up
And yep Tri-Gate will push that further down nearer to Soc Decoder
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 7th September 2011 at 17:43.
CruNcher is offline   Reply With Quote
Old 7th September 2011, 19:03   #28  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
New version released!

Download version 0.11 alpha:
32 bit http://www.multiupload.com/FCBQAAARUI
64 bit http://www.multiupload.com/6O3BXXXPAC

Revision history:
v1.11:
* Fixed skipping issues. Seeks are now instant.
* Fixed handling of sequence header for all supported formats. Fixes image corruption in some clips.
* Created 64bit version. Very limited testing was done with this one.
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 7th September 2011, 19:05   #29  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
CruNcher, I'll look into the crashes tomorrow. Thanks a lot for your help
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 7th September 2011, 20:33   #30  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by nevcairiel View Post
...
The only thing missing really is a good way to use the GPU for deinterlacing without relying on EVR. Thats basically the only reason i still use LAV CUVID myself, for the deinterlacing (and interlaced VC-1 decoding)
I've considered adding HW deinterlacing to the decoder, it's not too complicated. But the extra copying I'll have to do, renders this solution a bad one (ATM).
If you want, I can export the D3D surface w/o copying and apply some post processing on it including DI. Enabling video post processing is high on my list after root causing the current bugs.

CruNcher, I'll look into the crashes tomorrow. Thanks a lot for your help
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.

Last edited by egur; 8th September 2011 at 22:57.
egur is offline   Reply With Quote
Old 9th September 2011, 16:23   #31  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,949
@egur
the 64 bit version doesn't work it falls back to other decoder in the directshow chain if ffdshow-quicksync is selected for the format (libavcodec works)

Here is another stream that has problems with ffdshow-quicksync

http://www.mediafire.com/download.php?cla9ncy0m1tb89w <-stops @ start no playback possible
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 9th September 2011 at 16:33.
CruNcher is offline   Reply With Quote
Old 9th September 2011, 16:27   #32  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by CruNcher View Post
@egur
the 64 bit version doesn't work it falls back to other decoder in the directshow chain if ffdshow-quicksync is selected for the format (libavcodec works)
I did very limited testing - only graph edit and it worked for a few clips. I'll try mpc-hc x64. Any other players to test? BTW, what was the setup (filters, content,etc) so I can reproduce quickly?

BTW, I reproduced the crashes with LAV splitter on your samples but didn't have the time to debug yet.
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 9th September 2011, 16:40   #33  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,949
MPC-HC 64 Bit 3704
MPC-HC splitter standalone 64 bit (not internal, though should be the same as MPC-HC 3704 internal)
Lav Splitter 0.35 64 bit
Lav Audio 0.35 64 Bit
Renderer: Stability testing EVR (default)
Renderer: Shader Processing tests: EVR-CP

MPC-HC 64/32 3704 + standalone filters (binaries) can be found here http://xhmikosr.1f0.de/index.php?folder=bXBjLWhj

ffdshow 64 bit i used to replace your quicksync components with was http://sourceforge.net/projects/ffds...4.exe/download

Most important dshow players are based on these components anyways

you could also test with ongoing AVsplitter http://avsplitter.avmedia.su/en it's like lav splitter based on libavformat (uni*), MPC-HC is native windows based code

Nothing connects to the ffdshow-quicksync 64 bit Decoder via MPC-HC 64 bit internal and standalone 64 bit filters (container doesn't matter format either)

32 bit same framework no connection issues


Btw here is a result from Intels MFT Decoder (copy overhead):

Renderer: Enhanced Video Renderer (Media Foundation)
Decoder: Intel® Hardware H.264 Decoder MFT
Decoder Device: ModeH264_VLD_NoFGT
Processor Device: ProgressiveDevice
Time: 00:05.685
Average FPS: 177,130
Min/Max FPS: Min: 170 Max: 178
CPU Usage (%): Avg: 36 Min: 33 Max: 38

In compare no Copy overhead DXVA2:

Renderer: Enhanced Video Renderer (Media Foundation)
Decoder: Microsoft H264 Video Decoder MFT
Decoder Device: ModeH264_VLD_NoFGT_ClearVideo
Processor Device: ProgressiveDevice
Time: 00:02.238
Average FPS: 367,161
Min/Max FPS: Min: 343 Max: 386
CPU Usage (%): Avg: 09 Min: 07 Max: 12

in direct compare to Cyberlinks DXVA2:

Renderer: Enhanced Video Renderer (DirectShow)
Decoder: CyberLink Video Decoder
Decoder Device: ModeH264_VLD_NoFGT_ClearVideo
Processor Device: ProgressiveDevice
Time: 00:02.655
Average FPS: 379,183
Min/Max FPS: Min: 368 Max: 383
CPU Usage (%): Avg: 03 Min: 02 Max: 04

Current ffdshow-quicksync (copy overhead):

Renderer: Enhanced Video Renderer (DirectShow)
Decoder: ffdshow Video Decoder
Decoder Device: -
Processor Device: ProgressiveDevice
Time: 00:22.403
Average FPS: 44,948
Min/Max FPS: Min: 44 Max: 45
CPU Usage (%): Avg: 24 Min: 23 Max: 25

egur you should ask the guys that made the mft decoder (i guess they are part of the driver team and or sdk) how they optimized the performance @ the little higher overhead, though i would say this is currently the farest you could get in Performance optimization with ffdshow-quicksync .


in comparison here is the Libavcodec decoder efficiency on the 4 cores

Renderer: Enhanced Video Renderer (DirectShow)
Decoder: LAV Video Decoder
Decoder Device: -
Processor Device: ProgressiveDevice
Time: 00:03.456
Average FPS: 291,299
Min/Max FPS: Min: 285 Max: 285
CPU Usage (%): Avg: 83 Min: 83 Max: 83
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 9th September 2011 at 20:32.
CruNcher is offline   Reply With Quote
Old 12th September 2011, 12:50   #34  |  Link
iwod
Registered User
 
Join Date: Apr 2002
Posts: 749
CruNcher is an SNSD Fans

And What movie is test.ts ??
iwod is offline   Reply With Quote
Old 12th September 2011, 21:54   #35  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Hi CruNcher,

I've fixed some of the problems and I'll release a new version tomorrow.
• 64bit version is working. The 64 bit version was built wrong - fixed and now it works in MPC-HC x64 (using latest version which is older than yours, BTW).
• Optimized CPU usage (faster copying from GPU to CPU). Changed memcpy to an SSE4.1 implementation - much faster, but I don't have numbers yet (now it's faster then libavcodec or an average 720p movie).
• More stable with LAV splitter. Previous version crashed on several MPEG2 transport with AVC1 (H264) video. AVC header parsing is more robust (Media SDK bug or LAV filter bug).
• Added time stamp stabilizing (transport stream issues).
• Added adaptive inverse telecine (29.976 --> 23.97) when stream reports it. And fall back to the original frame rate when the content is "normal" (no repeating fields). This works great on the smple you've sent.

BTW, using Shader/GPGPU video processing is asking for trouble with the HD2000/3000. It's not comparable to the mainstream or high end cards. Even the simple Haali Video Renderer produces 7(!) fps (720p to a little higher resolution) on my laptop regardless of decoder.

If you can point me to some VC1 clips, I'd appreciate it.
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 12th September 2011, 21:58   #36  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by Superb View Post
That's great news. Btw, you might wanna look at VLC's git repository... They use DXVA2 acceleration and copy the frames back too. (under modules\codec\avcodec\dxva2.c)
Thanks!
I looked at the VLC code and found out they use an SSE4.1 instruction to copy from the GPU memory. I had to rewrite using SSE4 intrinsics so 64 bit compilations would work. Results are nice, Now I'm always faster then libavcodec on 720p (and north) videos.
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 12th September 2011, 23:11   #37  |  Link
ajp_anton
Registered User
 
ajp_anton's Avatar
 
Join Date: Aug 2006
Location: Stockholm/Helsinki
Posts: 775
What exactly does this do?
With all this talk about copying frames from GPU to main memory, I get the impression that it's kind of like Nvidia's "CUDA" decoding, but it doesn't seem to be working properly.

CPU usage on a ~30Mbit 1080p video (i7-2600K):
"Quicksync": 10%
ffdshow (libavcodec): 7%
LAV: 6%
DXVA (MPC-HC): 0%

Last edited by ajp_anton; 12th September 2011 at 23:16.
ajp_anton is offline   Reply With Quote
Old 13th September 2011, 06:37   #38  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,733
Quote:
Originally Posted by egur View Post
Thanks!
I looked at the VLC code and found out they use an SSE4.1 instruction to copy from the GPU memory. I had to rewrite using SSE4 intrinsics so 64 bit compilations would work. Results are nice, Now I'm always faster then libavcodec on 720p (and north) videos.
Yeah that SSE 4.1 instruction is great for this task. Intel really knows what they're doing.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Old 13th September 2011, 10:38   #39  |  Link
Blight
Software Developer
 
Blight's Avatar
 
Join Date: Oct 2001
Location: Israel
Posts: 995
anton:
This is exactly what this is, the Intel sandybridge equivalent of nvidia's "CUDA" decoding.
It's an initial build, things will get better as more content is tested.
__________________
Yaron Gur
Zoom Player . Lead Developer
Blight is offline   Reply With Quote
Old 13th September 2011, 14:18   #40  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,733
I've been thinking about this thing today, and i've been wondering - what exactly does the Media SDK offer over a DXVA2 decoder (assuming you copy the frame back into system ram as well) ?
I'm only interested in actual user visible advantages, i realize coding might be simpler with the SDK, but then DXVA2 works with more GPUs.

PS:
Its not the same as CUDA decoding, CUDA is handled quite differently. As i understand it, the MSDK is just a "wrapper" around DXVA2, hence my question.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Reply

Tags
ffdshow, h264, intel, mpeg2, quicksync, vc1, zoom player

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:03.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.