Log in

View Full Version : Intel QuickSync Decoder - HW accelerated FFDShow decoder with video processing


Pages : 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

CruNcher
2nd January 2012, 19:48
@ Egur
Good news the Lock Problem is history :) crazy its noise less as the DXVA now, i can see the same peaking on the keyframes now as i could with Cyberlinks DXVA decoder (without any other background noise in the graph) :)
No lock lost problem on that sample anymore perfectly telecined smooth as butter with MPC-HC Splitter as well as Lav Splitter :D

Really good work :) the peaks are even lower now then what Cyberlink Produces, most results are same with MPC-HC and Lav Splitter though Arcsofts Decoder doesnt like Lav Splitter @ all, Mainconcepts DXVA is either broken or tuned for their framework exclusively :D


Test Setup:
Win7 Aero
MPC-HC TS Splitter (Source)
Lav Audio (Sync Correction)
EVR Custom dfr3882i (Experimental by JanWillem32)

Cyberlink Peaks (GPU)

http://img46.imageshack.us/img46/2943/cyberlinkdxvapeaks.png

FFdshow Quicksync Peaks (GPU)

http://img718.imageshack.us/img718/2454/ffdshowquicksyncpeaks.png

Lav Video Peaks (CPU)

http://img810.imageshack.us/img810/3343/lavvideopeaks.png

Mainconcept SDK 9 Peaks (GPU)

http://img269.imageshack.us/img269/1192/mainconceptsdk9dxvapeak.png

Mainconcept SDK 9 Peaks (CPU)

http://img856.imageshack.us/img856/7070/mainconceptsdk9cpupeaks.png

Arcsoft Peaks (GPU)

http://img18.imageshack.us/img18/3952/arcsoftdxvapeaks.png

Arcsoft Peaks (GPU) (LAV Splitter) out of sync to fast complete disaster audio is way behind (though it seems to be able to get those keyframes correct)

http://img683.imageshack.us/img683/73/arscoftpeaksdxvalavspli.png

2ms is really wow (with those peaks without it's really in the range of the best DXVA implementations 0.4xx ms) Egur :)

Though initially i just wanted to make sure the lock lost issue is history and jep it is that was a big problem experience wise, i gonna continue and report if i find other issues :)

egur
2nd January 2012, 21:17
Thanks CruNcher :thanks:
Can you explain what we're seeing here?
Are you referring to the low render latency?

egur
2nd January 2012, 21:30
Hey Eric,

how did you ever manage to compile a release build of the decoder? :D

1>d:\dev\multimedia\lavfsplitter\intel qs\qsdecoder\intelquicksyncdecoder\quicksyncdecoder.cpp(643): error C2220: warning treated as error - no 'object' file generated
1>d:\dev\multimedia\lavfsplitter\intel qs\qsdecoder\intelquicksyncdecoder\quicksyncdecoder.cpp(643): warning C4715: 'CQuickSyncDecoder::SetD3DDeviceManager' : not all control paths return a value

In the ffdshow codebase there's a different project file, that for some reason got lowered to level 3 warnings and removed the "treat warnings as errors".

update
SVN rev12 fixes the compilation issue as well as contain your requested features.
ffdshow was updated to rev4218.
I will not release a build since the functionality difference is zero (in ffdshow).

nevcairiel
2nd January 2012, 22:00
I found something odd.

I did a very basic integration of your decoder now, and when i feed it some 1920x1080 movies, it reports to me that they are 1919x1079.
This happens for example with the files 7-10 from that DXVA test set earlier.

Any idea whats going on?
Doubt i could've broken something, there isn't really any screws to turn. :)
I did not give it any configuration yet, just running with thatever it has for defaults.

PS:
Thanks for the changes!

CruNcher
2nd January 2012, 22:04
Thanks CruNcher :thanks:
Can you explain what we're seeing here?
Are you referring to the low render latency?

Jep :) on that clip with the lock lost issue it was crazy (see video recording of previous result) :D

egur
2nd January 2012, 22:30
I found something odd.

I did a very basic integration of your decoder now, and when i feed it some 1920x1080 movies, it reports to me that they are 1919x1079.
This happens for example with the files 7-10 from that DXVA test set earlier.

Any idea whats going on?
Doubt i could've broken something, there isn't really any screws to turn. :)
I did not give it any configuration yet, just running with thatever it has for defaults.

PS:
Thanks for the changes!

The output rects are according to Windows standard (the way you pass dimensions to Win32 API):
So you 1919,1079 is the bottom right pixel.
This allows simple conversion to CRect, etc.
If it's confusing, I'll add a comment in the struct's code.

nevcairiel
2nd January 2012, 22:35
So, i can assume that width/height are always +1? :d
For the record, i have never seen a rect used that way, the rects in the DShow media type say 0,0,1920,1080

Edit:
Checked the code, so yeah, always +1.

CruNcher
2nd January 2012, 23:48
Egur

started testing in Progressive mode (forcing every Interlaced content to be displayed as such) and so far on the .ts part i found 1 sample that behaves rather strange with a very high latency but only @ the beginning (looks crazy) 6ms for a 30 1080i stream and audio sync issues but whats really crazy with this is as soon as i change the player size (let jans renderer do a refresh) it gets sync ;) i separated that stream for now for further testing (very interesting never saw a stream causing something like this with the renderer) might be another locking issue :(

found 2 other streams 1 causing a black screen and the other even a hang of ffdshow or mpc-hc closing doesnt unloads it (have to kill it) and the good old beyonce tssplit is also failing with audio only :(

also the just recently posted stream by someone experiencing strange behavior on android and ipad with it dongle_3 http://forum.doom9.org/showthread.php?t=163695 gets unsync from the start with ffdshow quicksync (mpc hc splitter and lav audio)

MC.ts still fails i guess we can slowly say it's a Hardware issue and wont be fixed ? (see starting of this thread the VC-1 special interlace case) sad thing seeing that Nvidia was capable supporting it very fast 1 driver release after the report to donald it was supported :)

ok finished the MPC-HC splitter go trough now continuing with a Lav Splitter round and comparing issues :) then make some tests and post samples and issues and then test the deinterlacing part of things :)

egur
3rd January 2012, 00:01
Egur
started testing in Progressive mode (forcing every Interlaced content to be displayed as such) and so far on the .ts part i found 1 sample that behaves rather strange with a very high latency but only @ the beginning (looks crazy) 6ms for a 30 1080i stream and audio sync issues but whats really crazy with this is as soon as i change the player size (let jans renderer do a refresh) it gets sync ;) i separated that stream for now for further testing (very interesting never saw a stream causing something like this with the renderer)
Strange, never saw such a thing. Please (multiupload please)

found 2 other streams 1 causing a black screen and the other even a hang of ffdshow or mpc-hc closing doesnt unloads it (have to kill it) and the good old beyonce tssplit is also failing with audio only :(

also the just recently posted stream by someone experiencing strange behavior on android and ipad with it dongle_3 http://forum.doom9.org/showthread.php?t=163695 gets unsync from the start with ffdshow quicksync (mpc hc splitter and lav audio)
Please share the problematic clips.

ok finished the MPC-HC splitter go trough now continuing with a Lav Splitter round and comparing issues :) then make some tests and post samples and issues :)
Great! Your problematic clips sometimes find hard-to-find bugs in my code. It really help.

CruNcher
3rd January 2012, 02:24
WOW im baffled almost my entire test database (205 tests) work with Lav Splitter (TS part) + ffdshow quicksync actually it comes not from nothing i booged nev hard with samples and cases ;) but i really feel that was worth it great job nev great job libav/ffmpeg team great job egur :)

Yep gonna upload the 3 problems i found and think might be fixable and fail on both MPC-HC and Lav Splitter one is hard unsync issue but its a corrupt stream and i already tested a VideoRedoed version that works flawless also Mplayer itself gets sync fast same for Lav Video (though obviously it would be a big bonus if you could get it working without hurting the current overall stability which is excellent in combination with .ts and Lav Splitter 0.43 EVR Custom currently) :)

Though the other issue makes me more nervous, it seems to be a H.264 x264 decoding error :( that doesn't happen with Software or another DXVA decoder

anyway preparing 3 smaller samples that show the issues im not sure with the refresh issue which could be a encoder error (maybe that logo animation is the cause ?) as this switch in framerate looks strange (it seems to switch from 30 (unsync state) to 29 fps (sync state) it takes a while to recover though refreshing or seeking can force it faster it seems :)

So here we go

Decoding Error = http://www.mediafire.com/?rld8gnlh52f03ud (not good) (Cyberlink DXVA = Fail, CoreAVC DXVA = OK, CoreAVC = OK, Arcsoft DXVA = OK, ffdshow-quicksync = Fail, Lav Video = OK, Mainconcept DXVA = Fail, Mainconcept = Fail, Microsoft DTV = Fail, DivX = Fail, Mirillis = Fail, Potplayer DXVA = OK)
Refresh Lock Problem = http://www.mediafire.com/?8u0sn24cfggx6dq (that is the strange behaving one, we might need jan here also involved as refreshing the experimental renderer also locks the what seems correct fps 29.97 seeking though seems to have the same effect i wonder if its related to the corrupt unsync issue)
Sync issue = http://ibc.cdngc.net/Avidan/dongle_3.ts (Sync can be forced by seeking)

Hehe the sync issue dubilev stream is suffering from i guess is the same cause as mine his sample should it be fixed should fix mine too i already found out its hard to get such a sample as mostly all the time you cut you can create a new hard unsync that cant sync @ seek anymore or you fix it entirely (depending on the muxer) ;)

So here is another sample of the unsync issue additionally = http://www.multiupload.com/QL5F0FRM7O (Sync can be forced by seeking)

egur
3rd January 2012, 14:16
@CruNcher,
I'll take a look at the clips in the following days and root cause the individual problems. I'll report back.

nevcairiel
3rd January 2012, 15:57
I did some more tests, and it looks like i get the progressive flag on every frame of some interlaced H264 streams.

Seems to only affect H264 so far, and from the looks of it, the files i tested are MBAFF.
PAFF seems to be sporadically wrong (it has interlaced flags for a while, and then not, and then again)
"Normal" interlaced is fine.

MBAFF: http://files.1f0.de/samples/Test_clip_avc.1080i59.94.ac3.5.1.mkv
PAFF: http://files.1f0.de/samples/premiere-paff.ts

Sorry if this was discussed before, i remember vague mentions of such problems, but nothing conclusive. :
Seems like a bug in the Media SDK APIs to me.

For the record, NVIDIA had similar issues at the beginning of their CUDVID API, but they worked it out pretty good.

CruNcher
3rd January 2012, 16:28
Jep also reported these MBAFF issues once but i currently not try to focus on deinterlacing but more stability as a whole first next step again is the whole line of deinterlace issues, though with such issues in a interlaced stream as the refresh lock problem imho it makes no sense to test deinterlacing as it will give a failing result anyway until the main issue is fixed :)

hhb97b
3rd January 2012, 19:38
@ Egur
Good news the Lock Problem is history :) crazy its noise less as the DXVA now, i can see the same peaking on the keyframes now as i could with Cyberlinks DXVA decoder (without any other background noise in the graph) :)
No lock lost problem on that sample anymore perfectly telecined smooth as butter with MPC-HC Splitter as well as Lav Splitter :D

Really good work :) the peaks are even lower now then what Cyberlink Produces, most results are same with MPC-HC and Lav Splitter though Arcsofts Decoder doesnt like Lav Splitter @ all, Mainconcepts DXVA is either broken or tuned for their framework exclusively :D


CruNcher: do you know what the green and red line represent in the graph? I can't find a definition anywhere.

egur
3rd January 2012, 19:54
I did some more tests, and it looks like i get the progressive flag on every frame of some interlaced H264 streams.
...

I'll take a look.
The flagging issue was something i was working on that shouldn't have been comitted :(
I want to report that although the image is interlaced, it is packed as a progressive frame.
The DS flags are a bit confusing with this regard - I didn't see an option to report this.
Please sync to latest revision (14).

Unfortunately, I'm very short on time for proper testing, so if this breaks something, comment the following code in QuickSync.cpp (line 668)

// Frame has progresive structure but might be film type as well
else if (picStruct & MFX_PICSTRUCT_PROGRESSIVE)
{
flags |= AM_VIDEO_FLAG_WEAVE;
}

CruNcher
3rd January 2012, 20:05
CruNcher: do you know what the green and red line represent in the graph? I can't find a definition anywhere.

http://www.ostrogothia.com/?page_id=1218

though i leave the whole syncing to Aero it does a great job so most of this isnt valid for Aero if both lines match eatch other you have the best sync anything that causes interferences will become immediately visible as peaks be it issues from outside or inside, though the OSD itself needs some time if you just need the sync graph then you can completely disable the updating realtime statistics this will improve the whole playback a tad depending on the configuration. :)
If you change something you also get peaks as the experimental renderer is refreshing after some msec (this is the biggest drawback from current trunk every change becomes a small interruption in playback also shown as a peak)

nevcairiel
3rd January 2012, 20:57
I'll take a look.
The flagging issue was something i was working on that shouldn't have been comitted :(
I want to report that although the image is interlaced, it is packed as a progressive frame.
The DS flags are a bit confusing with this regard - I didn't see an option to report this.
Please sync to latest revision (14).

You're right, there are no options for this, which is why most algorithms just work on the assumption that its interlaced chroma, which works OK (but not perfect) for progressive, and doesn't screw up on interlaced.

Anyhow, r14 seems to be working properly. Progressive streams are still interlaced 0, and interlaced streams look good.

Thanks for the fast fix. :)
Seems to be a odd setup in the Media SDK to combine these flags into one, but oh well. :)

egur
4th January 2012, 08:11
You're right, there are no options for this, which is why most algorithms just work on the assumption that its interlaced chroma, which works OK (but not perfect) for progressive, and doesn't screw up on interlaced.
Great.
Now you have the option know which way is optimal to convert to RGB (or 4:2:2 or 4:4:4) within LAV decoder. But probably most people will have LAV output 4:2:0 YCbCr so it doesn't matter...
Did you finish the integration?
Please tell me how long it took you. My goal was <1 work week integration (5 days).

nevcairiel
4th January 2012, 08:32
Did you finish the integration?
Please tell me how long it took you. My goal was <1 work week integration (5 days).
Its not completely finished, some improvements to be done, but it took less then a work day so far. Granted, my architecture was already meant to be able to plug in different decoders, and i had some experience with the CUVID decoder which works somewhat similarly.

I do plan to include it in LAV 0.44 though, which should be done "soon".

egur
4th January 2012, 16:54
Its not completely finished, some improvements to be done, but it took less then a work day so far. Granted, my architecture was already meant to be able to plug in different decoders, and i had some experience with the CUVID decoder which works somewhat similarly.

I do plan to include it in LAV 0.44 though, which should be done "soon".
Very well.
A few pointers you need to check:
1) A D3D device (QuickSync) needs a D3D device manager from the renderer to work in full screen exclusive. Look at the proxy class (TVideoCodecQuicksync within ffdshow) on how to extract the device manager. Passing the device manager improves init time but it's not critical in other cases.
2) WMC will call your decoder to create thumbnails without providing the D3D device manager. You will need to use SW fallback. In ffdshow, I looked at the GUID of the renderer for this purpose (TffdshowDecVideo::CompleteConnect). If you have a more elegant solution. Let me know.
3) The ffdshow proxy class checks the ability for the decoder to tun (see the check function). I may improve it in the future to make it quicker (e.g. not load the decoder dll)

Anyway, please let me know what parts of integration were not simple. I'll also make a guide in the future.

nevcairiel
4th January 2012, 16:56
I found another small annoyance. :)

Your decoder currently checks the dwProfile value in the media type for compatibility, however that value is not guaranteed to be present (or correct)
Its quite certainly possible that the source does not set or know the value, especially in the case of live streaming, when the media type usually doesn't contain extradata either. It also happened before that these values were wrong, which is why i changed to actually checking the bitstream (SPS for H264, and the sequence info for MPEG2).

I would be happy with a flag to disable your own checks, in case the DS decoder around it already does them, but if you want to re-work and improve your own checks, thats good too. :)
Personally, i would've designed your whole DLL indepedent of DirectShow specific things anyway, so no IMediaSample input and no DirectShow media types, but thats another thing. :)

PS:
While reading around in your code, i found this: (made me laugh a bit :))

// Discard audio NALUs
if (NALU_TYPE_AUD == naluType)
continue;


AUD is Access Unit Delimiter, not Audio. No reason to drop it. :D

nevcairiel
4th January 2012, 19:37
I also have a bug report:

Enabling multi-threading causes a freeze on MPEG-2 files when ReClock is used (possibly because of its own graph it builds), doesn't happen without ReClock or when multi-threading is off.
As a workaround, i disabled any HW decoding when i detect that ReClocks fake graph is building the graph, and this solves it for now.

I didn't test with ffdshows QS decoder, but i can try it later.

egur
4th January 2012, 20:34
I also have a bug report:

Enabling multi-threading causes a freeze on MPEG-2 files when ReClock is used (possibly because of its own graph it builds), doesn't happen without ReClock or when multi-threading is off.
As a workaround, i disabled any HW decoding when i detect that ReClocks fake graph is building the graph, and this solves it for now.

I didn't test with ffdshows QS decoder, but i can try it later.

I'll check it out. I didn't test with Reclock or got any feedback on freezes yet.
Any specific setting? Just MPEG2? All MPEG2 files?

BTW don't enable DVD decoding it's not functional.

Regarding checks, this must be done as a workaround for various issues I got in the past and may exist today. The decoder can expect the media samples to report something valid or close to it.
Reporting a supported profile when it's not, can lead to a severe decoder failure during init.
If the sequence headers come within the stream this will be OK. If miss report the format to the MSDK (e.g. profile/level/width/height) as long as it's still the same decoder, the HW decoder will fail gracefully and a operation will continue after a reset. This is transparent to you.
If the QS decoder refuses to play a clip, there's no need to force it. You can "fix" the media sample if you want, but this is probably a rare case.
I didn't get any reports on this issue yet.

Also parsing various headers will cause a lot of code bloat which I don't want ATM. I want to keep the code size to the minimum. No problems with ffdshow so far.
Supported formats might grow or shrink (not likely :) ) on different models so it's best that compatibility checks are black boxed.

nevcairiel
4th January 2012, 20:47
Regarding checks, this must be done as a workaround for various issues I got in the past and may exist today. The decoder can expect the media samples to report something valid or close to it.

I have sophisticated checks that don't rely on the source setting those values in the media type, instead they parse the extradata and get the real values - and if the extradata is not present, they read the actual frames until they find the info they need. I just want to be able to make use of those functions so i can offer a 100% working software fallback. It would be great to make use of these functions, because they work in many more cases where a pure checking of the media type fields fails.

Anyhow, i could probably produce an example pretty easily of a DVB application which does not set dwProfile (leaves it at 0).
Doesn't matter, i'll just fix up the media types, easy enough to do.

egur
4th January 2012, 22:06
I have sophisticated checks that don't rely on the source setting those values in the media type, instead they parse the extradata and get the real values - and if the extradata is not present, they read the actual frames until they find the info they need. I just want to be able to make use of those functions so i can offer a 100% working software fallback. It would be great to make use of these functions, because they work in many more cases where a pure checking of the media type fields fails.

Anyhow, i could probably produce an example pretty easily of a DVB application which does not set dwProfile (leaves it at 0).
Doesn't matter, i'll just fix up the media types, easy enough to do.

dwProfile ==0 in MPEG2 is 422 profile which isn't supported. Adding a kill switch for the init check is simple but I need to test this. Otherwise we'll go back and forth with crashes.
I can add a feature to create fake parameters if the info header is empty or invalid.
Can you supply me a clip (or flow) that demonstrates this behavior?
The Media SDK should be able to handle this situation.

Try sending a fake VIDEOINFOHEADER with fake width/height so it will pass inspection. The decoder should realloc resources and re-init the SDK when the actual stream arrives.

Update:
Using Reclock playback didn't start for several seconds on the first times I tested after installation. The debugger reported that my DLL wasn't loaded yet, LAV splitter wasn't loaded either.
A few more runs and it stopped freezing. This is very odd. I ran my entire MPEG2 collection and apart from a slower system initialization (because of reclock BTW) everything was fine.
I need to know your exact setup: player, 32/64 bit, splitter setting (I use the defaults in LAV), renderer.

nevcairiel
4th January 2012, 22:22
dwProfile ==0 in MPEG2 is 422 profile which isn't supported.
It could also just be a value which was never set because the demuxer doesn't extract that from MPEG-2. In my brief checks, it seems to be set at least for MPEG-TS demuxers, but who knows what in a MKV or so could happen. I still remember the bug reports, which is why i switched do bitstream analysis.


Try sending a fake VIDEOINFOHEADER with fake width/height so it will pass inspection.
I just modify the existing header and set a profile which is valid, seems to be fine. AVC1 needs its MPEG2VIDEOINFO, or stuff breaks, so i modify the dwProfile in it. Seems like the safest approach.

Anyhow, such a flag would be nice, just document it that it may cause breakage if used improperly.
Personally, i can go without it now, though.

I just fixed my last issues with VC-1 playback, and i'm about ready to post a test version.

nevcairiel
4th January 2012, 22:31
I found another issue.

http://files.1f0.de/samples/20100816_1715_-_TV3_(N)_-_According_to_Jim.ts

At around 0:27 its changing its aspect ratio from 16:9 to 4:3 (transition from the ads back to the show), however your decoder doesn't seem to notice the change, and keeps sending 16:9 in QsFrameData

egur
4th January 2012, 22:50
I found another issue.

http://files.1f0.de/samples/20100816_1715_-_TV3_(N)_-_According_to_Jim.ts

At around 0:27 its changing its aspect ratio from 16:9 to 4:3 (transition from the ads back to the show), however your decoder doesn't seem to notice the change, and keeps sending 16:9 in QsFrameData

I'll take a look.
BTW, VS2010 reports that when Reclock is used Haali media splitter is loaded before LAV splitter (along with a few other DLLs). But when DirectSound is used, it's not. Using ZoomPlayer 8.

update
Well the Media SDK doesn't report an aspect ratio change.
FFDShow with either limpeg2 or libavcodec do the same - no aspect ratio change. Tried with Haali splitter as well.
LAV video decoder + splitter (0.43) doesn't change AR as well (32 bit anyway).

pirlouy
4th January 2012, 23:21
Sorry if it has been asked.
Why QuickSync needs Intel GPU to be active ? From what I've read, I thought it was independant. Is there a possibility to use QuickSync to decode, and then send it to RAM for third party GPU for example ?
Is it an Intel limitation, or it's just something you don't have overlooked right now ?

egur
4th January 2012, 23:56
...
So here we go

Decoding Error = http://www.mediafire.com/?rld8gnlh52f03ud (not good) (Cyberlink DXVA = Fail, CoreAVC DXVA = OK, CoreAVC = OK, Arcsoft DXVA = OK, ffdshow-quicksync = Fail, Lav Video = OK, Mainconcept DXVA = Fail, Mainconcept = Fail, Microsoft DTV = Fail, DivX = Fail, Mirillis = Fail, Potplayer DXVA = OK)
Refresh Lock Problem = http://www.mediafire.com/?8u0sn24cfggx6dq (that is the strange behaving one, we might need jan here also involved as refreshing the experimental renderer also locks the what seems correct fps 29.97 seeking though seems to have the same effect i wonder if its related to the corrupt unsync issue)
Sync issue = http://ibc.cdngc.net/Avidan/dongle_3.ts (Sync can be forced by seeking)
...
So here is another sample of the unsync issue additionally = http://www.multiupload.com/QL5F0FRM7O (Sync can be forced by seeking)

The 1st shows corruption after 11 seconds. Strange that only if this clip is played from the start it shows corruption. If seeking anywhere between 0.5s and 11s will play the clip perfect. This will be hard to root cause.
The second clip played fine - the decoder reported 29.97 from the start (in debug builds I have trace prints) and this doesn't change. Tried with Haali as well. Tried 10 times to make it fail. Need more details to reproduce.
3rd clip issues are deterministic. Need to root cause. Play somewhat OK with libavcodec. Seems to be a broken stream.
4th clip - a lot of corruption at he start of the clip - probably what causing the sync issue. Maybe broken sequence header? Need to root cause.

egur
5th January 2012, 00:00
Sorry if it has been asked.
Why QuickSync needs Intel GPU to be active ? From what I've read, I thought it was independant. Is there a possibility to use QuickSync to decode, and then send it to RAM for third party GPU for example ?
Is it an Intel limitation, or it's just something you don't have overlooked right now ?

It's part of the Intel GPU (physically) and operated by the Intel GPU driver.
My decoder copies the frames back to system memory so you can use a renderer on another GPU. You'll need an H67/Z68 chipset for this to work. See this post on how to enable multi GPU setup: http://forum.doom9.org/showthread.php?p=1532786#post1532786

hajj_3
5th January 2012, 00:22
Egur, i don't suppose you know whether it is technically possible for sandy bridge's quicksync to hardware decode HEVC a.k.a h.265 when that is launched in 2013 or will the hardware on the gpu not be powerful enough to do it? It would be great if sandy/ivy bridge could hardware decode that with a software upgrade otherwise current pc's likely won't have the cpu power to decode it as it looks to be extremely cpu intensive.

CruNcher
5th January 2012, 01:38
@ Egur
i wont have the time to test both lav video and ffdshow-quicksync (i will though do certain cross compares on strange issues and cases) but primarily i will do all further tests with Lav Video i hope nev can help improve all the time stamping code in ffdshow-quicksync so that both will work perfectly in the end with Lav Splitter @ least for .ts now :)

if you cant reproduce it i guess we really need Jan here too i also get the same with Lav Video :) the actual decision when it recovers the lock seems non deterministic except when you seek :)

I'll make a Video of it (i hope i can capture it) :)

nevcairiel
5th January 2012, 08:02
update
Well the Media SDK doesn't report an aspect ratio change.
FFDShow with either limpeg2 or libavcodec do the same - no aspect ratio change. Tried with Haali splitter as well.
LAV video decoder + splitter (0.43) doesn't change AR as well (32 bit anyway).

Works for me in software decoding or CUVID decoding in LAV Video. The splitter plays no role here (or should not, its possible that Haali breaks it, because of its ugly stream AR overwriting)
Just confirmed that its indeed working just fine with LAV Splitter + LAV Video, both Software (avcodec) and CUVID. Seeking anywhere after the 0:27 mark switches AR instantly, seeking back to the start switches AR back. Playing the file from start to end switches at the proper time as well. (Make sure "Use Stream AR" option is on)

Also tested ffdshow (a rather old version i had around, r3967), and both avcodec and libmpeg2 switch the AR properly.

All tests in MPC-HC with vanilla EVR as well as EVR Custom.

egur
5th January 2012, 08:56
Works for me in software decoding or CUVID decoding in LAV Video. The splitter plays no role here (or should not, its possible that Haali breaks it, because of its ugly stream AR overwriting)
Just confirmed that its indeed working just fine with LAV Splitter + LAV Video, both Software (avcodec) and CUVID. Seeking anywhere after the 0:27 mark switches AR instantly, seeking back to the start switches AR back. Playing the file from start to end switches at the proper time as well. (Make sure "Use Stream AR" option is on)

Also tested ffdshow (a rather old version i had around, r3967), and both avcodec and libmpeg2 switch the AR properly.

All tests in MPC-HC with vanilla EVR as well as EVR Custom.

I tested with MPH-HC and saw what you described. ZoomPlayer have an issue with this clip as it doesn't change the renderer's AR.
I'll dig in deeper, but a quick fix is not probable.

CruNcher
5th January 2012, 15:13
I did some tests with this lock problem and tried this @ two refresh rates and indeed it locks faster @ 60 Hz then 75 Hz :P
@ 75 Hz it can be problematic as such that it takes longer or doesn't lock @ all not sure where this depends on Aero i guess, though im almost sure now this is more something on Jans Render side then a Splitter/Decoder issue :) Also the behavior on 60 hz Aero of both is pretty identical it deterministically never misses the lock and locks always after 20 second only the way to the lock differs a little

60 Hz (lock takes 20 seconds)

Fddshow Quicksync = http://www.multiupload.com/F3NDGFYDF7
Lav Video Quicksync = http://www.multiupload.com/NDDB6U5YU8

75 Hz (lock doesn't happen or takes very long, non deterministic, seeking forces it)

Fddshow Quicksync = http://www.multiupload.com/PMFRYQEWPI
Lav Video Quicksync = http://www.multiupload.com/VPG09HCV1H

PS: Btw trying to capture this with my GPU recording framework (Quicksync) failed same for the software based x264 counterpart this needs very low latency to be captured in Realtime Mirillis Low Latency I frame Codec was capable of achiving it without modifying the outcome result :)

NikosD
5th January 2012, 16:47
1) After the installation of latest driver Intel 15.22.52.2559 I found 3 MFT decoders by Intel at C:\Program Files\Common Files\Intel\Media SDK\s1\2.0\

The names are Intel Hardware H.264/MPEG-2/VC-1 Decoder MFT.

But during the enumeration of available codecs in DXVA Checker when I try to benchmark a video file, those decoders never show up.

Why?

Also in their properties they don't seem to have a DXVA option (enable/disable)



I don't know - I'm not part of the Media SDK dev team nor the graphics driver team. I'll forward your question.



4) Why Intel restricts such a POWERFUL DECODER like QS for 1920x1080 only?

I think that the driver's team should "open" the driver up to 4K x 2K that QS could handle with ease.

And of course your decoder and every other decoder using QS must be updated too, to include 4K x 2K.



See answer #1. My guess would be that it made the HW more expensive and not worth the cost. I'll forward the question.


Eric hi.

Any feedback from the Intel Media and drivers team for the above questions ?

Now that latest beta from PotPlayer v1.5.31323 supports DXVA H.264 4K x 2K resolutions with it's internal codec, it would be useful for us to check it out.

We only need Intel to update the drivers.

You could add those resolutions to your project too.

Blight
5th January 2012, 16:55
Thanks to an insight from nev, Zoom Player v8.1 final will support dynamic aspect ratio changes with EVR.
I just verified that it works against the test clip posted earlier with both LAV and Haali as the source filters :)

ETA to v8.1 release, ~4h

CruNcher
5th January 2012, 17:10
Eric hi.

Any feedback from the Intel Media and drivers team for the above questions ?

Now that latest beta from PotPlayer v1.5.31323 supports DXVA H.264 4K x 2K resolutions with it's internal codec, it would be useful for us to check it out.

We only need Intel to update the drivers.

You could add those resolutions to your project too.

4K works @ least non DXVA it seems

http://img266.imageshack.us/img266/720/ffdshowquicksync4k.png

Though im not sure if it might have fallen back to Intels Software Decoding Core or Libav ? (ill try to check)

hajj_3
5th January 2012, 17:28
wouldn't 4k be 2160p (3840x2160)?

4096 x 2304 seems an odd size?

CruNcher
5th January 2012, 17:41
3840x2160 that's QFHD (2160p)
4096 x 2304 is 4K @ 16:9 AR
Full 4K 4096 × 3112

http://www.youtube.com/view_play_list?p=5BF9E09ECEC8F88F

PS: I found out that Arcsofts Decoder on Intel is by default (when called from Directshow directly, not sure about TMT5) (@ least for VC-1) a "Copy Back Decoder" (and it has very poor performance the heavier the stream gets, comparable to what ffdshow-quicksync reached in the beginning without all the Copy improvements) :P

@Egur

So with this my testing @ least for playback stability of .ts (Lav Splitter) and with quicksync (ffdshow-quicksync,lav video quicksync) ended, i will finish the deinterlace test on lav video and then finally move on to .mp4/.mov/.mkv/.wmv/.mpg/.m2ts :D so only the 2 issues are left the Sync problems with those damaged streams that work fine @ Software Decoding (Libav) and this strange Decoding error and what i guess wont be fixed anymore on the Driver side the MC.ts VC-1 interlace issue :)
I hope especially someone is able to root cause the decoding error the Sync issue for damaged streams seems not so critical as it can be fixed manually by a short seek @ playback (though indeed i didn't tested if it becomes maybe unsync after some time again) and also not sure what happens @ Encoding i guess end result would be unsync though :)

Left Decoding stability issues ( Lav Splitter + FFdshow-quicksync/Lav Video Quicksync + Lav Audio Decoder):

Decoding Error = http://www.mediafire.com/?rld8gnlh52f03ud (not good) (Cyberlink DXVA = Fail, CoreAVC DXVA = OK, CoreAVC = OK, Arcsoft DXVA = OK, ffdshow-quicksync = Fail, Lav Video = OK, Lav Video Quicksync = Fail, Mainconcept DXVA = Fail, Mainconcept = Fail, Microsoft DTV = Fail, DivX = Fail, Mirillis = Fail, Potplayer DXVA = OK)
Sync issue = http://ibc.cdngc.net/Avidan/dongle_3.ts (Sync can be forced by seeking)

also Nev should look @ those last 2 issues from the Parser/Splitter part :)

egur
5th January 2012, 19:49
@NikosD
I had a crazy week and didn't get to do it.
Did you try posting on the driver or media sdk support forums?

I can't manage to download any of the 4k clips, can anyone help/share?

CruNcher
5th January 2012, 20:22
http://www.mediafire.com/?yn7pa6xe0cdp5cx

but they are rather easy to decode hence youtube and low bitrate harder are the QFHD samples with enormous bitrates 50 Mbps and up

egur
5th January 2012, 20:24
PS: I found out that Arcsofts Decoder on Intel is by default (when called from Directshow directly, not sure about TMT5) (@ least for VC-1) a "Copy Back Decoder" (and it has very poor performance the heavier the stream gets, comparable to what ffdshow-quicksync reached in the beginning without all the Copy improvements) :P

This is very odd. How do you they use DXVA?
BTW, with frame copying the resolution and frame rate is what makes the difference, not the bitrate.

So with this my testing @ least for playback stability of .ts (Lav Splitter) and with quicksync (ffdshow-quicksync,lav video quicksync) ended, i will finish the deinterlace test on lav video and then finally move on to .mp4/.mov/.mkv/.wmv/.mpg/.m2ts :D so only the 2 issues are left the Sync problems with those damaged streams that work fine @ Software Decoding (Libav) and this strange Decoding error and what i guess wont be fixed anymore on the Driver side the MC.ts VC-1 interlace issue :)
I hope especially someone is able to root cause the decoding error the Sync issue for damaged streams seems not so critical as it can be fixed manually by a short seek @ playback (though indeed i didn't tested if it becomes maybe unsync after some time again) and also not sure what happens @ Encoding i guess end result would be unsync though :)


Well, you know the last 1% is always the hardest...

On another matter, I'll release another update in a few days. There's a small bug related to multi threading. Afterwards, I want to really optimize the multithreaded path.
When this is done, it's time to put the video processing in.

dukey
5th January 2012, 20:25
Egur since you work for Intel. Can you get someone to fix the opengl drivers ? There's a list of serious problems with them .. In fact it's so bad it's enough to make developers want to abandon opengl.

eg
Intel -> http://i126.photobucket.com/albums/p95/dukeeeey/gfx%20stuff/image1.png
ATI/Nvidia -> http://i126.photobucket.com/albums/p95/dukeeeey/gfx%20stuff/Image2.png

egur
5th January 2012, 20:29
Egur since you work for Intel. Can you get someone to fix the opengl drivers ? There's a list of serious problems with them .. In fact it's so bad it's enough to make developers want to abandon opengl.

eg
Intel -> http://i126.photobucket.com/albums/p95/dukeeeey/gfx%20stuff/image1.png
ATI/Nvidia -> http://i126.photobucket.com/albums/p95/dukeeeey/gfx%20stuff/Image2.png

Sorry, please post in driver support forum.

NikosD
5th January 2012, 20:32
@NikosD
I had a crazy week and didn't get to do it.
Did you try posting on the driver or media sdk support forums?

I can't manage to download any of the 4k clips, can anyone help/share?

I didn't post there because we have Intel here :D

Here you are for the samples:

http://xhmikosr.1f0.de/index.php?folder=c2FtcGxlcy8yMTYwcA==

CruNcher
5th January 2012, 20:35
This is very odd. How do you they use DXVA?
BTW, with frame copying the resolution and frame rate is what makes the difference, not the bitrate.



Well, you know the last 1% is always the hardest...

On another matter, I'll release another update in a few days. There's a small bug related to multi threading. Afterwards, I want to really optimize the multithreaded path.
When this is done, it's time to put the video processing in.

Yep that could maybe explain why it crashes very strangely @ a 720p 50 fps clip Microsofts 2011 Build Keynote and inside TMT5 it gets totally unsync :P

with higher res 1080p it works outside but i get 15 fps for a 30 fps clip and im pretty sure those are no parser issues ;)

anyways here is a QFHD sample (Mainconcept) http://115.com/file/be83t4l4# it's heavier then the Life in the Garden 4K Bitrate wise

this really pressures Quicksync heavily and it gives it a run for its money also it seems Lav Video Quicksync Performed a tad better then ffdshow-quicksync but mostly performance is the same with audio brake ups and fps brakes, when changing to Software Decoding its flawless so im pretty sure Quicksync Decoded it (and IO was also no issue). Though i guess it doesn't even have todo with the Decoder but the Memory Copy i guess the Hardware would be capable of playing this flawless ?.
Trying some of the DXVA decoder on it

Nope DXVA fails as others already said or it falls back to Software Decoding

QFHD 50Mbps H 5.1

Lav Video

http://img810.imageshack.us/img810/176/lavvideo.png

Lav Video Quicksync

http://img42.imageshack.us/img42/1662/lavvideoquicksync.png

So for now with this performance i would make the suggestion to fallback to Software (Libav) for these complexities though with the 4K youtube it has less problems so it wont be enough to just decide this based on resolution i guess ;)

Lav Video Quicksync "Youtube 4K" H 5.1 Max 19.4 Mbps

http://img40.imageshack.us/img40/2609/lavvideoquicksync4k.png

So that runs rather ok but seeing those spikes i guess that's where the bitrate shoots higher and indeed the Quicksync Decoder gets problems to cope with that Bitrate Spikes + Resolution it seems. So if that starts here already it's no wonder it's dying with the QFHD sample though lets analyse the GPU MFX pressure (that's where it would be nice to have it directly in the OSD + Power consumption, craziest of course as another Realtime Graph :))

So there are 2 possibilities

Either the Decoder cant cope with the bitrate (most probably) or the +~6 fps are enough pressure on the copy side to cause the playback to endup like this (doubtful)

And so in the first case no DXVA would help here and Sandy Bridge would only be able to play very restricted 4K @ all (even youtube looks slightly to much for it ;) )

egur
5th January 2012, 21:09
anyways here is a QFHD sample (Mainconcept) http://115.com/file/be83t4l4# it's heavier then the Life in the Garden 4K Bitrate wise

Can you share on multiupload, download breaks all the time.

fano
5th January 2012, 21:25
Hy egur :D

I'm in a lot of trouble:
I've downloaded last ffdshow compiled by you: quicksync is present during installation, but not after installation (on the normal ffdshow Video Codecs page): Mediaportal plays the file but the ffdshow icon says it's using libvacopec (cpu usage is high!)... in ffdshow video I can't configure QuickSync anymore :mad:

It's a bug or my system it's not supported?

Strange as my hardware is AsRock Core 100 HT:
http://www.asrock.com/nettop/overview.asp?Model=Core%20100HT#Specifications

it shouldn't be supported?
If not why installer deludes me?
Official ffdshow have QuickSync selectable but in the they used libvacodec...

I've to install by hand (?) the famous QuickSync,dll? There's none in my system :p

:thanks: for your support!