Log in

View Full Version : Intel QuickSync Decoder - HW accelerated FFDShow decoder with video processing


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 [23] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

CruNcher
22nd April 2012, 17:04
ehh wait 1920x792 :D
though mod8 should be no problem, not 2.40:1 so someone wanted to save every pixel here i guess ;)

egur
22nd April 2012, 21:25
I use MPC-HC, LAV splitter and madVR/EVR, but it looks very much like a decoder-only problem.

Glitch confirmed. I'll try newer drivers and if all else fails, I'll report this clip.

Update
Newest driver doesn't solve this. I'll report this clip.

ajp_anton
22nd April 2012, 22:14
Well it's a very old encode (edit: 5 years), so it's not a big deal. IIRC CoreAVC also had this problem with some x264 encodes back then.
I could just re-encode it to fix the problem, but thought I'd see if it could also be fixed on the decoder side.

egur
23rd April 2012, 07:04
Well it's a very old encode (edit: 5 years), so it's not a big deal. IIRC CoreAVC also had this problem with some x264 encodes back then.
I could just re-encode it to fix the problem, but thought I'd see if it could also be fixed on the decoder side.

Broken clips such as this one will help produce a better HW decoder/driver. Keep 'em coming. Thanks.

NikosD
24th April 2012, 08:20
Eric,

is it possible to give us a direct QS1(sandy) vs QS2(ivy) comparison on H.264 dxva benchmarks of my collection ?

I think both native DXVA and MSDK modes would be useful (maybe together with DXVA-CP)

What about 4K H.264 decoding playback utilization of GPU and some benchmark performance ?

Can Ivy decode complex 4K H.264 clips of today like the Crowd Run- 3840x2160@50fpsRef4-275Mbps ?

Some figures would be useful.

If you have a clue - that you can share in public - about HDMI limitation of motherboards, it would be very welcomed to share it with us.

Is the lack of 4K output something that will be fixed in the future by M/B manufacturers, or we must forget about 4K output in current generation of Ivy processors ?

I was thinking of upgrading to Ivy platform but I think I'll wait for:

1) Drivers to include OpenCL functionality (and check out the performance)

2) Motherboards to include Thunderbolt

3) Motherboards to include 4K HDMI output ? (or DisplayPort ?)

4) Prices to drop ?

egur
24th April 2012, 12:39
Eric,
is it possible to give us a direct QS1(sandy) vs QS2(ivy) comparison on H.264 dxva benchmarks of my collection ?

I don't have comparable systems to check. I can run tests on a reference board but I'm too busy lately and the setup is not quick.

What about 4K H.264 decoding playback utilization of GPU and some benchmark performance ?
IVB is about 50% faster. I'm saying "about" because some clips will stress the memory subsystem so IVB might get less than 50%. Low bitrate clips will do that.

Can Ivy decode complex 4K H.264 clips of today like the Crowd Run- 3840x2160@50fpsRef4-275Mbps ?
Yes. As long as they are H264 high profile (or lower).

If you have a clue - that you can share in public - about HDMI limitation of motherboards, it would be very welcomed to share it with us.

Is the lack of 4K output something that will be fixed in the future by M/B manufacturers, or we must forget about 4K output in current generation of Ivy processors ?
I don't know the details. Some connectors (screens) are driven through the PCH, this might be one problem. Maybe it's an HDMI limitation.
HDMI 1.4 supports up to 4K (4096x2160) but only at 24fps.
Unless you have money to burn, you should wait for two things to happen:
1) Screens become mainstream ($$$).
2) Real content (movies/tv) becomes available.
IVB will be old by then :(

I was thinking of upgrading to Ivy platform but I think I'll wait for:

1) Drivers to include OpenCL functionality (and check out the performance)
Do you really care? What OCL apps do you use?

2) Motherboards to include Thunderbolt
Nice to have but mostly relevant to laptops. It's a nice way to connect multiple high speed devices through a single cable.
More and more products are launching that use Thunderbolt but their price might be high.

3) Motherboards to include 4K HDMI output ? (or DisplayPort ?)
You'll need to wait for DisplayPort 1.2 capable mother boards or GPUs. IVB is DP 1.1.
The Radeon HD 7000 series support 4K displays.

4) Prices to drop ?
Prices are always dropping and products are getting better. The longer you wait the better deal you'll get but in the meanwhile you're stuck with an aging platform...

NikosD
24th April 2012, 15:02
Yes. As long as they are H264 high profile (or lower).


A good sample is this also:

http://xhmikosr.1f0.de/samples/2160p/OldTownCross/OldTownCross_2160p50.x264.CRF24.mkv

If someone can try to play it in DXVA mode with Ivy and report GPU utilization during normal playback.

Also it would be useful to benchmark Ivy with that 4K sample.


Unless you have money to burn, you should wait for two things to happen:
1) Screens become mainstream ($$$).
2) Real content (movies/tv) becomes available.
IVB will be old by then :(


No I'm OK with 4K decoding and downscale to 1080p.
I don't really care - right now - for 4K output.


I was thinking of upgrading to Ivy platform but I think I'll wait for:


Do you really care? What OCL apps do you use?


Mostly password auditing :) ...and some others too.

My final decision of buying Ivy will be 4K decoding performance and Thunderbolt.
I would like to see OpenCL performance too.

I think I have to wait for a few months for Thunderbolt and OpenCL and I hope normal H.264 and 4K H.264 benchmark results to come sooner than the first two.

@Jakmal

Maybe Jakmal could help with benchmark results of 4K H.264 decoding and direct comparison of QS1 (Sandy) and QS2 (Ivy) on 1080p H.264 clips.

I have a nice collection of 4K clips I could share for benchmark reasons.

hajj_3
24th April 2012, 17:28
thunderbolt support will be built into Intel Haswell.

egur
24th April 2012, 19:16
Just to clear things up:
IvyBridge can decode 4K H264 video in HW easily.
IvyBridge has nothing to do with Thunderbolt (it's a separate chip). Thunderbolt existence is up to the board manufacturer.

andyvt
24th April 2012, 19:19
Just to clear things up:
IvyBridge can decode 4K H264 video in HW easily.


Do you know why your QS decoder refuses to do this? I took a brief look at it after finding the section in LAV Video which disabled DXVA2 for > 1080p, but didn't see anything obvious.

egur
24th April 2012, 19:25
Do you know why your QS decoder refuses to do this? I took a brief look at it after finding the section in LAV Video which disabled DXVA2 for > 1080p, but didn't see anything obvious.

I think nev mentioned he disabled QS when the resolution is greater than 1080p - it's a legacy workaround. One user reported he changed the restriction and got it working.
He should fix his code.

egur
24th April 2012, 19:35
Some users wanted the compiled Media SDK Direct show filter. Here (http://www.multiupload.nl/RZ2OWTLJCL) they are.

andyvt
24th April 2012, 19:44
I think nev mentioned he disabled QS when the resolution is greater than 1080p - it's a legacy workaround. One user reported he changed the restriction and got it working.
He should fix his code.

I just took another look with the latest LAV and QS loads (not sure why it wasn't before), but the CPU utilization is much higher than DXVA2 (3-4x). Obviously not a huge issue since 4K doesn't matter yet :)

egur
24th April 2012, 21:19
I just took another look with the latest LAV and QS loads (not sure why it wasn't before), but the CPU utilization is much higher than DXVA2 (3-4x). Obviously not a huge issue since 4K doesn't matter yet :)

CPU utilization will always be higher than DXVA since I copy the frames back to system memory. This simplifies the SW stack and removes several DXVA limitations. Video can then be easily processed by SW and using a subtitles filter is possible.
LAV video decoder have an DXVA option and so does ffdshow.

andyvt
24th April 2012, 21:22
CPU utilization will always be higher than DXVA since I copy the frames back to system memory. This simplifies the SW stack and removes several DXVA limitations. Video can then be easily processed by SW and using a subtitles filter is possible.
LAV video decoder have an DXVA option and so does ffdshow.

I understand that, but it shouldn't be 3-4x (the difference b/w 3-4% and 12-15% on a i7-3770K) higher, right?

egur
24th April 2012, 21:26
I understand that, but it shouldn't be 3-4x (the difference b/w 3-4% and 12-15% on a i7-3770K) higher, right?

12-15% seems a little high.
Do you mean the player's process CPU utilization not the entire system?
Which clip did you use (bitrate, codec, resolution)?
RAM type?
OS?
32 bit?

andyvt
24th April 2012, 21:45
12-15% seems a little high.
Do you mean the player's process CPU utilization not the entire system?
Which clip did you use (bitrate, codec, resolution)?
RAM type?
OS?
32 bit?

Yes
Timescapes 4K
DDR3 1333
W7 x64
GSN 32-bit.

http://babgvant.com/images/qs4k.jpg

nevcairiel
24th April 2012, 22:25
I think nev mentioned he disabled QS when the resolution is greater than 1080p - it's a legacy workaround. One user reported he changed the restriction and got it working.
He should fix his code.

That limit has been removed for ages, ever since you added the option to disable software fallback.
There is only a resolution limit for DXVA2 Native (not even CB), because auto-detection of 4K support is not working yet.

I also did a (very quick) test with QuickSync and 4K decoding, and it does not activate itself for me.
Might be a driver thing, i dunno. Need sleep, more testing when i have time.

CruNcher
25th April 2012, 04:31
That limit has been removed for ages, ever since you added the option to disable software fallback.
There is only a resolution limit for DXVA2 Native (not even CB), because auto-detection of 4K support is not working yet.

I also did a (very quick) test with QuickSync and 4K decoding, and it does not activate itself for me.
Might be a driver thing, i dunno. Need sleep, more testing when i have time.

Nev still no idea why 720p.mpg is crashing with Lav Video DXVA2 on Intel it seems definitely to be a implementation issue as every player seems to crash that seems to implement laurents DXVA 2 code somehow ?

Potplayer seems to be the only ones that avoid this crash now

egur
25th April 2012, 13:19
Nev still no idea why 720p.mpg is crashing with Lav Video DXVA2 on Intel it seems definitely to be a implementation issue as every player seems to crash that seems to implement laurents DXVA 2 code somehow ?

Potplayer seems to be the only ones that avoid this crash now

I didn't see a link for this anywhere. Can you share it?

CruNcher
25th April 2012, 14:12
http://www.mediafire.com/?y2dbekgemoeo28m <- sure here it is Mpeg-2

but now the really crazy thing comes this one http://www.mediafire.com/?xr7ynkl2fg59g1c is a cuda transcode (Nvidia H.264) of it and it also crashes with Intel and Laurents DXVA 2 implementation but also with Potplayers DXVA, (though the Potplayer guys doesn't know of this yet the Mpeg-2 crash they fixed but it's still not perfect playing back ;)

i really want to know how this is connected or just a coincidence (transcode crashing the same way) ;)

both clips play absolutely fine with Quicksync and some other 3rd Party DXVA Decoder (Arcsoft,Cyberlink) though all based on Laurent Aimars DXVA 2 implementation (VLC) crash with those both on Intel Hardware :)

The crash happens in ntdll and igdumd most probably a buffer overflow ;)

egur
25th April 2012, 16:19
http://www.mediafire.com/?y2dbekgemoeo28m <- sure here it is Mpeg-2

but now the really crazy thing comes this one http://www.mediafire.com/?xr7ynkl2fg59g1c is a cuda transcode (Nvidia H.264) of it and it also crashes with Intel and Laurents DXVA 2 implementation but also with Potplayers DXVA, (though the Potplayer guys doesn't know of this yet the Mpeg-2 crash they fixed but it's still not perfect playing back ;)

i really want to know how this is connected or just a coincidence (transcode crashing the same way) ;)

both clips play absolutely fine with Quicksync and some other 3rd Party DXVA Decoder (Arcsoft,Cyberlink) though all based on Laurent Aimars DXVA 2 implementation (VLC) crash with those both on Intel Hardware :)

The crash happens in ntdll and igdumd most probably a buffer overflow ;)

They work fine using QS like you said.
BTW, the 2nd clip (cuda) is H264 not mpeg2 (avc1 fourcc).

nevcairiel
25th April 2012, 17:19
However, as always, when a piece of software crashes on corrupt input, its still the software's fault. The driver should not crash if its fed data it doesn't like. In fact, it should never crash.

Anyway, crashes somewhere deep inside the intel driver are basically impossible to debug, so no clue where to start.
Especially hard because it seems to cause some kind of heap corruption, that isn't always easy to find even when the code is fully available!

pankov
25th April 2012, 21:30
Some users wanted the compiled Media SDK Direct show filter. Here (http://www.multiupload.nl/RZ2OWTLJCL) they are.
a while ago there was talk in LAV filters thread that Intel have some examples which could allow 3D playback. Then I asked Eric if he can provide compiled versions of these sample filters and yesterday that he did so I decided to give them a try.
Sadly there are some problems:
1. When I try to register the custom_evr_presenter.dll with regsvr32.exe it looks like it succeeds but I can't find it in the filter list in GraphStudio/GraphStudioNext or Zoom Player. I do see it in the registry in Wow6432Node\CLSID\{29FAB022-F7CC-4819-B2B8-D9B6BCFB6698} as "Intel® Media SDK Custom EVR Presenter" but it's not present. What did I do wrong? Do you see it in these applications or am I supposed to use it in a different way?
2. After I registered the mvc_dec_filter.dll with regsvr32 it appeared nicely in GraphStudioNext as "Intel® Media SDK MVC Decoder" but I'm not able to insert it as a filter in any graph. Again, do you have any idea what I do wrong?

Can someone else try them and share his/her experience?

I really hope these filters will be the first steps to fulfilling my dreams for open source 3D playback. (preferably in Zoom Player)

Currently I've tried both NVIDIA's and Intel's 3D HDMI output using TMT5 and I find the one from Intel a bit more hustle free and easy to use so I have high hope that this is the right way to go.

egur
26th April 2012, 10:19
However, as always, when a piece of software crashes on corrupt input, its still the software's fault. The driver should not crash if its fed data it doesn't like. In fact, it should never crash.
I agree. Driver crashes are taken seriously.
Please provide me the simplest way to reproduce (I already have the samples).

a while ago there was talk in LAV filters thread that Intel have some examples which could allow 3D playback. Then I asked Eric if he can provide compiled versions of these sample filters and yesterday that he did so I decided to give them a try.
Sadly there are some problems:
1. When I try to register the custom_evr_presenter.dll with regsvr32.exe it looks like it succeeds but I can't find it in the filter list in GraphStudio/GraphStudioNext or Zoom Player. I do see it in the registry in Wow6432Node\CLSID\{29FAB022-F7CC-4819-B2B8-D9B6BCFB6698} as "Intel® Media SDK Custom EVR Presenter" but it's not present. What did I do wrong? Do you see it in these applications or am I supposed to use it in a different way?
2. After I registered the mvc_dec_filter.dll with regsvr32 it appeared nicely in GraphStudioNext as "Intel® Media SDK MVC Decoder" but I'm not able to insert it as a filter in any graph. Again, do you have any idea what I do wrong?

Can someone else try them and share his/her experience?

I really hope these filters will be the first steps to fulfilling my dreams for open source 3D playback. (preferably in Zoom Player)

Currently I've tried both NVIDIA's and Intel's 3D HDMI output using TMT5 and I find the one from Intel a bit more hustle free and easy to use so I have high hope that this is the right way to go.

I looked at the setup:
EVR custom presenter is not a DS filter, it's an MFT filter. That's why GraphEdit doesn't show it. I'll ask the Media SDK team if they can produce a DS version. This is outside my bandwidth...

The MVC decoder only accepts H264 fourcc. Most content is AVC1 fourcc. Maybe this is the problem. This is something I can patch myself.
Please share 3D media files for testing purposes.

pankov
26th April 2012, 10:42
Eric,
the problem with the MVC decoder is not with connecting it - it's not possible to add it to the graph at all.

About the samples - every 3D blu-ray can be used - you just have to use Haali's or the patched MPC MPEG Splitter as I've mentioned here
http://forum.doom9.org/showthread.php?p=1567409#post1567409
If you don't have such 3D blurays I can try to find a short sample ... or try to cut one for you tonight because I'm at work now.

egur
26th April 2012, 10:52
Eric,
the problem with the MVC decoder is not with connecting it - it's not possible to add it to the graph at all.

About the samples - every 3D blu-ray can be used - you just have to use Haali's or the patched MPC MPEG Splitter as I've mentioned here
http://forum.doom9.org/showthread.php?p=1567409#post1567409
If you don't have such 3D blurays I can try to find a short sample ... or try to cut one for you tonight because I'm at work now.

I managed to add it to an empty graph (in GraphStudioNext 32 bit).
Also managed to build a complete and working graph for an H264 file (that has H264 fourcc), connect the decoder to EVR and play.
I also checked its dependency on other DLLs - it only depends on standard Windows DLLs (no VS2010 dependencies) .

nevcairiel
26th April 2012, 11:06
An EVR Custom Presenter is not something you can ever manually add to a graph. You need to define a custom frontend that tells EVR which presenter to use.

EVR is a Media Foundation technology, thats why the Custom Presenter acts like a MF object, but it can be used by a DirectShow EVR as well. You just cannot use it in GraphStudio without writing a wrapper around it.

PS:
What makes you think the Intel MVC decoder uses the same way of receiving the 3D data as CoreCodecs decoder? (which Haali was designed to deliver data for)
Its a rather wild assumption, tbh. :) There is no official standard that defines how this should be done. Heck, CoreCodecs decoder isn't even available!

nevcairiel
26th April 2012, 11:15
I agree. Driver crashes are taken seriously.
Please provide me the simplest way to reproduce (I already have the samples).

Just play it with LAV Video in DXVA2 Native mode with EVR, it'll crash immediately.

pankov
26th April 2012, 11:31
I managed to add it to an empty graph (in GraphStudioNext 32 bit).
Also managed to build a complete and working graph for an H264 file (that has H264 fourcc), connect the decoder to EVR and play.
I also checked its dependency on other DLLs - it only depends on standard Windows DLLs (no VS2010 dependencies) .
:(
I did exactly the same.
Any ideas what could have I done wrong? I simply used "regsvr32 mvc_dec_filter.dll" to register it and that's all.
And I tried both GraphStudio and GraphStudioNext (both 32bit) and I do see the filter but when I try to add it to a graph nothing happens.

...
PS:
What makes you think the Intel MVC decoder uses the same way of receiving the 3D data as CoreCodecs decoder? (which Haali was designed to deliver data for)
Its a rather wild assumption, tbh. :) There is no official standard that defines how this should be done.
I was simply hoping something will work.
Sadly I don't know any other MVC decoder.
In Stereoscopic Player I see CoreMVC filter but it's not available as DS filter. On the other hand the modified MPC MPEG splitter is available at their site and I think/hope it can be used to check how it sends the data to the CoreCodec decoder. On the other hand the source for the Intel decoder is available ... if I'm not wrong ... so it can be changed / adapted to accept this input ... or simply add a "translation" filter.

Guys,
this is way out of my league so if I'm talking nonsense please ignore me.

pankov
26th April 2012, 11:44
I found some time and managed to upload a small SIFF file for testing
http://www.mediafire.com/?hwvxscmms4mkbhp

NikosD
26th April 2012, 12:02
http://www.mediafire.com/?y2dbekgemoeo28m <- sure here it is Mpeg-2

but now the really crazy thing comes this one http://www.mediafire.com/?xr7ynkl2fg59g1c is a cuda transcode (Nvidia H.264) of it and it also crashes with Intel and Laurents DXVA 2 implementation but also with Potplayers DXVA, (though the Potplayer guys doesn't know of this yet the Mpeg-2 crash they fixed but it's still not perfect playing back ;)


Using PotPlayer latest version and ATi hardware in DXVA mode I couldn't play both of them at the signature system.

Playback of the MPEG-2 clip is awful - I can't see even one frame correct.

Playback of the H.264 file is better - the only problem is a green horizontal stripe at the top of the film.

BTW, WMP12 plays both clips perfect in DXVA mode.

wanezhiling
26th April 2012, 14:46
Using PotPlayer latest version and ATi hardware in DXVA mode I couldn't play both of them at the signature system.

Playback of the MPEG-2 clip is awful - I can't see even one frame correct.

Playback of the H.264 file is better - the only problem is a green horizontal stripe at the top of the film.

BTW, WMP12 plays both clips perfect in DXVA mode.

I can reproduce on my uvd2.2 card too.

For the H.264 file, MPC-HC DXVA has same playback (http://forum.doom9.org/showpost.php?p=1571889&postcount=19414).

For the MPEG-2 clip, NVIDIA is still ok, dunno how about uvd3.0 which support MPEG-2_VLD.
Edit:I tested HD6850, it's ok though also has a green horizontal stripe at the top like the h264 file.

NikosD
26th April 2012, 21:37
Eric,

it's not your field or responsibility but I'm really frustrated with Intel's policy about Ivy.

I look at the models and none of them is suitable for me.

3700K is the fastest and overclockable, but misses every other feature.

3700 is a little slower, with all features on but it's not overclockable.

I wanted to invest on a big expandable tower with an optional water cooling system (for the future), to experiment in overclocking with Ivy, but at the same time I need VT-d because I run also server systems with Hyper-V.

So I want maximum speed and overclocking capability, along with features like VT-d and security in HW.

It should exist a fully featured CPU, for someone who wants everything, shouldn't it ?

egur
27th April 2012, 07:49
Eric,

it's not your field or responsibility but I'm really frustrated with Intel's policy about Ivy.

I look at the models and none of them is suitable for me.

3700K is the fastest and overclockable, but misses every other feature.

3700 is a little slower, with all features on but it's not overclockable.

I wanted to invest on a big expandable tower with an optional water cooling system (for the future), to experiment in overclocking with Ivy, but at the same time I need VT-d because I run also server systems with Hyper-V.

So I want maximum speed and overclocking capability, along with features like VT-d and security in HW.

It should exist a fully featured CPU, for someone who wants everything, shouldn't it ?

What features are missing from the 3700K family that you need?
The K's don't have vPro, TXT and VT-d. These features are aimed at the enterprise segment and have no use in consumer systems.
Most of them require a different chipset (e.g. for vPro) like the B or Q series. These chipsets are not common outside the enterprise market segment.

NikosD
27th April 2012, 08:02
As I wrote above, mainly VT-d.

I need to use Hyper-V, the Virtualization technology of Microsoft OS server editions and soon Microsoft will put Hyper-V even in Windows 8 client editions, too

nevcairiel
27th April 2012, 08:16
As Eric mentioned, VT-d does only work if you also have a chipset which supports it. Those chipsets also don't support overclocking, so putting VT-d into a "K" CPU does not make sense, as you wouldn't be able to use both at the same time anyway, because of the Chipset requirements.

For that matter, VT-d is not required to use Hyper-V. VT-x is the important part, and that is present in all CPUs and all Chipsets. VT-d is only for I/O MMU virtualization, which is only required for enterprise-level virtualization, where you have truely dedicated hardware for every VM. The VM in Windows 8 "client" will not support VT-d either.

PS:
I also found some interesting quote on the web:
Currently (as of Windows Server 2008 R2 SP1), Hyper-V doesn't use the Intel VT-d hardware features (more information is available on this Intel site). The official guidance is to disable Intel VT-d in the BIOS.

NikosD
27th April 2012, 08:32
I/O Virtualization (VT-d) allows native-speed access to dedicated hardware from a guest operating system, including DMA-capable hardware.

True, I/O Virtualization is not performed by the CPU, but instead by the chipset.

So the question I put at my first post alters to this:

It should exist a combination of CPU+chipset (platform) that includes everything (overclockability, all features).

So 3770K and Z77 (the top processor and top chipset) should be a complete platform, I think.

nevcairiel
27th April 2012, 08:35
Z77 is a consumer chipset, it does not contain Enterprise features.

As mentioned before, Hyper-V does not even use VT-d.

NikosD
27th April 2012, 08:58
I change my system usually every three years and I'm sure that in the next few months we'll have:

1) M/Bs with Thunderbolt

2) OpenGL 4.x drivers

3) Hyper-V supporting VT-d.

I don't know for sure if I'll wait for Ivy or for Haswell.

I'm good enough with Core2Duo right now.

egur
27th April 2012, 09:36
I change my system usually every three years and I'm sure that in the next few months we'll have:

1) M/Bs with Thunderbolt

2) OpenGL 4.x drivers

3) Hyper-V supporting VT-d.

I don't know for sure if I'll wait for Ivy or for Haswell.

I'm good enough with Core2Duo right now.
What do you plan to do with this system? What's your budget?

NikosD
27th April 2012, 10:04
I started to plan my next system as a "no-limits" machine, with only logical restrictions, but even without discrete graphics card the total cost went to 1100€ !

I don't want something less, but I was a little disappointed when I saw that clock for clock Ivy is about 5% faster on average than Sandy in CPU only.

GPU graphics is theoritically faster in synthetic benchmarks 2x, but in games is about 50% (from 20% to 80%)

QuickSync 2.0 seems to be 50% faster in transcoding than QS 1.0.

I'm still waiting for H.264 benchmarks in decoding only, from you or Hendrik.

I don't have that money right now, so most probably I'll wait for Haswell which I will buy next year in drachmas or U.S dollars :D

pulbitz
28th April 2012, 10:47
APPCRASH, IntelQuickSyncDecoder.dll

sample file http://www.sendspace.com/file/k68n9h
(I don't use editing tool. Just a 5MB split.)

I tested with ffdshow QS, PotPlayer QS. (libav, DXVA is OK.)

egur
28th April 2012, 13:21
APPCRASH, IntelQuickSyncDecoder.dll

sample file http://www.sendspace.com/file/k68n9h
(I don't use editing tool. Just a 5MB split.)

I tested with ffdshow QS, PotPlayer QS. (libav, DXVA is OK.)

Confirmed. It has a broken H264 header. I'm working on it.
Update
Actually a fourcc mismatch - it's AVC1 when it should be H264.

Update 2
Fixed - will be part of next release - after some more testing.

egur
28th April 2012, 17:53
Version 0.31 beta is out with the following changes:
* Fixed AVC1 streams with AnnexB format.
* Fixed rare case where H264 was corrupt and reported a huge frame rate along with horribly bad time stamps (previous commit)
* Changed behavior with corrupted frames - they get discarded at the start of a stream.
* Updated H264 NALU parsing code from LAV filters.
* FFDShow rev4438

Downloads
* For the latest cutting edge FFDShow builds download my builds Intel QuickSync Decoder SourceForge home page (http://sourceforge.net/projects/qsdecoder/)
* FFDShow-tryout site (http://ffdshow-tryout.sourceforge.net/download.php)
* LAV Splitter builds (http://forum.doom9.org/showthread.php?t=156191)

edwrap
29th April 2012, 21:40
w.r.t. the sandy bridge video scaler improvements detailed earlier in this thread, are these dependent on specific settings in the driver options? I'd only just returned to using Windows 7 on an HD 3000 last night, and was somewhat horrified to see almost every video "enhancement" option ticked by default, having been trained for years to disable them. Some like adaptive contrast are instantly noticeable, but others like skin tone are much more subtle. Are there agreed upon best settings?

Apologies if this has been asked before, search was unhelpful.

andyvt
30th April 2012, 01:35
w.r.t. the sandy bridge video scaler improvements detailed earlier in this thread, are these dependent on specific settings in the driver options? I'd only just returned to using Windows 7 on an HD 3000 last night, and was somewhat horrified to see almost every video "enhancement" option ticked by default, having been trained for years to disable them. Some like adaptive contrast are instantly noticeable, but others like skin tone are much more subtle. Are there agreed upon best settings?

Apologies if this has been asked before, search was unhelpful.

Personally, I don't like skin tone correction (tends to make people's faces look pasty or wearing an overly heavy amount of makeup and often detail is lost) or contrast enhancement so I disable those, but I've found the default values for NR and sharpness to be pretty good because they aren't that aggressive, but since the "best" results are based on individual preference it's hard to choose a one-size-fits-all set.

Changes are made in real time, so it's easy to play with it and find something that you like.

egur
30th April 2012, 11:08
Video processing algorithms settings are subjective and should be tested by each user.
The only option exposed with respect to the scaler is how display 4:3 content on a 9:16 screen (Non Linear Adaptive Scaling). The more advanced scaler is used for video sources automatically (used mostly by EVR and probably custom renderers from CyberLink and Arcsoft).

My settings:
Some noise reduction and sharpness.
Auto contrast is off. If you're watching a movie on a poor display (e.g. laptop) you should consider switching this on as it will display dark scenes better. If you have good watching conditions, leave it off.
Standard color correction - defaults (do nothing).
Total color control - mild enhancements to RGB, even smaller enhancements to CMY. This is not a simple procamp (standard color correction) filter it will not over-saturate colors.
Skin tone enhancement - off. Recommend testing with mild settings.

The latter is very cultural based (hue and saturation of desired skin). So some users will love it and others will hate it.
Hollywood film makers hate it - it changes the directors artistic intentions (e.g. green tint in the Matrix movies). Movie makers in many cases modify the colors so the movie will not appear too real. This causes the viewers to immerse in the film and not feel that they are watching a play.

egur
3rd May 2012, 12:35
I did some testing on IvyBridge today.
System:
* CPU: i7 class engineering sample (E0) @ 2.6GHz.
* DDR3@1333 sodimm.
* Window 7
* 2712 graphics driver
* ZoomPlayer
* CoreTemp
* LAV splitter
* Decoders: LAV 0.50.2 and my own latest ffdshow build.

Good news:
* Managed CrowdedRun (4K@50fps, ~122mbps) with 30% CPU@3.2GHz using both ffdshow and LAV. libavcodec (ffdshow) took 85% CPU@2600 (jumps to 2700).
* This clip is the worst case scenario - very high bitrate, high resolution, high frame rate - stresses all the subsytems (memory, decoder, CPU). Most 4K clips have 1/4 of the bitrate and half the frame rate.
* Other clips played fine no surprises so far.
* A transposed 720p clip (720x1280) played very well. SandyBridge's QS can't play it since the line count is >1080.

The bad news:
* The reference board used didn't have proper cooling and the CPU hit 103C (SW or QS). At these temperatures it activates throttling to cool itself down. This might explain the high CPU usage. I need to rig it with a better fan and test again.

Even with these far-from-optimal conditions the video playback was smooth.

Also, started working on adding HW video processing. Some stuff already works but some issues are too severe for a proper release :(
I'll commit to SVN my changes soon so Nev can start playing with it.
Added:
* Deinterlacing (half/full rate output)
* Detail filter
* Denoise filter

Not working:
* 50i sources
* Telecined sources

Didn't add procamp (Hue, Saturation, Brightness, Contrast) yet, I'll add it too (very simple to do but not so useful).

Update
Fixed the cooling solution and QS behaves the same. libavcodec now raises CPU frequency to 3.2GHz and uses all cores at 82%.

CruNcher
3rd May 2012, 15:20
I did some testing on IvyBridge today.
System:
* CPU: i7 class engineering sample (E0) @ 2.6GHz.
* DDR3@1333 sodimm.
* Window 7
* 2712 graphics driver
* ZoomPlayer
* CoreTemp
* LAV splitter
* Decoders: LAV 0.50.2 and my own latest ffdshow build.

Good news:
* Managed CrowdedRun (4K@50fps, ~122mbps) with 30% CPU@3.2GHz using both ffdshow and LAV. libavcodec (ffdshow) took 85% CPU@2600 (jumps to 2700).
* This clip is the worst case scenario - very high bitrate, high resolution, high frame rate - stresses all the subsytems (memory, decoder, CPU). Most 4K clips have 1/4 of the bitrate and half the frame rate.
* Other clips played fine no surprises so far.
* A transposed 720p clip (720x1280) played very well. SandyBridge's QS can't play it since the line count is >1080.

The bad news:
* The reference board used didn't have proper cooling and the CPU hit 103C (SW or QS). At these temperatures it activates throttling to cool itself down. This might explain the high CPU usage. I need to rig it with a better fan and test again.

Even with these far-from-optimal conditions the video playback was smooth.

Also, started working on adding HW video processing. Some stuff already works but some issues are too severe for a proper release :(
I'll commit to SVN my changes soon so Nev can start playing with it.
Added:
* Deinterlacing (half/full rate output)
* Detail filter
* Denoise filter

Not working:
* 50i sources
* Telecined sources

Didn't add procamp (Hue, Saturation, Brightness, Contrast) yet, I'll add it too (very simple to do but not so useful).

Update
Fixed the cooling solution and QS behaves the same. libavcodec now raises CPU frequency to 3.2GHz and uses all cores at 82%.

How long will it survive 103C :D and how much can it cool down with throttling, is it unbreakable whatever load you put on it the throttling will always keep it safe or will it still @ least switch off ;), for a long time no one did the typical test anymore in the media just removing the cooler while the system is running under full load i would really like to know if this can be survived by todays Intel systems (SB,Ivy Bridge) unlimited when it's throttling down from like 103C i guess everyone remembers this http://www.youtube.com/watch?v=y39D4529FM4 http://www.youtube.com/watch?v=cAqlA9EJ4ME http://www.youtube.com/watch?v=5umDJhIfrt0 ;) of course testing with a Game these days is not really adequate full x264 Encoding load on all cores would be the kicks ;)