View Full Version : Intel QuickSync Decoder - HW accelerated FFDShow decoder with video processing
egur
15th November 2012, 08:08
The problem was in the driver not LAV. The next driver version will have this fix. I'll check the new clip and report back.
Update:
Latest driver (15.28.9.64.2884) plays fine with the Bourne clip (last link). Just a matter of time till its released (don't know when).
NikosD
15th November 2012, 15:12
Latest drivers (v2875) have disabled H/W acceleration of MFT decoders (transcoders) (H.264, VC-1) - at least with DXVAChecker v2.9.1
corporalgator
16th November 2012, 02:25
The problem was in the driver not LAV. The next driver version will have this fix. I'll check the new clip and report back.
Update:
Latest driver (15.28.9.64.2884) plays fine with the Bourne clip (last link). Just a matter of time till its released (don't know when).
Do you want me to upload more? I think all I have left is Bourne Supremacy and Bourne Legacy. I think it's just beating a dead horse at this point since we know what the fix is.
egur
16th November 2012, 22:30
Do you want me to upload more? I think all I have left is Bourne Supremacy and Bourne Legacy. I think it's just beating a dead horse at this point since we know what the fix is.
Clips from the same source probably exhibit the same issues, so if you have clips from other sources, then yes, please share.
Edit:
The previous 64 bit driver (15.28.8.64.2875) is available from Intel here (http://downloadcenter.intel.com/Detail_Desc.aspx?DwnldId=22083).
ryrynz
17th November 2012, 07:09
I noticed FFDshow revision 4493 listed @ SourceForge but there's no link or update here.
egur
17th November 2012, 09:23
I noticed FFDshow revision 4493 listed @ SourceForge but there's no link or update here.
My last build was with ffdshow r4490 (QS v0.40).
Afterwards I've added AVX2 copy function (Haswell). The latter is not relevant to anyone yet and compiles under VS2012 (no AVX or AVX2 intrinsic functions in VS2010). Intel compiler v13 and probably 12.1 can do the job too.
I ran performance tests on Haswell to check whether AVX2 is faster than SSE4.1 (with respect to GPU memcpy). So far SSE4.1 is a little faster.
nn007
18th November 2012, 22:18
Windows Media Player (version 12.0.9200.16420) with Windows 8 cannot use ffdshow.
Anyone using Windows 8?
http://en.wikipedia.org/wiki/Ffdshow
After installation of ffdshow, compatible DirectShow or VFW media players such as Media Player Classic, Winamp, and Windows Media Player (not 12) will use the ffdshow decoder automatically, thus avoiding the need to install separate codecs for the various formats supported by ffdshow.
egur
19th November 2012, 08:23
I think Media player uses MFT codecs, not DirectShow.
There's a tweak tool that disables MFT for Media player and Windows Media Center. The tool was designed for Win7 but it should work with Win8.
Here's a link (http://codecguide.com/windows7_preferred_filter_tweaker.htm) to the tool i use.
You can disable MFT playback and assign your favorite DS filters instead. It applies to Windows Media Center too.
ryrynz
19th November 2012, 11:50
So far SSE4.1 is a little faster.
And you still don't have a production sample yet huh? Could be some improvements when that comes out, when you can disclose some info on it please do.
nn007
19th November 2012, 18:23
Decoding of which codec consumes most CPU cycles?
1. SMPTE VC-1
2. H.264 or MPEG 4 Part 10
3. H.262 or MPEG 2 Part 2
egur
19th November 2012, 19:13
Decoding of which codec consumes most CPU cycles?
1. SMPTE VC-1
2. H.264 or MPEG 4 Part 10
3. H.262 or MPEG 2 Part 2
In HW decoding, it's hard to tell - depending what the driver does.
In SW, VC-1 is the most CPU hungry for a given combination of resolution and bitrate.
MPEG2 is the simplest of all.
kwlee
20th November 2012, 12:09
Hi egur,
when I decode h264 file and when CQuickSync::Decode(IMediaSample* pSample) start,
I find sts of first 16 decoding loop will return MFX_ERR_MORE_DATA. After that,
the first decoded frame will be OK. Is it same thing as you discussed previously ?
http://forum.doom9.org/showthread.php?t=162442&page=86
egur
20th November 2012, 12:54
The Media SDK decoder decodes several frames before returning the first one. I also add some delay because I want to measure a few time stamps. Generally speaking the MSDK decoder caches the amount of frames that can be used for reference by that codec/profile. e.g. H264 high profile can use 16 frames so the delay is 16 frames.
This is far from optimal but that's the current state. The real impact is visible when there's a lot of seeks - this happens when certain players perform fast playback by seeking very quickly.
nn007
20th November 2012, 17:26
In HW decoding, it's hard to tell - depending what the driver does.
In SW, VC-1 is the most CPU hungry for a given combination of resolution and bitrate.
MPEG2 is the simplest of all.
Microsoft claims otherwise
http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx
Because they are more sophisticated, VC-1 and H.264 are both more complex to decode than MPEG-2. Yet VC-1 is more than twice as efficient to decode as H.264. A study by 3GPP, a collaboration group that is setting 3G mobile phone standards, found that VC-1 Main Profile requires 25 percent fewer cycles than H.264 Baseline. It should be noted that H.264 Main Profile requires even more cycles than Baseline, because it includes highly complex arithmetic coding, also known as CABAC.
nevcairiel
20th November 2012, 17:38
The big problem in software decoders is that H.264 is usually very well optimized, but VC-1 decoders are not.
It also depends on the type of content, VC-1 Advanced Profile, especially field-interlaced content, is much more complex then VC-1 Main Profile.
Anyway, there is no argument that H.264 is the computationally most complex (it also compresses the best), but the implementation plays a big role.
egur
20th November 2012, 17:54
I agree with Nev and I'd like to add that VC-1 main profile is usually called WMV9 not VC1. VC1 content (WVC1 fourcc) is VC1 advanced profile.
Most H264 content is not baseline. The most common is high profile (all pirate copies of TV shows and movies).
BTW, H264 is the first format agreed by both MPEG and ITU-T, hence the double name: H264 and MPEG4 AVC (MPEG4 part 10).
VC1 ws introduced by Microsoft, not a standards video expert committee. It has many good features but most SW implementations are pretty bad.
Google has VP8 which is yet another advanced codec.
nevcairiel
20th November 2012, 18:06
VC-1 is also known as SMPTE 421M, it has some kind of standard attached to it. ;)
nn007
20th November 2012, 21:08
All three are Official Standards for Blu-Ray Video (Maximum Video Bit Rate is 40 Mbps):
1. SMPTE VC-1
2. H.264 or MPEG 4 Part 10
3. H.262 or MPEG 2 Part 2 (to maintain backward compatibility with DVD Video)
Hence, all three are supported by current version of Intel Quick Sync Video - Sandy Bridge Video Decoding:
http://www.tomshardware.com/gallery/decoder,0101-274956-0-2-3-1-jpg-.html
egur
20th November 2012, 21:59
I think BluRays can hit 54mbps but that's not a problem with QuickSync either on any SandyBridge or newer processor.
JeanMarc
21st November 2012, 05:41
I apologize if this has been addressed already, but I could'nt find the specific question. I am planning to build a new system with an Ivy Bridge processor (i5-3570K or i7-3770K) which can potentially use Quicksync acceleration, primarily for the purpose of encoding videos.
Is there a way to incorporate the quicksync decoder in ffmpeg, for example as part of libx264, or as a specific library, so that it would be possible to transcode to x264 without being forced to use Media Espresso or ArcSoft Media Converter?
Thank you for any help.
egur
21st November 2012, 08:23
I'm not aware of a way to use QuickSync in ffmpeg. FYI, using a HW decoder in transcoding will provide very limited benefits (if at all) due to the copying and frmae-format conversions.
Fixed typo - I'm not aware...
JeanMarc
22nd November 2012, 20:33
Thanks for the reply.
nn007
24th November 2012, 22:34
I am looking for a sample HD Video file encoded in H.264 or MPEG4/AVC at full HD resolution (1080p) and maximum frame rate (60 fps Progressive Scan)
1080p60 (1920 x 1080 at 60 fps Progressive)
Encoded in H.264 or MPEG4/AVC
I only find 24 fps or 30 fps HD content on the internet
egur
25th November 2012, 15:15
I am looking for a sample HD Video file encoded in H.264 or MPEG4/AVC at full HD resolution (1080p) and maximum frame rate (60 fps Progressive Scan)
1080p60 (1920 x 1080 at 60 fps Progressive)
Encoded in H.264 or MPEG4/AVC
I only find 24 fps or 30 fps HD content on the internet
Here's one of my test clips (http://www.mediafire.com/?bo70xnw2xl9n30d), H264 1080p@60p.
BTW, H264 and AVC are the same thing.
nn007
26th November 2012, 00:20
Here's one of my test clips (http://www.mediafire.com/?bo70xnw2xl9n30d), H264 1080p@60p.
BTW, H264 and AVC are the same thing.
Thanks for the link (the given clip is animated video)
Can you also provide links for
1. SMPTE 421M or VC1 - 1080p @ 60 fps
2. H.262 or MPEG2 - 1080p @ 60 fps
egur
26th November 2012, 08:46
I don't have ready made 60fps clips for mpeg2 and vc1. You can make clips yourself by creating an AVS script for an existing clips and manually setting the fps in the AVS script. You'll need to send it to both an MPEG2 encoder and a VC1 encoder (Microsoft).
hoborg
7th December 2012, 14:21
Hi.
I am not sure if this is good place to ask, but i didnt found answer anywhere.
Is Intel HD 2000 capable of FullHD deinterlacing on Win7? By some reason, it seems it is not working, at last for me. SD is working, but Full HD is not.
Or did i miss some settning?
http://hobring.esero.net/saf/intel/hd2000_deinterlacing.png
Tested on latest mpc-hc (EVR custom), Intel HD 2000 (512RAM set in BIOS), driver 9.17.10.2875, PDVD12/LAVF/FFDshow (QuickSync) decoders
Source video (h.264 recording): Test-CT-HD.zip (http://hobring.esero.net/saf/samples/Test-CT-HD.zip)
egur
7th December 2012, 18:44
I'll check it out.
Did you enable deinterlacing in ffdshow->Intel QuickSync or you relied on the renderer to do it?
hoborg
7th December 2012, 18:57
I'll check it out.
Did you enable deinterlacing in ffdshow->Intel QuickSync or you relied on the renderer to do it?
Renderer. As i wrote before, SD is deinterlacing just fine.
hoborg
7th December 2012, 19:10
Enabling deinteracing did the trick. But i have bad experience when using FFDshow as LiveTV decoder (i am on PDVD12 decoder right now), so renderer deinterlacing will be better in such situation.
egur
7th December 2012, 19:13
Very strange clip.
I also have an HD2000 BTW not that it matters anything.
If I turn on deinterlacing in my decoder it looks OK.
If I let EVR do it, it shows a lot of tearing. This is also true when I use My AMD radeon 6950 output.
The thing about this clip is that it fails both Intel's and AMD's film cadence detection (hard inverse telecine). That's what I think any.
In any case, your system is fine, this is just a hard clip.
Just to make it clear - HD2000 and HD3000 have the same deinterlacer and same driver. They only differ in core count
hoborg
7th December 2012, 19:26
Very strange clip.
I also have an HD2000 BTW not that it matters anything.
If I turn on deinterlacing in my decoder it looks OK.
If I let EVR do it, it shows a lot of tearing. This is also true when I use My AMD radeon 6950 output.
The thing about this clip is that it fails both Intel's and AMD's film cadence detection (hard inverse telecine). That's what I think any.
In any case, your system is fine, this is just a hard clip.
Just to make it clear - HD2000 and HD3000 have the same deinterlacer and same driver. They only differ in core count
Thanks for test.
Well, that is just cut of recording broadcasted here in Czech Rep. I have even worse h.264 recording from another multiplex, even the deinterlacing in ffdshow fail to do it correctly. On real TV, all is ok. I can cut a sample if you are interested.
hoborg
7th December 2012, 22:35
egur, i am not sure if this is known issue, but FFDshow "ffdshow_rev4490_20121107_egur.exe" does not remember this settings (it is possible to choose, but it is not saved):
"mpegAVI"=dword:00000013
"em2v"=dword:00000013
egur
8th December 2012, 09:37
You can download a newer version of ffdshow from the ffdshow tryout homepage. My builds allow users to enjoy the latest version of my decoder without waiting for an official ffdshow release. Once there's a newer official build, you'll get everything.
hoborg
9th December 2012, 21:24
Hi.
I just hit another issue.
FDshow will crash if playing 4:4:4 h.264 (i444compressed.mkv) with enabled Intel quicksync decoder for h.264 and enabled resize. Decoder will fall back to Libavcodec and crash.
Strange is that set h.264 to Libavcodec + resize play it just fine. Looks like QuickSync to Libavcodec fallback somehow cause it.
egur
10th December 2012, 09:52
Please share a failing clip and I'll look into it. I don't have a 4:4:4 clip.
Also specify the resize parameters.
hoborg
10th December 2012, 09:58
Please share a failing clip and I'll look into it. I don't have a 4:4:4 clip.
Also specify the resize parameters.
Hi.
I will upload it when i return home.
And resize - simply default settings, just enable it.
hoborg
10th December 2012, 16:44
Here is the 4:4:4 sample (http://hobring.esero.net/saf/samples/i444compressed.zip).
egur
10th December 2012, 18:59
Did you try the last version of ffdshow?
I couldn't reproduce using ffdshow latest source code. Tried both debug and release builds and everything looks fine.
I used LAV splitter (0.54.1) under ZoomPlayer (32 bit).
Please add more details on your setup so I can reproduce.
hoborg
10th December 2012, 19:48
Did you try the last version of ffdshow?
I couldn't reproduce using ffdshow latest source code. Tried both debug and release builds and everything looks fine.
I used LAV splitter (0.54.1) under ZoomPlayer (32 bit).
Please add more details on your setup so I can reproduce.
?
I have same LAVF splitter as you.
Tested on your ffdshow_rev4494_20121128_clsid.exe or ffdshow_rev4490_20121107_egur.exe.
Just install ffdshow (reset all settings durning install).
Then set intel quicksync decoder for h.264/avc and enable resize.
I am testing in graphstudio (EVR renderer) or MPC-HC.
Crash info:
Problem signature:
Problem Event Name: APPCRASH
Application Name: graphstudio.exe
Application Version: 0.5.0.1
Application Timestamp: 503fc06e
Fault Module Name: ffmpeg.dll
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 50b620dc
Exception Code: c0000005
Exception Offset: 002c5111
OS Version: 6.1.7601.2.1.0.768.3
Locale ID: 1029
Additional Information 1: 0a9e
Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789
egur
10th December 2012, 21:04
OK, managed to reproduce. Needed to check "resize to screen resolution".
Yes it crashes and I also found another problem. When the resize feature is enabled, and QS is the decoder, some clips lose their color.
BTW, it crashes when libavcodec is selected in the codec tab and QS doesn't load at all. This happens all the time for me.
This clip surface type is 444P10. Maybe libavcodec doesn't handle this format well when it need to output to NV12.
It's not new that ffmpeg doesn't like NV12 colorspace very much. I patched ffdshow a while back to stop crashing on NV12->NV12 copying.
QS always outputs NV12 like all HW decoders and EVR likes NV12 too since this is the native HW format.
ffmpeg decoders output YV12 instead and the video procesing features there are validated better then the NV12 flows.
I'll look into it but I'm doubtful there's any chance of fixing this in ffdshow.
edit
If limiting outputs to 10 or 12 bit surfaces, it also crashes even when QS is disabled. I didn't find a workaround. Should be reported to libavcodec/swscale dev team.
NikosD
12th December 2012, 14:02
This is funny...
How many times Intel is going to announce 4K support in Ivy drivers ?
The other thing about 4 displays, I don't know...
http://semiaccurate.com/2012/12/07/intel-to-update-ivy-gpu-drivers-for-quadhd
nevcairiel
12th December 2012, 14:17
I don't see any official Intel announcement, and SemiAccurate usually fails at linking to their sources. In general, its a rather terrible tech news site (biased and full of trolling articles)
NikosD
12th December 2012, 14:21
It's an interesting site called Semi Accurate - not fully accurate.
Most of the times it's one step in front of the others in the forthcoming news, with the risk of that step.
Let's see...
Ah, one more thing...Intel has already announced 2 or 3 times the 4k support of Ivy drivers...
nevcairiel
12th December 2012, 14:26
I don't think they actually released drivers with 4K output support yet, so until that happens, they might as well keep mentioning it.
Keep in mind that this is about actual 4K output, not about decoding (which has been working for a long time already)
andyvt
12th December 2012, 14:45
Ah, one more thing...Intel has already announced 2 or 3 times the 4k support of Ivy drivers...
IIRC, they demonstrated 4k output at IDF. They are probably making sure that they get full PR coverage. The target demographic for IDF isn't that broad.
odditory
15th December 2012, 07:15
@Eric or anyone else that's running QS in headless mode successfully, I cannot QS to work headlessly, followed Eric's instructions (http://software.intel.com/en-us/forums/topic/311872) to a tee but the only way any QS based codec or transcoding app will engage QS is if the physical display or headless display connected to the HD4000 is set for "Make this my main display". As soon as the display on the NVIDIA GPU is set as "Make this my main display" then QS is unreachable. At this point I'm assuming its just an anomaly with my particular dGPU and motherboard combo and have given up getting headless to work.
Searched this thread and then forum but couldn't find answer, if anyone has any insight I"d appreciate it.
My specs: i5-3570k, Asus Gene V Z77, NVIDIA GTX 680 w/ latest driver, Intel HD4000 latest driver 15.28.8.64.2875, Lucid VirtuMVP NOT installed, Windows 8 x64
wanezhiling
15th December 2012, 07:39
headless mode? Whats it?
odditory
15th December 2012, 08:06
Headless mode in the context of Quicksync enabled apps refers to utilizing the on-die Intel GPU without the need to physically connect a monitor to the motherboard display out ports, usually because you've got your display(s) instead attached a discrete GPU (PCIe card).
Or in my case, because I have no choice but use a discrete GPU with my 2560x1440 monitors, since the motherboard has no DVI out, and my monitor has no displayport or HDMI in, and I can't use displayport/HDMI to DVI converter cables since 1440p is a resolution requiring a dual link connection.
LucidVirtu has attempted to bridge this issue for discreteGPU users, but its very bleeding edge, hit and miss, still maturing.
egur
15th December 2012, 09:05
See how to do headless setup here (http://forum.doom9.org/showthread.php?p=1532786#post1532786).
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.