Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th November 2011, 16:14   #241  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Open source project has been set up

SourceForge homepage:
http://sourceforge.net/p/qsdecoder
Currently only useful for source control (SVN).

FFDshow code changes where merged to FFDshow's code trunk. Will be part of next official FFDshow release (very similar to 0.18 alpha).

Next on my task list (v0.19):
* Create configuration to enable/disable certain features as asked by several developers for easy integration.
* Fix fullscreen problem in WMC (not loading ffdshow for some reason).
* Export D3D surfaces (DXVA2 samples) instead of system memory buffers. Will provide DXVA speed without actually dealing with DXVA...

If all goes well, version 0.20 will add video postprocessing (deinterlacing, film cadence correction, noise reduction, sharpness, etc.)
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 7th November 2011, 17:14   #242  |  Link
rsd78
Registered User
 
Join Date: Jan 2009
Posts: 73
Hi Eric,

Very interested by your work here! Will definitely check it out once the WMC fullscreen issue is fixed since I'm a WMC only user. One question I did have is since this is using ffdshow, will using the Mediacontrol plugin continue to work as well? Mediacontrol is huge for me (easy sub/audio stream control and ff/rew), so I'm hoping so.

Thanks for your great work!
rsd78 is offline   Reply With Quote
Old 7th November 2011, 17:31   #243  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 8,915
I just looked over your ffdshow changes, and for the record: Its always much nicer to keep changes separated amongst multiple commits. For example, the bug fixes and the addition of the QS decoder should've at least been two commits, or more. Just sayin', its not my project or anything.

One thing i noticed though. Your sse2 memcpy seems superflous. If ffdshow is configured to use function intrinsics, the MS compiler will already use a optimized memcpy using sse2 if available. I did some testing along those lines recently, and a custom sse2 memcpy was actually not faster.
In addition to that, i don't think ffdshow had a hard dependency on sse2 before.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 7th November 2011, 18:52   #244  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by nevcairiel View Post
I just looked over your ffdshow changes, and for the record: Its always much nicer to keep changes separated amongst multiple commits. For example, the bug fixes and the addition of the QS decoder should've at least been two commits, or more. Just sayin', its not my project or anything.
Usually, yes, but it was hard to separate everything since a lot have changed.

Quote:
Originally Posted by nevcairiel View Post
One thing i noticed though. Your sse2 memcpy seems superflous. If ffdshow is configured to use function intrinsics, the MS compiler will already use a optimized memcpy using sse2 if available. I did some testing along those lines recently, and a custom sse2 memcpy was actually not faster.
In addition to that, i don't think ffdshow had a hard dependency on sse2 before.
Maybe VS2010 got it right I just copied the function from an another program that was compiled on vs2005. Back then, it was 2x faster (on Core2Duo and P4).
SSE2 implies a Pentium 3 or early 4 if I remember correctly. Not a crazy dependency

I'll run a few more tests and kill it if performance is not gained.
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.

Last edited by egur; 7th November 2011 at 19:07.
egur is offline   Reply With Quote
Old 7th November 2011, 22:06   #245  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by rsd78 View Post
Hi Eric,

Very interested by your work here! Will definitely check it out once the WMC fullscreen issue is fixed since I'm a WMC only user. One question I did have is since this is using ffdshow, will using the Mediacontrol plugin continue to work as well? Mediacontrol is huge for me (easy sub/audio stream control and ff/rew), so I'm hoping so.
I'll try the Media Control plugin, shouldn't be a problem as I didn't change the ffdshow API. I'll probably fix the WMC issue soon. Been busy lately.

Quote:
Originally Posted by rsd78 View Post
Thanks for your great work!
10x
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 8th November 2011, 00:14   #246  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,950
Quote:
Originally Posted by egur View Post
Relatively high cpu usage, what are the details (resolution, CPU type, output surface format, etc.)
Video: NV12 1920x1080 59.94fps, Intel Core I-5 2400

General
Complete name : G:\WipEout_HD_English_1080p.mp4
Format : MPEG-4
Format profile : Base Media / Version 2
Codec ID : mp42
File size : 171 MiB
Duration : 1mn 11s
Overall bit rate : 19.9 Mbps
Encoded date : UTC 2008-08-01 17:57:37
Tagged date : UTC 2008-08-01 17:57:43

Video
ID : 2
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main@L4.2
Format settings, CABAC : No
Format settings, ReFrames : 2 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 1mn 11s
Bit rate : 19.8 Mbps
Width : 1 920 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 59.940 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.159
Stream size : 169 MiB (99%)
Language : English
Encoded date : UTC 2008-08-01 17:57:05
Tagged date : UTC 2008-08-01 17:57:43
Color primaries : BT.709-5, BT.1361, IEC 61966-2-4, SMPTE RP177
Transfer characteristics : BT.709-5, BT.1361
Matrix coefficients : BT.709-5, BT.1361, IEC 61966-2-4 709, SMPTE RP177

Audio
ID : 1
Format : AAC
Format/Info : Advanced Audio Codec
Format profile : LC
Codec ID : 40
Duration : 1mn 11s
Bit rate mode : Constant
Bit rate : 144 Kbps
Nominal bit rate : 160 Kbps
Channel(s) : 2 channels
Channel positions : Front: L R
Sampling rate : 48.0 KHz
Compression mode : Lossy
Stream size : 1.25 MiB (1%)
Language : English
Encoded date : UTC 2008-08-01 17:57:04
Tagged date : UTC 2008-08-01 17:57:43
Material_Duration : 71851
Material_StreamSize : 1314528

interesting with Cyberlinks Decoder i get additional the Line 21 Decoder 2 loaded ? wasn't line 21 a Analog thing ?

This gets loaded on EVR Input 1 with Cyberlink DXVA

Filter: Line 21 Decoder 2
Pin: XForm Out

- Connection media type:

Video: AI44 720x480 29.97fps 82861kbps

Hmm the Filter is in quartz.dll so it's coming from Microsoft i never saw this one on XP

ffdshow-quicksync though doesn't load it

http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx

yep it's the analog line that was used to transmit different types of data back in the Analog days (videodat,captions) seems Cyberlink loads it by default with some data strange or there is some hidden analog data in this Sony stream :P.

Filter : CyberLink Video Decoder (PDVD11) - CLSID : {9699092D-91FC-4DA1-8A63-112D865EB1D2}

- Connected to:

CLSID: {E4206432-01A1-4BEE-B3E1-3702C8EDC574}
Filter: Line 21 Decoder 2
Pin: XForm In

- Connection media type:

Unknown

AM_MEDIA_TYPE:
majortype: MEDIATYPE_AUXLine21Data {670AEA80-3A82-11D0-B79B-00AA003767A7}
subtype: MEDIASUBTYPE_Line21_GOPPacket {6E8D4A23-310C-11D0-B79A-00AA003767A7}
formattype: FORMAT_None {0F6417D6-C318-11D0-A43F-00A0C9223196}
bFixedSizeSamples: 1
bTemporalCompression: 1
lSampleSize: 200
cbFormat: 0

- Enumerated media type 0:

Set as the current media type

Seems indeed Cyberlink opens it by default now, i get it loaded for every stream (never experienced this before seems to be new behavior)

PS: Egur i checked more carefully into the overhead and it seems without Quick Sync Recording the CPU usage is 10% @ playback with ffdshow quicksync and it increases it to 20% while Recording also Cyberlink DXVA Cpu usage increases while Recording but not by such a heavy amount just 2% more from 1 to 3%. So Recording with Quick Sync @ the same time currently seems to lower the efficiency of the Decoder somehow expected that also recording time critical stuff (D2D Browser Demos) shows a slowdown (low latency recording helps here a little).
Yup it seems there are more GPU resources allocated to the Recording that get lost for the Decoding and so CPU usage inreases
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 8th November 2011 at 10:39.
CruNcher is offline   Reply With Quote
Old 8th November 2011, 12:30   #247  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by CruNcher View Post
PS: Egur i checked more carefully into the overhead and it seems without Quick Sync Recording the CPU usage is 10% @ playback with ffdshow quicksync and it increases it to 20% while Recording
That's more aligned to what I see for 1080p@60.
BTW, when I'll add HW deinterlacing, this is the expected performance (for outputting 1080p@60).

I can live with this level of performance but maybe driver improvements can lower CPU usage - more than half the CPU usage within my decoder DLL (not in FFDSHOW) goes into locking the D3D surface - slower than copying the surface back to system memory...
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 8th November 2011, 21:19   #248  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Help needed

Current version has a limitation that was exposed by Windows Media Center.
It can't initialize in full screen exclusive mode.
The D3D surfaces are allocated through a D3D9 device created with IDirect3D9::CreateDevice(). It's not used to display anything.

Only in full screen exclusive mode (which WMC seem to use) CreateDevice fails.

Does anyone have a workaround?
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 8th November 2011, 21:47   #249  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 8,915
What you could try is ask the EVR for the device. Part of the whole DXVA APIs is a Interface to get the device from the renderer.

Luckily for me, CUVID also functions without a D3D device, so i have never had to try (yet).

Edit:
Specifically this: http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 8th November 2011 at 21:49.
nevcairiel is offline   Reply With Quote
Old 8th November 2011, 22:00   #250  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
Indeed, that's the regular DXVA helper. Note that it's not actually an EVR object. It inherits from DXVA2.dll, and calls mfplat.dll.
Code:
	typedef HRESULT (WINAPI *DXVA2CreateDirect3DDeviceManager9Ptr)(__out UINT *pResetToken, __out IDirect3DDeviceManager9 **ppDXVAManager);
	DXVA2CreateDirect3DDeviceManager9Ptr	pfDXVA2CreateDirect3DDeviceManager9;
	m_hDXVA2Lib = LoadLibrary(L"dxva2.dll");
	if (m_hDXVA2Lib) pfDXVA2CreateDirect3DDeviceManager9 = reinterpret_cast<DXVA2CreateDirect3DDeviceManager9Ptr>(GetProcAddress(m_hDXVA2Lib, "DXVA2CreateDirect3DDeviceManager9"));
	else {
		_Error += L"Could not find dxva2.dll\n";
		hr = E_FAIL;
		return;}
edit:
If you're using a device passed to DXVA2CreateVideoService (http://msdn.microsoft.com/en-us/libr...=VS.85%29.aspx), make sure that the HWND pointer used when creating the device isn't linked to a monitor that will be used for exclusive mode.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv

Last edited by JanWillem32; 8th November 2011 at 22:27.
JanWillem32 is offline   Reply With Quote
Old 8th November 2011, 23:00   #251  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Problem is that my decoder is not connected to the renderer, it's unaware of the graph. I can redesign
Currently I create my own device but it fails in fullscreen (only when decoder is instantiated during fullscreen)
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 9th November 2011, 07:28   #252  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 8,915
Its the only way. In FSE Mode you cannot create a new device, you have to use the one the EVR gives you.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 9th November 2011, 09:46   #253  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by nevcairiel View Post
Its the only way. In FSE Mode you cannot create a new device, you have to use the one the EVR gives you.
Thanks.
Can you point me to the relevant reading material? I didn't see this anywhere.

Update:
I was referred to this article:
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx

Basically this mean I can create a device when these conditions are met:
* They are created by the same Direct3D object that created the device that is full-screen.
* They have the same focus window as the device that is full-screen.
* They represent a different adapter from any full-screen device.

So nevcairiel is right . I need to postpone decoder initialization after the graph is connected which is a little ugly for current use cases and requires a shotgun surgery in ffdshow's code. Future use cases (output DXVA samples) will enjoy this design change as I'll be able to know how many surfaces are queued in the renderer.
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.

Last edited by egur; 10th November 2011 at 10:27.
egur is offline   Reply With Quote
Old 9th November 2011, 20:11   #254  |  Link
Blight
Software Developer
 
Blight's Avatar
 
Join Date: Oct 2001
Location: Israel
Posts: 992
CruNcher:
Line21 still exists in digital format (see DVDs or M2TS content grabbed from DTV streams), mainly for Closed Captions.
__________________
Yaron Gur
Zoom Player . Lead Developer
Blight is offline   Reply With Quote
Old 10th November 2011, 10:33   #255  |  Link
egur
QuickSync Decoder author
 
Join Date: Apr 2011
Location: Atlit, Israel
Posts: 916
Quote:
Originally Posted by JanWillem32 View Post
If you're using a device passed to DXVA2CreateVideoService (http://msdn.microsoft.com/en-us/libr...=VS.85%29.aspx), make sure that the HWND pointer used when creating the device isn't linked to a monitor that will be used for exclusive mode.
Thanks for your help but I'm not sure this is a viable option in all cases:
* When there's just 1 GPU and one screen.
* Even in multi GPU setups I don't know how to do this

My home setup has 2 GPUs Intel + AMD. The screen is connected to the AMD and I still can't create the device on the Intel GPU.
Maybe I send an HWND that's associated with the AMD-connected monitor. But I don't know how to create an generic HWND on the other monitor that will actually result in a functioning d3d device...
__________________
Eric Gur,
Processor Application Engineer for Overclocking and CPU technologies
Intel QuickSync Decoder author
Intel Corp.
egur is offline   Reply With Quote
Old 10th November 2011, 13:06   #256  |  Link
dukey
Registered User
 
Join Date: Dec 2005
Posts: 560
EVR sets things up like this
creates a window 1x1 pixels in size

D3DPRESENT_PARAMETERS pp;
ZeroMemory(&pp, sizeof(pp));

pp.BackBufferWidth = 1;
pp.BackBufferHeight = 1;
pp.Windowed = TRUE;
pp.SwapEffect = D3DSWAPEFFECT_COPY;
pp.BackBufferFormat = D3DFMT_UNKNOWN;
pp.hDeviceWindow = hwnd;
pp.Flags = D3DPRESENTFLAG_VIDEO;
pp.PresentationInterval = D3DPRESENT_INTERVAL_DEFAULT;

doesn't use that for rendering. It creates addition swap chains which it renders into, and presents when they are done. Don't know if that helps.
dukey is offline   Reply With Quote
Old 10th November 2011, 18:52   #257  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
That points out the hwnd handle nicely. Creating the window handle window handle with WS_MINIMIZE|WS_POPUP and possibly WS_DISABLED should work. http://msdn.microsoft.com/en-us/libr...=VS.85%29.aspx
If creating it minimized is a problem, using CloseWindow should also do the trick of minimizing it. http://msdn.microsoft.com/en-us/libr...=VS.85%29.aspx
Also, structs can be assigned on creation. Using ZeroMemory is more something for class members (and also only if you can't zero them on class initialization).
D3DPRESENT_PARAMETERS pp = {1, 1, D3DFMT_X8R8G8B8, 1, D3DMULTISAMPLE_NONE, 0, D3DSWAPEFFECT_DISCARD, hWnd, TRUE, FALSE, D3DFMT_UNKNOWN, 0, 0, D3DPRESENT_INTERVAL_IMMEDIATE};
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv
JanWillem32 is offline   Reply With Quote
Old 10th November 2011, 19:05   #258  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 8,915
For readability, everyone should always favor the syntax as dukey posted it. For complex structs like that, using the inline initializers is just asking for trouble.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 10th November 2011, 20:22   #259  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
That would most certainly extend the size of the renderers I'm working on considerably (the color management section declares dozens of various structs). Also, I often declare structs like this as static const. Using ZeroMemory might inhibit making elements "rommable" (a syntax of: D3DPRESENT_PARAMETERS pp = {0}; is illegal in this case). I usually just add comments to mark the interesting bits (most elements of this type of struct are 0 or 1). In this case only the hWnd parameter is worth noting, the rest just describes parameters for a 11 pixel backbuffer without extras.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv
JanWillem32 is offline   Reply With Quote
Old 10th November 2011, 20:23   #260  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 8,915
If the code is already too long, using these things to "shorten" it is the worst idea ever. It'll just make already long code even harder to read/understand.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Reply

Tags
ffdshow, h264, intel, mpeg2, quicksync, vc1, zoom player

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:24.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.