Log in

View Full Version : Intel QuickSync Decoder - HW accelerated FFDShow decoder with video processing


Pages : 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

RBG
15th January 2012, 23:19
* HW Video processing: deinterlacing, film detection (3:2, 2:2 pulldowns, etc), noise reduction, sharpness, scaling, etc.

nevcairiel
15th January 2012, 23:39
Thoug i guess no one yet compared Intels Deinterlacing algorithm Quality vs Yadif in terms of Quality Yadif @ least is comparable to where Nvidia currently is with GPU Deinterlacing

Both NVIDIAs and AMDs hardware deinterlacing look much better then yadif on some content (especially sports)
Intels is a bit worse then NVIDIAs/AMDs, but a comparison to yadif is missing here.

CruNcher
16th January 2012, 03:32
Here are some Utilization results WVC1 1920x1080 50fps 14 Mbits (got VLD to work with Arcsofts/Cyberlinks Decoder and Potplayer for virtualy any WVC1 in .wmv though some files fail (black screen)) :)

Lav Video = 22%
Quicksync = 11%
IDCT = 8% (Potplayer)
VLD = 2% (Arcsoft/Cyberlink)

biggest boost is from software to Quicksync 50% and Quicksync to VLD

NikosD
16th January 2012, 09:20
Eric,

I have contacted PotPlayer's developer and he told me what Nevcariel has said too.
That VC-1 VLD HW acceleration is not standard DXVA, although VC1 IDCT is standard DXVA.
Moreover PotPlayer's developer told me that VC1 VLD is not open.

I found these statements odd, if they are true of course.

Both MPEG-2 and H.264 are far more common codecs than VC-1 and Intel support them by implementing standard DXVA VLD mode for both.

Why do you have this "special treatment" for VC-1 ?
Non standard DXVA and not open.

Is it something to do with Microsoft / ISV's ?

How is it possible for an independent developer like PotPlayer or Nevcariel to implement DXVA VC1 VLD?

One last question:
You have mentioned and I have tested that FFDshow QS decoder doesn't provide WMV3 accleration, which essentially is VC-1 Simple and VC-1 Main profiles.
It seems that only VC-1 Advanced profile is HW accelerated.

Is that a HW limitation or a software limitation ? (Intel MSDK or driver).

goldie
16th January 2012, 09:28
Let's have a little poll.
What should be the next big feature?
* HW Video processing: deinterlacing, film detection (3:2, 2:2 pulldowns, etc), noise reduction, sharpness, scaling, etc.
* Output native DXVA surfaces (hybrid setups will not be supported)
* Other - please specify.

Deinterlacing & film detection. :thanks:

betaking
16th January 2012, 09:50
to egur does Intel SandyBridge hardware accelerated support MPEG-4 ASP?

egur
16th January 2012, 10:28
OK then. everyone wants deinterlacing + film detection. I'll start with that. Scaling and other post processing features will follow.
I won't implement a custom DI myself, I'll use the one supplied by the driver. Personally I've tested it to be better than Nvidia/AMD but your mileage may vary.

@CruNcher
The scaling algorithm in SandyBridge is superior to both Nvidia and AMD in both upscaling and downscaling. Since I invented the algorithm for the video scaler, I have deep knowledge on the matter. Like many other parts of the video engine, it's implemented as ASIC and has very high performance.

@NikosD
I have relatively little knowledge on DXVA and what the driver support or not. Frankly I don't want to deep dive on the matter, I'd rather have a tooth pulled out :(
That's why I use the Media SDK, it simplifies (significantly) HW video decode/process and also adds encode as a bonus.
This means that dealing with DXVA is handled by the Media SDK developers and not me :)
Some features may be possible when using native DXVA that don't exist in the MSDK, but I can live with that.
Anyway, complaints about the driver (or feature requests) should be posted in the driver forum (http://communities.intel.com/community/tech/graphics).

@Betaking
The MSDK doesn't support MPEG4-ASP as far as I know. I don't know about driver support. Does DXVA support this format (any GPU)?

nevcairiel
16th January 2012, 10:39
Does DXVA support this format (any GPU)?

Yes, its supported on both NVIDIA and AMD (at least on recent GPUs, its one of the more recent additions).
Intel does not expose support for such a mode in DXVA, so i guess its not supported by the hardware. To be honest, its not really required anyway, MPEG4-ASP SD can be decoded by CPUs that are 10 years old, and HD material is somewhat rare (and even then its still very simply to decode).

The old LAV CUVID supported MPEG4-ASP, and LAV Video 0.45 will regain that ability. I may also add it to the DXVA2 decoder for AMD/ATI, if i ever feel really bored (its not implemented in ffmpeg yet, so more work then just flipping a switch)

@VC-1:
Its possible to intercept the calls from the MSDK into DXVA and check what its doing different (i used that alot the last few days to figure out VC-1 interlaced DXVA2 with other DXVA2 decoders), but since we have the MSDK - why bother? :d

NikosD
16th January 2012, 10:42
@NikosD
I have relatively little knowledge on DXVA and what the driver support or not. Frankly I don't want to deep dive on the matter, I'd rather have a tooth pulled out :(
That's why I use the Media SDK, it simplifies (significantly) HW video decode/process and also adds encode as a bonus.
This means that dealing with DXVA is handled by the Media SDK developers and not me :)
Some features may be possible when using native DXVA that don't exist in the MSDK, but I can live with that.


Thanks for the reply.

So, when you propose to "Output native DXVA surfaces" doesn't involve you in DXVA itself ?
You could output native DXVA surfaces through MSDK ?

I thought the proposal "Output native DXVA surfaces" was actually a proposal for building a direct DXVA decoder.

RBG
16th January 2012, 10:58
egur
Hello Eric.

Can you explain a little bit more about scaling and what do you mean by that. AFAIK video scaling(chroma, luma upscaling, downscaling) is something that is usually done on render level, and you are developing a decoder. Also how good is intel scaling compared to madVR?



The old LAV CUVID supported MPEG4-ASP, and LAV Video 0.45 will regain that ability

Great news, I don't have to use old CUVID any more. :)

NikosD
16th January 2012, 11:00
The old LAV CUVID supported MPEG4-ASP, and LAV Video 0.45 will regain that ability. I may also add it to the DXVA2 decoder for AMD/ATI, if i ever feel really bored (its not implemented in ffmpeg yet, so more work then just flipping a switch)


Good to know.
For Nvidia HW why isn't it possible to accelerate MPEG4-ASP in DXVA ? (Direct or Frame copy)
Why do you have to use LAV CUVID ?


@VC-1:
Its possible to intercept the calls from the MSDK into DXVA and check what its doing different (i used that alot the last few days to figure out VC-1 interlaced DXVA2 with other DXVA2 decoders), but since we have the MSDK - why bother? :d

Direct is always faster and more efficient in terms of power consumption, a requirement for laptops mainly.
Especially Intel's MSDK implementation with 60fps VC-1 clips, there is the same problem of Turbo CPU frequency as with H.264 60fps files, even in normal playback mode.

betaking
16th January 2012, 11:10
OK then. everyone wants deinterlacing + film detection. I'll start with that. Scaling and other post processing features will follow.
I won't implement a custom DI myself, I'll use the one supplied by the driver. Personally I've tested it to be better than Nvidia/AMD but your mileage may vary.
@Betaking
The MSDK doesn't support MPEG4-ASP as far as I know. I don't know about driver support. Does DXVA support this format (any GPU)?

Thanks for the reply.;)

nevcairiel
16th January 2012, 11:13
For Nvidia HW why isn't it possible to accelerate MPEG4-ASP in DXVA ? (Direct or Frame copy)
Why do you have to use LAV CUVID ?

Who says you have to?
DXVA is also possible.

NikosD
16th January 2012, 11:17
Nobody has done it, yet.
Including you.

nevcairiel
16th January 2012, 11:19
And i already said why i didn't do it yet, because ffmpeg doesn't support MPEG4 DXVA2 yet, and implementing that is quite a bit of work for very little benefit. It is however planned for some time in the future.

Also, my DXVA2 decoder is only a few days old, i rather focused on issues like VC-1 interlaced DXVA, which required you to use a commercial DXVA decoder up to now. :d

NikosD
16th January 2012, 11:23
So, PotPlayer's (which is based on FFMpeg) UVD3 support of MPEG4-ASP VLD is just a work of their own ?

Or FFMpeg implemented MPEG4 ASP only for ATI's HW ?

BTW, DivX codec itself has DXVA MPEG4 ASP support only for UVD3.

Update:
PotPlayer is free and can accelerate DXVA VC-1 Interlaced at least one year before!

betaking
16th January 2012, 11:26
So, PotPlayer's (which is based on FFMpeg) UVD3 is just a work of their own ?

Or FFMpeg implemented MPEG4 ASP only for ATI's HW ?

BTW, DivX codec itself has DXVA MPEG4 ASP support only for UVD3.

I not have UVD3 to test it !but arcsoft video codec support DXVA MPEG4 ASP only for UVD3 too!

nevcairiel
16th January 2012, 11:33
So, PotPlayer's (which is based on FFMpeg) UVD3 support of MPEG4-ASP VLD is just a work of their own ?

Or FFMpeg implemented MPEG4 ASP only for ATI's HW ?

Its not implemented at all, if they added it, its their own - and they "forgot" to contribute it back to ffmpeg, like the license mandates.


PotPlayer is free and can accelerate DXVA VC-1 Interlaced at least one year before!

A decoder limited to one player is not useful to many people.

Its just one example why their attitude is not really productive. They take open source code, then add their own features, and claim to have more features then everyone else.
If you base your work on open source, its mandatory to also contribute any changes back to the project, or at least make the changes available for the public. Its not only a "nice thing to do", but its also required by copyright law!

NikosD
16th January 2012, 11:40
So, they look like the "bad" guys of Open Source community.
They take things from others, but they give nothing back.

I don't know if that's true - I have heard it from others too - but I know that they sure have the most complete Video Player out there especially regarding DXVA video codecs support for all HW (ATI, Nvidia, Intel)

I consider PotPlayer and LAV filters among the best "new" free software for multimedia (MPC-HC, FFMpeg are the "grandfathers")

egur
16th January 2012, 14:41
Thanks for the reply.

So, when you propose to "Output native DXVA surfaces" doesn't involve you in DXVA itself ?
You could output native DXVA surfaces through MSDK ?
...

The MSDK outputs Direct3D9 surfaces (which I allocate BTW). These type of resource is used by DXVA for video frames. MSDK, according to its documentation is an abstraction layer on top of DXVA2. BTW, the overhead of the MSDK is extremely low. I add an overhead in my decoder (frame copying) so I can impersonate a SW decoder with the added benefits.

egur
Hello Eric.

Can you explain a little bit more about scaling and what do you mean by that. AFAIK video scaling(chroma, luma upscaling, downscaling) is something that is usually done on render level, and you are developing a decoder. Also how good is intel scaling compared to madVR?

Scaling
Assumption: an image is a discrete representation of the continuous world. this means that pixels (samples) are integrals of an area in the real world. This is similar to audio samples that represent an integral over time.

Scaling AKA resampling can be described as converting the discrete samples to a continuous signal and getting the value (actually integrating) the signal in new positions. If you create more sample points from the continuous signal, you up-scale the image (more pixels) if you sample less points, you're performing down-scaling.

Signal processing theory describes the process (some signal processing knowledge required :) :
* Create a continuous signal from the discrete samples. This done by adding zeros between the samples. The continuous signal is all zeroes with spikes where the samples where.
* Low-pass the signal (weaken or eliminate high frequency)
* Sample the values of the low-passed signal at new positions.

Actual resampling implementations (nearest neighbor, bi-linear, bi-cubic, Lanczos) do just that. Instead of integrating a signal which most of it is zero, you can simply sample the low-pass function and multiple the sampled values with the original pixels.

When down-scaling, the low-pass function must be designed so it will remove high frequencies that do exist in the output image. Every discrete signal has a Nyquist frequency which half it's size (one for horizontal and one for vertical).

From signal processing point of view, the perfect low-pass filter is a Sync (sin(x)/x). A Sync will clip all high frequencies and retain the amplitude (strength) of the low frequencies.

For performance reasons the number of samples used to derive a new pixel value is limited. This is called the sampling window width.
For down-scaling this is perfect (if all the pixels in the input image are used to create each and every output pixel).

For up-scaling things are not so easy. Using a Sync or a modified trimmed version of it (Lanczos) will result in ripples near edges. This is unpleasing to the eye (false edges and mosquito noise).

A variety of sampling functions exist, they are always compromise on performance, sharpness and artifacts.
* Lanczos is the sharpest. Exhibits strong edge artifacts. The more taps used (sampling window size) the output will be sharper with more artifacts
* Bi-cubic - less artifacts, less sharp.
* Bi-leaner - not sharp, geometric artifacts.
* nearest neighbor - sharp, heavy geometric artifacts

There are some sampling algorithms that work a little differently. They can guess the value of missing samples by some kind of heuristics or statistics (e.g. NEDI algorithm). They are computationally very heavy and the results are not worth the effort.

SandyBridge's adavnced video scaler has a different approach. A context adaptive scaler.
It will use a Lanczos4 scaler (8 taps) in order to create very sharp images. In order to avoid (most) of the artifacts, it will perform an analysis of the area and blend between the sharp scaler and a smooth scaler depending if the analysis thought the target pixel is prone to artifacts.

Context adaptive scaling is not a new idea but this implementation's quality and performance are probably one the best.
Some companies perform context adaptive scaling using a different paradigm - use a soft scaler like bi-cubic and perform post processing sharpness filter on edges that were very strong in the source image.

BTW, these tricks are used only for upscaling. For downscaling , the optimal filter is Lanczos for a given sampling window size.

Regarding luma and chroma scaling. Luma is he grey levels of the image (called Y) and chroma is the color information (called UV or CbCr). In the YUV color space, which most of the videos are encoded with, the UV color components are usually at a lower resolution and thus not fully aligned with the luma (Y) component. A scaler algorithm must make sure that chroma scaling produces a pleasing result. Most of the time, chroma values are resampled using a softer scaler (bi-cubic variant).

MadVR currently implements a wise variety of scaling algorithms, all of them are known textbook algorithms and allows selecting different algorithms for Y and UV scaling so the user can get the results he/she likes best.
Since theirs usually a trade-of between sharpness and various artifacts some users will sacrifice one for the other.

NikosD
16th January 2012, 16:07
The MSDK outputs Direct3D9 surfaces (which I allocate BTW). These type of resource is used by DXVA for video frames. MSDK, according to its documentation is an abstraction layer on top of DXVA2. BTW, the overhead of the MSDK is extremely low. I add an overhead in my decoder (frame copying) so I can impersonate a SW decoder with the added benefits.


It's clear now.

But then , I think is extremely easy to implement a "direct" DXVA decoder through MSDK, just by sending directly the decoded frame to EVR renderer.

So, I think the "direct" - through MSDK - DXVA decoder can be implemented earlier and easier than video scaling procedures. :D

CruNcher
16th January 2012, 16:16
OK then. everyone wants deinterlacing + film detection. I'll start with that. Scaling and other post processing features will follow.
I won't implement a custom DI myself, I'll use the one supplied by the driver. Personally I've tested it to be better than Nvidia/AMD but your mileage may vary.

@CruNcher
The scaling algorithm in SandyBridge is superior to both Nvidia and AMD in both upscaling and downscaling. Since I invented the algorithm for the video scaler, I have deep knowledge on the matter. Like many other parts of the video engine, it's implemented as ASIC and has very high performance.

@NikosD
I have relatively little knowledge on DXVA and what the driver support or not. Frankly I don't want to deep dive on the matter, I'd rather have a tooth pulled out :(
That's why I use the Media SDK, it simplifies (significantly) HW video decode/process and also adds encode as a bonus.
This means that dealing with DXVA is handled by the Media SDK developers and not me :)
Some features may be possible when using native DXVA that don't exist in the MSDK, but I can live with that.
Anyway, complaints about the driver (or feature requests) should be posted in the driver forum (http://communities.intel.com/community/tech/graphics).

@Betaking
The MSDK doesn't support MPEG4-ASP as far as I know. I don't know about driver support. Does DXVA support this format (any GPU)?

I fully believe you that :) though that wasn't the question on the scaling it was more how does it compare to NEEDI3 ;)

There are some sampling algorithms that work a little differently. They can guess the value of missing samples by some kind of heuristics or statistics (e.g. NEDI algorithm). They are computationally very heavy and the results are not worth the effort.

Yes extremely slow and that's the question how does yours compare Performance/Quality in Hardware implemented even Adaptive :)

SandyBridge's adavnced video scaler has a different approach. A context adaptive scaler.
It will use a Lanczos4 scaler (8 taps) in order to create very sharp images. In order to avoid (most) of the artifacts, it will perform an analysis of the area and blend between the sharp scaler and a smooth scaler depending if the analysis thought the target pixel is prone to artifacts.

It sounds good on paper (did you ever released one ?), and surely Intel wouldn't have bought it if it wouldn't have looked valuable for them ;)

nevcairiel
16th January 2012, 16:32
But then , I think is extremely easy to implement a "direct" DXVA decoder through MSDK, just by sending directly the decoded frame to EVR renderer

Its not that easy, there are a number of annoying factors to deal with. Personally, i think the copy-back solution is easier, which is why i started with it. :D

RBG
16th January 2012, 16:51
egur

Thanks for your reply, I appreciate it a lot.:) I want to clear something up, will SB scaling work on hybrid systems and what are the conditions of it? For example, there is no real display connected to my Intel HD graphics, I made a fake one, like you suggested here (http://forum.doom9.org/showpost.php?p=1532786&postcount=186).

CruNcher
16th January 2012, 17:05
@Eric
as this is one of your professions i would like to advice you that we also have Robidoux on Doom9 Madshi, Tritical and other who research in that field over @ the Avisynth area ;) (i know you are fully on with the decoder but maybe as soon as you get to the scaling implementation you could say hello ;) )

http://forum.doom9.org/showthread.php?t=160038
http://forum.doom9.org/showthread.php?t=145358
http://forum.doom9.org/showthread.php?t=160610
http://forum.doom9.org/showthread.php?t=154143


Context adaptive scaling is not a new idea but this implementation's quality and performance are probably one the best.
Some companies perform context adaptive scaling using a different paradigm - use a soft scaler like bi-cubic and perform post processing sharpness filter on edges that were very strong in the source image.

This is also what i currently prefer Realtime and use in my avisynth framework via the GPU shader though not with bi-cubic :)
Would really tove to see your implementations result, especialy speed beeing native asic and not Shader though copy back will hit that again :)
But yeah i voted for Deinterlacing + Ivtc and those are more important for now :D

STaRGaZeR
16th January 2012, 17:08
Good post right there egur ;)

egur
16th January 2012, 20:42
@CruNcher
SandyBridge's Advanced Video Scaler (AVS) is a programmable fixed function sclaler (ASIC) utilized when either using the EVR or by renderers from Cyberlink and Arcsoft and maybe other companies.
I've confirmed that the Media SDK uses the AVS for scaling. Older GPUs had simpler scalers.

I didn't release a paper/patent since the actual implementation is trade secret (the analysis part). But again, context adaptive scaling (or context adaptive algorithms in general) is not new.

The performance of the AVS will vary on GPU clock speed, but it can do several 1080p60 streams simultaneously.

The best way to test upscaling is by scaling DVD resolution to 1080p (720p-->1080p is a small scale factor). Downscaling can be checked by shrinking the player/render and playing test patterns - look for aliasing.

@RBG
EVR will use the video processing features (DI, scaling ,etc) available on the GPU connected to the screen showing the video. So with hybrid setups, you get what AMD/Nvidia gives you.

RBG
16th January 2012, 21:28
@RBG
EVR will use the video processing features (DI, scaling ,etc) available on the GPU connected to the screen showing the video. So with hybrid setups, you get what AMD/Nvidia gives you.

Ah... That's sad.:(

Now I don't understand what did you mean by writing:

What should be the next big feature?
* HW Video processing: deinterlacing, film detection (3:2, 2:2 pulldowns, etc), noise reduction, sharpness, scaling, etc.

Internal hw deinterlacing, edge-enhancement, all that works in LAV video(CUVID), and when I saw "scaling" in your list, I thought you are going to implement it somehow in the decoder itself, that is why I asked you about it.

egur
16th January 2012, 22:02
Ah... That's sad.:(

Now I don't understand what did you mean by writing:

What should be the next big feature?
* HW Video processing: deinterlacing, film detection (3:2, 2:2 pulldowns, etc), noise reduction, sharpness, scaling, etc.

Internal hw deinterlacing, edge-enhancement, all that works in LAV video(CUVID), and when I saw "scaling" in your list, I thought you are going to implement it somehow in the decoder itself, that is why I asked you about it.

I do plan to implement it internally and it will work on Intel HW regardless of the renderer. That's the beauty of the QS decoder's design, it cares very little about the renderer.
The decoder is (traditionally) not responsible for scaling, the renderer is. But I can expose the feature anyway, like ffdshow does (scale to a fixed resolution).

RBG
16th January 2012, 23:40
I do plan to implement it internally and it will work on Intel HW regardless of the renderer. That's the beauty of the QS decoder's design, it cares very little about the renderer.
The decoder is (traditionally) not responsible for scaling, the renderer is. But I can expose the feature anyway, like ffdshow does (scale to a fixed resolution).

Yes, scaling is done by render, I know that, but IMO it brings some inconveniences, especially on hybrid systems. That means if I want to get high quality picture, I should either stick with MadVR which is obviously not very stable, either use SB hw scaler, which is limited to vanilla EVR and physical display connection needed here. And in this situation internal hq hw scaling can be a real option. It will be totally awesome if you implement this feature someday. Also I wonder if it is possible to make internal hw scaling work dynamically, like it works on EVR, scale to the actual window size?

nevcairiel
17th January 2012, 07:49
Also I wonder if it is possible to make internal hw scaling work dynamically, like it works on EVR, scale to the actual window size?

I would doubt that this is feasible.

egur
17th January 2012, 08:45
...
Also I wonder if it is possible to make internal hw scaling work dynamically, like it works on EVR, scale to the actual window size?
The thing is that the decoder receives events/callbacks only for decoding frames.
When the player is paused and the window is resized, only the renderer will receive a notification that the size has changed. Changing resolutions within the decoder implies a (implicit) notification from the decoder to the renderer. Frequent notifications might cause the renderer to misbehave.
So technically it's not possible to do dynamic resolution change within the decoder.
If there was an open source renderer project than it would be possible.

For your kind of setup, if you choose not to use MadVR, you should check if you can use Lucid Virtu (Google it). It will copy the frames from EVR to the actual display with relatively small overhead.

RBG
17th January 2012, 09:26
The thing is that the decoder receives events/callbacks only for decoding frames.
When the player is paused and the window is resized, only the renderer will receive a notification that the size has changed. Changing resolutions within the decoder implies a (implicit) notification from the decoder to the renderer. Frequent notifications might cause the renderer to misbehave.
So technically it's not possible to do dynamic resolution change within the decoder.
If there was an open source renderer project than it would be possible.

Thanks for the clarification. Well, fixed resolution resize should be fine too.;)


For your kind of setup, if you choose not to use MadVR, you should check if you can use Lucid Virtu (Google it). It will copy the frames from EVR to the actual display with relatively small overhead.

Lucid Virtu runs on my motherboard only in trial mode, already tried it, and vanilla EVR itself is no good due to poor subtitle support and lack of custom shaders.

NikosD
17th January 2012, 10:48
Eric,

Your professional knowledge of scalers and Video Processing in general, could be an initial boost for Intel to move forward next generation Video Processing capabilities supported by DXVA-HD.

There is no driver available by any company supporting DXVA-HD, which BTW has nothing to do with decoding.
It's pure Video Processing enhanced compared to DXVA-VP.

Here they are some key points from Microsoft:

Improvements over DXVA-VP

DXVA-HD expands the set of features provided by DXVA-VP. Enhancements include:

•RGB and YUV mixing. Any stream can be either RGB or YUV. There is no longer a distinction between the primary stream and the substreams.
•Deinterlacing of multiple streams. Any stream can be either progressive or interlaced. Moreover, the cadence and frame rate can can vary from one input stream to the next.
•RGB background colors. Previously, only YUV background colors were supported.
•Luma keying. When luma keying is enabled, luma values that fall within a designated range become transparent.
•Dynamic switching between deinterlace modes.

DXVA-HD also defines some advanced features that drivers can support. However, applications should not assume that all drivers will support these features. The advanced features include:

•Inverse telecine (for example, 60i to 24p).
•Frame-rate conversion (for example, 24p to 120p).
•Alpha-fill modes.
•Noise reduction and edge enhancement filtering.
•Anamorphic non-linear scaling.
•Extended YCbCr (xvYCC).

nevcairiel
17th January 2012, 11:30
There is no driver available by any company supporting DXVA-HD,

Thats not true, NVIDIA supports DXVA-HD.
Intel does too, but Intels support is rather limited (and slow!)

AMD does not offer it at all.

I evaluated it a while ago to use it for deinterlacing, but because only NVIDIA really offered a working solution, i discarded the code again.

Obviously even NVIDIA doesn't offer every single mode, especially the IVTC support is very limited through DXVA-HD.

egur
17th January 2012, 12:16
Eric,

Your professional knowledge of scalers and Video Processing in general, could be an initial boost for Intel to move forward next generation Video Processing capabilities supported by DXVA-HD.

I'm in a different position, other people are responsible for video processing capabilities.
There's active R&D going on in video processing at Intel. Processors/GPUs after SandyBridge will have improved algorithms. It's an evolutionary process.
BTW, most of the features you specified exist in SandyBridge as far as I know.
I think an open source renderer would be a good place to advance the video processing field. LAV Renderer perhaps? :D

nevcairiel
17th January 2012, 12:51
LAV Renderer perhaps? :D

No thanks, i'm happy with madVR.

patul
17th January 2012, 12:51
I think an open source renderer would be a good place to advance the video processing field. LAV Renderer perhaps? :D

<OT>
That would be nice, might be reinventing the wheel as madVR (even though it's closed source) is doing a great job. I asked for full blown LAV Player (MF-based) few weeks ago, and nevcairiel said that I will have to lock him up in a room with computer for a year or two :D

</OT>

egur
17th January 2012, 12:55
No thanks, i'm happy with madVR.
I was kidding. You already do so much.

RBG
17th January 2012, 13:54
I was kidding. You already do so much.

Maybe you're were kidding, but you're all the way right here, indeed a new open source render concept is needed. MadVr is good, but it is not near as stable as EVR, and it's closed source project.

CruNcher
17th January 2012, 13:57
Eric,Jan,Madshi and Nev together that would be crazy (in the good sense) EriJaMaNev Player or why not finally do the "Doom9 Player" ;)

Blight
19th January 2012, 03:29
I wonder if dynamic scaling can be done in the decoder if the player used a function to inform the decoder of the resolution change.

egur
19th January 2012, 17:22
I wonder if dynamic scaling can be done in the decoder if the player used a function to inform the decoder of the resolution change.

Due to queuing on both decoder and renderer this will not be smooth...

CharlieCL
20th January 2012, 01:04
Is it possible to wrap encoder into ffdshow? So we can record a video from frame buffer. This may remove a big disadvantage of PC that can not record video from HDMI because of HDCP.

CharlieCL
20th January 2012, 01:19
Let's have a little poll.
What should be the next big feature?
* HW Video processing: deinterlacing, film detection (3:2, 2:2 pulldowns, etc), noise reduction, sharpness, scaling, etc.
* Output native DXVA surfaces (hybrid setups will not be supported)
* Other - please specify.

Eric,

Could you please select H.26, VC1, MPEG2 codecs to Intel QuickSync as default codecs in your ffdshow distribution? In Windows 7 I have to run another program to disable Windows codec even if the QuickSync was selected. This is not convenient. Since your distribution is especially for SB/IB the default values are better to this optimization.

Blight
20th January 2012, 02:00
Many users will accept jerky scaling while resizing a window if the end result is a better image quality once the window finishes resizing.

Due to queuing on both decoder and renderer this will not be smooth...

I vote deinterlacing in the poll, added as a new option in ffdshow's deinterlace setting and accessible through ffdshow's API.
Right now Zoom Player is setting ffdshow to 'yadif' if a user turns on deinterlacing, but it would be trivial to make SB deinterlacing an option if the API supported it.

CruNcher
20th January 2012, 02:04
Many users will accept jerky scaling while resizing a window if the end result is a better image quality once the window finishes resizing.



I vote deinterlacing in the poll, added as a new option in ffdshow's deinterlace setting and accessible through ffdshow's API.
Right now Zoom Player is setting ffdshow to 'yadif' if a user turns on deinterlacing, but it would be trivial to make SB deinterlacing an option if the API supported it.

Depends i guess how heavy the latency would be for regular window,full window, full screen switching see for example Jans Experimental Renderer latency its just acceptable if it would take longer it would feel odd it's already a completely different feeling compared to trunk with that 7ms latency.

PS: Though i agree having Lanczos4 Upscaling Quality In Realtime is something not so shaby and equal quality for both down/up due to the blending :)

RBG
20th January 2012, 02:07
Eric,

Could you please select H.26, VC1, MPEG2 codecs to Intel QuickSync as default codecs in your ffdshow distribution? In Windows 7 I have to run another program to disable Windows codec even if the QuickSync was selected. This is not convenient. Since your distribution is especially for SB/IB the default values are better to this optimization.

Another program? You mean you changed decoder merit?

Libavcodec can be switched to quicksync without any third-party software just in ffdshow configuration screen in a few clicks, from my point of view this problem is not even worth mentioning.

Many users will accept jerky scaling while resizing a window if the end result is a better image quality once the window finishes resizing.


As a user I will absolutely accept not smooth scaling if finally it will end in decent picture quality.:D I think most of the people use only about two resolutions, which are scaled by render, when they open movie in the window as a preview, and of course when they extend it to full screen. So indeed smooth resolution change is not that much needed.

egur
I got a question, what is the best way to deliver the internally scaled image to render(EVR), I mean what processing queue and output color space should ffdshow use not to ruin the hw re-sampled video?

egur
20th January 2012, 14:37
Eric,

Could you please select H.26, VC1, MPEG2 codecs to Intel QuickSync as default codecs in your ffdshow distribution? In Windows 7 I have to run another program to disable Windows codec even if the QuickSync was selected. This is not convenient. Since your distribution is especially for SB/IB the default values are better to this optimization.

Ever since my decoder was integrated into ffdshow's official builds (a few months back), I've set the default codecs to their original values (mostly libavcodec). If you install ffdshow on top of an existing version, you're settings are not touched. So you need to configure ffdshow only once and updates will not touch this.
Disabling windows codecs has nothing to do with ffdshow or my decoder. I don't know what problems you have and why you need to disable them again and again. This doesn't occur on my PC and I didn't receive complaints from other users.

@Blight
First version will have DI on by default for interlaced streams and ffdshow will not be aware that the stream is interlaced as it will receive progressive frames.
As for configuration of the video processing, I'll add a new TAB in ffdshow's window for this purpose or expose an interface from ffdshow or both. Please define what is needed from such interface, no hurry it will be a few weeks before VP will work.

@RBG
I agree with RBG that full screen scaling makes more sense and windowed mode can use the renderer's scaler, this will simplify things a lot.

For best performance use only NV12, it's the one and only supported format in HW (ATM). All other formats cause conversions to occur.

@CruNcher
Waht do you mean by "and equal quality for both down/up due to the blending". What blending?

BTW, I'll be on vacation for a week starting tomorrow so I'll might not be able to answer in the following week (till Jan 28th).

CruNcher
20th January 2012, 15:03
adaptively changing kernels for downscaling/upscaling :)
do you have any numbers of the raw speed of the Asic just for jaw dropping effect ;)