Log in

View Full Version : LAV Filters - DirectShow Media Splitter and Decoders


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 [343] 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508

nevcairiel
20th February 2014, 08:08
I just found out that my GPU Ati 4850 with 512MB Ram cannot handle DXVA-CB on

Older AMD cards suck at Copy-Back. Anything before the HD7000 series is basically unusable with Copy-Back decoding.
Especially if you try to use high framerate material, you'll really notice how the card bottle-necks the whole system.

NikosD
20th February 2014, 08:29
My Radeon 5750 card on PCI-E x16 has almost same performance using DXVA-CB or Native.
The problem is the high framerate.

60fps are unreachable even with HD7000 series - I haven't tested myself, I have read it though.

mhourousha
20th February 2014, 09:22
My HD7770 can handle 1080p60 clips with bitrate around 10Mbps.
to marsovac:you can create a custom EVR Presenter that support both DXVA2_Native and xySubfilter.

NikosD
20th February 2014, 09:24
And what about the clip posted above or any other AVCHD clip which usually is ~25Mbps ?

You can handle the 10Mbps 1080p60fps clip in both modes DXVA-CB and native, or native only ?

mhourousha
20th February 2014, 09:33
And what about the clip posted above or any other AVCHD clip which usually is ~25Mbps ?
I tested 'ducks take off(1080p version 100Mbps)'clip some months before,the decode performance is around 45fps IIRC.Native mode

NikosD
20th February 2014, 09:43
Thanks for the info, but "the ducks" clip is not a 1080p60fps, it's not an AVCHD clip.

Can you try the clip posted above or this one
ftp://helpedia.com/pub/multimedia/x264/testvideos/2011%20-%2002%20-%20H.264%20CPU%20DXVA%20codec%20comparison%20-%20Core2Duo%20vs%20UVD%202.2/6.Cat-1080p60fpsRef4-25Mbps.m2ts

fagoatse
20th February 2014, 09:44
Older AMD cards suck at Copy-Back. Anything before the HD7000 series is basically unusable with Copy-Back decoding.
Especially if you try to use high framerate material, you'll really notice how the card bottle-necks the whole system.

Shouldn't APUs be better at that task though?

Soukyuu
20th February 2014, 11:17
If you're talking about HSA, I'm not sure it will work out of the box. I pretty much expect the code to duplicate data in memory because it doesn't expect both the GPU and CPU to have access to the whole memory.

mhourousha
20th February 2014, 12:19
Thanks for the info, but "the ducks" clip is not a 1080p60fps, it's not an AVCHD clip.

Can you try the clip posted above or this one
ftp://helpedia.com/pub/multimedia/x264/testvideos/2011%20-%2002%20-%20H.264%20CPU%20DXVA%20codec%20comparison%20-%20Core2Duo%20vs%20UVD%202.2/6.Cat-1080p60fpsRef4-25Mbps.m2ts
avg 76 fps for both DXVA2_N and DXVA2_CB mode

NikosD
20th February 2014, 12:34
Shouldn't APUs be better at that task though?

If you're talking about HSA, I'm not sure it will work out of the box. I pretty much expect the code to duplicate data in memory because it doesn't expect both the GPU and CPU to have access to the whole memory.

Yes they should, but only in DXVA-CB, not DXVA native.
Because memory copies are faster, so I'm pretty sure that DXVA-CB should be faster than discrete cards for the same generation - at least.

But DXVA native should be the same, because basically it depends on the HW decoder.

avg 76 fps for both DXVA2_N and DXVA2_CB mode

Thanks!
It seems that the performance of HD7000 series is on par with Nvidia VP4 and my overclocked UVD2.2

Can you see the clock of GPU/Memory during benchmarking of your HD 7770?

mhourousha
20th February 2014, 13:51
Thanks!
It seems that the performance of HD7000 series is on par with Nvidia VP4 and my overclocked UVD2.2

Can you see the clock of GPU/Memory during benchmarking of your HD 7770?
GPU load is 30%,and the clock of core/mem is in full speed(1100/1200) during benchmark.

NikosD
20th February 2014, 15:16
@mhourousha

One last question:

Can you benchmark those two 1080p120fps files ?

1) http://120hz.net/hypermatrix/120fpsvideo/bf.mp4

2) http://120hz.net/hypermatrix/120fpsvideo/bf1.mp4

Can you decode them in realtime using HD 7770 (CB or N)?

mhourousha
20th February 2014, 17:55
@mhourousha

One last question:

Can you benchmark those two 1080p120fps files ?

1) http://120hz.net/hypermatrix/120fpsvideo/bf.mp4

2) http://120hz.net/hypermatrix/120fpsvideo/bf1.mp4

Can you decode them in realtime using HD 7770 (CB or N)?
both clips cannot reach full speed as expected
first clip:avg 77 fps
second clip: avg 79 fps
DXVA2_N

NikosD
20th February 2014, 20:00
It could go higher because these are easier to decode clips than AVCHD clips, VP4 has an average of 90fps.

Thanks for your time.

marsovac
21st February 2014, 15:48
http://120hz.net/hypermatrix/120fpsvideo/bf1.mp4

With DXVA native full speed.

With DXVA CB, extremely laggy. Like 50% dropped frames.

Phenom II X4 3Ghz, 8GB DDR3 1666 C8
ATi HD4850 512MB

Soukyuu
21st February 2014, 17:52
I'm pretty sure I saw someone reporting the same before and the answer was that ati's 4000 series don't have high enough memory bandwidth to cope with copying data back and forth.

nevcairiel
21st February 2014, 18:37
Its a problem with the architecture more then anything, copying from the GPU to system memory is just dead-slow on anything before HD7000. Its not a common use-case for a graphics card tbh, unless you're using it for compute - and that wasn't very big back then.

marsovac
21st February 2014, 20:37
So as I was told: if I want to get DXVA native I need to connect LAV video to a compatible renderer without any filters in between.

And I did that. Connecting directly to EVR.

Enum of all filter in my graph:
Renderer: EnhancedVideoRenderer
Default DirectSound Device
LavVideo
LavAudio
LavSplitter


I'm doing this:

if (lavVideoSettings.CheckHWAccelSupport(LAVHWAccel.HWAccel_DXVA2Native) != 0)
{
hr = lavVideoSettings.SetHWAccel(LAVHWAccel.HWAccel_DXVA2Native);
hr = lavVideoSettings.SetHWAccelResolutionFlags(LAVHWResFlag.SD | LAVHWResFlag.HD | LAVHWResFlag.UHD);
}


And connecting renderer and video:


hr = m_graph.Connect(DsFindPin.ByDirection(_video, PinDirection.Output, 0),
DsFindPin.ByDirection(_renderer, PinDirection.Input, 0));


Result: no luck:
http://i.imgur.com/qeOoRl1.jpg

I tried with various files which should work with DXVAn, none worked.

Any other suggestions?

EDIT: After removing my custom presenter from the EVR DXVAn is applied.
What are the requirements to have DXVAn and a custom EVR presenter?

NikosD
21st February 2014, 20:43
You have a very old version of LAV video.

Is it on purpose ?

I don't know though if a newer version solves your problem.

marsovac
21st February 2014, 20:53
You have a very old version of LAV video.

Is it on purpose ?

I don't know though if a newer version solves your problem.

It seems that a custom presenter that I put on the EVR makes it fall back to avcodec.

Without a presenter i get dxvan but of course I cannot make that useful in any way :D

PS. I'm using LAV 0.60.1, but I don't know why the dialog says otherwise.

nevcairiel
21st February 2014, 21:08
I haven't actually written a presenter myself, but I assume it needs to somehow handle the DXVA2 mode. I'm afraid I cannot help you there.

PS:
The property page thing is a common issue if you use a unregistered/private copy of LAV Video, and having another version of it installed in the system - if thats what you're actually doing. The way the code looks up the COM object for the property dialog, it'll always use the system registered page.

marsovac
21st February 2014, 21:12
The property page thing is a common issue if you use a unregistered/private copy of LAV Video, and having another version of it installed in the system - if thats what you're actually doing. The way the code looks up the COM object for the property dialog, it'll always use the system registered page.

That's what I am doing. Loading the COM by using the class factory in the DLL.

It probably uses the system registered version.

About the presenter... I'll review the code from MPC-HC and see what I am missing...

Thanks.

mhourousha
22nd February 2014, 03:37
What are the requirements to have DXVAn and a custom EVR presenter?
refer to the following tutorial
http://msdn.microsoft.com/en-us/library/windows/desktop/bb530107%28v=vs.85%29.aspx
if you want to display subtitle,you can use xySubfilter.
Create a filter that implement ISubRenderConsumer declared in http://madshi.net/SubRenderIntf.h.connect it to xySubfilter,then you can request subtitle img(when the Presenter queue a video sample) using ISubRenderProvider provided by xySubfilter, receive and store the img data in a queue ,when the Presenter Draw a queued video sample, fetch the subframe with corresponding timestamp from the queue, copy it to a Texture ,blend with video frame.that's all.

marsovac
22nd February 2014, 19:01
refer to the following tutorial
http://msdn.microsoft.com/en-us/library/windows/desktop/bb530107%28v=vs.85%29.aspx
if you want to display subtitle,you can use xySubfilter.
Create a filter that implement ISubRenderConsumer declared in http://madshi.net/SubRenderIntf.h.connect it to xySubfilter,then you can request subtitle img(when the Presenter queue a video sample) using ISubRenderProvider provided by xySubfilter, receive and store the img data in a queue ,when the Presenter Draw a queued video sample, fetch the subframe with corresponding timestamp from the queue, copy it to a Texture ,blend with video frame.that's all.

Very good info mhourousha.
Might get it done in a few days if C++ doesn't create problems :)

Thank you!

mzso
22nd February 2014, 20:34
Hello!

This little test video causes a crash with latest LAV video and mpc-hc or potplayer. With or without HW acceleration. With Potp internal decoder and MPV it plays fine.

https://drive.google.com/file/d/0ByfdfPvnoDuzSlZqY1JxeDdXNkU/edit?usp=sharing

JEEB
22nd February 2014, 20:37
mzso, Did you try current git HEAD or the latest release? At least the 32bit current git HEAD seems to work fine for me with a standard MPC-HC setup (splitter->decoder (LAV Audio does downmixing to 2ch)->EVR|DirectSound).

mzso
22nd February 2014, 21:34
mzso, Did you try current git HEAD or the latest release? At least the 32bit current git HEAD seems to work fine for me with a standard MPC-HC setup (splitter->decoder (LAV Audio does downmixing to 2ch)->EVR|DirectSound).

Hmm....
Could it be that it's a LAV Video+madVR issue?
Apparently changing either circumvents the issue.

nevcairiel
22nd February 2014, 21:38
The crash is inside madVR. I believe I reported a similar issue to madshi before, but to be safe you should tell him again.

marsovac
23rd February 2014, 03:56
Thanks to mhourousha I got DXVAn to work with my presenter.

Now the big step, integrate xyVSFilter into the presenter instead of attaching to LAV...

One strange thing I find out is:

- In DXVA native the bf1 video lags badly (bf1.mp4 (http://120hz.net/hypermatrix/120fpsvideo/bf1.mp4))
- IN LAV software mode it works without any trouble

Is this supposed to happen for 120 FPS videos?

I have my both build (dxva, software) in this zip. Can anyone test and see if they have the same issue?

http://marino.boletus.hr/TestBuilds.zip

You can check that dxva is enabled in the lav config dialog in the tray icon.

PS. I tried to play the video in MPC-HC and it lags in DXVAn, even tough a little less then in mine.

mhourousha
23rd February 2014, 06:35
no AMD GPU can handle 1080p120 in Hw decoding mode currently,so slow-down will occur.
but your player produce some additional lag IMHO.the display of stats flash all the time,and player crushed near the end of clip :(
I suggest you try using seperate device,seperate thread for rendering(UI,frame,stat etc).the D3D9EX has 'sharing resource between device'feature.so EVR Presenter could only draw Frame to a shared RenderTarget,then the rendering thread and device use this RT to draw the front end.

Aleksoid1978
23rd February 2014, 08:03
no AMD GPU can handle 1080p120 in Hw decoding mode currently,so slow-down will occur.

ATI 7750 play perfect at 1080p120 in DXVA 2.0 mode.

mhourousha
23rd February 2014, 08:07
ATI 7750 play perfect at 1080p120 in DXVA 2.0 mode.
My HD7770 simply can't.

wanezhiling
23rd February 2014, 09:33
My HD7770 simply can't.

Win7 or Win8/8.1?

Try to disable aero if is win7

NikosD
23rd February 2014, 10:40
ATI 7750 play perfect at 1080p120 in DXVA 2.0 mode.

I think you are the first ever reporting this.

Are you sure you are not using your CPU and/or you are actually decoding 120fps ?

Do you play 4K H.264 video too, in DXVA and 7750 ?

mhourousha
23rd February 2014, 11:00
Win7 or Win8/8.1?

Try to disable aero if is win7
My OS is win7.aero is already disabled when benchmarking.
LAV DXVA2N:avg~79fps
MS Decoder MFT: avg~81fps

NikosD
23rd February 2014, 11:02
Check your Catalyst video settings and make sure that everything - besides "Automatic Deinterlacing" - is disabled.

marsovac
23rd February 2014, 11:19
no AMD GPU can handle 1080p120 in Hw decoding mode currently,so slow-down will occur.
but your player produce some additional lag IMHO.

Ok that explains what I am having with HD4850.

The additional lag I'm having is exactly what you said.

I am actually drawing the EVR surface into a separate render target, and then I'm passing that into my application over to a WPF D3DImage, which is where the image is presented in the UI.

There is more overhead then just drawing the surface on a form, but the main issue is thet the whole UI in WPF is GPU accelerated, so my UI introduces even more lag on the alredy stressed GPU.. You can see that when hovering in and out of the window during the play of that video. The transparency animation effect is lagging also.

But i'm not concerned about that. I don't want to make a player in GDI or native DX since it requires too much work to make the UI nice :)

About the player crash: I may have some leakage left in those builds since I was experimenting a lot of things to make the presenter handle DXVA2. Will check it.

mhourousha
23rd February 2014, 11:34
Check your Catalyst video settings and make sure that everything - besides "Automatic Deinterlacing" - is disabled.
turn off all the post processing eff will lower the GPU load(30%->4%),but also lower the GPU core clock(1100Mhz->400Mhz in most time),so there is a small drop of the fps(80fps->76fps)

NikosD
23rd February 2014, 11:42
Strange.

In normal playback that would be normal.

But in benchmarking mode, the driver should clock the GPU as high as possible.
400 MHz is the standard clock of every UVD decoder, since Radeon 5000 series (UVD2.2)

I don't know if there is a way to find the real UVD clock, instead of GPU clock, although it should be the same if the GPU is using one clock for both.

wanezhiling
23rd February 2014, 11:50
It is still a long way for GCN to achieve NV VP5 capability in decoding, let alone the decoding *monster* Intel, sigh.

nevcairiel
23rd February 2014, 14:33
I have a NVIDIA 750 here for testing (with the new decoder, VP6 if you want to call it that), didn't install it in a system yet though, we'll see how well it performs and blows away AMD even more.
NVIDIAs claims would put it on par with Intel, but we'll see what really happens.

NikosD
23rd February 2014, 14:36
Nice!

I bet the performance of 750 video decoding will be half yours Haswell iGPU.

DragonQ
23rd February 2014, 14:51
Quite interested in results for the Maxwell GPUs; nVidia's claims are rather outlandish (8-10x faster video decoding, 2x faster video encoding) and the new range of cards should be really nice with their much lower temperatures and significantly reduced power consumption.

NikosD
23rd February 2014, 14:57
I have read that claim of x8-x10 video decoding performance and I think there is a HUGE misunderstanding.

The x8-x10 claim is referring to real time decoding, not the previous generation VP5.

VP6 (?) is x2 - x2.5 faster than VP5, which translates to 4K60fps video decoding capability.

My bet.

DragonQ
23rd February 2014, 15:02
Yeah makes more sense. 2180p @ 60 fps AVC would be great but it's a shame they only have partial HEVC acceleration. Most 2180p content will likely be HEVC in the near future.

I'd be interested to see if they've updated their deinterlacing algorithms at all but I guess with the eventual move to 2180p they don't need to worry much about it any more.

nevcairiel
23rd February 2014, 20:19
First testing shows its around 3x as fast as VP5, reaching close to 400 fps in various 1080p samples, or 80 fps on 4K. But that was only a quick and dirty test with very limited samples, more later, and maybe I can also squeeze more out of it by modifying the decoder a bit to be more aggressive.

NikosD
23rd February 2014, 20:26
Not a surprise for me.

I was thinking about 4K@70 fps, but Nvidia decided to give it more room just in case.

They have fully covered HDMI 2.0 which is 4K@60.

Still 4K@80 fps is about half decoding performance of Haswell iGPU.

nevcairiel
23rd February 2014, 20:27
My Haswell did 100 fps on the 4K sample I tested (and its usual 500 on the 1080p samples), so hardly half.

NikosD
23rd February 2014, 20:32
Have you disabled all VPP functions in drivers ?

Because even the hardest 4K ducks@370Mbps can be decoded by Ivy HD4000 at about ~130 fps.

So your 20 EU's iGPU should go higher.

In renderless mode using decoder only could go more than 150fps.

GTPVHD
23rd February 2014, 21:57
Thanks nev for testing the new Maxwell hardware decoder, seems to be a big jump from 120FPS in VP5 to 400FPS & from 4K24P to 4K80P in Maxwell. Intel's 4K120P decoder is still pretty ridiculous in performance.