Log in

View Full Version : LAV Filters - DirectShow Media Splitter and Decoders


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 [162] 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508

wanezhiling
12th January 2012, 07:23
1. As a ATI user, Thx nev first.:)

2. I simply tested all 1080p and 1080i I have(BD files .m2ts).

3. My M hd4650(UVD2.0) supports H.264_VLD, VC1(WMV3)_VLD, MPEG2_IDCT, the project for ATI only works on H.264 and VC1(WMV3), MPEG2 failed, just like MPC-HC whose DXVA is only based on VLD(Variable Length Decoding).:)

4. Compared with H.264, VC1 files perform poor,not very smooth(Same is CUVID).. Yep, 1080p's playback is better than 1080i too.

5. Everything(except VC1) seems to be running smoothly, I only found one 1080i/H.264 clip(Image is broken (http://img165.poco.cn/mypoco/myphoto/20120112/13/5905971020120112131401080.jpg)) .Maybe you can try it here (http://www.gokuai.com/f/2igXjZBs6v96Rbt0).:)



Mobility HD4650, Catalyst 12.1a preview, win7 x86
H.264 1080I (http://pastebin.com/FXNcJbmw)(Image broken)

VC1 (http://pastebin.com/g5pJk1L3)

WMV3 (http://pastebin.com/1rbzyeLk)

MPEG2 (http://pastebin.com/nqG5g96Q)

nevcairiel
12th January 2012, 07:46
From all these numbers it looks like anything up to 60i will most likely play properly, however 60p might be too slow, somewhat similar to my own results.

We'll see if i can find a better way to lock and retrieve the video frame. I didn't even optimize the memory copying yet because all the time is spent in LockRect, not copying the data.
I must be doing somethign wrong still, because even on my NVIDIA the performance isn't close to what i would expect. :)

Does anyone have any experience with copying the data from the GPU?
Right now i'm just LockRect'ing the DXVA surface and copying the data over - but the LockRect can quite frequently take a long time. I read that an alternative is using GetRenderTargetData to copy the data to system memory, and because that method is async in the background, i could potentially gain some benefits from it?

PS:
Partial acceleration is annoyingly complicated to support for only a small gain, which is why its not supported and never will be. Even Microsoft got the hint, and any new formats don't have a partial acceleration type (and won't have).

Superb
12th January 2012, 10:13
Not that I'm defending the copy/pasting of the guide here, but every one of these was explicitly answered in the text.

1. "LAV Splitter v0.42 still doesn’t support segment linking yet, so playing series releases that have OP/ED as seperate video files won’t playback in each of the episodes"
2. "The 64-bit version is incompatible with madVR"
3. "Under “Internal Filters”, disable everything. You can leave some of the ones on the left active"
4. "Bonus: Adding ffdshow (Optional) ... Adding ffdshow will allow you to use its audio decoder (for filters + a more customizable mixer), ... Especially for older 480p and worse content (which isn’t likely to be encoded in AVC1), you can use the ffdshow deblocking or debanding filters to attempt to improve image quality this way."
5. "I recommend leaving this off unless you are having playback problems, because it prevents you from taking screenshots and makes the transition to fullscreen very ugly (It also messes up MPC-HC’s interface)"

You obviously read enough to make a snarky, unhelpful reply... would it have been much more work to try to comprehend anything that was written there instead of just being a jerk?1. 99% of the releases don't use this feature. And LAV Splitter supports MANY other containers much much better than Haali or MPC-HC's filters.
2. The 64 bit version of MPC-HC. Yes. We all know that 64bit player cannot run a 32bit renderer. I was asking: "why not install the 64bits components of LAV Filters?" (so that they will be available for other [64bit] programs)
3. But why? LAV Splitter does a much better job.
4. Well, wasn't this a "newbie" guide? Adding ffdshow to the filter chain would be be useless for 99.999% of the users, since they don't even know what video post processing is or how to setup it properly.
5. It doesn't mess up the MPC-HC GUI. The move to exclusive full screen is just fine (does the quick "flash" really bother anyone?). They must have forgotten exclusive mode results in a much smoother playback (which is one of the best features of madVR).

blexley
12th January 2012, 10:13
LOL :p

I spent a week going around in circles trying to find an easy step by step install guide and i've just found it at the very page i first started at, nevcairiel own HomePage. :D :D :D

http://1f0.de/lav-cuvid/guide/

nevcairiel is there any chance you could make this guide more obvious in the first post of this thread and on the start of your webpage so newbies and people with low computer skills such as myself can avoid the obnoxious comments from users when asking for guidance as if LAV filters is some how for the elite and you need to search and read about codecs and splitters before you can use it.As all we need to know is what to adjust and tick/untick so we can just get on and use it and your guide does that prefectly.

Plus the more accessible it is and the quicker we can get going the quicker newbies can hit the paypal donate. :)

Thanks

madshi
12th January 2012, 10:15
We'll see if i can find a better way to lock and retrieve the video frame. I didn't even optimize the memory copying yet because all the time is spent in LockRect, not copying the data.
I must be doing somethign wrong still, because even on my NVIDIA the performance isn't close to what i would expect. :)

Does anyone have any experience with copying the data from the GPU?
Right now i'm just LockRect'ing the DXVA surface and copying the data over - but the LockRect can quite frequently take a long time. I read that an alternative is using GetRenderTargetData to copy the data to system memory, and because that method is async in the background, i could potentially gain some benefits from it?
Using LockRect is the same method I'm using in madNV12Test. GetRenderTargetData() is what madNV12Test lists as "trick download" and it usually fails for NV12 surfaces for me simply because it would require me to create an NV12 surface in system memory and neither CreateOffscreenPlainSurface() nor IDirectXVideoAcceleratorService::CreateSurface() seem to be willing to do that on my PC. FWIW, with NVidia I'm getting higher download speeds by allocating the NV12 decoding surface with IDirectXVideoAcceleratorService::CreateSurface() instead of CreateOffscreenPlainSurface().

nevcairiel
12th January 2012, 10:21
FWIW, with NVidia I'm getting higher download speeds by allocating the NV12 decoding surface with IDirectXVideoAcceleratorService::CreateSurface() instead of CreateOffscreenPlainSurface().

Thats what i'm using, well technically IDirectXVideoDecoderService, but that one just inherits from IDirectXVideoAccelerationService. The copy speed is fine, its just the LockRect operation that takes its time.
The weird thing is that the locking speed is not consistent, its going like 20ms, 1ms, 1ms, 20ms, 1ms, 1ms .. as if its still decoding a frame there and waiting.

I'll see about running more tests later.

cengizhan
12th January 2012, 10:29
nevcairiel,

i have an ATI 6850. new dxva works ok with mpc-hc but there is a problem with smartdvb:

http://postimage.org/image/93aj13v8r/

nevcairiel
12th January 2012, 10:36
nevcairiel is there any chance you could make this guide more obvious in the first post of this thread

No, because the guide is outdated, and LAV CUVID has been deprecated.

blexley
12th January 2012, 10:45
No, because the guide is outdated, and LAV CUVID has been deprecated.

If we are not supposed to use that guide then will you be making a new one for MPC-HC and LAV filters when you have time. ?

Thanks

nevcairiel
12th January 2012, 11:03
If we are not supposed to use that guide then will you be making a new one for MPC-HC and LAV filters when you have time. ?


I write code, not guides.

blexley
12th January 2012, 11:09
Then why have the guide on your webpage if it's beneath you as a coder. ?

What is it with the abrasive rude attitude towards newbies in this thread , do you wan't peple to use your filter and to donate money or not. ?

madshi
12th January 2012, 11:22
@blexley, the reason people are being rude to you has nothing to do with you being a newbie. Rudeness in these forums is usually just a reaction to annoying posts/behaviour. That said, I don't see nevcairiel's post as rude at all. He's just clearly saying that he's not interested in writing guides. What's rude about that? Nothing. He's just stating his preferences. However, insinuating that nevcairiel would feel that something were "beneath him" is quite rude in itself. So from my point of view, you're the one being rude here. And furthermore you're being rude to a developer who spends LOADS of his time to create free software for you. You should reconsider your style of posting if you don't want to end up on the ignore lists of all the most active developers in this forum.

blexley
12th January 2012, 11:43
What you have just posted is quite frankly ridiculous and illogical as how is a newbie supposed to enjoy all this so called time spent creating free software if there isn't a guide on how to use it correctly let alone donate money. ?

Are people supposed to be grateful and donate money for something they haven't even used yet and can't use until they know what exact settings to alter. ?


.

Koepi
12th January 2012, 11:57
blexley,

if a coder just wants to code, just let him do that. Website maintenance isn't real fun either, and as a starting point the old guide may be still of use for someone.

As you are that demanding - why don't _you_, who got 'a free lunch', take the time and write a guide and offer that?

Demanding is simple (and annoying, I can tell). Doing the real work however, is much more satisfying for everyone. So instead of just demand updates of documentation and website, write that guide. Which will bring you in a much better position to demand anything then, by the way.

nevcairiel
12th January 2012, 11:58
You seem to think we all do this so people donate money, as you keep bringing up that point. But you're dead wrong.

I do this because i consider working on it fun.
Because of that, i only do things that are fun for me, and writing guides isn't one of these things, and dealing with obnoxious people isn't fun either - so i just ignore them.

I only put up the donation button because people asked if/how they could donate, and who am i to say no if they offer. :)
It always feels good getting a donation, as a recognition for my work, however i would still continue if i never got any donations.

There are plenty guides out there, and since using it "correctly" is rather subjective and depends on your goals, just pick one that suits your needs and go with it.
Just don't copy/paste some random guide into this thread or ask people to hand-hold you through the setup, this isn't the place for that.

Also, if you're such a "newbie", then use a codec pack, there are plenty of them that come with my filters, pre-configured with all the settings you need.

blexley
12th January 2012, 12:09
blexley,

if a coder just wants to code, just let him do that. Website maintenance isn't real fun either, and as a starting point the old guide may be still of use for someone.

As you are that demanding - why don't _you_, who got 'a free lunch', take the time and write a guide and offer that?

Demanding is simple (and annoying, I can tell). Doing the real work however, is much more satisfying for everyone. So instead of just demand updates of documentation and website, write that guide. Which will bring you in a much better position to demand anything then, by the way.

Is this some sort of joke. ?

How is somebody with low computer skills who can't work out what settings are supposed to altered with MPC-HC and if i'm supposed to run MadVR aswell write the very step by step guide that i'm asking for. ?



There are plenty guides out there, and since using it "correctly" is rather subjective and depends on your goals, just pick one that suits your needs and go with it. Just don't copy/paste some random guide into this thread or ask people to hand-hold you through the setup, this isn't the place for that.

Also, if you're such a "newbie", then use a codec pack, there are plenty of them that come with my filters, pre-configured with all the settings you need.

So you say use a guide without actually pointing to one and the one i do paste up and ask if it is accurate and can be used is hand holding. ?

The ridiculous logic and reasoning and contradictions to these posts instead of just helping is beyond belief.

entrecour
12th January 2012, 12:20
blexley, Here is IMHO a pretty comprehensive guide

Watching H.264 videos using Compute Unified Device Architecture (CUDA) (http://imouto.my/watching-h264-videos-using-compute-unified-device-architecture-cuda/)

To simplify the setup further you can easily leave out halli, madflac, and ffdshow unless you have a specfic need for them.

blexley
12th January 2012, 12:24
Thanks entrecour :thanks:

You sir are a gent. :)

PeQuE
12th January 2012, 12:33
nev, how is lav video decoder designed to handle acceleration of vc-1 on Nvidia cards that support only VDPAU feature set A. My GTX 295 only supports Set A which according to wikipedia is as follows

Feature Set A
Complete acceleration for H.264
Partial acceleration for MPEG-1, MPEG-2, VC-1/WMV9

When I watch vc-1, with cuda enabled all is fine for the most part, until certain scenes where the image will appear blocky for a second or so.

Of course I would love to leave cuda on for vc1 so in my case should I manually disable it, or is there anything that can be done from the lav video side to allow me to use this "partial acceleration" and still see a valid image? Cheers

Its a bug in the driver, seems to have appeared a while ago. You could try different drivers, otherwise there isn't anything that can be done.

258.96 is the last ver to work right. And I think its actually hardware deinterlacing related. cause if i set CUVID to weave with newer drivers the blocking goes away.

It still is somehow related to partial acceleration, because it doesn't happen on cards with full acceleration capabilities. It might just be coincidence and depend on the series of the card, but somewhere is another factor in there. :)

I quote some messages that I consider have something to do with what I want to say...

Yesterday I upgraded my nVidia GT240 drivers from 258.96 to 285.62... Well, this drivers have broken in some way (framedrops, irregular fps and render times) the CUDA hardware acceleration of H264 interlaced material (HD livetv mainly).

- Interlaced MPEG2 is ok.
- Progressive H264 is ok.
- Switched Lav Video (0.44) to Cyberlink PowerDVD 11 H264 (so no CUDA), ok.
- Downgrade to Lav Video 0.43, still nok.
- Downgrade nVidia drivers to 280.26, ok (but I would say not as stable decoding as 258.96 anyway...)

ty, you're right. If anyones interested, heres some nvidia drivers I tried, showing vc1 decoding was broken somewhere between 280.36 and 285.27 on these cards at least. (I tried using the working cuda related dlls with the newer drivers but that doesn't seem to work very well :P )

Nvidia Driver - Cuda dll version - Status
285.62 - 8.17.12.8562 - broken
285.38 - 8.17.12.8538 - broken
285.27 - 8.17.12.8527 - broken
280.36 - 8.17.12.8036 - working
280.26 - 8.17.12.8026 - working

Definitely, 285.xx series drivers have some problems...

Nev, I don't ask you to do anything, as is clearly driver related. I only want to leave this comment in case anyone else finds the same, knows where's the problem.


bye!

madshi
12th January 2012, 12:45
Finally I can put this forum's ignore list feature to good use.

Alabanda
12th January 2012, 12:59
nevcairiel, you're awesome! Finally, the possibility of having quality hardware decoding in my HTPC-intended setup. I'm running an AMD Fusion board, an E350.

(pastebin was under heavy load) http://dl.dropbox.com/u/26229491/LAVVideo.txt

Again, thanks a lot!

SamuriHL
12th January 2012, 13:44
Finally I can put this forum's ignore list feature to good use.

Seriously unreal. I honestly don't understand people sometimes. Just know that the majority of us appreciate what you guys do.

Sent from my Xoom using Tapatalk

VipZ
12th January 2012, 13:48
Cool! Would you mind running this test, too?

http://madshi.net/madNV12Test.zip

Will do when I get home.

Bit OT, but may be useful for others to know. HDMI bit steaming on the 7970 negotiates pretty much instantly for all formats, where it used to take up to 3secs on my 5850. And on pausing it doesn't drop the stream immediately but keeps it showing as no channels on the AVR for around 2 secs before dropping.

STaRGaZeR
12th January 2012, 15:41
DXVA? Sounds nice :D

Let's hope drivers don't ruin the party.

CruNcher
12th January 2012, 15:50
Now a real DXVA (GPU independent, no frame copy) and DXVA support for MadVR is left ;)

goldie
12th January 2012, 16:38
Sorry for my late feedback. :p
LAVFilters 0.44 debug log for ATI Radeo HD 6850 with Catalyst 11.12 is here.
http://pastebin.com/RgdMrTcH

Execution environment:
Win 7 Pro 64bit
MPC-HC 1.5.3.3958 (32bit)
madVR 0.80

Playing a H264 L4.2 1080p60 clip was not so smooth confirmed,
but I'm so excited at the news about hw decoding for ATI users!
Good job, and :thanks: nev.

CruNcher
12th January 2012, 16:51
could you test vs the Performance of Cyberlink HAM it's the same :)

goldie
12th January 2012, 17:15
I had test Cyberlink HAM video decoder before,
but I think it always fallback to software decoding.
(I saw the CPU utilization didn't come down with HAM mode compared to software mode.)
So I uninstall it few months ago,
maybe I don't know how to set it up properly. :o

CruNcher
12th January 2012, 17:26
hmm does potplayers copy decoder works then ?

Not Ati but ;)

1080p @60 fps

IMSDK = 60 fps
Nev DXVA2 = 40 fps

Not that bad for the start :)

nevcairiel
12th January 2012, 17:30
Good News everyone!

I managed to increase the performance quite a bit, and its now close to the CUVID decoder on my system.
Just to figure out why one of the first frames is always corrupted now... :)

madshi
12th January 2012, 17:37
I managed to increase the performance quite a bit
How? (If I may ask...) :D

fastplayer
12th January 2012, 17:38
When it comes to DXVA, always assume it's the driver's fault! :D

nevcairiel
12th January 2012, 17:40
How? (If I may ask...) :D

I remembered something which i did for the CUVID decoder waaaay back.

I just store the surface in a queue, and only if the queue is full, i start processing them, one at a time. This gives the GPU time to finish rendering to the surface before i access it. A queue of 2 frames makes all the difference, just a tiny bit of delay to give the GPU some breathing room.
Speed is now 99% that of CUVID. :)

SamuriHL
12th January 2012, 17:46
I remembered something which i did for the CUVID decoder waaaay back.

I just store the surface in a queue, and only if the queue is full, i start processing them, one at a time. This gives the GPU time to finish rendering to the surface before i access it. A queue of 2 frames makes all the difference, just a tiny bit of delay to give the GPU some breathing room.
Speed is now 99% that of CUVID. :)

Well done!!

madshi
12th January 2012, 17:50
I remembered something which i did for the CUVID decoder waaaay back.

I just store the surface in a queue, and only if the queue is full, i start processing them, one at a time. This gives the GPU time to finish rendering to the surface before i access it. A queue of 2 frames makes all the difference, just a tiny bit of delay to give the GPU some breathing room.
Speed is now 99% that of CUVID. :)
So how many FPS do you with ATI cards? I'm wondering why I got so miserable results with my madNV12Test tool. All I did there was LockRect + memcpy in a loop, with no real decoding going on...

nevcairiel
12th January 2012, 17:51
So how many FPS do you with ATI cards? I'm wondering why I got so miserable results with my madNV12Test tool. All I did there was LockRect + memcpy in a loop, with no real decoding going on...

I have no ATI card, but when i post a first real test version, i'm sure people will be glad to benchmark it.

SamuriHL
12th January 2012, 17:54
I'll be able to test it later for sure. Very interested in this.

madshi
12th January 2012, 17:56
Ah sorry, misunderstood. Thought you had 99% of CUVID performance with ATI DXVA!

SamuriHL
12th January 2012, 17:57
He might. :D That's what we need to test. ;)

goldie
12th January 2012, 18:00
I'll be able to test it later for sure. Very interested in this.
+1 :thanks:

CruNcher
12th January 2012, 18:09
@ nev
ehh how does it compare with the IMSDK layer (Quicksync Decoder overhead) ? do you think it's not really needed anymore if this runs generically fine on every GPU :) or do you meant you currently reach the same performance as CUVID in speed (on your Nvidia VPx) which would be still less then with Intels MSDK DXVA2 implementation ?

nevcairiel
12th January 2012, 18:25
Intel has some issues with the "generic" DXVA2, i would have to figure out what settings exactly it needs to run properly - so instead just use the MSDK? :)

SamuriHL
12th January 2012, 18:27
Is this stuff checked in to the repository yet? if so I can start building it and take a look.

nevcairiel
12th January 2012, 18:36
Ok, here is a more polished version without debugging.

http://files.1f0.de/lavf/LAVFilters-0.44-dxva2-perftest.zip

It would be great if some people could benchmark this using following clip:

http://xhmikosr.1f0.de/samples/2160p/CrowdRun/CrowdRun_1080p50.x264.CRF23.mkv

I would recommend to benchmark with GraphStudio, mostly because its so easy. ;)
It would also be great if you could mention which kind of CPU you're running, so i know which kind of memory copy is being used.

As a reference, my NVIDIA with a VP4 decoder does around 73fps, both in DXVA2 and CUVID.

Disclaimer:
- This version is generally not all that much tested, and might blow up (and possibly take your PC with it)
- DXVA2 is a Vista/7 tech, don't expect it to work on XP
- Fallback to software decoding is still unfinished, and will most likely crash.
- Seeking in VC-1 still is somewhat rough, H264 seems to work better, however.
- VC-1 interlaced decoding is not yet implemented (if i can pull it off to finish it)

hoborg
12th January 2012, 18:48
Radeon HD 6750: (it stutter a little)
http://hobring.esero.net/saf/lavf/dxva2_perf.png

Sebastiii
12th January 2012, 18:48
Amazing :) Thanks

SamuriHL
12th January 2012, 18:59
@hoborg....what is that? And where can I find it? :)

nevcairiel
12th January 2012, 19:00
I would guess thats GraphStudio Next (http://code.google.com/p/graph-studio-next/)

CruNcher
12th January 2012, 19:00
Yup it stresses User as well as Kernel time a little more and thus isn't as efficient as the Quicksync Implementation on Intel Hardware but not that bad @ all for 1080p 60 fps the difference was 2%/4%/12%/17%/25% (includes render overhead) for Cyberlink and Arcsoft DXVA/CoreAVC DXVA/Intel(Egur/Nev) DXVA2/Generic DXVA2(Nev)/Libav(Nev) @ playback (not raw benchmark performance) :)

sneaker_ger
12th January 2012, 19:06
Radeon HD 5850
~50.3 fps