Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 31st January 2014, 02:57   #22221  |  Link
asmo42
Registered User
 
Join Date: Dec 2012
Posts: 5
Quote:
Originally Posted by madshi View Post
You probably only have enabled NNEDI3 doubling for luma, right? In that case the chroma channels are upscaled using the "image upscaling" algorithm. That's the explanation for what you're seeing.
You're right that's it. I was under the impression that if I did luma doubling beyond the output resolution it would be the "chroma upscaling" setting that controlled all the chroma upscaling and the "image upscaling" setting wouldn't have any effect. But I realize now that's not how it works. After re-reading the thread again I saw the quote below which explains everything. Sorry for taking up your time.

Quote:
Originally Posted by madshi View Post
The "chroma upscaling" settings are only for 4:2:0/4:2:2 to 4:4:4 conversion, for nothing else. Once 4:4:4 is reached, all other scaling operations are decided by "image doubling/upscaling/downscaling" options.
asmo42 is offline   Reply With Quote
Old 31st January 2014, 02:57   #22222  |  Link
sajara
Registered User
 
Join Date: Jan 2013
Posts: 18
Quote:
Originally Posted by leeperry View Post
Madshi said that it wasn't all that demanding on his 7770(or was it 7790?) and I can confirm that it's a breeze on my factory overclocked 7850.
Quote:
Originally Posted by DragonQ View Post
Just because it's not that demanding for you on your rig doesn't mean it isn't in general
I can only say that with my test video H264 720x404 25fps with HD5730M:

upscale settings
chroma: lanczos 3 taps
image: jinc 3 taps
debanding: low
fade: high

laptop monitor res:1366x768@60hz
FSE: on
smooth motion: off

random dithering: rendering 13.84ms/gpu 43%
OpenCL error diffusion: rendering 28.36ms / gpu 70%

I think it scales very nicely for a low end mobile GPU upscaling to a bit over HD res.

One thing I must note is that when I moved from Win 7 to 8 I noticed vast improvements on seeking times with lav+madvr and much better performance switching from FSE to desktop. Maybe and just maybe that can make a difference in overall performance.
sajara is offline   Reply With Quote
Old 31st January 2014, 03:25   #22223  |  Link
JarrettH
Registered User
 
Join Date: Aug 2004
Posts: 838
Is it worth it to have debanding on the low setting all the time? I don't watch animation very much, just good quality films. I can think of instances where I might have seen banding (light sources, logo fades), but it wasn't that severe

Last edited by JarrettH; 31st January 2014 at 03:31.
JarrettH is offline   Reply With Quote
Old 31st January 2014, 04:17   #22224  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
When playing around with the NNEDI3 doubling on 720p anime viewed at 1080p, I came to the conclusion that 32 neurons is not worthwhile. You don't get a big step-up in edge quality from 16 neurons until you use 64 neurons.

Quote:
Originally Posted by madshi View Post
That said, I recommend not to go lower than 32 neurons for luma upscaling, because 16 neurons leaves too many artifacts in the image for my taste with *some* videos.
....
I wouldn't go lower than 16 for doubling, unless your GPU absolutely can't handle 32 neurons and you still prefer NNEDI3 with 16 neurons over Jinc/Lanczos.
But this interests me. Could you expand on the *some* videos which have artifacts using NNEDI3 16 neurons? Better yet, take a PNG screenshot of the source video at original resolution so I can test myself in madVR.

Does source resolution and quality matter? Have you actually see these artifacts doing 1280x720 -> 2560x1440 or 1920x1080 -> 3840x2160 scaling?

Quote:
Originally Posted by madshi View Post
In my tests I've found little reason to enable NNEDI3 for the chroma channels. So I would recommend to leave the right side of the image doubling settings page unchecked.
My testing agrees with this. NNEDI3 image doubling for chroma doesn't seem to be worth it, at least until you've hit at least 64 neurons for luma doubling.

One addition though. The 'chroma upscaling' setting for NNEDI3 does have a noticeable impact on chroma quality, especially when dealing with DVD resolution content and below. I'd consider it worthwhile for at least SD content if you can afford it.

Last edited by cyberbeing; 31st January 2014 at 04:20.
cyberbeing is offline   Reply With Quote
Old 31st January 2014, 04:32   #22225  |  Link
XMonarchY
Registered User
 
Join Date: Jan 2014
Posts: 489
Quote:
Originally Posted by cyberbeing View Post
No, it's an added cost.
Open GPU-Z. If OpenCL is not checked, then you've broken OpenCL support in the driver.
You mean using previous version nvopencl.dll with newer drivers will make OpenCL box un-ticked, right? Because with 344.67 drivers with 344.67 nvopencl.dll files my box IS ticked, but black screen goes black.

I wonder if its possible to edit that file verificaiton you are talking about. Just how deeply are OpenCL drivers integrated with regular drivers? I'm sure with some time and patience its possible to force drivers to use the older nvopencl.dll files...
XMonarchY is offline   Reply With Quote
Old 31st January 2014, 04:33   #22226  |  Link
Audionut
Registered User
 
Join Date: Nov 2003
Posts: 1,264
Well I was getting by with just an Intel i7 2600k with onboard GPU, until all these new features turned up.
Turns out I was actually getting the odd frame drop with none of the quality stuff turned on, resizing all being done as bilinear and using a 3dlut as described here: http://www.avsforum.com/t/1471169/madvr-argyllcms
I was even using a heap of the quality tradeoffs.

Bit the bullet and got an HD 7790.

Now I have none of the quality tradeoffs enabled, debanding enabled, NNEDI3 (32 neurons) chroma upscaling and it purrs like a kitten. This is in 24.976 playback as I prefer the odd motion judder to the current smooth motion algorithms.

One thing I am not entirely certain of, the rendering times drop when I output 16bit 4:4:4 from LAV, I assume this is because LAV is doing some of the processing? I also assume it's best to just output 8bit 4:2:0 from LAV and let mVR do all the grunt work?

edit: Ok seems better to output 4:2:0 from LAV and give control to mVR. What about the bit depth though?
__________________
http://www.7-zip.org/

Last edited by Audionut; 31st January 2014 at 05:07.
Audionut is offline   Reply With Quote
Old 31st January 2014, 04:51   #22227  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by XMonarchY View Post
You mean using previous version nvopencl.dll with newer drivers will make OpenCL box un-ticked
Yes, that's correct.

Quote:
Originally Posted by XMonarchY View Post
I wonder if its possible to edit that file verificaiton you are talking about. Just how deeply are OpenCL drivers integrated with regular drivers? I'm sure with some time and patience its possible to force drivers to use the older nvopencl.dll files...
Who knows, but considering madshi has stated that OpenCL <-> D3D9 interop fails (black screen), while the OpenCL processing is actually successful, I'd be more suspect that the Direct3D driver is bugged rather then the OpenCL driver.
cyberbeing is offline   Reply With Quote
Old 31st January 2014, 05:57   #22228  |  Link
The 8472
Registered User
 
Join Date: Jan 2014
Posts: 51
Quote:
Originally Posted by madshi View Post
That's probably due to the combination of using Smooth Motion FRC and Error diffusion. Using both means that Smooth Motion FRC increases the cost of Error diffusion, because Error diffusion must be applied to every output frame, and Smooth Motion FRC increases the output frames from 24fps to nearly 60fps. So basically Smooth Motion FRC means that the cost of Error diffusion increases by a factor of almost 2.5x.
Yeah, I was just pointing out that high target resolutions + SM can make this quite expensive, since you seemed surprised by someone mentioning the high performance impact.


Quote:
I don't think it's as high quality as error diffusion?
Correct, I meant it as middle ground between random and error diffusion.

Quote:
Currently I'm doing Error diffusion in RGB. Maybe doing it in YCbCr would produce nicer results
Based on some toying around with imagemagick I think doing color reductions in L*a*b* or similar linear color spaces gives visually better results, but doing the dithering before non-linear colorspace transforms is difficult to get right.

Quote:
I'm not sure. Other than that I don't have an algorithm which would spreads errors over multiple color channels. All the algorithms I know strictly work on one channel at a time, only.
The find-closest-color step of the algorithm can be done based on colorspace-aware distance metrics, but that requires iterative searches. That's probably prohibitively expensive to do in realtime.
The 8472 is offline   Reply With Quote
Old 31st January 2014, 06:38   #22229  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
madshi: Now knowing that SM doesn't work well in window mode, although it's worked fine up until now. I'm trying to set a profile to use overlay only when using potplayer otherwise window. if (mediaPlayer = "F:\Tools\PotPlayer\PotPlayerMini.exe") "overlay" else "window" is what I've put but it always chooses window. Also tried "PotPlayermini.exe" and "Potplayer" any ideas?

Does your GTX 650 drop frames when using nnedi3 from 720p to 1080? It does here but can't figure out why, gpu load is fine and SD -> 1080 works fine with about the same gpu load.

Andrey /MAG/: you need to downgrade to 327 drivers.
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0
turbojet is offline   Reply With Quote
Old 31st January 2014, 07:19   #22230  |  Link
omarank
Registered User
 
Join Date: Nov 2011
Posts: 180
Quote:
Originally Posted by madshi View Post
Are we talking forced film mode? With forced film mode disabled, deintFps will be around 50.0 for PAL content.
Although I have set forced film mode, but that doesn't seem to be related to my problem. Here I am talking about a profiling script for smooth motion. Yes, with DXVA deinterlacing, deintFps will 50.0 for PAL content but for 50 fps content I would like to enable smooth motion. I don't want it to be enabled for still images which LAV output as 25 fps content.

As you said that you couldn't confirm the bug which I reported, I wrote this script which can perhaps let you reproduce the bug with smooth motion profiling.

Quote:
Originally Posted by madshi View Post
The algorithm itself seems to be quite fast. About 2.5ms per 1080p frame on my HD7770. The majority of the cost comes from using OpenCL <-> D3D9 interop, at least when using AMD. On Intel GPUs interop is almost free, but the Intel GPU itself is rather slow with OpenCL. I don't about NVidia.
Now I have a very strong urge to make this request to you even though you have stated that you would be sticking to OpenCL for all compute related things. Could you please consider doing an alternative CUDA implementation *ONLY* for error diffusion (for the rest of the things OpenCL is fine)? With CUDA implementation, perhaps nvidia users with moderate GPU can also use error diffusion.
omarank is offline   Reply With Quote
Old 31st January 2014, 07:24   #22231  |  Link
drew_afx
Registered User
 
Join Date: Dec 2012
Posts: 12
Error Diffusion test

source is 1080p 23.976fps
no hardware decoding (avcodec active in Lav Video Decoder), outputs 8bit NV12 4:2:0
no processing
chroma upscaling: Jinc3/AR
no image doubling
image upscaling: Jinc3/AR
image downscaling: Catmull-Rom/AR,LL
delay playback start until render queue is full, everything else unchecked
CPU queue size: 8
GPU queue size: 8
windowed mode backbuffers: 4
no smooth motion

GPU used is GeForce 650 Ti Boost GDDR5 on PCI-E 3.0x16
Clock speed Core/Memory 1032MHz(1136MHz @boost) / 1502MHz
Quick look at OpenCL benchmark results (650 Ti Boost , Radeon HD 7770)
Luxmark 334 score , 789 score
RatGPU 73.20 sec , 101.37 sec
Bitmining 55.8 MHash/s , 170.6 MHash/s
GPU Caps - PostFX 93 FPS , 145 FPS

read avg rendering time when the value was stable for about 10sec

<540p> - 518400 pixels
6.95ms w/o opencl error diffusion
9.35ms with opencl error diffusion

2.40ms diff

4.63ns per pixel

<1080p> - 2073600 pixels
5.53ms w/o opencl error diffusion
13.45ms with opencl error diffusion

7.92ms diff

3.82ns per pixel

<2160p> - 8294400 pixels
23.77ms w/o opencl error diffusion
50.86ms with opencl error diffusion

27.09ms diff

3.266ns per pixel
drew_afx is offline   Reply With Quote
Old 31st January 2014, 07:59   #22232  |  Link
The 8472
Registered User
 
Join Date: Jan 2014
Posts: 51
Quote:
Originally Posted by Farfie View Post
http://puu.sh/6EgZs.jpg

This is without error diffusion.

Something tells me AMD performance isn't nearly this bad. I don't see how I could ever do 4k upscale like this at anything above 16 neurons. Is this expected? I'd be willing to bet, from the reports in this thread, that AMD performance is much better.
Not trying to complain at all, but I love NNEDI3 and I'd also love to always use it for upscaling at at least 32 neurons, and I feel like a 680 should be able to do so much more.
1080p30 -(nnedi)-> 2160p30 -(downscaler)-> 1440p30 actually is more work than direct doubling to 4k.

You could pick nearest neighbor as downscaler to get an approximate idea how it would perform on a 4k screen. You also have to keep in mind that it first needs to do a 4:4:4 conversion and then upscale the chroma separately too. That's lots of data shuffling involved there.

I can get nnedi with 32 neurons and 1080p24 -> 4k -> 1440p24 to work on my GTX670 - barely so - by bumping up the rendering queues and skimping on all the other scaling algorithms.

Next-generation cards might be able to handle 4k scaling reasonably well. And the generation after that maybe for HFR content.

That or pester madshi to get SLI working / distribute workload among devices.
The 8472 is offline   Reply With Quote
Old 31st January 2014, 08:12   #22233  |  Link
baii
Registered User
 
Join Date: Dec 2011
Posts: 180
For 720p and 1080p content on 1440p monitor, is needi3 better ("sharper" w/o unwanted artifacts) than jinc3 in most situation?

Considering switch to AMD if so as AMD perform 2-3x better at same price point according to the benchmark result from nnedi3 opencl thread.

Last edited by baii; 31st January 2014 at 08:19.
baii is offline   Reply With Quote
Old 31st January 2014, 08:23   #22234  |  Link
XMonarchY
Registered User
 
Join Date: Jan 2014
Posts: 489
Quote:
Originally Posted by cyberbeing View Post
Yes, that's correct.



Who knows, but considering madshi has stated that OpenCL <-> D3D9 interop fails (black screen), while the OpenCL processing is actually successful, I'd be more suspect that the Direct3D driver is bugged rather then the OpenCL driver.

I think using Direct3D .dll files from a previous set could be quite problematic and break way more things... but just in case - which .dll files control D3D9???

They say nVidia is aware of the issue - is that true? Since when has it been aware of the bug? Any updates on when it will be fixed?
XMonarchY is offline   Reply With Quote
Old 31st January 2014, 08:43   #22235  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 445
Hard to say how monolithic Nvidia's driver team is, but someone at Nvidia was informed about the problem about a day before the forum went down, so 3-4 days I'd say.
Ver Greeneyes is offline   Reply With Quote
Old 31st January 2014, 09:30   #22236  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 5,961
Quote:
Originally Posted by Werewolfy View Post
No, some Philips TV don't even support 4:4:4 http://www.hdtvtest.co.uk/news/phili...1309193320.htm (see the benchmark results)

Have you considered this one? http://www.hdtvtest.co.uk /news/kdl32w653-201310313413.htm It supports 4:4:4 at every refresh rates and it has everything except 3D, I don't know if it's a requirement for you.
3d is a must have and should work with intel, nvidia and amd.

the pf8008 doesn't have pc-mode in his manuel and that's not a cheap model...

the pf4508 is the smallest Philips with pc-mode and 3d but i'm not sure anymore...
huhn is offline   Reply With Quote
Old 31st January 2014, 09:48   #22237  |  Link
Gagorian
Registered User
 
Join Date: Jul 2013
Posts: 27
Quote:
Originally Posted by Gagorian View Post
Is it normal that OpenCL error diffusion increases average rendering time 3x? My GPU is a r9 270 (basically a 7870?) and using latest madVR.

I tried playing a few 1080p24 standard x264 blu-ray movies presented at 23.976 Hz.

Average rendering time (using Jinc 3 AR for both Luma and Chroma, Debanding low, all trade quality for performance options except OpenCL error diffusion disabled) was around 8-10 ms. The rendering time is about the same for 720p movies, so scaling for instance is rather cheap even with Jinc 3 AR.

With OpenCL error diffusion the average rendering time was raised to around 28-30 ms.

Should it really be that demanding?
Quote:
Originally Posted by madshi View Post
Is smooth motion on or off? What is your display resolution? On my PC the cost of Error diffusion seems to be quite a bit lower than on yours, even though I only have an 7770. Is your display resolution much higher than 1080p, maybe?
Smooth motion is off (like I said presented at 23.976 Hz) and display resolution is 1080p. Newest catalyst drivers and AMD APP SDK 2.9 installed.
Gagorian is offline   Reply With Quote
Old 31st January 2014, 10:53   #22238  |  Link
Qaq
AV heretic
 
Join Date: Nov 2009
Posts: 422
I'm thinking of OpenCL GPGPU benchmarks: http://forums.aida64.com/topic/1539-opencl-gpgpu-benchmarks/
Worth it?
Qaq is offline   Reply With Quote
Old 31st January 2014, 11:11   #22239  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Quote:
Originally Posted by Werewolfy View Post
Have you considered this one? http://www.hdtvtest.co.uk /news/kdl32w653-201310313413.htm It supports 4:4:4 at every refresh rates and it has everything except 3D, I don't know if it's a requirement for you.
3D is a key requirement, and it must work with all GPUs, not be an NVidia only solution. How am I supposed to ever add 3D support to madVR if my development monitor can't do 3D with all GPUs?

Quote:
Originally Posted by Farfie View Post
This is without error diffusion.

Something tells me AMD performance isn't nearly this bad. I don't see how I could ever do 4k upscale like this at anything above 16 neurons. Is this expected? I'd be willing to bet, from the reports in this thread, that AMD performance is much better.
Not trying to complain at all, but I love NNEDI3 and I'd also love to always use it for upscaling at at least 32 neurons, and I feel like a 680 should be able to do so much more.
Which scaling algorithms are you using in the image upscaling and image downscaling pages? See my recommendations in the v0.87.4 announcement post. That might allow this to work on your GPU. If all else fails you could check whether maybe 16 neurons still produces a better image than not using NNEDI3 at all.

Quote:
Originally Posted by iSunrise View Post
I tried to narrow it down further, it seems to be related to the deinterlacing/NNEDI3 settings (the clip itself is interlaced) and madVR uses IVTC/deinterlacing for it. Please always test with filmrez.ts.

So, for some reason NNEDI3 triggers that same problem, when deinterlacing is at some specific settings.

Please play around with the following:

1) automatically activate deinterlacing when needed (checked)
1.1) if in doubt, deactivate deinterlacing (checked)
- donīt check anything else in the deinterlacing tab -
2) enable NNEDI3 chroma upscaling
3) if you now suddenly check automatic source type in the deinterlacing tab, the bug goes away (grey levels instead of yellow)
4) If thereīs no bug at this stage, enable/disable NNEDI3 chroma upsampling again
5) If thereīs no bug at this stage, enable/disable deinterlacing again
I've already ripped out the NVidia GPU in my development PC again. You know, it totally refuses to even acknowledge the precense of my LCD monitor, when connecting via DVI, while my Intel and AMD GPUs have no problems with that. FWIW, the NVidia does like my projector, when using DVI. I have to use VGA connection to get an image with the NVidia 650 on my LCD monitor. I can put the NVidia GPU back in, but doing so costs time and effort. So I'd need a 100% sure way to reproduce any specific problem.

So can anybody else with an NVidia GPU confirm/deny iSunrise's problem?

Quote:
Originally Posted by DragonQ View Post
Just because it's not that demanding for you on your rig doesn't mean it isn't in general
Correct.

Quote:
Originally Posted by DragonQ View Post
even with your decent AMD card you can't use it with Smooth Motion, right?
No, that's not right at all. Actually I've just tested the following:

- Blu-Ray 1080p24 playback
- tested on 1080p projector @ 60Hz, 100% view
- tested on 1680x1050 monitor @ 60Hz, downscaled
- NNEDI3 chroma upscaling, 16 neurons
- Catmull-Rom AR image downscaling with linear light
- image debanding with gradient angle detection
- error diffusion
- smooth motion FRC with 24fps @ 60Hz

Basically all "trade quality for performance" options unchecked. Highest possible settings/algorithms everywhere (!!), including debanding and smooth motion FRC. It plays perfectly smooth on both my projector and my LCD monitor. And this is the main purpose of madVR: Playing Blu-Rays perfectly.

Yes, rendering times are quite high, around 40ms actually, so it was really close, but it worked. And yes, using a higher res monitor would increase the cost so much that it would not run smoothly, anymore. So Error Diffusion can't always be used, it does require a decent GPU, and it increases rendering times quite noticeably, but it *can* be used just fine, even with Smooth Motion turned on, plus NNEDI3 chroma upscaling, debanding and all the bells and whistles, with a mid-range GPU like my HD7770.

FYI, when switching my projector to 24Hz (so Smooth Motion turned itself off automatically), rendering times with the same maxed out settings as above were 29ms. So I still have about 11-12ms headroom for future algorithms.

Quote:
Originally Posted by DragonQ View Post
Dunno about nVidia. Surely that leaves a minority of hardware combinations where it would be feasible? Maybe we have a different definition of "demanding". I forgot about the new profiles though, I suppose they mean more people can use it for some video types.

I didn't conclude anything madshi, I used the word suggest.
Please understand that it's frustrating to me if you suggest that one of the new features in madVR might be unusable for the majority of users.

Quote:
Originally Posted by cyberbeing View Post
That's unfortunate. It may be worth trying it on YCbCr, or maybe even better converting RGB -> L*a*b* channels and doing error-diffusion dither on that.
Actually my idea doesn't work. I could apply Error Diffusion in YCbCr color space, but that's not what madVR outputs. I'd have to convert back to RGB afterwards. And after the conversion I'd again have RGB floating point data. So I'd have to apply Error Diffusion yet again.

Quote:
Originally Posted by leeperry View Post
Basically, if I set mVR to automatically roll refresh rates when playback starts then everything's fine but if I set it to roll refresh rates when going FS then FSE is more likely to fail. Actually I can easily reproduce the problem if I wait for mVR's OSD to appear and instantly switch to FS.....then I get this in mVR's logs:
Code:
00003601 Render   COsd::DisplayMessage(self: 0C9D1FB8, message: exclusive mode failed, milliseconds: 2000)
00003601 Render   CQueue::IsStillImage();
00003601 Render   CQueue::IsRunGraphWhenQueueFull(self: 0C9F4188) -> no
00003601 Render   CQueue::IsStillImage() -> no
00003601 Render   COsd::DisplayMessage(self: 0C9D1FB8) -> +
00003601 Render   CDirect3D::ResetDevice(self: 0C9D01E0) -> switching to fullscreen failed (8876086c)
I've taken the liberty of uploading some working and failing logs at PotP+mVR.rar (864 KB)

I've tried hard but I can't get MPC-HC to fail.
Well, it seems to have something to do with what PotP does. You could try the various PotP settings to see if any of those fixes the problem. Maybe those new D3D9 PotP options to render the GUI in FSE mode is causing this problem? I don't know. Since this only seems to occur with PotP, I don't know if I can do much to fix it. This might be PotP's job to fix. In any case, I currently don't have the time to look into this.

Quote:
Originally Posted by leeperry View Post
Well, I think the BenQ GW2760HS might be right up your alley......
I don't see any indication that it supports 3D? And if it does, doesn't BenQ usually do that NVidia only 3D stuff? I need 3D to work with all GPUs.

<sigh> Finding a usable PC monitor for my needs seems to be really hard...

Quote:
Originally Posted by Andrey /MAG/ View Post
nnedi3 doubling makes picture very dark when I use the last NVIDIA WHQL driver 332.
Please re-read the v0.87.4 announcement post carefully.

Quote:
Originally Posted by sajara View Post
I can only say that with my test video H264 720x404 25fps with HD5730M:

upscale settings
chroma: lanczos 3 taps
image: jinc 3 taps
debanding: low
fade: high

laptop monitor res:1366x768@60hz
FSE: on
smooth motion: off

random dithering: rendering 13.84ms/gpu 43%
OpenCL error diffusion: rendering 28.36ms / gpu 70%

I think it scales very nicely for a low end mobile GPU upscaling to a bit over HD res.
Thanks. Of course having Smooth Motion FRC off and having a relatively low display resolution helps. To be honest, in your situation I think I'd prefer using Smooth Motion FRC over Error Diffusion, if you can only use one of both features. But it's good to know that Error Diffusion might still be fast enough in some situations even on mobile GPUs.

Quote:
Originally Posted by JarrettH View Post
Is it worth it to have debanding on the low setting all the time? I don't watch animation very much, just good quality films. I can think of instances where I might have seen banding (light sources, logo fades), but it wasn't that severe
It's your decision, really. Maybe if you find a movie scene where you can see banding, you could check whether debanding would help in that scene. Probably on "low" setting it might only reduce the problem, but not fully remove it, but it depends on the situation...

Quote:
Originally Posted by cyberbeing View Post
When playing around with the NNEDI3 doubling on 720p anime viewed at 1080p, I came to the conclusion that 32 neurons is not worthwhile. You don't get a big step-up in edge quality from 16 neurons until you use 64 neurons.

But this interests me. Could you expand on the *some* videos which have artifacts using NNEDI3 16 neurons? Better yet, take a PNG screenshot of the source video at original resolution so I can test myself in madVR.
Simple test: Use MS Paint, light gray background. Some diagonal black lines on top. Upscale this with 16 neurons. Ouch.

Generally, with some of the usual resampling test images, I've found that every step up in neurons improves quality a little bit. With some test images one specific step might seem bigger than another. I think it depends on the exact edge angles etc.

Quote:
Originally Posted by cyberbeing View Post
One addition though. The 'chroma upscaling' setting for NNEDI3 does have a noticeable impact on chroma quality, especially when dealing with DVD resolution content and below. I'd consider it worthwhile for at least SD content if you can afford it.
I can imagine that it could help for some DVDs. At least there it should be much more useful (and less power hungry, too) compared to using it for the chroma channels in the image doubling settings page.

Quote:
Originally Posted by Audionut View Post
Bit the bullet and got an HD 7790.

Now I have none of the quality tradeoffs enabled, debanding enabled, NNEDI3 (32 neurons) chroma upscaling and it purrs like a kitten. This is in 24.976 playback as I prefer the odd motion judder to the current smooth motion algorithms.

One thing I am not entirely certain of, the rendering times drop when I output 16bit 4:4:4 from LAV, I assume this is because LAV is doing some of the processing? I also assume it's best to just output 8bit 4:2:0 from LAV and let mVR do all the grunt work?
You should check all the output format options in LAV. That allows LAV to output the decoded video in its native format, which is the best solution. LAV can do chroma upscaling and it can also "increase" the bitdepth (it just adds some zeros), but madVR can do all that in better quality.

If you send 4:4:4 from LAV, madVR doesn't have to do chroma upscaling, so the GPU has less work to do. That explains the drop in the rendering times.

Quote:
Originally Posted by The 8472 View Post
Yeah, I was just pointing out that high target resolutions + SM can make this quite expensive
True.

Quote:
Originally Posted by turbojet View Post
madshi: Now knowing that SM doesn't work well in window mode, although it's worked fine up until now.
Well, it works, as long as you keep the rendering times low enough (lower than ~ 16ms for 60Hz).

Quote:
Originally Posted by turbojet View Post
if (mediaPlayer = "F:\Tools\PotPlayer\PotPlayerMini.exe") "overlay" else "window" is what I've put but it always chooses window. Also tried "PotPlayermini.exe" and "Potplayer" any ideas?
The path is removed from madVR. Use only the exe file name.

Quote:
Originally Posted by turbojet View Post
Does your GTX 650 drop frames when using nnedi3 from 720p to 1080?
Haven't tested that.

Quote:
Originally Posted by omarank View Post
Although I have set forced film mode, but that doesn't seem to be related to my problem. Here I am talking about a profiling script for smooth motion. Yes, with DXVA deinterlacing, deintFps will 50.0 for PAL content but for 50 fps content I would like to enable smooth motion. I don't want it to be enabled for still images which LAV output as 25 fps content.

As you said that you couldn't confirm the bug which I reported, I wrote this script which can perhaps let you reproduce the bug with smooth motion profiling.
I'm sorry, but I don't really have the time to test "perhaps" bugs. If you can make a step-by-step guide to reproducing a specific problem, ideally with a small video sample to help, then I'd be happy to look into this.

Quote:
Originally Posted by omarank View Post
Now I have a very strong urge to make this request to you even though you have stated that you would be sticking to OpenCL for all compute related things. Could you please consider doing an alternative CUDA implementation *ONLY* for error diffusion (for the rest of the things OpenCL is fine)? With CUDA implementation, perhaps nvidia users with moderate GPU can also use error diffusion.
No, sorry. It's not just the kernel which would need to be converted. I'd have to implement a full blown CUDA framework in madVR with all the interop stuff etc. That's quite a lot of work. I do plan to look into DirectCompute as a possible OpenCL alternative/replacement, though. It's supported by all GPUs.

Quote:
Originally Posted by drew_afx View Post
GPU used is GeForce 650 Ti Boost GDDR5 on PCI-E 3.0x16
Clock speed Core/Memory 1032MHz(1136MHz @boost) / 1502MHz
Quick look at OpenCL benchmark results (650 Ti Boost , Radeon HD 7770)
Luxmark 334 score , 789 score
RatGPU 73.20 sec , 101.37 sec
Bitmining 55.8 MHash/s , 170.6 MHash/s
GPU Caps - PostFX 93 FPS , 145 FPS

read avg rendering time when the value was stable for about 10sec

<540p> - 518400 pixels
6.95ms w/o opencl error diffusion
9.35ms with opencl error diffusion

2.40ms diff

4.63ns per pixel

<1080p> - 2073600 pixels
5.53ms w/o opencl error diffusion
13.45ms with opencl error diffusion

7.92ms diff

3.82ns per pixel

<2160p> - 8294400 pixels
23.77ms w/o opencl error diffusion
50.86ms with opencl error diffusion

27.09ms diff

3.266ns per pixel
Thanks. Makes a lot of sense. 8ms cost on a 1080p display is more than I would have liked, but I believe it's still very much in the "usable" area. After all, if you have a 1080p@24Hz capable display, you can spend up to 41ms on each frame. So 8ms is not great, but "ok", IMHO.

Quote:
Originally Posted by baii View Post
For 720p and 1080p content on 1440p monitor, is needi3 better ("sharper" w/o unwanted artifacts) than jinc3 in most situation?

Considering switch to AMD if so as AMD perform 2-3x better at same price point according to the benchmark result from nnedi3 opencl thread.
It's a little bit a matter of taste. Look at the v0.87.0 announcement post and the post after that to read about advantages and disadvantages of NNEDI compared to Jinc.

I wouldn't replace the GPU right now. madVR v0.87.x is just out for a week now. Give it a bit of time. Yes, AMD seems to have an advantage in OpenCL performance. But I plan to try DirectCompute, maybe it will close the gap. I don't know...

Quote:
Originally Posted by huhn View Post
the pf8008 doesn't have pc-mode in his manuel and that's not a cheap model...

the pf4508 is the smallest Philips with pc-mode and 3d but i'm not sure anymore...
Hmmmm... That might be an option, will look into it, thanks!

Last edited by madshi; 31st January 2014 at 14:27.
madshi is offline   Reply With Quote
Old 31st January 2014, 11:13   #22240  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Quote:
Originally Posted by Gagorian View Post
Smooth motion is off (like I said presented at 23.976 Hz) and display resolution is 1080p. Newest catalyst drivers and AMD APP SDK 2.9 installed.
Weird, seems the cost for Error Diffusion is more than twice as high on your PC compared to mine, although your GPU should be around twice as fast as mine. I have no explanation for that right now.
madshi is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 18:01.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.