View Full Version : MPC-HC tester builds for internal renderer fixes
JanWillem32
28th January 2015, 17:56
When not resizing the renderer will not compile the resizer shaders and just skip the resizing pass or use nearest neighbor sampling to make a copy. What is the window size and video area indicated by the stats screen when you play back the sample (with one one of the resizers that seems to work)?
ts1
28th January 2015, 18:01
Tried with the bilinear, stats screen says display size is 376x301, but window size seems like 800x600. And this error appears at the very beginning. I'm not doing anything with the player.
Edit:
376x301 is in monitor EDID section.
window size, video area and video size shows 800x600
JanWillem32
28th January 2015, 18:30
The display size stating 376x301 is expressed in mm, not in pixels. A 1308x981 video area on a 1280x1024 window is possible if you zoom in a bit (the sides will be clipped off though).
I suspect it's the down-sizing kernel that's giving problems. Would you disable the "Auto-zoom" function on the "Options", "Playback" tab? It might be the opening size that's giving problems. (Auto-zoom is applied some time after opening a video.)
ts1
28th January 2015, 18:40
No, without auto-zoom it's the same video area size in the full screen. With auto-zoom it works tho on second try, just after the error, if I press play button. Without auto-zoom it doesn't work at all.
Edit: And it works if I first open the video with some different resolution and then 800x600 video.
JanWillem32
28th January 2015, 18:52
So, the error only happens when rendering video on the small (310×200) video size (the size the player starts with when no video is playing)? Then it's something with the down-sampling kernel.
ts1
28th January 2015, 19:04
hah it works if I make player a little bit smaller or larger (disabled auto-zoom), only doesn't work in size at which player starts.
Edit: figured out why video area was 1308x981. Zoom1 was chosen in video frame menu, switched to touch window from inside and now it's correct.
JanWillem32
28th January 2015, 19:54
I've attempted to fix the resizer. I hope it works.
-
ts1
28th January 2015, 20:13
It works now.
While testing noticed that in stats screen green line is very unstable with Alternative scheduler (GPU NV 660).
JanWillem32
28th January 2015, 20:52
The green line is the recorded sample time versus the presentation time (middle bar). Note that it can very imprecise; when pulldown is applied on a video for example some frames will last twice as long as others, making large spikes in the graph.
When presenting in a window you also have to deal with the secondary present to the screen from the desktop. That's why the D3D fullscreen mode is more stable.
XRyche
28th January 2015, 22:59
With your latest "regular builds (just small updates this time)" x64 the following upscaling chroma fix for AMD/Intel are giving "compiling initial pass pixel shader failed" : B-spline4, Catmull-Rom spline6, and B-spline8. Lanczos4 doesn't appear to be displaying colours. All the image resizers appear to be working as they should. I honestly don't know if it started with this build or before. I've hardly only ever used Lanczos 2 or 3 and untill recently never the chroma upscaling fix resizers.
JanWillem32
29th January 2015, 00:49
I replaced the release-type builds in my earlier post and I deleted the debug builds.
XRyche, thank you for noticing these bugs. This stuff must have been broken for a while now, as I haven't edited these routines recently at all.
XRyche
29th January 2015, 00:58
Oh, this wasn't the debug build, I made sure of that. Should I go ahead and d/l the build from here again : http://forum.doom9.org/showthread.php?p=1707384#post1707384 or do I already have it?
JanWillem32
29th January 2015, 01:30
You can download the new builds with using the old links. I just update the files if only minor changes have been made to the builds.
ts1
29th January 2015, 08:10
The presentation method for Windows Vista with Desktop composition enabled is indeed slow to adapt to changes.
It's slow with or without Desktop composition. And it was ok before the fix of the black screen with Desktop composition disabled and crash with VSync enabled.
JanWillem32
29th January 2015, 12:14
One of the reasons the initialization for the monitor section is slow can be read from the log you posted earlier:00000120 0.15050118 [816] Video renderer attempting to use the generic Ex mode method to read the display refresh rate
00000121 0.26801983 [816] Video renderer GetTimingReport() failed the first try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000122 0.26801983 [816] , the renderer is giving the driver half a second and will try again
00000123 0.87145257 [816] Video renderer GetTimingReport() failed the second try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000124 0.87145257 [816] , the renderer is giving the driver half a second and will try again
00000125 1.47645974 [816] Video renderer GetTimingReport() failed the third try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000126 1.47645974 [816] , the renderer is giving the driver half a second and will try again
00000127 2.08050847 [816] Video renderer GetTimingReport() failed the fourth try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000128 2.08050847 [816] , the renderer is giving the driver half a second and will try again
00000129 2.68555593 [816] Video renderer GetTimingReport() failed the fifth try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000130 2.68555593 [816] , the renderer is giving the driver half a second and will try again
00000131 3.29060459 [816] Video renderer GetTimingReport() failed the sixth try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000132 3.29060459 [816] , the renderer is giving the driver half a second and will try again
00000133 3.89564085 [816] Video renderer GetTimingReport() failed the seventh try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000134 3.89564085 [816] , the renderer is giving the driver half a second and will try again
00000135 4.50069332 [816] Video renderer GetTimingReport() failed the eighth try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000136 4.50069332 [816] , the renderer is giving the driver half a second and will try again
00000137 5.10676622 [816] Video renderer GetTimingReport() failed the eighth try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field.
00000138 5.10676622 [816] , the renderer is giving the driver half a second and will try again
00000139 5.10684347 [816] Video renderer GetTimingReport() failed the ninth try with error 0xc0262589: An operation failed because a DDC/CI message had an invalid value in its command field. It seems that your monitor isn't responding to DDC/CI calls. Are you sure it's not only at initialization time but also at minor resets because of resizing the window? (Note that I can lower the amount of time given in between GetTimingReport() calls for cases like this. That might help eitherway.)
ts1
29th January 2015, 12:36
It's always when trying to resize the window or enabling/disabling VSync or switching the bit depth of the surface, enable/disable of the Alternative scheduler is fine tho. The same is on Win 7 at least with disabled DWM.
JanWillem32
29th January 2015, 13:29
Resets are detected every .5 second, and the Alternative VSync runs its own thread that needs to be shut down for every resize or other reset condition. These things may take a while. I don't see anything really problematic at the moment. Do the resets take more than two seconds for you?
ts1
29th January 2015, 15:02
Yes, 5 seconds, no matter netbook or i5 and NV 660.
JanWillem32
30th January 2015, 01:21
I replaced the builds in my earlier post. I made the timeout after failures from GetTimingReport() much shorter to solve the long (re-)initialization time. Unfortunately, I don't have a replacement for this timing report functionality. I'll have to look up some other methods.
ts1
30th January 2015, 07:14
Yeah better now. It takes about 2 seconds now.
XRyche
30th January 2015, 12:00
JanWillem32, did you fix the AMD/Intel Chroma bug in the latest build already? I hope you did or I've got gnomes in my PC.
JanWillem32
30th January 2015, 18:28
I just double-checked it, and all 4:2:0 and all 4:2:2 passes seem to work fine.
ts1, I might be able to lower it a bit more. I'll work on it later.
XRyche
30th January 2015, 19:01
I just double-checked it, and all 4:2:0 and all 4:2:2 passes seem to work fine.
Meaning that it was a problem on my end and not with the renderer.
JanWillem32
31st January 2015, 13:21
I replaced the builds in my earlier post. I added a source code link and fixed a bug with the general Lanczos resizers.
XRyche
31st January 2015, 15:13
The Lanczos4 image resizer is failing in the new build. All others appear to be working.
v0lt
31st January 2015, 16:34
The latest builds strong jerk when the window is resized.
JanWillem32
31st January 2015, 17:19
I replaced the builds in my earlier post.
XRyche, I thought I tested all resizers I edited. Oh well, it was just one typo. Thank you for reporting.
I also reverted the refresh rate calculation function I edited. The edited version was prone to fail and output an infinite refresh rate.
v0lt, it should be partially fixed now. I'll see if I can further optimize the re-initalization speed.
XRyche
31st January 2015, 18:38
All the resizers are pristine and working as they should again. Thanks for fixing it so fast.
Hera
31st January 2015, 18:41
I reset the player's settings and tried again (x64 Jan 31 2015 build),
~ 0 CPU when playing
~ 40 CPU when paused
EDIT: AH!
Alternative scheduler is disabled on the default settings. Base on your previous post, shouldn't the default settings have it enabled?
JanWillem32
31st January 2015, 19:33
As the alternative scheduler was an experimental item, I made it opt-in instead of the default. In a later stage I might eliminate the option and just make it default-enabled.
Hera
31st January 2015, 19:55
As the alternative scheduler was an experimental item, I made it opt-in instead of the default. In a later stage I might eliminate the option and just make it default-enabled.
Yeah... just had my mouse pointer jerky... closing IE11 helped...:confused:
JanWillem32
31st January 2015, 20:49
The alternative scheduler forces the desktop composition to compose at always the full display refresh rate (the only way to do reliable scheduling). Other applications can slow down the system if they have to compose at higher rates than they normally do.
v0lt
31st January 2015, 21:00
I did some tests to bicubic interpolation in MPC-BE. One pass mode works well, but two pass mode has a strong distortion when A = -1.0.
Your builds give similar results as MPC-BE with two-pass interpolation.
If you're interested, here's the test results - https://yadi.sk/d/KN4d6JzPeNVyK (1pass VS 2pass VS madVR).
This is obtained for Intel. I was told that on other cards, this problem is missing. :confused:
Hera
1st February 2015, 00:00
The alternative scheduler forces the desktop composition to compose at always the full display refresh rate (the only way to do reliable scheduling). Other applications can slow down the system if they have to compose at higher rates than they normally do.
That sounds like a major tradeoff.
Anyway, I will crank up dithering and whatnot with alternative scheduler and watch some shows and get back if I see any other issues :)
JanWillem32
1st February 2015, 00:24
v0lt, It indeed does seem like it's only on Intel graphic adapters. My old ATi HD4890 doesn't give such artifacts as in the sample picture for two-pass Bicubic A=-1 resizing. The kernel for the cubic filters can be written in three forms however. Here are the two other forms, maybe these will work.
Note 1: these builds feature new window handling code to remedy the jerky resizing you mentioned earlier. It might work, but it's probably still unfinished.
Note 2: do not down-size with the bicubic A=-1 filter. I didn't check the down-sizing kernel with these methods yet.
Variant B is less efficient. I'm not sure if I'd want to make it default. So, I'm hoping that the code in variant A will work for all cases.
Hera, it's how the system handles on-screen windows with high output frame rates. It's usually sufficient to minimize or cover up a whole window to prevent it from rendering (and save some processing time on the CPU and GPU). Enabling D3D fullscreen exclusive mode also frees up extra resources because the desktop doesn't have to render for a monitor anymore. Another factor is memory and video memory. If either is low because of many open programs Windows may lag as well because of that.
v0lt
1st February 2015, 07:20
JanWillem32, I see a problem in many modes except B-spline. On Bicubic -1.0 distortions are most noticeable.
x86 SSE2 varant A: http://i.imgur.com/Ief3PLys.png (http://i.imgur.com/Ief3PLy.png)
x86 SSE2 varant B: http://i.imgur.com/1oRrpqFs.png (http://i.imgur.com/1oRrpqF.png)
ts1
1st February 2015, 12:20
Difference in size between nearest neighbor/bilinear and all other resizers http://s000.tinyupload.com/index.php?file_id=00943912200448698260 (not this variant A/B builds)
JanWillem32
1st February 2015, 15:29
v0lt, that's too bad. I assume the artifacts are only less visible on the B-spline types because these are rather blurry. Let's try another type of fix.
ts1, that is actually normal. The nearest neighbor and bilinear resizers are implemented using StretchRect() instead of a pixel-shading pass. The vertices for StretchRect() have to be rounded to the nearest integer numbers, but the vertices for a shader pass are real numbers (with the limits of the single precision floating point format). There should never be more than one pixel worth of rounding difference.
v0lt
1st February 2015, 15:54
x86 SSE2 variant C: http://i.imgur.com/6habbpws.png (http://i.imgur.com/6habbpw.png)
madshi
1st February 2015, 16:03
@JanWillem32, double check to make sure your HLSL code is addressing the center of each pixel. I've seen these kind of artifacts in my shaders when my addressing accidently addressed the pixel borders instead of the center. Just a thought, though, could be something completely different...
JanWillem32
1st February 2015, 16:33
This situation sure is weird. On other graphics adapters you get a rift from the top-right to the bottom-left corner if you disable the half-pixel vertex offsets (variant C). (For AMD/ATi drivers you can enable "Alternative Pixel Centers" to counter this effect.) The bottom-right figure in the sample video is supposed to still look relatively normal.
Let's try software vertex processing. I'm not sure that this will make a difference, as all vertices in the renderer are pre-processed anyway.
madshi, did these problems also only happen on Intel graphic adapters for you, too?
v0lt
1st February 2015, 17:09
x86 SSE2 variant D: http://i.imgur.com/8Tvjjgqs.png (http://i.imgur.com/8Tvjjgq.png)
Again, this problem has only two pass interpolation. For example, a single-pass Perlin "Smootherstep" works well.
madshi
1st February 2015, 18:04
I think I had something like this with an AMD GPU, but not with NVidia. I don't remember if I tested with Intel at that time, it was a long time ago. Anyway, as far as I remember, the problem in my case was caused by my HLSL code not properly addressing the center of each source pixel when calling tex2D. I think I addressed the pixel border instead, which resulted in the GPU driver sometimes flipping to the left/top pixel and sometimes to the bottom/right pixel, when using point filtering.
Are you using point sampling, too? Try linear filtering, just as a test to see whether that "fixes" the issue. If it does, then incorrect pixel addressing is the likely cause of the problem.
v0lt
1st February 2015, 19:37
I have converted the bw18x18_rgb.avi (https://yadi.sk/i/g7to4JZid2yqx) (0,255) to grey18x18_rgb.avi (https://yadi.sk/i/W0pQ-H0WePjXx) (64,191) and got the correct result!
http://i.imgur.com/tdbKw6Ws.png (http://i.imgur.com/tdbKw6W.png) :-)
JanWillem32
1st February 2015, 19:40
I can't replicate the issue myself, but I can certainly try to change the offsets in this case. It could be that I mistranslated the resizers from their original DirectX 10 format. (DirectX 10+ vertex formats are different than the types used in earlier versions of the API.)
ts1
1st February 2015, 19:52
Why is that B-spline4 sharper than B-spline6 and even sharper than B-spline8? Shouldn't it be the other way around? And I can't see the difference between Catmull-rom spline4-8 at all. I can see the difference for example between Lanczos2-4.
JanWillem32
1st February 2015, 20:35
B-splines blur a lot, and they blur progressively more over the more samples you feed the kernel. Catmull-Rom splines sharpen slightly more over the more samples you feed the kernel (you can see the difference with the black and white sample posted earlier, for example).
Note that the player also allows plain .BMP files for rendering samples. I've used that function a lot in the past to evaluate resizers and such.
Shiandow
1st February 2015, 20:48
I have converted the bw18x18_rgb.avi (https://yadi.sk/i/g7to4JZid2yqx) (0,255) to grey18x18_rgb.avi (https://yadi.sk/i/W0pQ-H0WePjXx) (64,191) and got the correct result!
http://i.imgur.com/tdbKw6Ws.png (http://i.imgur.com/tdbKw6W.png) :-)
Ah, that seems to be it, for some reason the intermediate result is clamped to values between 0 and 1 which makes the process asymmetric. I can reproduce similar images by doing this manually.
v0lt
2nd February 2015, 16:26
x86 SSE2 variant E: http://i.imgur.com/qi1Ymyus.png (http://i.imgur.com/qi1Ymyu.png)
Ah, that seems to be it, for some reason the intermediate result is clamped to values between 0 and 1 which makes the process asymmetric. I can reproduce similar images by doing this manually.
I also think that this is due to rounding. I see a black screen when I choose float texture on Intel. So I can only select 8-bit textures, it is also used for intermediate results two-pass interpolation.
JanWillem32
2nd February 2015, 18:55
Too bad variant E isn't working either. I'll try some more. Note that 16- and 32-bit textures do work on Intel graphics adapters when using VMR-9 instead of EVR.
About the asymmetry when using 8-bit integer intermediate storage, could AMD and Nvidia possibly be cheating with the intermediate storage type? I've been able to replicate the original asymmetric image from the resizer using the DirectX 9 reference device. (I'll happily share a debug build with those that have access to the D3D debug runtime. Just forgive that it's only working for the 8-bit integer surfaces option and is really slow to initialize.) I'll look forward to the results for testing with the 16- and 32-bit surfaces options. If nothing really seems wrong with the original resizers, I'll just revert the code used for the variant types. I don't mind a minor truncation issue when using the performance mode, as long as the quality mode doesn't suffer from it.
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.