Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
20th December 2014, 09:43 | #83 | Link | |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
Quote:
Code:
"sampler SourceSampler : register(s0);\n" "sampler WeightSampler : register(s2);\n" "float4 floatConsts1 : register(c0);\n" "#define pixSizeX (floatConsts1[0])\n" "#define pixSizeY (floatConsts1[1])\n" "static float1 SumWeights1[nns] = (float1[nns]) packedSumWeights1Array;\n" "static float1 SumWeights2[nns] = (float1[nns]) packedSumWeights2Array;\n" "static float4x4 rgbToHd = {+0.2126000000000000, +0.7152000000000000, +0.0722000000000000, 0,\n" " -0.1145721060573400, -0.3854278939426600, +0.5000000000000000, 0,\n" " +0.5000000000000000, -0.4541529083058166, -0.0458470916941834, 0, 0, 0, 0, 0};\n" "\n" "float4 main(float2 Tex : TEXCOORD0) : COLOR0\n" "{\n" " float input[32];\n" " float mstd0, mstd1, mstd2;\n" " {\n" " float sum = 0;\n" " float sumsq = 0;\n" " int index = 0;\n" " float xpos = Tex.x - 1.0 * pixSizeX;\n" " for (int ix = 0; ix < 4; ix++)\n" " {\n" " float ypos = Tex.y - 3.0 * pixSizeY;\n" " for (int iy = 0; iy < 8; iy++)\n" " {\n" " float4 sample = tex2Dlod(SourceSampler, float4(xpos, ypos, 0, 0));\n" " sample = (sample - 16.0f / 255.0f) / (219.0f / 255.0f);\n" // d3d9Float8 16-235 -> 0-255 " sample = mul(rgbToHd, sample) * 255.0;\n" " ypos += pixSizeY;\n" " input[index++] = sample[0];\n" " sum += sample[0];\n" " sumsq += sample[0] * sample[0];\n" " }\n" " xpos += pixSizeX;\n" " }\n" " mstd0 = sum / 32.0;\n" " mstd1 = sumsq / 32.0 - mstd0 * mstd0;\n" " mstd1 = (mstd1 <= 1.19209290e-07) ? 0.0 : sqrt(mstd1);\n" " mstd2 = (mstd1 > 0) ? (1.0 / mstd1) : 0.0;\n" " }\n" " float vsum = 0;\n" " float wsum = 0;\n" " {\n" " float ypos = 0.5 / nns;\n" " for (int i1 = 0; i1 < nns; i1++)\n" " {\n" " float xpos = 0.5 / 16.0;\n" " float sum1 = 0;\n" " float sum2 = 0;\n" " int index = 0;\n" " for (int i2 = 0; i2 < 16; i2++)\n" " {\n" " float4 weights = tex1Dlod(WeightSampler, float4(xpos, ypos, 0, 0));\n" " xpos += 1.0 / 16.0;\n" " float sample = input[index++];\n" " sum1 += sample * weights[0];\n" " sum2 += sample * weights[1];\n" " sample = input[index++];\n" " sum1 += sample * weights[2];\n" " sum2 += sample * weights[3];\n" " }\n" " ypos += 1.0 / nns;\n" " float temp1 = sum1 * mstd2 + SumWeights1[i1];\n" " float temp2 = sum2 * mstd2 + SumWeights2[i1];\n" " temp1 = exp(clamp(temp1, -80.0, +80.0));\n" " vsum += temp1 * (temp2 / (1.0 + abs(temp2)));\n" " wsum += temp1;\n" " }\n" " }\n" " float result = (mstd0 + ((wsum > 1e-10) ? (((5.0 * vsum) / wsum) * mstd1) : 0.0)) / 255.0;\n" " return result * (219.0f / 255.0f) + 16.0f / 255.0f;\n" // d3d9Float8 0-255 -> 16-235 "}"; |
|
22nd December 2014, 06:52 | #84 | Link |
YAP author
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
|
madshi,
thanks, will look into it Shiandow, have thought a bit about it, so if I made support for simple script, that will look like in the example below, will it allow you to replace RenderScript part without changing your hlsl's ? For example we have a shader pack of 5 hlsl scripts (Shader1.hlsl, Shader2.hlsl, Shader3.hlsl, Shader4.hlsl, Shader5.hlsl), the default script generated would be one pass: Shader1(Source)->Shader2(Shader1)->Shader3(Shader2)->Shader4(Shader3)->Shader5(Shader4); but you can change it to something like this: Shader1(Source)->Shader2(Shader1)->Shader3(Shader2); // first pass Shader4(Source); // second pass Shader5(Source, Shader3, Shader4); // third pass |
22nd December 2014, 11:27 | #85 | Link | |
Registered User
Join Date: Dec 2013
Posts: 753
|
Quote:
|
|
23rd December 2014, 04:48 | #89 | Link | |
YAP author
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
|
Quote:
v0lt, uncheck "Explorer context menu entry" option and YAP will remove it. But if you only plan move executable to some other place, don't worry YAP will update path automatically |
|
23rd December 2014, 04:48 | #90 | Link | |
YAP author
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
|
Quote:
v0lt, uncheck "Explorer context menu entry" option and YAP will remove it. But if you only plan move executable to some other place, don't worry YAP will update path automatically |
|
23rd December 2014, 09:51 | #91 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
JFMI: What do you need to scale for? I thought SuperRes would look at the original/unscaled image and at the final/scaled image and then post-process the scaled image, based on analyzing both images? Do you need to manually scale another time to make SuperRes work?
|
23rd December 2014, 13:38 | #92 | Link | |
Registered User
Join Date: Dec 2013
Posts: 753
|
Quote:
- D^t (A - D B) Where D^t is the transpose of the downscaling operator, which (surprisingly) is the corresponding upscaling operator. For instance if D performs bicubic downscaling then D^t performs bicubic upscaling. This means that you can calculate this part by downscaling 'B', subtracting this from A and then upscaling this again. You could technically do this in only one go, but that is several orders of magnitude slower. This method does effectively invert the downscaling operation, but is ill behaved. It will create lots of ringing and aliasing. To avoid that it is necessary to do some post-processing to remove those, but of course this may cause the image to deviate from the original again, so you have to correct that again. Anyway that goes back and forth a few times and (hopefully) converges onto a final image. In practice 2 times seems to be enough to get reasonable results. |
|
24th December 2014, 11:49 | #93 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
Ok, so let me try to sum that up:
1) SuperRes needs access to the original image A and the upscaled image B. 2) SuperRes downscales B internally to the resolution of A. 3) SuperRes calculates the difference between A and B. 4) SuperRes upscaled the difference to the resolution of B. 5) SuperRes applies the upscaled difference to B. Is that correct? I suppose the algorithms for steps 2) and 4) should be "identical" (e.g. both Bicubic AR)? Should they also be identical to the original upscaling algorithm used to upscale A to B? Or is that not necessary? |
24th December 2014, 12:05 | #94 | Link | |
Registered User
Join Date: Dec 2013
Posts: 753
|
Quote:
Anyway, according to the mathematics step 2) and 4) should have "identical" scaling algorithms, but in practice using a better algorithm for step 4) has far more benefit than using a better algorithm for 2). My current favorite combination is to use bilinear for downscaling and Gaussian for upscaling (low aliasing and ringing). In theory you can use whatever algorithm you want for the initial scaling of A to B, but it's generally better to use one without too much aliasing, NEDI is almost ideal in that regard. |
|
25th December 2014, 02:18 | #97 | Link |
Angel of Night
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
|
Anima123, it looks like it hands off to the player/renderer/next in chain to make any final adjustments, like plain NEDI. Without NEDI it'll go direct to the output resolution (unless you have a weird chain and force something else), so no further resize should be done.
Shiandow, something I'm curious about with SuperRes: Once the algorithm is pretty locked in, will converting it to OpenCL make a big difference? Also, would you eventually be willing to make an AviSynth or VapourSynth filter out of it? (NEDI-based upsizing is definitely better than NNEDI for some things.) If not, at least having the code available makes it possible for others. It just keeps getting better, I really like how well it works! Also, if you guys don't mind, I think it's best to split this discussion out of the YAP thread. |
25th December 2014, 11:03 | #98 | Link | |
Registered User
Join Date: Dec 2013
Posts: 753
|
Quote:
Anyway, this discussion is deviating quite a bit from the original topic so I agree that it would probably be better to move it to it's own thread. |
|
26th December 2014, 04:41 | #99 | Link |
YAP author
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
|
Shiandow, to be completely sure
For SuperRes: NEDI-pre -> <Upscale>-I -> <Upscale>-II -> SuperRes-pre -> SuperRes -> [SuperRes-inf -> SuperRes] -> NEDI-pst Upscale means 2x upscale in both directions ? Where did final scaling should happen and what algorithm is used (better to use) for it ? And how did MPC-HC knew that it should reallocate and resize output texture on steps 2 and 3 ? And can you please draw the same scheme for MPDN NEDI ? Last edited by Orf; 26th December 2014 at 04:51. |
26th December 2014, 14:36 | #100 | Link | |
Registered User
Join Date: Dec 2013
Posts: 753
|
Quote:
Code:
/---------------------------------------\ | | v | Initial Guess ---> Downscale ---> Diff ---> SuperRes ^ ^ | | Original --------------------------+-----------/ For the NEDI shaders the diagram is something like: Code:
NEDI-pre -> NEDI-I -> NEDI-II -> NEDI-pst Code:
Input (w,h)--->NEDI-Hinterleave (2w,h)--->NEDI-Vinterleave(2w,2h) | ^ | ^ V | V | NEDI-I (w,h)--------/ NEDI-II (2w,h)--------/ |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|