View Full Version : NNEDI - intra-field deinterlacing filter
tritical
15th September 2007, 03:41
Well, here is nnedi v1.3 (http://bengal.missouri.edu/~kes25c/nnedi_v1.3.zip). It isn't perfect yet, but I think it definitely proves that this method can work well. A v2.0 is already in the works. The filter operation is pretty simple... it throws away one field of each input frame and then interpolates the missing pixels. There is a parameter called 'field' to control which field is kept and double vs same rate output (same as the field parameter in eedi2). Then there are boolean Y, U, and V parameters to control which planes are processed.
This filter turned out to be pretty good for resizing as well (limited to powers of 2 enlargement). Using it for resizing is pretty easy... pointresize the height to 2x, use nnedi, rotate left or right, pointresize again, use nnedi a second time, etc... It is slightly more difficult for YUY2 because turnleft()/turnright() will mess up (blur/interpolate) the chroma. So you will need to use utoy() and vtoy() to pull the chroma planes out and then process each of the 3 clips separately. Example functions for 2x resizing:
function nnediresize2x(clip c, bool pY, bool pU, bool pV)
{
v = c.nnedi(dh=true,Y=pY,U=pU,V=pV).turnleft()
v = v.nnedi(dh=true,Y=pY,U=pU,V=pV).turnright()
return v
}
function nnediresize_YUY2(clip c)
{
cy = c
cu = c.utoy()
cv = c.vtoy()
cy = nnediresize2x(cy,true,false,false)
cu = nnediresize2x(cu,true,false,false)
cv = nnediresize2x(cv,true,false,false)
return ytouv(cu,cv,cy)
}
function nnediresize_YV12(clip c)
{
return nnediresize2x(c,true,true,true)
}
This will result in a shifted image, the direction being dependent on the rotations used.
As an example, 4x enlargement (http://bengal.missouri.edu/~kes25c/t0.png) of clown image from http://www.general-cathexis.com/interpolation.html. No pre or post processing.
Any feedback is welcome, and thanks again to everyone who contributed cpu time :thanks:.
Dark Shikari
15th September 2007, 04:01
Wow, that is a nice filter :eek:
Great work.
Revgen
15th September 2007, 06:44
I decided to use it as a bob filter and compare it to other bob filters. Since I tend to encode sports, it's what I'm most interested in.
Here's my interpretation:
NNEDI: Very Good Quality, Terrible Stability
MVBOB: Very Good Quality, Very Good Stability
MCBOB: Best Quality, Best Stability
NNEDI+TDeint: Good Quality, Good Stability
NNEDI didn't allow too many stray interlaced lines to come in, but the video was flickering and jerking too much and wasn't stable. Pairing it with TDeint improved stability but allowed more stray interlacing artifacts in. If there are any suggestions on improving quality for the NNDI scripts, let me know.
Here's my Lagarith sample. http://www.mediafire.com/?42y1jis3mbg
NNDI Settings:
NNDI = nnedi(field=3,y=true,u=true,v=true,threads=2,opt=0)
NNEDI+TDeint =
interp = nnedi(field=3,y=true,u=true,v=true,threads=2,opt=0)
tdeint(mode=1,order=1,edeint=interp)
MVBOB = Default
MCBOB = Default
foxyshadis
15th September 2007, 07:56
nnedi is a replacement for eedi2, not a smart bob on its own. Swap it for eedi2 in mvbob and then compare it to eedi2's performance, using either securebob, mvbob, or mcbob as you prefer.
Function NNEDIbob(clip Input)
{
Input.nnedi(Field = -2)
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
Add this as type == 4 to SecureBob. Make it default if you want, currently eedi2 is. Same thing in mcbob, but the relevant line to change there is edibobbed = clp.EEDIbob().
tritical
15th September 2007, 10:11
As foxyshadis said, nnedi is just an interpolater like eedi2. It would never be able to beat a motion compensated or motion adaptive bobber on content with static logos and writing.
Anyways, I noticed a bug which was causing the incorrect lines of the chroma planes to be kept in yv12 (yuy2 was fine). I modified the link above to point to version 1.1.
Revgen
15th September 2007, 10:24
nnedi is a replacement for eedi2, not a smart bob on its own. Swap it for eedi2 in mvbob and then compare it to eedi2's performance, using either securebob, mvbob, or mcbob as you prefer.
Function NNEDIbob(clip Input)
{
Input.nnedi(Field = -2)
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
Add this as type == 4 to SecureBob. Make it default if you want, currently eedi2 is. Same thing in mcbob, but the relevant line to change there is edibobbed = clp.EEDIbob().
Thanks. I'll do it tommorow.
tritical
15th September 2007, 10:41
I bobbed your sample using yadif with nnedi for spatial prediction. Result: test.avi (http://bengal.missouri.edu/~kes25c/test.avi)
scharfis_brain
15th September 2007, 11:34
@tritical: wow!
I am really impatient right now, to see a yadif+nnedi.dll to implement it into mvbob. this should jield into a massive improvement in stability.
scharfis_brain
15th September 2007, 14:17
I tested nnedi now and found that it creates garbage with SSE.
It works fine if I force it to use C-Code.
tritical
15th September 2007, 17:28
What is the script you're using, and what cpu does your computer have? c code and sse produce exactly the same results on my laptop and desktop.
scharfis_brain
15th September 2007, 17:48
loadplugin("c:\x\nnedi.dll")
avisource("60i-YUY2-Huffy.avi").assumetff()
nnedi(opt=0, field=-2)
the resulting image looks like this:
(opt=2 also produces this result)
http://home.arcor.de/scharfis_brain/samples/nnedi-opt0.jpg
when I set opt=1 I receive a pretty nice interpolated result:
http://home.arcor.de/scharfis_brain/samples/nnedi-opt1.jpg
the source video is 640x480@29.97fps YUY2
converting it to YV12 results in the same weird image.
I use an Athlon XP 2600+ (Barton Core) with an ASUS A7N8X-XE Mainboard and 2 Gigs of RAM.
The source image looks like this:
http://home.arcor.de/scharfis_brain/samples/nnedi-source.jpg
tritical
15th September 2007, 19:08
scharfis, could you run [link removed] with debugview open to capture the output. It should show which sse routines aren't working correctly on your computer.
The output log might get really big really fast.
Chainmax
15th September 2007, 19:24
The 4x enlargement look amazing, it's better than most results in that page and at least comparable to Zhao Xin-LI and LAD Decovolution :eek:. Great work, tritical! http://smilies.vidahost.com/otn/wink/thumb.gif
MfA
15th September 2007, 21:17
BTW, what kind of downsampling (or rather PSF) are you optimizing for? Straight bilinear (box) like Aruzinsky?
tritical
15th September 2007, 21:47
yadifmod v1.0 (http://bengal.missouri.edu/~kes25c/yadifmod_v1.zip). I've had this for a while, but never got it together for release. It is the same as Fizick's port, except that spatial predictions are taken from a user supplied clip. Also, it is not an Avisynth_C plugin. It works with YV12 and YUY2 input.
@MfA
None really. The primary purpose of the filter is interpolation for deinterlacing not image enlargement. The training set for v1.0 consisted of 220 frames taken from about 30-35 dvd sources (many of them being anime, probably 5-10 were real life sources) and some random images. The filter simply learns to predict a pixel value given only the pixels in the opposite field surrounding its location. For v2.0 I am increasing it to ~250-270 frames. Most of the new ones are from real life images and test clips I found on the internet. There are some other internal changes being made for the next version as well.
Revgen
15th September 2007, 22:14
Just previewed both MVBob and MCBob with NNEDI and it looks pretty good so far looking at still frames. I exprimented with this line in MCBOB
# If requested, do additional PP via EEDI2
# ----------------------------------------
oweave.mt_merge(last,notstatic,luma=false,U=3,V=3)
AssumeTFF()
edisingle = eedi2().LanczosResize(ox,oy,0,-0.5,ox,2*oy+0.001,taps=3)
edidouble = merge(SeparateFields().SelectEven().eedi2(field=1),SeparateFields().SelectOdd().EEDI2(field=0),0.5)
(EdiPost==1) ? edisingle : \
(EdiPost==2) ? edidouble : last
and changed it to
# If requested, do additional PP via NNEDI
# ----------------------------------------
oweave.mt_merge(last,notstatic,luma=false,U=3,V=3)
AssumeTFF()
edisingle = nnedi()
edidouble = merge(nnedi(field=1),nnedi(field=0),0.5)
(EdiPost==1) ? edisingle : \
(EdiPost==2) ? edidouble : last
It looked okay, but it didn't smooth jagged lines as well as the EEDI2 one, so I kept the former.
I'll let you know more once they are fully encoded.
tritical
15th September 2007, 22:56
Changing
edisingle = eedi2().LanczosResize(ox,oy,0,-0.5,ox,2*oy+0.001,taps=3)
to
edisingle = nnedi()
can't be right. eedi2 is taking in a frame and doubling the height. Whereas, nnedi is taking in the same frame, throwing out half the lines and then interpolating them. To get the height doubling behavior with nnedi you need to pointresize to 2x vertically prior to calling nnedi. It should be:
edisingle = pointresize(width,2*height).nnedi().LanczosResize(ox,oy,0,-0.5,ox,2*oy+0.001,taps=3)
Revgen
15th September 2007, 23:01
Changing
edisingle = eedi2().LanczosResize(ox,oy,0,-0.5,ox,2*oy+0.001,taps=3)
to
edisingle = nnedi()
can't be right. eedi2 is taking in a frame and doubling the height. Whereas, nnedi is taking in the same frame, throwing out half the lines and then interpolating them. To get the height doubling behavior with nnedi you need to pointresize to 2x vertically prior to calling nnedi. It should be:
edisingle = pointresize(width,2*height).nnedi().LanczosResize(ox,oy,0,-0.5,ox,2*oy+0.001,taps=3)
Okay, I'll try that out then.
scharfis_brain
16th September 2007, 00:59
@tritical:
the special version of nnedi.dll you gave me for testing with debugview neither shows a correct result with opt=2 nor with opt=1.
debugview's only (over and over repeated) message is this:
[2876] findCluster doesn't match!
however, your officially posted nnedi.dll works fine with opt=1.
Revgen
16th September 2007, 01:24
Okay I've now looked at MVBob and MCBob, and it appears that NNEDI makes a definite difference on edges. With EEDI2 the edges display something I call "blur bubbles" on straight lines. Replacing EEDI2 with NNEDI seems to greatly reduce if not eliminate these artifacts.
Here's an example of MVBob in it's regular state.
http://img118.imageshack.us/img118/6697/blurbubblecw9.png (http://imageshack.us)
Here's MVBob paired with NNEDI instead of EEDI2
http://img297.imageshack.us/img297/713/noblurbubblehr0.png (http://imageshack.us)
The differences are hard to notice while in motion though.
scharfis_brain
16th September 2007, 01:27
@revgen: if you cannot use something else than Paint then just ensure to set the image size to 1x1 pixels via
Image -> Attributes
before pasting an image!
this will avoid the white borders!
Revgen
16th September 2007, 01:30
@revgen: if you cannot use something else than Paint then just ensure to set the image size to 1x1 pixels via
Image -> Attributes
before pasting an image!
this will avoid the white borders!
I have no idea how to use paint. :p
I'll do it next time.
tritical
16th September 2007, 01:48
scharfis, I put up a new nnedi.dll at the same location as before. Can you dl it and see if it fixes the problems with sse. The last one I put up always used sse (then compared the results for each routine to the C version) so opt didn't do anything.
scharfis_brain
16th September 2007, 01:58
this version behaves like the original one:
- no debugview output
- opt=1 produces a nice output
- opt=2 produces garbage
btw.: I am working with AVS 2.58
tritical
16th September 2007, 02:13
One more time, same link as before. If it still doesn't work I'm out of ideas.
scharfis_brain
16th September 2007, 05:30
Still the same:
(to quote myself)
this version behaves like the original one:
- no debugview output
- opt=1 produces a nice output
- opt=2/0 produces garbage
EDIT: I just tested it in Microsoft VirtualPC on a fresh, virgin-like install of WindowsXP.
The result was the same:
- opt=1 OK
- opt=2/0 Garbage
Is it possible, that my CPU is faulty and processes SSE commands in a wrong way?
Are there programs to check for correct execution of commands (or command sets like SSE)?
tritical
16th September 2007, 08:04
I don't know of any programs to check correct execution of sse, but I also haven't looked for one. The only thing that makes the findCluster sse routine (which is the only one that doesn't work correctly on your computer) different from the other sse routines is that it uses the 'comiss' instruction. The rest of it is almost exactly the same as one of the other routines which works correctly.
Maybe someone else with an athlon xp can test?
Fizick
16th September 2007, 09:05
same bug with my AthlonXP 1800+
tritical
16th September 2007, 11:11
Here are the C/sse routines, maybe someone can see something I can't:
int findCluster_C(const float *input, const float *clusters, const int n)
{
int idx;
float mdiff = FLT_MAX;
for (int i=0; i<n; ++i)
{
float diff = 0.0f;
for (int j=0; j<100; ++j)
diff += (input[j]-clusters[j])*(input[j]-clusters[j]);
if (diff < mdiff)
{
mdiff = diff;
idx = i;
}
clusters += 100;
}
return idx;
}
__declspec(align(16)) const float sse_floatmax[4] =
{ FLT_MAX, FLT_MAX, FLT_MAX, FLT_MAX };
int findCluster_SSE(const float *input, const float *clusters, const int n)
{
int idx;
__asm
{
xor eax,eax
mov edx,n
mov esi,clusters
movaps xmm7,sse_floatmax
i_loop:
mov edi,input
mov ecx,5
xorps xmm0,xmm0
xorps xmm1,xmm1
twenty_loop:
movaps xmm2,[esi]
movaps xmm3,[esi+16]
movaps xmm4,[esi+32]
movaps xmm5,[esi+48]
movaps xmm6,[esi+64]
subps xmm2,[edi]
subps xmm3,[edi+16]
subps xmm4,[edi+32]
subps xmm5,[edi+48]
subps xmm6,[edi+64]
mulps xmm2,xmm2
mulps xmm3,xmm3
mulps xmm4,xmm4
mulps xmm5,xmm5
mulps xmm6,xmm6
addps xmm1,xmm2
addps xmm3,xmm4
addps xmm5,xmm6
addps xmm0,xmm3
addps xmm1,xmm5
add esi,80
add edi,80
sub ecx,1
jnz twenty_loop
addps xmm0,xmm1
movhlps xmm1,xmm0
addps xmm0,xmm1
movaps xmm1,xmm0
psrlq xmm1,32
addss xmm0,xmm1
comiss xmm0,xmm7
jae check_loop
movss xmm7,xmm0
mov idx,eax
check_loop:
add eax,1
cmp eax,edx
jl i_loop
}
return idx;
}
ARDA
16th September 2007, 14:35
@tritical
First of all thank for this contribution; in a fast look (didn't analyze code) if I don't remember wrong
psrlq xmm1,32 is a SSE2 instruction not supported in old SSE capables cpus. All xmm instructions in SSE
are just for floating point ones.
I have not my papers here but ALMOST sure about that.
Best regards for this project
ARDA
IanB
16th September 2007, 16:51
Yep, psrlq xmm1,32 is an SSE2 instruction.
A convienient reference is distrib/include/SoftWire/InstructionSet.cpp
One of SHUFPS, UNPCKLPS or UNPCKHPS is probably what you want.
Terranigma
16th September 2007, 17:14
I'm loving this filter. It's really fast and does a terrific job when used with yadifmod. :D
I could'nt ask for more. :)
Revgen
16th September 2007, 20:17
I bobbed your sample using yadif with nnedi for spatial prediction. Result: test.avi (http://bengal.missouri.edu/~kes25c/test.avi)
Oops! Looks like I missed this post.
That's not too bad at all for Yadif. I'll try it out myself later.
tritical
16th September 2007, 20:35
Thank you ARDA and IanB. I replaced psrlq with shufps. The funny thing is I originally added movaps/psrlq to replace pshufd so that it wouldn't require SSE2.
I put up a new version at the same link as before. scharfis or Fizick, could you test it when you have time?
scharfis_brain
16th September 2007, 20:46
@tritical: it works this way now and it is much faster!
Many thanks!
Chainmax
16th September 2007, 20:54
Revgen, could you try to include TDeint+NNEDI+TMM on your comparison?
Revgen
16th September 2007, 21:33
Revgen, could you try to include TDeint+NNEDI+TMM on your comparison?
Hmm... I didn't know about TMM until you mentioned it. I'll try it out as soon as my other encode is finished.
Revgen
17th September 2007, 07:06
Okay I checked out TDeint+TMM+NNEDI. The good news is that it rivals MVBob (with either EEDI or NNDI in the script) in terms of quality and stability. The bad news is that it's about as slow as MVBob too. And this is with Threads=2 enabled for NNEDI. It doesn't come close to MCBob though, regardless of whether MCBob is using the NNEDI or not.
I wonder if Tritical would be interested in adding an Emask parameter to Yadifmod.
It would be nice to see what result we get with Yadif combined with NNEDI and TMM.
tritical
17th September 2007, 08:35
If you were to going to use tmm/nnedi you would get the same output as using tdeint+tmm+nnedi... there wouldn't be anything for yadif to do. It doesn't matter anyways, because yadif doesn't use a motion mask like tmm outputs. Yadif doesn't make a straight weave or don't weave decision. It starts with the spatial prediction, and then limits that value to be within 'diff' of the weaved prediction (average of pixels from the prev and next fields). 'diff' is calculated from temporal differences and spatial differences.
There is one obvious improvement that can be made to yadif, and that is to slide the temporal window. Right now it is basically a five field check that checks only the middle case... so, for example, it will never output the weaved prediction if the center field (the one being turned into a frame) is within 2 fields (ahead or back) of a scenechange. The only downside is the added computational complexity. Making it check all five cases is on my list of things to do.
2Bdecided
17th September 2007, 11:39
Thanks for more toys to play with!
What's the difference, algorithmically, between NNEDI and EEDI2? (Apart from EEDI2 wanting the fields, and NNEDI throwing one field away from a frame?)
Should I stop using EEDI2 and start using pointresize.NNEDI?
Cheers,
David.
tritical
18th September 2007, 10:42
In terms of the basic operation, EEDI2 and NNEDI do the same thing. They just get there in different ways... EEDI2 copies every line of the input frame to every other line of the output frame and then interpolates the missing pixels. NNEDI just starts by throwing out every other line of the input frame and interpolates the missing pixels.
Algorithmically, NNEDI is a computational intelligence approach using artifical neural networks and clustering. Whereas EEDI2 uses a vector matching method to create a direction map, does some processing of the direction map, and then does linear interpolation along the determined directions. The main advantage of NNEDI is that it isn't limited to outputting the average of two pixels (one from the line above and one from the line below) like EEDI2 is. This allows it to handle conditions that EEDI2's interpolation can't, and is also the reason it can eliminate what Revgen called "Blur Bubbles," which EEDI2 produces. Atm, there are still some things EEDI2 handles better, but I'm confident NNEDI can best it on those things as well. There is still a lot of experimenting to be done as far as NNEDI is concerned.
Should I stop using EEDI2 and start using pointresize.NNEDI?
You should use whichever one looks best to you :p.
Chainmax
18th September 2007, 23:59
tritical, I used EEDI2 mostly for antialiasing and picture improvement on blocky sources (reconnecting edges). How do you expect NNEDI to behave on such cases? Also, does pointresize have a final image quality advantage over other resizing methods when pairing it with NNEDI or is it just a processing speed choice?
yup
19th September 2007, 10:58
Hi tritical!
:thanks:
Can I use this plugin for calculation pelclip for MVAnalyse(MVTools plugin)? Where need use src_left=0.25 and src_top=0.25, in first pointresize or second?
Advice right way.
With kind regards yup.
tritical
20th September 2007, 00:28
tritical, I used EEDI2 mostly for antialiasing and picture improvement on blocky sources (reconnecting edges). How do you expect NNEDI to behave on such cases? Also, does pointresize have a final image quality advantage over other resizing methods when pairing it with NNEDI or is it just a processing speed choice?
I would expect nnedi to work pretty much the same as EEDI2, but there is only one way to find out. Pointresize is the only resizing method that will work because the original pixels need to be kept intact. Basically, you just need a method that will copy the existing rows of pixels to every other line (even lines if field=1 or odd lines if field=0) of the height doubled input into nnedi. The point resize method copies to both, so it works for both field=0/1.
Can I use this plugin for calculation pelclip for MVAnalyse(MVTools plugin)? Where need use src_left=0.25 and src_top=0.25, in first pointresize or second?
If I understand the documentation correctly, mvtools actually wants a shifted clip (left/up). So you can use the code from the first post, but with field set so that the image always ends up shifted left and up:
function nnediresize2x(clip c, bool pY, bool pU, bool pV)
{
v = c.nnedi(dh=true,Y=pY,U=pU,V=pV,field=1).turnleft()
v = v.nnedi(dh=true,Y=pY,U=pU,V=pV,field=0).turnright()
return v
}
function nnediresize_YUY2(clip c)
{
cy = c
cu = c.utoy()
cv = c.vtoy()
cy = nnediresize2x(cy,true,false,false)
cu = nnediresize2x(cu,true,false,false)
cv = nnediresize2x(cv,true,false,false)
return ytouv(cu,cv,cy)
}
function nnediresize_YV12(clip c)
{
return nnediresize2x(c,true,true,true)
}
Call either nnediresize_YUY2 or nnediresize_YV12 depending on the colorspace, or you could make a wrapper function which checks the colorspace and chooses the right one automatically.
IanB
20th September 2007, 07:21
Hint: To double the height fastInterleave(last,last).AssumeFieldBased().Weave()
foxyshadis
20th September 2007, 07:51
That's actually faster than pointresize? o.O?
Fastest of all would seem to be the way eedi2 does it internally, which is just copying every line of source into every other line of output. (With suitable simd, which eedi2 doesn't have.) I'm not actually much concerned about speed, as the overhead of making and keeping a copy of something in cache that's just going to be thrown right away. (I use it for biiiiiiiiig stuff.) I guess MakeWriteable would prevent that.
IanB
20th September 2007, 08:35
Yes internally doing a BitBlt(..., dest_pitch*2, ....) would be twice as fast as the weave I suggested, which does the above blit twice.
The resizer core does struggle to do a point-resize efficently, it stupidly goes through the full motion, multiplying every pixel by 1 in a loop of 1 cycle.
tritical
20th September 2007, 08:45
I could add the option to make nnedi do it internally, which would be the fastest. However, it really wont make a noticeable difference since nnedi runs more than 100 times slower than pointresize. On my laptop pointresize 720x480 -> 720x960 runs ~260-280 fps. interleave()/weave() 720x480 -> 720x960 runs ~500-600 fps. nnedi on 720x960 input runs ~1.25 fps. Even on my quadcore the ratio is still > 100 times slower (6 fps vs 750 for point and 1100 for interleave/weave).
tritical
21st September 2007, 01:30
nnedi v1.3 (http://bengal.missouri.edu/~kes25c/nnedi_v1.3.zip). foxyshadis's argument about the cache and ram usage in general convinced me to add an option to internally do the needed copying for doubling the height... so no need to call pointresize anymore. I was also using a separate filter to pad the frames prior to nnedi, and then invoking crop afterwards. That has been done away with as well. I also discovered a bug in the yuy2 padding code, which resulted in occasionally incorrect (+-3) interpolated chroma values at the left and right hand sides of the image.
I updated the code in the first post to use the new 'dh' option instead of pointresize.
Terranigma
21st September 2007, 16:12
Thanks a lot tritical for the speedy changes :)
IanB
21st September 2007, 23:33
field -
Possible settings:
...
-1 = same rate, uses avisynth's internal parity value
...
Default: -1 (int)Which field is kept odd or even?
dh -
...
Default: false (int) <- (bool)Wouldn't it have been simpler to fold double height mode into extra settings for field? And maybe name field as mode?
Y, U, V -
These control whether or not the specified plane is processed. Set to true to
process or false to ignore.What state are the unprocessed planes left? Copied, zerod, trashed, ...?
Also are you going to discuss the nitty gritty of the algorithm or are you constrained because you are doing your thesis?
tritical
22nd September 2007, 03:29
Which field is kept odd or even?
If field is set to -1 then nnedi calls child->GetParity(0) during initilization. If it returns true then field is set to 1. If it returns false then field is set to 0. If field is set to -2 then the same thing happens, but instead of setting field to 1 or 0 it sets field to 3 or 2.
Wouldn't it have been simpler to fold double height mode into extra settings for field? And maybe name field as mode?
dh could have been rolled into 4 extra values for field, but I went with adding dh instead. field could be renamed mode... I named it field because eedi2 has the same parameter with the same values.
What state are the unprocessed planes left? Copied, zerod, trashed, ...?
Unprocessed planes are left trashed.
Also are you going to discuss the nitty gritty of the algorithm or are you constrained because you are doing your thesis?I'm not going to discuss the full details just yet. I have talked to my advisor about using it for my thesis, and he said that it would be ok. So I will probably use it for that.
I'll try and update the readme tommorrow sometime... thanks for pointing out areas lacking complete descriptions.
Also, if anyone is willing to donate cpu time again, network training for version 2 has started. Same irc channel as before (#editrain on irc.freenode.net)... files are in same directory on my website as before. Based on previous experience, I would say an x2 or core 2 duo is probably the minimum processor to complete the optimization runs in a reasonable amount of time. I'm also looking into offloading some processing onto the gpu, which should speed things up a good bit.
sh0dan
23rd September 2007, 14:47
Results are very impressive. I had a look at some of the math you link to, and it seems like another case of "Math I'll never be able to understand", just like FFT math.
Anyway, this is very interesting. I'm thinking it could be very useful for a Bayer Grid de-mosaic, which has very similar problems compared to deinterlacing.
I hope the source will be available at some point. Keep up the good work!
manolito
26th September 2007, 22:00
One (probably stupid) question:
When I want to use nnedi with TDeint or Yadifmod, do I have to use the same syntax as for EEDI2
AssumeTFF (or BFF)
interp = separatefields().selecteven().nnedi(field=1,dh=true)
yadifmod(edeint=interp)or can I use
AssumeTFF (or BFF)
interp = nnedi()
yadifmod(edeint=interp)
I tried both, and the result (and the speed) looked pretty much the same to me.
And one small request:
While playing with different deinterlacers I found it very annoying to have two Yadif plugins with almost the same functionality, one does autoload, the other one doesn't. Is it possible to make one consolidated build of Yadif which has the "edeint" parameter, but which uses its own spatial interpolation when the edeint parameter is not specified (just like TDeint)?
Cheers
manolito
tritical
27th September 2007, 07:03
separatefields().selecteven().nnedi(field=-1,dh=true)
and
nnedi()
are equivalent. The second should be faster, but the speed difference probably wouldn't be enough to notice. As far as adding internal interpolation to yadifmod, I personally think yadif's default interpolation is far too artifact prone. So I probably wouldn't add exactly the same method that Fizick's filter uses. At the very least it would include capping the prediction to the max/min +-2 of the vertical neighbors.
IanB
27th September 2007, 07:17
... I had a look at some of the math you link to, and it seems ...I must have missed this, anyone got a post reference.
I hope the source will be available at some point. Keep up the good work!I second that emotion ;)
sh0dan
27th September 2007, 09:00
I must have missed this, anyone got a post reference.
http://forum.doom9.org/showthread.php?p=1038500#post1038500
You might also spend some time at wikipedia:
http://en.wikipedia.org/wiki/Artificial_neural_network
IanB
27th September 2007, 14:16
Thx! :cool:
So if I guess correctly, you classify and examine a stack of images to build a tables of how "the world" works. To do the interpolation you assume the target image is a member of "the world" with every 2nd line missing. You then search your table for "a good/best match" and insert the missing lines based on that instance of experience. This should be extensible for general times N upsizing by assuming N-1 of N lines are missing and need insertion.
Also I suspect with a little lateral thinking, the tables of how "the world" works, could be adapted to analyze interlaced images to classify the area that are static (good experience match) from those that have motion (bad experience match).....:D
manolito
27th September 2007, 17:02
@ tritical
separatefields().selecteven().nnedi(field=-2,dh=true)
and
nnedi()
are equivalent.
field=-2? Typo? Nnedi won't let me do that, it insists on field being 0 or 1 if dh=true is used. Also would not make much sense IMO because I need a same rate nnedi clip.
Cheers
manolito
Terranigma
27th September 2007, 17:41
@ tritical
field=-2? Typo?
I believe so. I believe he meant -1, as it refers to the parity ;)
Field=-2 should only be used in double-rate deinterlacing, like when using yadifmod with mode 1.
tritical
27th September 2007, 23:09
The -2 was indeed a typo, it should have been -1.
So if I guess correctly, you classify and examine a stack of images to build a tables of how "the world" works. To do the interpolation you assume the target image is a member of "the world" with every 2nd line missing. You then search your table for "a good/best match" and insert the missing lines based on that instance of experience. This should be extensible for general times N upsizing by assuming N-1 of N lines are missing and need insertion.
Here is basically how it works. In the first stage a small ann takes in some surrounding pixels and predicts whether or not cubic interpolation will be close to the true pixel value (basically a two class classifier). If it thinks it is then cubic is used. Otherwise, the point is matched up to the closest of 64 cluster prototypes. The pixel is then fed to the ann for that cluster prototype (there is a separate ann for each cluster prototype), which predicts (outputs) the missing pixel value using surrounding pixels as input (more this time than for the first stage classification). In this way no single ann has to learn to approximate the entire input->output space mapping, only a small piece of it. CMAES is used to find the weights for these anns (instead of a more usual training method like gradient descent, lev-mar, etc...). In the second stage CMAES is minimizing squared error.
Mtz
28th September 2007, 19:53
I made a small test on a DV source and I said: WOW!
Reasons:
Original:
http://i20.tinypic.com/mki1cm.jpg
Nnedi:
http://i23.tinypic.com/ok4q4x.jpg
Original (the camera was moving left-right or right-left):
http://i20.tinypic.com/2w65imx.jpg
Nnedi:
http://i24.tinypic.com/aafaxj.jpg
Cedocida encode & Nnedi:
http://i23.tinypic.com/avht29.jpg
Original (the camera was moving left-right or right-left and up-and-down):
http://i24.tinypic.com/15grwn8.jpg
Nnedi:
http://i22.tinypic.com/j6o568.jpg
Cedocida encode & Nnedi:
http://i24.tinypic.com/rthk79.jpg
enjoy,
Mtz
Terranigma
28th September 2007, 21:56
mtz, are those nnedi examples using just the nnedi filter, or as an external interpolator such as using nnedi with yadifmod?
Btw, those results are awesome! :)
Mtz
29th September 2007, 00:23
I used:
AVISource("I:\MiniDV\My Video-04\InterlacedSonyDCRHC96__DV.avi")
AssumeBFF()
nnedi()
The sample is at the bottom of this page. (http://www.ftyps.com/unrelated/interlace/)
enjoy,
Mtz
Boulder
4th October 2007, 20:19
Okay I've now looked at MVBob and MCBob, and it appears that NNEDI makes a definite difference on edges. With EEDI2 the edges display something I call "blur bubbles" on straight lines. Replacing EEDI2 with NNEDI seems to greatly reduce if not eliminate these artifacts.Would you mind posting the two functions, please?
Terranigma
4th October 2007, 21:06
Would you mind posting the two functions, please?
You can grab both modded bobbers here (http://www.zshare.net/download/404129650335d2/).
Boulder
4th October 2007, 21:43
Thanks :)
ficofico
4th October 2007, 23:38
You can grab both modded bobbers here (http://www.zshare.net/download/4016525fb0727e/)
Thanks for upload....... for me mcbob works well, mvbob... no, there's some problem.
Terranigma
4th October 2007, 23:42
for me mcbob works well, mvbob... no, there's some problem.
What kind of problem? It runs just fine here.
Make sure you have all the prerequisites.
I didn't include any of them for either scripts since that's only what boulder asked for.
2Bdecided
5th October 2007, 12:45
I've finally jumped in with this.
NNEDI seems to have different strengths from EEDI2 - another useful tool - thank you tritical!
However, I'm having problems with mcbob downloaded from Terranigma's link. The "old" EEDI2 MCBob_v03c works beautifully, but MCBob_v03u isn't right. It doesn't crash, and I get sensible output, but the deinterlacing is bad - on every second field (so bottom field on TFF content) moving diagonals look close to plain old bob() - much worse than v03c, and worse than NNEDI on its own. On the frame generated from top fields, it's not horrible, but not great – those bobbles on diagonals that nnedi reportedly avoided are present, whereas they aren't there with v03c.
What have I broken? I assume all plug-ins are OK since mcbob v03c works beautifully.
I can post samples and screen shots if it's not an obvious/known error.
Cheers,
David.
krieger2005
5th October 2007, 13:59
If the problem is not NNEDI why discussing in the Thread of NNEDI?
Didée
5th October 2007, 14:53
Well, the problem is about correct usage of NNEDI, so it does fit here somehow.
Wrong:
Function nnEDIbob(clip Input)
{
[...]
GetParity(Input) ? Input.SeparateFields().nnEDI(dh=true,Field = 1) : Input.SeparateFields().nnEDI(dh=true,Field = 0)
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
NNEDI has slightly different operating modes compared to EEDI2, hence you can not just replace EEDI2 -> NNEDI in all circumstances.
Correct:
Function nnEDIbob(clip Input)
{
[...]
Input.SeparateFields()
GetParity(Input) ? Interleave( SelectEven().nnEDI(dh=true,Field = 1), SelectOdd().nnEDI(dh=true,Field = 0) )
\ : Interleave( SelectEven().nnEDI(dh=true,Field = 0), SelectOdd().nnEDI(dh=true,Field = 1) )
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
It would be less irritating if modes -2...3 were working the same way in EEDI2 and NNEDI, agreed.
Terranigma
5th October 2007, 16:04
Well, the problem is about correct usage of NNEDI, so it does fit here somehow.
Wrong:
Function nnEDIbob(clip Input)
{
[...]
GetParity(Input) ? Input.SeparateFields().nnEDI(dh=true,Field = 1) : Input.SeparateFields().nnEDI(dh=true,Field = 0)
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
NNEDI has slightly different operating modes compared to EEDI2, hence you can not just replace EEDI2 -> NNEDI in all circumstances.
Correct:
Function nnEDIbob(clip Input)
{
[...]
Input.SeparateFields()
GetParity(Input) ? Interleave( SelectEven().nnEDI(dh=true,Field = 1), SelectOdd().nnEDI(dh=true,Field = 0) )
\ : Interleave( SelectEven().nnEDI(dh=true,Field = 0), SelectOdd().nnEDI(dh=true,Field = 1) )
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
Thanks Didee for pointing that out. :)
It would be less irritating if modes -2...3 were working the same way in EEDI2 and NNEDI, agreed.
Yes, I fully agree.
---
Link Updated.
tritical
5th October 2007, 21:13
Function nnEDIbob(clip Input)
{
Input.SeparateFields()
GetParity(Input) ? Interleave( SelectEven().nnEDI(dh=true,Field = 1), SelectOdd().nnEDI(dh=true,Field = 0) )
\ : Interleave( SelectEven().nnEDI(dh=true,Field = 0), SelectOdd().nnEDI(dh=true,Field = 1) )
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
much simpler:Function nnEDIbob(clip Input)
{
Input.nnedi(field=-2)
}
It would be less irritating if modes -2...3 were working the same way in EEDI2 and NNEDI, agreed.They do work the same way... the only difference is that you don't have to call separatefields() or separatefields().selecteven()/selectodd() before nnedi.
EEDIbob() could be simplified as well:
Function EEDIbob(clip Input, int "maxd")
{
GetParity(Input) ? Input.SeparateFields().EEDI2(Field = 3, maxd = maxd) : Input.SeparateFields().EEDI2(Field = 2, maxd = maxd)
AssumeFrameBased()
GetParity(Input) ? AssumeTFF() : AssumeBFF()
}
to
Function EEDIbob(clip Input, int "maxd")
{
Input.SeparateFields().EEDI2(Field = -2, maxd = maxd)
}
Just make sure you're using at least v0.9.1 of eedi2, the original didn't call vi.SetFieldBased(false).
Terranigma
6th October 2007, 01:32
Thanks tritical for making things simpler and faster :D
I updated the bobbers once again, hopefully this'll be the last time :P
With mcbob, I switched the default interpolater with nnedi, changed mtnmode to 1, renamed Vinverse to VinverseD (D = Didée. Seriously Didée, I hope you're ok with this), and included triticals' much faster vinverse dynamic library file. Grab it here (http://www.zshare.net/download/404129650335d2/) or from the link above.
2Bdecided
16th October 2007, 11:14
I've always wondered with EEDI2, and this may apply even more to NNEDI...
When I use it to re-size, I call NNEDI twice, with turnright and turnleft after each call. Sometimes I do the whole process twice (so calling NNEDI four times - twice for each dimension).
Would it be possible / better if NNEDI worked in 2-D to start with, so it doubled in each dimension in one pass? I'm not worried about speed, but worried that the current two-pass method forces NNEDI to interpolate from what's already 50% interpolated data. I'm guessing that if it interpolated in 2-D in one pass, it would use only "real" data as a source for interpolation in both dimensions.
Second (easier?) question: with or without the above modification, could it be told to interpolate more than one pixel in a given dimension? I know for interlaced content, it's always 1-in-2 pixels "missing". However, when I process other things, sometimes it's 2-in-3 pixels missing (meaning I only have 1-in-3 pixels available). For now, I just call NNEDI twice, but it would be neater (and avoid interpolating from interpolation again!) if I could call it with a parameter to do 2x (like now), or 3x, or 4x etc interpolation.
I'm not asking or expecting you to do this tritical (I know you're busy), but if either part is easy/possible and you're updating NNEDI at any time anyway, please think about it. It might make it an even more interesting academic project!
Cheers,
David.
IanB
16th October 2007, 11:38
Yes, TurnLeft/Right by the nature of the algorithm are slow. They violate the "never look down" rule for fast processing. You either read the src across and write the dest down or vide versa.
MfA
16th October 2007, 14:14
Slow is a relative term, on a modern computer it shouldn't take more than a msec to transpose a SD yv12 frame from cache even with the naive C implementation in Avisynth (on mine it takes nearly 2, but that stuff ain't modern). Just a drop in the bucket really.
A bigger problem to me seems that processing U/V planes without data from the Y plane is a bit dodgy.
2Bdecided
16th October 2007, 14:27
It's not speed I'm worried about (though if it makes it faster, great!) - it's accuracy. With turnright turnleft NNEDI is being given it's own output to work from - surely it would be better if we could avoid this?
Cheers,
David.
Leak
16th October 2007, 15:39
It's not speed I'm worried about (though if it makes it faster, great!) - it's accuracy. With turnright turnleft NNEDI is being given it's own output to work from - surely it would be better if we could avoid this?
That's the way all other resizers in AviSynth work - scaling in two dimensions is done by scaling in one first and then scaling the result in the other.
Also, NNEDI is designed (or rather, trained) to interpolate between image's odd or even lines (which is about the same as doubling the height) - scaling in both directions would need a total rewrite and probably a lot more training.
2Bdecided
16th October 2007, 18:19
If it's a total re-write (that's what I feared), I guess I'll use the old workarounds to get the best out of it (i.e. doing both turnright turnleft AND turnleft turnright to give it two stabs at it!).
Of course the other AviSynth resizers are (AFAIK) linear in nature, so working in one dimension then the other, or the opposite, or both together, would all give identical results. That's certainly not the case with EEDI2 and NNEDI, since they're not linear, and not designed to work in both dimensions anyway.
Cheers,
David.
davidhorman
16th October 2007, 18:45
would all give identical results.
<stickler>Nearly identical ;)</stickler>
David (a different one)
Terka
18th October 2007, 12:17
any news about the new version?
Adub
18th October 2007, 20:11
We finished training a lot of NN's for Tritical, so I assume he is running with those somewhere. He may be a little busy elsewhere with school work and what not, but be happy in the conclusion that steps are being taken in the direction of a new version.
Terka
24th October 2007, 23:37
how to correctly use yadifmod v1?
how to get the :edeint clip: in there?
manolito
25th October 2007, 07:48
Try this:
AssumeTFF (or BFF)
interp = nnedi()
yadifmod(edeint=interp)
Cheers
manolito
Terka
25th October 2007, 09:30
thank you, ill try; its working fine
superuser
26th October 2007, 02:05
thnks will try it out ... :thumbup:
yup
29th October 2007, 08:46
Hi manolito!
Advice script for make 50fps from 25fps interlaced source.
AssumeTFF (or BFF)
interp = nnedi()
yadifmod(edeint=interp)
give 25 fps output.
With kind regards yup.
foxyshadis
29th October 2007, 10:46
Read the manuals, they both tell you how to get double-rate output. (Hint: field and mode.) They're not just for decoration!
tritical
29th October 2007, 13:39
any news about the new version?
Don't fear, it is still being worked on :). I've just been busy working on some other projects the last week or two. A bit of good news is that I got permission to run the training program on my university's 512 cpu cluster, and I'm doing some test runs as I write this. A new version should be ready in a week or two (no promises though :p).
Chainmax
29th October 2007, 23:28
How does one enlarge an image by 2x in both directions using NNEDI v1.3?
Adub
30th October 2007, 00:13
turnright
turnleft?
function nnediresize2x(clip c, bool pY, bool pU, bool pV)
{
v = c.nnedi(dh=true,Y=pY,U=pU,V=pV).turnleft()
v = v.nnedi(dh=true,Y=pY,U=pU,V=pV).turnright()
return v
}
Chainmax
30th October 2007, 01:18
Doesn't that only need to be done with older versions (the ones you had to put PointResize before or something like that)?
tritical
30th October 2007, 12:38
That function is correct. With dh=true, nnedi will double the height (without the need to use pointresize before nnedi). So to resize 2x you need to do that for both the width and height... hence the need to rotate.
Leak
30th October 2007, 16:59
That function is correct. With dh=true, nnedi will double the height (without the need to use pointresize before nnedi). So to resize 2x you need to do that for both the width and height... hence the need to rotate.
Are you sure? The last time I tried it (using it to resize (http://www.blanklabelcomics.com/community/viewtopic.php?t=315&start=1100) an image (http://www.wapsisquare.com/d/20071026.html) then tracing it in Inkscape for further resizing and coloring it with GIMP... :D) it also wanted a "field=0" to work...
np: Savath & Savalas - Apnea Obstructiva (Golden Pollen)
manolito
30th October 2007, 22:38
it also wanted a "field=0" to work...
Yes, this is true. Whenever you specify "dh=true" then "field=0" or "field=1" has to be present, too. "field=-1" does NOT work in this case (contrary to what tritical says).
This is even true if the source only has one field (analog TV capture @720 x 288 PAL).
Cheers
manolito
Chainmax
16th November 2007, 03:15
...
There is a parameter called 'field' to control which field is kept and double vs same rate output (same as the field parameter in eedi2).
...
Sorry for my dumbness, but I wanted to update my IVTC/Deinterlacing/Bob lines I have as templates. Does the quote mean that the equivalent to these lines:
Interp = SeparateFields().SelectEven().EEDI2(field=0)
interp = separatefields().eedi2(field=3)
would be these:
Interp = SeparateFields().SelectEven().NNEDI(field=0)
interp = separatefields().NNEDI(field=3)
?
tritical
22nd November 2007, 06:25
@Leak/manolito
Yep, you both are correct. The current error checking requires field=0 or field=1 with dh=true. That's a bug.
@Chainmax
Interp = SeparateFields().SelectEven().EEDI2(field=0)
interp = separatefields().eedi2(field=3)
is the same as
Interp = nnedi(field=0)
interp = nnedi(field=3)
g_aleph_r
26th December 2007, 09:43
Is there any chance we can see a multithreaded version like eedi2mt? or maybe Cuda enabled?
Pleeeease
Boulder
26th December 2007, 10:08
The filter already supports multithreading. By default, the number of threads is the same as the number of detected CPUs.
foxyshadis
28th December 2007, 10:18
I was just curious, but did the combined power of that giant cluster ever net you a further demo that beat the pants off the current one?
tritical
28th December 2007, 12:41
I'm actually pretty close to one, but not quite there. Having access to the cluster definitely helped though. I was able to test a number of ideas I wouldn't have been able to without it. At this point a new version might only be a couple weeks away (if my current set of experiments work out) or it could be a couple months away. I plan to graduate next December so hopefully I can make it work by then :).
Chainmax
31st December 2007, 18:52
Don't forget to dedicate some time to unwinding, tritical :).
Archimedes
16th January 2008, 16:52
This filter turned out to be pretty good for resizing as well (limited to powers of 2 enlargement).
The filter does a good job on resizing images.
Original:
http://img167.imageshack.us/img167/3412/lhousevv5.png
4 x enlarged with NNEDI:
http://img136.imageshack.us/img136/3752/lhouse0592x0592nneditc1.png
badshah
26th January 2008, 11:14
MCBob_v03u is giving me speed of 0.5fps on my c2d with 1gb ram. can anyone help me ? :confused:
neuron2
26th January 2008, 11:31
MCBob_v03u is giving me speed of 0.5fps on my c2d with 1gb ram. can anyone help me ? :confused: Wrong thread, pal. Please read and follow forum rule 3.
badshah
26th January 2008, 11:51
Wrong thread, pal. Please read and follow forum rule 3.
sorry :o
Undead Sega
31st January 2008, 22:20
is it possible to use this and replace EEDI2 in MCBob?
Adub
31st January 2008, 22:30
Yes, and it has already been done. See the MCBob thread, and scroll down a few posts, it's there.
Terranigma
31st January 2008, 22:30
is it possible to use this and replace EEDI2 in MCBob?
Amazing what one can find by doing a search (http://forum.doom9.org/showthread.php?p=1049555#post1049555)
Undead Sega
31st January 2008, 22:33
Yes, and it has already been done. See the MCBob thread, and scroll down a few posts, it's there.
really? so it has been done, literally replacing EEDI2 with NNEDI?
cause personally i think NNEDI is better than EEDI2, and form some results, EEDI2 causes 'buble blur' as dubbed by one of the users here, on straight lines.
how do you do it?
Adub
31st January 2008, 22:35
Read the fracking thread! Somebody did it for you!
Search is a beautiful thing.
http://forum.doom9.org/showthread.php?p=1055263#post1055263
Undead Sega
31st January 2008, 22:43
i found it of course, but now i am wondernig how to implant it.
do i basically replace the whole code of the original MCBob with NNEDI+MCBob code by Merlin?
Adub
31st January 2008, 23:38
If that is what you want, then yes.
Another alternative is to create a new avs file named something like "MCBob_NNEDI.avs", and then just import it in your script every time you use MCBob.
Oh, and on a side note, I didn't write that code, I just posted it.
It was originally made by Didee, then modded by Terranigma to support NNEDI.
Terranigma
31st January 2008, 23:38
i found it of course, but now i am wondernig how to implant it.
do i basically replace the whole code of the original MCBob with NNEDI+MCBob code by Merlin?
Sure. The link I was pointing to you to, was actually the origin of the modded mcbob. If you've scrolled down a few post, you would've came across this (http://forum.doom9.org/showthread.php?p=1052018#post1052018) post. :)
Undead Sega
1st February 2008, 17:34
Cheers everyone!
i finally hae done it (yesterday) and damn its such a slow process!
i wanted to deinterlace a 1 hour footage and encode it to huffyuv avi, but it took 15 hours to do 30mins of encoded footage! :( even on my (quite powerful) PC.
this would lead to the filter not efficient to everyone's needs, unless revisions were made to make it faster.
Dark Shikari
1st February 2008, 17:35
Cheers everyone!
i finally hae done it (yesterday) and damn its such a slow process!
i wanted to deinterlace a 1 hour footage and encode it to huffyuv avi, but it took 15 hours to do 30mins of encoded footage! :( even on my (quite powerful) PC.
this would lead to the filter not efficient to everyone's needs, unless revisions were made to make it faster.Yes, MCBob is slow. If you want speed, use Yadif or TDeint.
Undead Sega
1st February 2008, 17:39
i have used Yadif and TDeint, but results at the end didnt quite satisfied me completely, as Yadif gives an oil painting like image to the video which is asomething that took awhile for me to realise.
how can you speed up MCBob?
Dark Shikari
1st February 2008, 17:50
i have used Yadif and TDeint, but results at the end didnt quite satisfied me completely, as Yadif gives an oil painting like image to the video which is asomething that took awhile for me to realise.
how can you speed up MCBob?Multithreading, a faster CPU, or both.
scharfis_brain
1st February 2008, 17:56
one could try
yadifmod(mode=1, edeint=nnedi(field=-2))
Undead Sega
1st February 2008, 17:57
i see, and may i ask, how does this one differ to the original? hope u dont mind explaining.
Multithreading, a faster CPU, or both.
i already have a Dual Core processor (Intel Core 2 E6600 :D) with 2GB RAM and a 500GB hard drive. isnt that fast enough?
Adub
1st February 2008, 18:25
Nope, try a quad core. I am running the exact same get up, with my e6600 overclocked to 3ghz and about a terabyte of harddrive space and it sure as hell ain't fast enough.
Undead Sega
1st February 2008, 18:47
then that basically is the filter's fault, because the Quad Core from Intel isnt literally twice a powerful as a Dual Core.
although, i have been reading of an AviSynth on Multithread and Multiprocessor, would that happen to help?
Boulder
1st February 2008, 19:07
Since you save to a lossless file, you could split the video in two halves using Trim and process both halves simultaneously. It's faster than using the multithreaded Avisynth. Then join the two halves again in your script when you encode to the final format.
Undead Sega
1st February 2008, 19:11
into a lossless file is the final file, where it would be turned into an MPEG-2 with it being colour corrected (not your usual type), unless there is a way to frameserve the filter into TMPEGXpress 4, which doesnt accept whatsoever.
Adub
1st February 2008, 20:16
It's much faster to go straight to lossless and then convert the lossless to whatever you want afterwards. With lossless, almost all of the cpu cycles are devoted to MCBob + whatever other filters you are using, thus maximum speed ensues.
The same thing happens when you go from lossless to something else. It takes very little processing power to decode lossless, thus your final encoder gets most of the cpu, meaning more speed.
Undead Sega
1st February 2008, 20:31
well the lossless file is, as i said, is the final, and that is the straight route from virtualdub, frameserving it to TMPEGEnc Plus into a lossless avi.
from lossless to watever is something of a personal need and it does not match the speed whatsoever of MCBob.
Didée
1st February 2008, 21:25
@ UndeadSega - Short answer: If you think MCBob is too slow to justify its results, then don't use it. Simple as that.
Slightly longer: MCBob tries to do some things "better" than other bobbers do them. As a logical consequence, it's much slower than other filters, since it has to do lots and lots of calculations.
In its current state, metrics tests indicate that MCBob has a respecive edge over other bobbers. If MCBob would be "simplified" to achieve competitive speed, then it would no longer be "better", but just be another small fish in the big swarm.
You could as well ask for a H.264 implementation that needs as little ressources as Mpeg-1 implementation do, or for a ferrari as cheap as a bicycle. Dreamworld and reality are not always compatible.
Undead Sega
1st February 2008, 22:00
that is very understandable, but i will not refuse to not to use it, because it is a very good filter, when it is applied with NNEDI (implanted and seperately), and it was only that, that needed to clear things to explain it. how funny if companies started using it to deinterlace their hours of 50i footage? :D
but still, it is without doubt very slow, but i also asked, i have read on about AviSynth taking advantage on multithread and core processors, i would like to know about that and i will ask, what will the performance be like using MCBob on that?
Adub
1st February 2008, 22:07
When I use MT on my e6600, and use MT, I may get about 6fps instead of 3fps.
If you want information on how to use it, see the MT thread, as this is the wrong thread for it.
Inventive Software
24th February 2008, 00:48
At the risk of grave-digging, is there a theoretical advantage to doing calculations for MCBob and/or NNEDI on the GPU, a la CUDA or CTM? I ask because this filter, whilst great and gives me great results, seems rather slow and could use a new revolution.
I could attack it over the summer, if people are interested? :rolleyes:
Undead Sega
26th February 2008, 21:20
definately!
when mentioning GPU, i have a 8800 GTS, which does seem to be pretty powerful, so it might help and relax the CPU as well.
Adub
26th February 2008, 21:53
If you can port it, do it!! It's never a bad thing to have greater speed at no loss in quality.
Undead Sega
27th February 2008, 13:05
also, since u mentioned GPU, its Anti-Aliasing algorithm is much more sufficient and complex than the filters on AviSynth (or at the moment), maybe you use the GPU's Anti-Aliasing algorithm in combination with MCBob to give almost perfect deinterlacing :D
one could try
yadifmod(mode=1, edeint=nnedi(field=-2))
and would that rid of the oil painting effect on the video?
Didée
27th February 2008, 13:48
also, since u mentioned GPU, its Anti-Aliasing algorithm is much more sufficient and complex than the filters on AviSynth (or at the moment)
Perhaps I'm mistaken (I'm pretty much out of gaming for a longer time), but I don't think that will work. GPU's antialiasing is designed for rendering 3D meshs, and for textures that are displayed with any sort of geometrical skew.
Anti-aliasing a fixed bitmap at it's original size is a completely different task! Not sure if current GPU's have routines for this specific task at all ... and if they do, how the results compare.
*.mp4 guy
29th February 2008, 14:27
Anti-aliasing a fixed bitmap at it's original size is a completely different task! Not sure if current GPU's have routines for this specific task at all ... and if they do, how the results compare.
To the best of my knowledge gpu's don't have any routines for fixing aliasing in a source image.
To, add to what Didée said, what they do is make sure aliasing doesn't get created during 3d rendering of an image. i.e., a gpu's antialiasing filter actually works by keeping the gpu from ever creating aliasing in the first place, it doesn't actually remove aliasing that has already been introduced.
Revgen
1st March 2008, 08:09
Perhaps I'm mistaken (I'm pretty much out of gaming for a longer time), but I don't think that will work. GPU's antialiasing is designed for rendering 3D meshs, and for textures that are displayed with any sort of geometrical skew.
Anti-aliasing a fixed bitmap at it's original size is a completely different task! Not sure if current GPU's have routines for this specific task at all ... and if they do, how the results compare.
There's different kinds of AA. There's adaptive AA and Full Screen AA (FSAA). FSAA is typically slower since it applies to all pixels while adaptive just tries to do the edges to prevent jaggies. I'm not sure what FSAA would look like on video though.
Maccara
1st March 2008, 14:03
There's different kinds of AA. There's adaptive AA and Full Screen AA (FSAA). FSAA is typically slower since it applies to all pixels while adaptive just tries to do the edges to prevent jaggies. I'm not sure what FSAA would look like on video though.
Typically (in the past - haven't done 3d stuff for a while :)) FSAA has been implemented so that the original picture is rendered with a higher resolution internally (not interpolated) to begin with and then scaled down.
Does not work with bitmaps at all. (you can notice this, for example, when you have poor textures to begin with - manages to only smooth out the edges in those)
For video, you get the same effect by just interpolating 2x/4x/whatever and then doing simple bilinear resize back down - sure, you get a nice "smooth" picture. :)
Undead Sega
1st March 2008, 15:26
For video, you get the same effect by just interpolating 2x/4x/whatever and then doing simple bilinear resize back down - sure, you get a nice "smooth" picture.
but that is just resizing, not AA. i would really like to see a FSAA filter for video. or infact, get the algorithm of it and somewat try to use it for video (AA filter of course).
Didée
1st March 2008, 15:44
i would really like to see a FSAA filter for video. or infact, get the algorithm of it and somewat try to use it for video (AA filter of course).
To put it in other, simple, words:
When GPUs perform AA in 3D games, they are working with information that is not present in video sources to begin with.
(The next topic could be: "How to implement the zoom filter used in the CSI TV-series?")
tritical
5th March 2008, 00:57
At the risk of grave-digging, is there a theoretical advantage to doing calculations for MCBob and/or NNEDI on the GPU, a la CUDA or CTM? I ask because this filter, whilst great and gives me great results, seems rather slow and could use a new revolution.
It depends a lot on which video card and cpu you have. NNEDI is highly parallelizable (each pixel can be computed independently). The computation for a pixel just involves the evaluation of 1 to 2 neural networks and a distance calculation to some clusters. So it wouldn't be hard to implement with CUDA. Last fall I actually spent some time writing a CUDA implementation to offload part of the calculations used for training (which are pretty much the same ones used during normal operation). I only have an 8600 GTS so even after lots of time optimizing it it didn't turn out to be worth it compared to a multithreaded sse2 implementation running on my Q6600 (and definitely not once I got access to the university's cluster). However, if you have something like an 8800 GTX, 8800 Ultra, etc... and only a single or dual core cpu then I think it would definitely benefit. Hopefully, the source code for NNEDI will be available by the summer. I actually have a new version ready, I just need a free day to update the code.
It's a pity GPUs don't support 8/16 bit SIMD in shaders or it would be no contest, as it is a quad core CPU is pretty closely matched to a modern GPU as far as arithmetic operations on 8 bit data goes (of course the GPU is doing it with single precision floating point operations).
BBugsBunny
6th March 2008, 22:06
I actually have a new version ready, I just need a free day to update the code.
Hooray :)
I've been browsing the thread every now and then - good to hear that there is a new version ready soon!
Inventive Software
7th March 2008, 14:50
It depends a lot on which video card and cpu you have. NNEDI is highly parallelizable (each pixel can be computed independently). The computation for a pixel just involves the evaluation of 1 to 2 neural networks and a distance calculation to some clusters. So it wouldn't be hard to implement with CUDA. Last fall I actually spent some time writing a CUDA implementation to offload part of the calculations used for training (which are pretty much the same ones used during normal operation). I only have an 8600 GTS so even after lots of time optimizing it it didn't turn out to be worth it compared to a multithreaded sse2 implementation running on my Q6600 (and definitely not once I got access to the university's cluster). However, if you have something like an 8800 GTX, 8800 Ultra, etc... and only a single or dual core cpu then I think it would definitely benefit. Hopefully, the source code for NNEDI will be available by the summer. I actually have a new version ready, I just need a free day to update the code.
God bless universities and their computer systems. :) My uni's computers have C2Ds and 8600GTs, and if I'm nice I can possibly get access to a lab after hours. I keep an eye on the NVIDIA forums for news on CUDA. My laptop's only got an Xpress 1150 (X300), so I'd need my own box for something to be working, but from what you've said, it definitely sounds possible, so I'm very interested in making it work. :) I think having groups of pixels, or possibly 4x4, 8x8 or 16x16 macroblocks (or variations on a theme) would make it faster, since from what I've read, CUDA likes many kernel threads and lots of data.
Kumo
12th March 2008, 20:10
i'm trying to deinterlace an old anime ntsc r1 (usa) dvd.dgindex reports it as interlaced.here is a sample:
http://rapidshare.com/files/97003811...muxed.m2v.html
how should i combine nnedi with tfm to convert it to 24p?
i'm trying likeDGDecode_Mpeg2Source("H:\Kimagure Orange Road Movie 1\VTS_01_1.d2v",info=3)
colormatrix(hints=true,interlaced=true)
nnedi()
tfm(order=1,pp=1,clip2=nnedi).tdecimate(mode=1,hybrid=1)
but i think it's a wrong way(nnedi effects the whole clip).how call it properly?
should i try a different deinterlacer?
canuckerfan
28th March 2008, 01:18
how'd NNEDI 2.0 coming along, tritical?:)
TheRyuu
29th March 2008, 00:40
i'm trying to deinterlace an old anime ntsc r1 (usa) dvd.dgindex reports it as interlaced.here is a sample:
http://rapidshare.com/files/97003811...muxed.m2v.html
how should i combine nnedi with tfm to convert it to 24p?
i'm trying likeDGDecode_Mpeg2Source("H:\Kimagure Orange Road Movie 1\VTS_01_1.d2v",info=3)
colormatrix(hints=true,interlaced=true)
nnedi()
tfm(order=1,pp=1,clip2=nnedi).tdecimate(mode=1,hybrid=1)
but i think it's a wrong way(nnedi effects the whole clip).how call it properly?
should i try a different deinterlacer?
DGDecode_Mpeg2Source("H:\Kimagure Orange Road Movie 1\VTS_01_1.d2v",info=3)
interp=nnedi()
tfm(order=1,pp=1,clip2=interp).tdecimate(mode=1,hybrid=1)
colormatrix(hints=true,interlaced=false)
Mug Funky
30th March 2008, 07:54
hmm. we just got one of these at work:
http://store.nvidia.com/servlet/ControllerServlet?Action=DisplayPage&Env=BASE&Locale=en_US&SiteID=nvidia&id=ProductDetailsPage&productID=67049700
would love to throw something at it :)
btw, it handles 2048x1556 images with 23 layers of grading in 32 bits before it goes under 25fps :devil:
this thing really is a monster.
morsa
3rd April 2008, 23:58
Using what software?
Scratch or AfterFX?
Raere
5th April 2008, 03:33
This might have been answered in one form or another, but I couldn't get the exact syntax...
I'm trying to replace EEDI2 with NNEDI with megui's "TIVTC + TDeint(EDI)" because it says my source is hybrid film/interlaced - mostly film.
The code is
edeintted = AssumeTFF().SeparateFields().SelectEven().EEDI2(field=-1)
tdeintted = TDeint(edeint=edeintted,order=1)
tfm(order=1,clip2=tdeintted).tdecimate(hybrid=1)
I can't just replace EEDI2 with NNEDI because it complains about having a different resolution. I think I need to put PointResize in there somewhere, but I'm not sure.
TheRyuu
5th April 2008, 14:39
This might have been answered in one form or another, but I couldn't get the exact syntax...
I'm trying to replace EEDI2 with NNEDI with megui's "TIVTC + TDeint(EDI)" because it says my source is hybrid film/interlaced - mostly film.
The code is
edeintted = AssumeTFF().SeparateFields().SelectEven().EEDI2(field=-1)
tdeintted = TDeint(edeint=edeintted,order=1)
tfm(order=1,clip2=tdeintted).tdecimate(hybrid=1)
I can't just replace EEDI2 with NNEDI because it complains about having a different resolution. I think I need to put PointResize in there somewhere, but I'm not sure.
nnedi doesn't double the height like eedi2 did so leave out the separatefields and selecteven call.
Raere
5th April 2008, 14:43
nnedi doesn't double the height like eedi2 did so leave out the separatefields and selecteven call.
Thanks! That works.
g_aleph_r
11th April 2008, 14:41
I actually have a new version ready, I just need a free day to update the code.
I would like to test it to see if I can use nnedi to upscale to 1080p in realtime on 8800GT, and q6600
It would be veery nice
Leak
11th April 2008, 16:44
I would like to test it to see if I can use nnedi to upscale to 1080p in realtime on 8800GT, and q6600
It would be veery nice
Somehow I've got the feeling that you'd rather need a small miracle than the above hardware for that to happen...
np: The Orb - DDD (Dirty Disco Dub) (The Dream)
g_aleph_r
3rd May 2008, 08:42
Somehow I've got the feeling that you'd rather need a small miracle than the above hardware for that to happen...
np: The Orb - DDD (Dirty Disco Dub) (The Dream)
Always hope :D
BTW: CUDA is a miracle :cool:
when new version will be ready?
g-force
5th May 2008, 16:48
tritical,
We can't wait for the update! Hoping it comes soon.
-G
Mug Funky
6th May 2008, 12:33
ssssh! don't you know it's not at all polite to ask someone how their thesis is going? :)
g-force
6th May 2008, 17:40
oops! sorry, tritical. Hope the thesis is going well!
-G
Congrats on the Netflix problem!
I can't seem to make NNEDI work with yadifmod. This is the code I'm using:
AviSource("Deint.avi")
loadplugin("D:\Video\AviSynth 2.5\plugins\nnedi.dll")
loadplugin("D:\Video\AviSynth 2.5\plugins\yadifmod.dll")
ConvertToYUY2()
AssumeBFF()
interp = nnedi()
yadifmod(edeint=interp)
It runs, but the output is the plain yadif output. When run alone, NNEDI works fine.
hi Tritical,
regarding deinterlace using nnedi:
is it possible to add to nnedi if statement like this:
if object with horizontal shape appers, new line is computed either from above,
either from below line(s).
b,w-known (different colors), ?-to be computed
bbb
???
bbb
???
www
???
www
---------------
bbb
???
bbb
bbb
www
???
www
OR
bbb
???
bbb
www
www
???
www
is it possible to decide the color from neibour frames?
this will avoid flickering after deinterlace. (instead of lines with width 1 pixel)
I'm working with NTSC 29.97 camcorder interlaced video captured at 720x480 compressed YUY2 huffy. I'm seeing my encoding speed drop to less than a quarter of what it was before using NNEDI.
Here's my original yadif only AVS (CCE SP2 Speed = .20):
Setmemorymax(768)
Load_Stdcall_plugin("C:\Program Files\AviSynth 2.5\plugins\yadif.dll")
AviSource("G:\VideoCapture\cam1.avi")
Trim(14,19527)+Trim(19534,20380)+Trim(20523,27914)+Trim(28672,110689)
assumetff()
Crop(0,8,-16,-8,True)
yadif(mode=1, order=1)
HDRAGC(coef_gain=.5,max_gain=1,min_gain=0,black_clip=.1,reducer=2,max_sat=1.3)
FFT3DGPU(sigma=4, bt=3, bw=32, bh=32, ow=16, oh=16, sharpen=1.5, interlaced=false)
assumetff()
separatefields().selectevery(4,0,3).weave()
AddBorders(8, 8, 8, 8, $000000)
FadeOut2(55)
FadeIn2(55)
Here's my yadifmod+nnedi AVS (CCE SP2 Speed = .04):
Setmemorymax(768)
AviSource("G:\VideoCapture\cam1.avi")
Trim(14,19527)+Trim(19534,20380)+Trim(20523,27914)+Trim(28672,110689)
assumetff()
Crop(0,8,-16,-8,True)
yadifmod(mode=1, order=1, edeint=nnedi(field=-2))
HDRAGC(coef_gain=.5,max_gain=1,min_gain=0,black_clip=.1,reducer=2,max_sat=1.3)
FFT3DGPU(sigma=4, bt=3, bw=32, bh=32, ow=16, oh=16, sharpen=1.5, interlaced=false)
assumetff()
separatefields().selectevery(4,0,3).weave()
AddBorders(8, 8, 8, 8, $000000)
FadeOut2(55)
FadeIn2(55)
Notice my sig, I have an HT processor. Using default nnedit settings it starts two threads on my machine. If I force it to use just one thread and make sure it is using SSE I don't see any change in speed:
yadifmod(mode=1, order=1, edeint=nnedi(field=-2, threads=1, opt=2))
I've also tried 2 Threads + SSE = same
Am I missing something because other have eluded that using nnedi was at least as fast if not faster.
I've been at this for several years now but still feel like a newbie sometimes...
Thanks so much!
Am I missing something because other have eluded that using nnedi was at least as fast if not faster.
Using NNEDI was as fast as what?
Tar? Molasses? Glacial migration?
Sorry, NNEDI is just about the definition of slow. It's got the quality to justify it, though...
(The "NN" in NNEDI stands for neural networks, if you didn't know - running those chews a lot of CPU...)
np: Kettel - Peeksje 1994 (My Dogan)
Using NNEDI was as fast as what?
Tar? Molasses? Glacial migration?
:D I had a chuckle.
It's not THAT slow. I can get almost realtime usage out of it. I have seen a lot worse.
It's not THAT slow. I can get almost realtime usage out of it. I have seen a lot worse.
Whoa... so what supercomputer did you use to accomplish that feat, if I may ask?
np: DJ Koze - Brutalga Square (Speicher CD2)
thetoof
9th May 2008, 16:19
I can "accomplish that feat" too (almost realtime) and I've got a q6600, 2x 2GB DDR2 RAM and 8800 GTS.
I can "accomplish that feat" too (almost realtime) and I've got a q6600, 2x 2GB DDR2 RAM and 8800 GTS.
Could you please tell us what's in the script that runs almost realtime? The source is in MPEG-2 or MPEG-4 format, SD or HD?
Does it really mean (cough) almost 30 fps? Does it use MT?
What exact NNEDI setting is used?
Thank you.
Tritical, is it possible to modify nnedi be temporal too?
Yep, I am running an E6600 with 3gb ram. It runs fine. Note: this is with SD sources.
thetoof
12th May 2008, 04:09
MPEG-2 720x480 source, nnedi alone, default parameters, 90-100% cpu usage (multithreading is used even without MT() or SetMTMode()), 165MB RAM, 175 MB virtual memory, 20-30 fps
Undead Sega
2nd June 2008, 22:18
how is running NNEDI on a GPU going? i would really want to see that happen very soon :D
especially MCBob, i need to use that filter for a 7mins 100fps video! :(
g_aleph_r
15th June 2008, 14:37
how is running NNEDI on a GPU going? i would really want to see that happen very soon :D
Me too!!:thanks:
pitch.fr
12th July 2008, 15:14
(The next topic could be: "How to implement the zoom filter used in the CSI TV-series?")
that's something I would also like to know :)
this script looks really good, but I don't succeed to get the same results on very ugly videos than with this PS script(to be used in MPC) :
// +deinterlace (blend).txt=ps_3_0
sampler s0 : register(s0);
float4 p0 : register(c0);
float4 p1 : register(c1);
#define width (p0[0])
#define height (p0[1])
#define counter (p0[2])
#define clock (p0[3])
#define one_over_width (p1[0])
#define one_over_height (p1[1])
#define PI acos(-1)
float4 main(float2 tex : TEXCOORD0) : COLOR
{
float2 h = float2(0, one_over_height);
float4 c1 = tex2D(s0, tex-h);
float4 c2 = tex2D(s0, tex+h);
float4 c0 = (c1+c2)/2;
return c0;
}
on videos full of combing artefacts, this script does miracles :eek:
any idea how to get the same results with AVS ?
thanks,
Leak
12th July 2008, 20:39
any idea how to get the same results with AVS ?
Bob().ConvertFPS(last.framerate)
You might probably want to replace Bob with something more advanced like LeakKernelBob (okay, I'm biased... :p) or something.
But why not simply apply bobbing on playback? That gives you the full temporal resolution instead of the above blended mess at half the temporal resolution...
(ffdshow will happily apply a KernelBob on playback - for best results set your monitor's refresh rate to an integer multiple of the produced framerate, i.e. 48, 50 or 60 Hz according to your source...)
np: Underworld & Gabriel Yared - Happy Toast (Breaking & Entering OST)
pitch.fr
13th July 2008, 00:35
thanks for the tip, but I've already tried all the ffdshow deinterlacers(including yours), and none of them appealed to me.
I'm talking about "artefacts" due to poorly or non-interlaced heavily interlaced videos.
these combing effects have no pattern, they are just erratic.
the MPC PS script I posted above literally does miracles on such poorly encoded AVI's or MPEG files :eek:
just looking for something just as powerful at this point.......and I can't find something that does miracles like this in ffdshow or AVS :(
Didée
13th July 2008, 01:35
I'm not used to read such pixelshader code, but it seems it's simply replacing each pixel with the average of its top & bottom neighbor?
Blur(0,1)
Ah, the miracle.
Leak
13th July 2008, 01:41
I'm talking about "artefacts" due to poorly or non-interlaced heavily interlaced videos.
"non-interlaced heavily interlaced videos" :confused:
If you mean interlaced videos that have been resized without proper handling so the interlaced frames got smeared together - do yourself a favour and throw them into the recycle bin; they're well beyond repair...
If that's not what you meant - could you perhaps post a sample?
np: The Cinematic Orchestra - Child Song (Live At The Royal Albert Hall)
pitch.fr
13th July 2008, 01:59
yes I mean interlaced video that were resized w/o deinterlace in the first place(quite common on old avi's).
they look ugly....they really do :)
but this PS script can almost fix them up perfectly..........it looks a lot less blurry than the ffdshow "linear blending" filter.
I've tried the nnedi AVS script that was posted on 1st page, it can prolly be improved by messing around with the parameters ?!
thanks for your help anyway ;)
pitch.fr
13th July 2008, 02:00
I'm not used to read such pixelshader code, but it seems it's simply replacing each pixel with the average of its top & bottom neighbor?
Blur(0,1)
Ah, the miracle.
thanks, I've tried that......didn't look too good
pitch.fr
13th July 2008, 21:03
actually I've been told by a PS coder that "GeneralConvolution ( 0, "0 1 0 0 0 0 0 1 0" )" would do the same as my script.
too bad this only works in RGB32, and converting in RGB32 is not multi-threaded and very slow in AVS......besides it's got "Chroma upsampling" problems, as shown here :
http://forum.doom9.org/showpost.php?p=1158550&postcount=9
any idea how to achieve this in YV12 ? :D
Didée
13th July 2008, 21:18
That can be done with MaskTools, it's a one-liner:
mt_luts(last,last,mode="average",pixels="0 -1 0 1",expr="y",Y=3,U=3,V=3)
However, you must be aware that this is NOT a general solution for progressively resized interlaced content. The exact appearance of such mistreated footage depends on the ratio OldHeight/NewHeight, and on the used resampling algorithm. The result of such mistreatment may be anywhere from something that looks almost like "correct" interlacing (i.e. still very thin miceteeth) - this is where this filtering might work okay - up to horrible, very very thick vertical bands, where this kind of filtering will have ZERO positive effect.
It's not a "magic" solution. It's a primitve filter that sometimes happens to do something useful, if you're lucky enough that the destruction is not all too severe.
Edit: corrected the pixel map in mt_luts(), there was an error in it.
Example for a case where this doesn't work out:
http://img166.imageshack.us/img166/7283/itaintworkbk6.jpg (http://imageshack.us)
(left - progressively resized interlaced content -- right: your "solution")
pitch.fr
13th July 2008, 21:33
oh my.....Didée you really are a life saver :)
I'm already totally sold on LSF in ffdshow in realtime, it makes the picture so much sharper....without any visible artefacts :)
this is too slow for realtime use and MT("mt_luts(last,last,mode="average",pixels="0 -1 0 0 0 1",expr="y",Y=3,U=3,V=3)",2) is giving an error msg that it's missing some "" ?!
and for interlaced WMV's with remaining combing artefacts, what would be a more suitable formula then ?
thanks again,
Didée
13th July 2008, 21:44
Note I edited my post above.
And, for the problem you're dealing with,
there is NO general solution.
("Generally", it is *impossible* to repair such defects.)
pitch.fr
13th July 2008, 21:57
well I still get an error msg that it's missing "" on :
MT("mt_luts(last,last,mode="average",pixels="0 -1 0 1",expr="y",Y=3,U=3,V=3)",2)
actually I've got some interlaced WMV's that are supposed to be deinterlaced by the WMV decoder, but it does a pretty poor job.
what would be a good AVS solution for this kind of regular combing ?
Didée
13th July 2008, 22:04
You probably need to put triple quotes: MT("""mt_luts(....)""",2)
pitch.fr
13th July 2008, 22:13
great thanks! I didn't know you could need so many """ :D
well that seems to work pretty well, but it's too slow for realtime use in ffdshow...
oh well I'll stick to EVR with this PS script I guess ;)
Didée
14th July 2008, 00:03
Just for the record, in this case here one can use the faster mt_edge() instead of mt_luts() :
mt_edge("0 1 0 0 0 0 0 1 0",0,255,0,255,Y=3,U=3,V=3)
(mt_edge is limited to a 3x3 kernel, which is sufficient here. For 3x3, it can be used like general_convolution.)
pitch.fr
14th July 2008, 01:10
ahhh, still too slow for realtime use in ffdshow.
it's already a miracle that the spline36 version of LSF works in realtime in ffdshow, AVS was really not meant for realtime use in the first place :(
and anyway what I find amazing in LSF is that it sharpens the motion blur, this is really impressive...especially when running Reclock :D
IanB
14th July 2008, 03:02
If you want fast, try this
Merge(Crop(0, 2, 0, 0), Crop(0, 0, 0, -2))Sure it loses a line top and bottom, but...
pitch.fr
14th July 2008, 11:04
thanks for the tip IanB, but this requires "splitvertical=true" in MT, and then it ruins the AR....I'll stick to the PS script I think :D
IanB
14th July 2008, 15:36
@pitch.fr,
You are wasting your time with MT on any filter that has the speed near that of a frame BitBlt operation, Merge(0.5) is almost as fast as a BitBlt. MT does a BitBlt operations to assemble all the final results pieces, so you get a net loss here. MT at best runs (Time(bitblt)+Time(filter)/Nprocessors) so to win Time(filter) > N/(N-1)*Time(bitblt)
If the loss of the top and bottom line is so catastrophic you could adjust this method:-T=StackVertical(Crop(0, 0, 0, 2), Last)
B=StackVertical(Last, Crop(0, Height()-2, 0, 0))
Merge(T, B)
Crop(0, 2, 0, 0)There are many variation on the theme.
And note the AR had not changed, you had just lost the top and bottom lines. There are lots of ways to compensate for or replace them.
pitch.fr
14th July 2008, 15:59
wow cool, you're right....MT actually slows things down in that case :eek:
it's very usable in realtime now, and I'd rather avoid missing horizontal lines because HR gives ghost lines when its built-in scaler is being used on ATi cards..
while I'm at it, do you know a way to run ConvertToRGB32() a bit faster ? like I dunno, multithreaded or only on Core0(considering Core1 is always busy decoding video)....or any other fast plugin to do that job ?
ffdshow is way faster than AVS to do YV12>RGB32 conversion.
thanks again!
IanB
14th July 2008, 16:19
Internally YV12>RGB32 is actually YV12>YUY2>RGB32 but there is no Cache between the 2 steps so if you are using SetMTMode you might get a small gain by explicitly doing the 2 steps so there is a cache between the 2 parts....
ConvertToYUY2(Interlaced=...)
ConvertToRGB32(Matrix=...)
pitch.fr
14th July 2008, 16:30
actually I was getting a script error until I did like Didee showed me :
MT("""ConvertToRGB32(matrix="rec709")""",2)
why so many quotes anyway ? :D
oh right, so I could let ffdshow convert to YUY2 and only convert to RGB32 in AVS....now we're getting somewhere :)
OTOH, I've saved this as SimpleDeinterlace.avsi :
function SimpleDeinterlace()
{
T=StackVertical(Crop(0, 0, 0, 2), Last)
B=StackVertical(Last, Crop(0, Height()-2, 0, 0))
Merge(T, B)
Crop(0, 2, 0, 0)
}
but when I do SimpleDeinterlace() in ffdshow, it says "invalid arguments to function Crop"....any idea please ?
sorry for the noobish questions :rolleyes:
Gavino
14th July 2008, 17:42
when I do SimpleDeinterlace() in ffdshow, it says "invalid arguments to function Crop"....any idea please ?
You need to do this:
function SimpleDeinterlace(clip c) {
c # sets 'last'
T=StackVertical(Crop(0, 0, 0, 2), Last)
...
Triple quotes are needed when a string literal is to include a quote character.
I suggest you spend some time studying the language basics (http://avisynth.org/mediawiki/Main_Page#AviSynth_Syntax).
Oh, and please use CODE, not QUOTE for code fragments here.
pitch.fr
14th July 2008, 18:21
ok will do, sorry! and thanks for the link and the fix :)
one last question if you don't mind.
LSF is doing some YV12 transformation in Avisynth, but it's mod4 in 2.57
latest AVS 2.58 RC2 is mod2 on YV12, but there's not MT modified version yet.
is there some trick to add black borders to always be mod4 before feeding LSF ?
TIA,
Underground78
14th July 2008, 18:55
Width (clip) : Returns the width of the clip in pixels (type: int).
Height (clip) : Returns the height of the clip in pixels (type: int).
Modulo operator : %
Here a silly example, AddBorders(width % 4, height % 4, 0, 0, color=color_pink) makes the last clip mod4 adding if necessary pink borders on the left and/or the top of the frame ... It's an example, you can surely do something really better (and you should) ... :p
2Bdecided
11th September 2008, 17:29
Don't fear, it is still being worked on :). I've just been busy working on some other projects the last week or two. A bit of good news is that I got permission to run the training program on my university's 512 cpu cluster, and I'm doing some test runs as I write this. A new version should be ready in a week or two (no promises though :p).I wonder if it will make it in time for this post's first birthday? ;)
I ask because, in the UK at least, full time Masters degrees usually run for one year. I wonder if Tritical has started, or finished, or if the next we'll see of his idea is if/when it's deployed commercially.
Any chance of an update?
Cheers,
David.
tritical
17th September 2008, 20:10
If I was going full time it would take ~3 semesters (degree requires 30 hours, 9 hours is full time). However, I've only been part time, and then working some (plus I have no desire to give up the college lifestyle right now :)). I should only have one more semester after this one.
Anyways, I'm still working on NNEDI... trying new ideas, etc... It has turned out to be a rather difficult problem (in terms of achieving the type of results I think are possible). If I ever get something significantly better than the current released version working then I will definitely post it. However, so far its just been small improvements. I'm hoping that the current idea I'm running with will show big improvements.
Adub
17th September 2008, 21:27
Good to hear from you!! I am glad that you are enjoying the college life (as I myself am) and I look forward to your future work with eagerness!
Terka
18th September 2008, 10:57
will new nnedi use also temporal information?
2Bdecided
18th September 2008, 20:02
Thanks for the update. I hope your thesis makes it on-line one day. Sadly, many universities still don't encourage this.
Cheers,
David.
tritical
21st September 2008, 03:23
@Terka
It's still spatial only for now. Once that is working well, incorporating temporal information (which will most likely have to involve separate motion compensation) is definitely the next step.
Terka
23rd September 2008, 11:04
holding the thumbs!
g_aleph_r
21st October 2008, 15:09
News about CUDA version?
I am currently working with 12 Fps on a 50Fps video, it is slooow!! :(
Adub
21st October 2008, 18:49
I don't think anyone is actually working on a CUDA version.
g_aleph_r
26th October 2008, 07:55
...Last fall I actually spent some time writing a CUDA implementation to offload part of the calculations used for training (which are pretty much the same ones used during normal operation).
...
Hopefully, the source code for NNEDI will be available by the summer. I actually have a new version ready, I just need a free day to update the code.
Sorry if I insist,
I would like to see it even if He says that is not very fast, It's still useful if it leaves CPU for other filters.
Terka
26th January 2009, 10:40
Hi tritical, any news regarding new version?
tritical
27th January 2009, 09:49
Not really. I still work on it when I get new ideas, but as it turns out the original formulation of nnedi was pretty good and not that easy to beat.
Dark Shikari
27th January 2009, 09:50
Not really. I still work on it when I get new ideas, but as it turns out the original formulation of nnedi was quite good and not that easy to beat.Would it work better for resizing if it was trained specifically for resizing instead of for edge-interpolation? And how about, for example, making a version explicitly for cartoons by training on such?
Also, any news on publicizing the algorithm behind this? ;) I have a few ideas, but I'd like to know for sure.
Terka
27th January 2009, 12:55
imho users will be grateful if temporal component was added.
Sagekilla
27th January 2009, 20:30
@Dark Shikari: You mean like giving it the full resolution input then having the algorithm try to optimize towards getting the result sharp like the source, instead of trying to get the edges sharp like the source?
Just a guess, I don't know exactly how NNEDI works.
tritical
29th January 2009, 14:19
@Dark Shikari
Yes, training specifically for resizing would make it better for that, and training specifically for anime would make it better at anime. The idea is actually pretty simple. Use cubic interpolation (or some other fast method) where it wont introduce much error, split remaining pixels into similar groups based on local neighborhood, have one or more neural networks for each group that are trained to output the interpolated pixel value given the local neighborhood as input. Of course there are lots of open questions there... How much data to use, and from how many sources? How to separate local neighborhoods into groups (clustering... what method? operate on raw pixel values? do dimensionality reduction? extract specific features?). How many groups to have? What to feed to the neural networks (raw pixel values? extracted features?). What structure should the neural networks have? How should they be trained? What should the objective function be? How should overfitting be avoided?
The version of nnedi out now used pretty much the simplest methods, and took no steps to avoid overfitting aside from using lots of training data:
k-means clustering with 64 clusters, cluster on raw pixel values of local neighborhood (mean removed), local neighborhood was 4x25 (100 pixels), cluster on ~20-25 million local neighborhoods from progressive frames from ~35-40 sources
one neural network per cluster, input was raw pixel values (scaled to [-1,1] and mean of local neighborhood removed), trained with CMA-ES to minimize squared error, neural network had 2 hidden layers w/ 8 neurons apiece, each neuron used Elliott activation, nn had one output neuron with linear activation function which was connected to both hidden layers, one of the neurons in the first hidden layer used linear activation, and starting point for training the neural networks was set by solving for the linear lss weights for the cluster, sticking those into the first layer linear activation neuron (basically the networks started out predicting the linear best fit solution).
I think that is about it, or what I remember at least.
@Terka
I have thought about how to include temporal information, but it isn't all that easy. It would require accurate motion compensation, and I think training would be much more complex than spatial only.
Dark Shikari
29th January 2009, 14:29
Would it be possible to make a release of NNEDI that could be trained specifically for whatever purposes I wanted? I have enough CPU power to go for it... :p
Also, how do you recommend training--downscaling sample input, NNEDI, and comparing it to original input? Won't that to some extent lead to an NNEDI that's optimized to a specific downscaling resampler?
Also, a hunch: if you're basing the neural network on neighboring pixels, are you using the differences between the neighboring pixels as well (e.g. (T-L), (LT-T), (TT-T), (LL-L), etc, where T=Top, L=Left, TL=TopLeft, TT=TopTop [two above], etc)? I suspect this might give even better results (testing with FFV1 shows that it gives the best correlation).
(By the way, here's a recent upscale I did with NNEDI and a few other filters: Left is Lanczos, Right is NNEDI (http://www.upimage.us/image-F548_49803665.jpg))
*.mp4 guy
29th January 2009, 15:21
Could you also post the source image?
Dark Shikari
29th January 2009, 15:27
Could you also post the source image?Linkage (http://i43.tinypic.com/dlgw9l.png)
Script:
image=ImageSource("test.png")
r=image.ShowRed("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster()
g=image.ShowGreen("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster()
b=image.ShowBlue("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster()
MergeRGB(r,g,b)
ConvertToYV12()
AddGrain(1,0.1,0.1)
AddGrain(2,0.2,0.2)
AddGrain(3,0.4,0.4)
AddGrain is for dither/weak noise bascally. DFTtest is to deal with the jpeg artifacts from the original (the PNG is converted from an original source JPEG). Separate upscaling for each color channel is because, IMO, it seems to work better.
tritical
29th January 2009, 23:24
Would it be possible to make a release of NNEDI that could be trained specifically for whatever purposes I wanted? I have enough CPU power to go for it...
It's possible, and I have thought about it before (allowing users to give training data). It would take a little work as the training code is scattered among multiple programs.
Also, how do you recommend training--downscaling sample input, NNEDI, and comparing it to original input? Won't that to some extent lead to an NNEDI that's optimized to a specific downscaling resampler?
It will be biased towards that resampler, but is there a better way to do it? In most of the papers I've read they test upsampling by downscaling large images (usually with basic averaging + some sharpening, trying to approximate how various imaging devices work).
Also, a hunch: if you're basing the neural network on neighboring pixels, are you using the differences between the neighboring pixels as well (e.g. (T-L), (LT-T), (TT-T), (LL-L), etc, where T=Top, L=Left, TL=TopLeft, TT=TopTop [two above], etc)? I suspect this might give even better results (testing with FFV1 shows that it gives the best correlation).
I don't do that. Theoretically, it is unnecessary/redundant, as those differences are simply linear combinations of the input variables... so the input layer neurons could learn the same mappings given the original pixel values as input vs if they were given those differences as input. It might make the learning faster though, would have to try.
Dark Shikari
29th January 2009, 23:28
Also, what about using a metric other than mean squared error? SSIM might be a good one to try for, or perhaps something like x264's psy-RD metric.
MfA
30th January 2009, 15:18
There is no need to simultaneous optimize interpolation and texture synthesis, unlike encoding there is no gain to be had from reusing artefacts as texture. You can always add texture in a separate pass.
Dark Shikari
30th January 2009, 19:22
There is no need to simultaneous optimize interpolation and texture synthesisWhy not?
MfA
31st January 2009, 20:19
Because "You can always add texture in a separate pass.". With H264 if the noise doesn't get encoded then all you have left is a smoothed result, if slightly misaligning edges etc. allows you to maintain some texture and get a better looking picture that's what you do ... because there is no better alternative. With interpolation you have the luxury to add texture afterwards, so you can concentrate on making the optimal interpolator for features which can be well predicted first (mostly edges).
Weighted predictors are not great texture synthesizers anyway.
tritical
4th February 2009, 07:03
@Dark Shikari
I have tried some other metrics, but with the current algorithm and number of data points it really has to be something completely independent of other pixels (or at least independent of pixels in other clusters) so that the weights for each cluster can be learned separately. SSIM can be computed for a single pixel change, but it doesn't work very well in my experience.
The latest idea I've been working with is to switch from a bunch of separate non-linear regression problems to one classification problem. In other words, switch from learning interpolation functions for a given set of groupings (the clusters learned through k-means combined with euclidean distance metric) to learning the grouping function for 'n' sets of linear interpolation weights.
I initialize k-means as usual (to get the initial groupings), but instead of regrouping the pixels based on euclidean distance to the mean of each cluster, I calculate the linear least squares interpolation coefficients for each cluster. I then reassign pixels to the cluster whose interpolation coefficients give the minimum squared error for that pixel, and keep repeating that until convergence (overall mse stops dropping significantly). I've found that it only needs ~16-32 sets of coefficients to get very nice results (I used about 5 million points from my 740 frame video to cluster, then tested it on the whole video by choosing the best set of weights for each pixel). Now, the problem becomes creating a classifier to choose which set of weights to use.
MfA
4th February 2009, 19:15
A problem with MSE (and SSIM) is that it heavily punishes outliers, which is fine for a quality metric ... but not good in a classifier.
PS. I find it curious you chose to optimize interpolation without simultaneously optimizing the classifier (ie. optimizing the interpolation weights with an oracle classifier). I would have expected you to optimize both at the same time. What classifier did you end up using?
tritical
4th February 2009, 21:49
As you say, it's possible to iteratively optimize both... train classifier(s) a little, train interpolation coefficients a little (or solve if direct solution exists), keep repeating. Or did you have something different in mind?
I haven't gotten that far yet. I am still testing different classifiers. Main restriction on what can be used is computation time, since it is typically going to be run on ~25-35% of pixels in a frame (~80k-120k for a 720x480 frame). What has worked best is a basic nn trained to select classes by minimizing squared error of the resulting interpolation (if there are 16 classes, then the nn has 16 output neurons and the one with the largest value is the chosen class). Actually, it worked better to not have the nn choose a single class, but to use its outputs as linear combination weights (either after applying softmax activation or normalizing so they sum to 1). However, having it output combination weights makes iterative optimization with the interpolation coefficients more complicated.
MfA
4th February 2009, 23:15
Actually, it worked better to not have the nn choose a single class, but to use its outputs as linear combination weights (either after applying softmax activation or normalizing so they sum to 1). However, having it output combination weights makes iterative optimization with the interpolation coefficients more complicated.
Wouldn't that only make sense if during application of the filter you also use the weighted combination of all predictors? Doesn't seem a realistic option.
By the way, why did you decide to second guess the CMA-ES algorithm? (ie. why not just let it optimize the entire system of both classifiers and predictors in one go.)
madshi
9th November 2009, 13:43
@tritical,
have you checked out iNEDI yet? It seems to be an noticeable improvement over the original NEDI algorithm. Maybe you could implement some of iNEDI's ideas into your NNEDI?
http://www.tecnick.com/pagefiles/appunti/iNEDI_tesi_Nicola_Asuni.pdf
tetsuo55
9th November 2009, 14:42
Looks like even iNEDI has been surpassed:
http://www.comp.leeds.ac.uk/bmvc2008/proceedings/papers/43.pdf
Is up to 10 times faster when compared to NEDI too.
EDIT:
And even ICBI has been surpassed:
http://www.eurasip.org/Proceedings/Eusipco/Eusipco2009/contents/papers/1569192778.pdf
tritical
9th November 2009, 18:56
Learning the interpolation weights based on the low res image works alright for image enlargement, but isn't any good for deinterlacing because too much information is missing. For now I'm more interested in deinterlacing interpolation than enlargement. The iterative energy minimization post-processing described in the ICBI paper is interesting though, and could be useful for deinterlacing. I will try running it on the result of nnedi2/eedi2 and see how it looks.
It also doesn't appear that MEDI > ICBI > iNEDI is always the case based on the results in that last paper. Looks like it depends on the image content, which isn't surprising. It would be interesting to see how nnedi2 compares psnr/ssim wise.
In the future I plan to revisit nedi... mainly because at the time I wrote ediupsizer I didn't have a full understanding of the mathematics/concepts involved. I will definitely keep these papers in mind as well :thanks:.
tetsuo55
9th November 2009, 19:05
It also doesn't appear that MEDI > ICBI > iNEDI is always the case based on the results in that last paper. Looks like it depends on the image content, which isn't surprising. It would be interesting to see how nnedi2 compares psnr/ssim wise.At a first glance i was confused about this too.
But as i read more, it became more and more obvious that the newer ones, and especially MEDI goes for the phycovisually better result, at the cost of some PSNR and SSIM
Also i find it very interesting that these algo´s can be used for deinterlacing.
It would be great to have a universal, MEDI based scaler-deinterlacer that always works, regardless of I or P content
MfA
9th November 2009, 22:19
The extreme staircasing of the image in the MEDI paper for bilinear makes me think they used decimation for downsampling in their tests (which makes the results pretty much irrelevant for normal upsampling).
PS. the gaps from the spokes to the rim are pretty damning, I'm 99% sure they used decimation ... poor show.
tetsuo55
9th November 2009, 22:22
Yeah i think the top 3, should be tested on realworld moving video.
madshi
13th November 2009, 10:12
For those interested, here's the Clown resampled by ICBI:
http://madshi.net/clownICBI.png
ICBI seems to be a bit soft to me, but on the positive side it looks quite smooth and natural and doesn't seem to add any artifacts (other than those already in the source, like the pole halo in the Clown image).
tetsuo55
13th November 2009, 10:30
I agree that its very soft.
It appears like most high quality interpolators create a soft image.
But we could always add a sharpener at the end.
EDIT: actually i would describe it as slightly out of focus
Mystery Keeper
14th November 2009, 01:48
I tried to implement ICBI for deinterlacing. It doesn't work well. Well, it does work, but NNEDI2 works better and faster than my pixel shaders 3 implementation.
tetsuo55
ICBI is adjustable method. You can get sharper image with it if you play with parameters.
madshi
14th November 2009, 09:45
ICBI is adjustable method. You can get sharper image with it if you play with parameters.
Which parameters did you change in which direction to get a sharper image? Probably choosing sharper parameters comes at the cost of curve smoothness, I guess?
MfA
14th November 2009, 15:36
Softness in an interpolator isn't a bad thing.
There is a difference between interpolation and super-resolution. An interpolator generally conserves the original pixels ... this is fundamentally incorrect if you are trying to reconstruct a non-smoothed higher resolution image. For instance just because a pixel covers an edge in the low resolution image doesn't mean it covers it in the higher resolution one, so mixing colors from both sides of the edge could be fundamentally incorrect for the non smoothed higher resolution image.
To benchmark interpolators you should compare against the smoothed version of the higher resolution image, not the original higher resolution image. Sharpening and texture synthesis are not interpolation.
Which is not to say you couldn't do a single step super resolution algorithm ... it just wouldn't be a pure interpolator and shouldn't retain the original pixels from the low resolution image.
Mystery Keeper
16th November 2009, 04:43
Which parameters did you change in which direction to get a sharper image? Probably choosing sharper parameters comes at the cost of curve smoothness, I guess?
Why, of course it does. Second parameter - beta, is there to limit the smoothing out.
absence
8th February 2011, 11:13
I'm toying with the idea of implementing some kind of EDI on a GPU. Not as advanced as NNEDI, more like NEDI or MEDI. The MEDI paper (http://www.eurasip.org/Proceedings/Eusipco/Eusipco2009/contents/papers/1569192778.pdf) claims NEDI only uses the training window shown in figure 4 a (or b), but according to a more practical description of NEDI (http://chiranjivi.tripod.com/EDITut.html), all 4x4 pixels surrounding the purple unknown high resolution pixel dot on the blue grid are used, in a way that leaves the purple dot centred. The MEDI paper and the figures in the NEDI and MEDI papers makes it sound like the unknown pixel is off-centre and only 3x3 surrounding pixels are used, while the formulas in both are quite clear about using all the pixels surrounding the unknown one. What am I misunderstanding here? :)
tritical
8th February 2011, 16:04
I had the same discussion with madshi about a year ago about MEDI, and the conclusion was that the MEDI paper is just weird. They talk about NEDI using 1 training window, which it could be implemented that way, but never is. All the implementations I've seen use 16 (4x4) or more training windows around the current point - and in both steps the combination of all of them is always centered about the point to interpolated. Here is what I wrote to madshi:
I don't understand that paper either. Saying that nedi uses a single training window is weird, but I guess it could be implemented that way. Their point about covariance mismatch is true either way though. If you use all training windows available inside an NxN window (8x8, 16x16 etc...) around the current point, some of those windows may have the same structure (same linear relationship between the predictors and the center pixel) as the real edge and some may not. The ones that don't could cause bad coefficient estimates. My take is they are trying to select the training window that most closely resembles the window being used for interpolation, but their criteria of highest covariance signal energy doesn't seem like the best solution. Anyway, based on the results they report it doesn't seem like their method is much of an improvement over nedi.
absence
8th February 2011, 16:36
I had the same discussion with madshi about a year ago about MEDI, and the conclusion was that the MEDI paper is just weird.
Glad it's not just me. :)
Anyway, based on the results they report it doesn't seem like their method is much of an improvement over nedi.
MEDI does a better job than NEDI at connecting pixels in the spoke images in the paper, but I haven't seen it in action anywhere else and don't know if the difference matters much for "normal" images.
Thanks for the info!
tritical
9th February 2011, 00:40
Like I said, I would not trust the MEDI paper. If you actually implemented what they describe, which is NEDI with a 2x2 window (so you have 4 training cases per pixel, which is still too few) and then select only one training case based on the energy, I would expect very large artifacts in any type of detailed area. The reason being you are using only 1 training case to estimate 4 parameters.
henryho_hk
21st February 2011, 13:37
Just curious, is the idea of gamma drift in resizing (http://www.4p8.com/eric.brasseur/gamma.html) relevant to the error calculation in NNEDI training?
tritical
21st February 2011, 20:02
The short answer is yes. I think it would relate to all image quality metrics. It would be interesting to know whether mean squared error, mean absolute error, SSIM, etc.. correlate better with human perception when computed on luminance (Y - relative luminance - computed from linear rgb) versus luma (Y' - computed from gamma corrected rgb). Lightness (human perceived brightness) is not linear with respect to relative luminance though. It is roughly linear as Y^1/3, after normalization by the reference white (cie76).
As far as interpolation goes, it is not quite as cut and dry as that webpage makes it out to be. Even if you undo gamma correction, there is no guarantee that the function being used for interpolation will more accurately fit (approximate) the underlying data. I could easily generate an image that is a linear ramp after gamma correction... so linear interpolation would give worse results if performed on the linear values. One could argue that such images are unlikely, or that natural images are generally better modeled by standard interpolation functions after removing gamma correction, which I guess is the argument of that webpage and is probably the case.
I should clarify, if you want to compute the average (or weighted average) luminance around a point (i.e. within some area) then you would most definitely want to work with the linear values. Interpolation is a different matter.
pbristow
24th February 2011, 17:14
@tritical: Regarding interpolation, the ramp example isn't that relevant as the differences between the two methods would be negligible when interpolating between roughly similar brightness levels.
Where this effect shows up strongest is when interpolating between a bright pixel (or several of them) and a much darker one (or several). The effect is that small bright details (e.g. stars in the night sky) become much dimmer and less visible than they should, while small dark details (e.g. speckles of dirt on a white sheet) become more obvious than they should.
tritical
24th February 2011, 22:32
Yes, your are correct that when averaging pixel values the difference between 1.) averaging the gamma corrected values directly and 2.) undoing gamma correction, averaging the linear values, and then redoing the gamma correction will be greatest when the values to be averaged are very far apart... and 1 will always result in a smaller final value than 2 (assuming the applied gamma correction factor is < 1.0). However, that is not interpolation. For example if I am given y=1.0 at x=0.0 and y=0.0 and x=1.0, and am asked to give a value for y at x=0.5, without any other knowledge about the function there is no reason to believe that averaging y at x=0.0 and x=1.0 will be anywhere close to the correct value at x=0.5. Now if I know that y is piecewise linear that is another matter. However, I would argue that most images - especially edge areas - are much closer to piecewise constant than piecewise linear... i.e. when you have two neighboring pixels that are very different, rarely in the original continuous (infinite resolution) image would the value directly between them be exactly half the luminance.
Also in my ramp example the difference between neighboring function values (that are linear after gamma correction) could be made infinitely large, resulting in as large a difference as you want. That point was only to say that the accuracy of interpolation will be limited by how well the model you use for interpolation fits the underlying data.
Actually, I did a quick test using the training data for nnedi (primarily directional edges and complex textures) and linear interpolation (so the goal is to predict the pixel value at (x,y) given (x,y+1) and (x,y-1)). Linear interpolation on the linear values (undo gamma correction, interpolate, redo gamma) vs linear interpolation on the gamma corrected values did not result in any significant reduction in absolute error or squared error - both in gamma corrected and linear value space. The same for cubic interpolation.
pbristow
25th February 2011, 01:12
Hmmm... perhaps I was assuming too simple a model of interpolation (thinking of the case of bilinear interp as the simplest case, and over-generalising)?
Today I've been looking at various images of faces, many where light is reflect in small highlights off of dark hair, and seeing what happens when I resize the image. Subjectively it does seem that the small highlights are getting dimmed (not just shrinking) during the resize, but then it's probably just a crude bilinear resize (whatever IE8 uses to resize images on webpages). I might do some proper testing tomorrow with various resizers, if I get time.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.