Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
#242 | Link | |
|
Avisynth language lover
Join Date: Dec 2007
Location: Spain
Posts: 3,412
|
Quote:
Will this fix also change the behaviour of nnedi2/3_rpow2? I notice that this produces opposite shifts for RGB and YUY2 when called without the cshift parameter. This isn't necessarily wrong, but seems anomalous. |
|
|
|
|
|
|
#243 | Link |
|
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
Yeah, it'll make the rpow2 functions produce the same shifts when called without cshift. Seems when I added rgb24 support I changed the cshift behavior in the rpow2 functions to account for rgb24 being stored upside down, but forgot that field=0/1 would be switched when using nnedi2/3 alone.
|
|
|
|
|
|
#244 | Link | |
|
Registered User
Join Date: Nov 2003
Posts: 324
|
Quote:
The Sourceforge version messed-up my computer; still not straight as to fonts. Don't understsand how that happened. Anyway, with the "SEt install version" by Atak_Snajpera I get normal errors such as: "Script error: there is no function named "eedi3" If I open an avi without eedi3.dll then everything is normal. So on my system with the SEt install by Atak_Snajpera "eedi3" is seen as a function rather than a dll. This is so wierd. Any suggestions to fix this are welcome. This is on a WinXp sp2 computer. Last edited by hartford; 15th July 2010 at 03:10. |
|
|
|
|
|
|
#245 | Link |
|
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
Posted a new version of nnedi3. Changes:
Code:
+ add nsize=5/6 (16x4,32x4) + add nns=0 (16 neurons). nns 0/1/2/3 from v0.9 are now 1/2/3/4. - new defaults. nnedi3: nsize=6,nns=1. nnedi3_rpow2: nns=3. - fix field=0/1 flipped with rgb24 input @hartford So when you get "Script error: there is no function named "eedi3"", you are able to open the script if you comment out the eedi3() line? or only if you remove eedi3.dll from the plugins directory? Which version of the 2005 redistributable do you have? I think there are multiple versions due to various service packs. eedi3 should require the first one (not sp1 or sp2). Hm, actually I need to check that. Last edited by tritical; 15th July 2010 at 16:21. |
|
|
|
|
|
#247 | Link |
|
Registered User
Join Date: Apr 2005
Posts: 212
|
Regarding NNEDI3, for image enlargement, I tend to use nsize=4, because with nsize=0 I often got strange results (see example below).
Source, 400x300 >> nnedi3_rpow2 (rfactor=4, nsize=0, nns=3, qual=2, pscrn=True, cshift="Spline36Resize"), 1600x1200 >> nnedi3_rpow2 (rfactor=4, nsize=4, nns=3, qual=2, pscrn=True, cshift="Spline36Resize"), 1600x1200 >> Look at the pillars in the windows. |
|
|
|
|
|
#248 | Link |
|
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
It's a result of the training data used. The first few nsize/nns combinations I generated used a smaller training set. Later I increased the training set size by 50%. I think the combinations were nsize=0,1,2 with nns=2,3,4. I'm re-generating those models, but it will probably take a week or so.
Also, eedi3 has the same bug with rgb24 input as nnedi2 has and nnedi3 had. I'll post a fix for that as well. One more thing, the new defaults for nnedi3 should be much closer to default nnedi2 in terms of speed. For reference, nnedi2 uses a 12x4 neighborhood, and nsize=2 in nnedi2 is somewhere between nns=1 and nns=2 in nnedi3. Last edited by tritical; 16th July 2010 at 18:44. |
|
|
|
|
|
#249 | Link |
|
Registered User
Join Date: Nov 2003
Posts: 324
|
@hartford
Which version of the 2005 redistributable do you have? That was the problem; I thought that I had installed it but did not. Works fine after installing it. Sorry for all the hub-bub. Oh, I should say that eedi3 is damn fast on anime and fixing jaggies. Thanks much for your efforts. I can't thank you enough. And thanks to all on this thread who provided assistance; I'm grateful. Last edited by hartford; 17th July 2010 at 03:03. |
|
|
|
|
|
#251 | Link |
|
Registered User
Join Date: Mar 2002
Location: Krautland
Posts: 903
|
Code:
Which version of the 2005 redistributable do you have? I think there are multiple versions due to various service packs. eedi3 should require the first one (not sp1 or sp2). Hm, actually I need to check that. ![]() The 2005 redistributable is installed (verified). The eventviewer shows the errors even when eedi3 is not called in any script. Del eedi3.dll in avisynth\plugins directory - no more errors... This is on WinXP Pro SP3 32bit. |
|
|
|
|
|
#252 | Link |
|
Registered User
Join Date: May 2006
Posts: 36
|
Hi Tritical,
I am also having the ""Script error: there is no function named "eedi3" problem. I installed the VS2005 redistributable here: http://www.microsoft.com/downloads/d...displaylang=en Not sure what the problem is. The latest versions of NNEDI2 and NNEDI3 work fine. |
|
|
|
|
|
#253 | Link |
|
Registered User
Join Date: Mar 2002
Location: France
Posts: 85
|
For all those getting an error, the version of vcomp needed is not one commonly found, it's a dev version with a security fix.
Here's the good redist to get: http://www.microsoft.com/downloads/d...displaylang=en @tritical: you might want to put that in the readme
|
|
|
|
|
|
#254 | Link |
|
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
Thanks for posting that link DeathWolf.
Part of the problem was the first mt version of eedi3 I posted needed the first vs2005 redist, but between that release and this last one I finally got my desktop computer, which has all my avisynth projects, online - about a year after moving. I had just been moving things between it and my laptop with a usb stick as needed. In the process of running windows update it installed sp1 for vs2005 and some other security updates resulting in needing this newer vcomp.dll. Sorry about that. Last edited by tritical; 4th August 2010 at 15:38. |
|
|
|
|
|
#255 | Link |
|
Registered User
Join Date: Mar 2002
Location: Krautland
Posts: 903
|
@DeathWolf:
Thank you for pointing it out. @tritical: Yeah, its working now without errors. But I think there will be others who will come to complain. So the readme should be updated as soon as possible. |
|
|
|
|
|
#256 | Link | |
|
Registered User
Join Date: May 2006
Posts: 36
|
Quote:
|
|
|
|
|
|
|
#257 | Link |
|
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
@Archimedes
What do you think of this enlargement? nnedi3_rpow2 (rfactor=4, nsize=0, nns=4, qual=2, cshift="Spline36Resize"), 1600x1200 |
|
|
|
|
|
#258 | Link |
|
x264 developer
Join Date: Sep 2004
Posts: 2,393
|
http://akuvian.org/src/x264/nnedi.tar.bz2
Same algorithm as nnedi3-0.9.1, and about 4x faster (per cputime. I haven't implemented threads). This is just a C function, not an avisynth filter. I included only the 8x6 mode because I'm only interested in scaling, not deinterlacing. But I assume the wider modes would benefit from the same optimizations. Changed: * Convert much of the float math to fixed-point. * The max part of softmax was useless. Adding a constant before exp is equivalent to multiplying after, and the weighted average removes any such multiplicative constant. * Edge mirroring was weird: nnedi3 kept 1 copy of the edge row on 2 sides of the frame, and 2 copies on the other 2 sides. imho 2 looks better. * And a bunch of assembly optimizations. @tritical I have some further optimization ideas, but they would require retraining. Could you post sourcecode for that? I tried to apply CMAES, and it sorta worked, but wasn't really satisfactory; I don't know if I'm doing something wrong or if I just haven't thrown enough cputime at it. |
|
|
|
|
|
#259 | Link |
|
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
4x is impressive! Not to mention I guess you figured out the structure from just the dll? I'll admit it isn't the most complicated thing in the world though. I had actually spent some time rolling together separate functions in the prescreener network into one big asm function (to eliminate unneeded function call overhead and memory stores) and a few other minor things and only got maybe 20%. Doing an int16 implementation never crossed my mind. I would think that would only get you 2x faster at best? Then you got another 2x on top of that? And here I thought I knew how to program
![]() Anyways, I haven't had time to look at your code in depth or compile it. Hopefully tomorrow or the day after. On the training, I no longer use CMA-ES just basic online gradient descent with a cooling schedule. Oh, and the mirroring thing is a bug... it should only be keeping 1 copy of the edge row on all sides. On the softmax bit, softmax is invariant to a global offset as you point out. The subtracting out the max of all the net values before computing the exp() of them is simply to make exp() well behaved in training (avoid overflow and maintain accuracy of the softmax output). Definitely could be avoided after training, though I doubt it's taking up much cpu time. Last edited by tritical; 21st August 2010 at 08:08. |
|
|
|
|
|
#260 | Link | |||
|
x264 developer
Join Date: Sep 2004
Posts: 2,393
|
Quote:
* I can eliminate the horizontal sum from the end of each dotproduct by shuffling the inputs appropriately. If there were only 1 dotp that would be counterproductive, but with lots of them I can reuse the shuffled inputs and it's a net win. * The prescreener (once you get past the first round of dotps) was latency bound, and out of order execution can't look far enough ahead to interleave multiple copies of the whole function. Manual interleave helped a bunch. (Just moving the first dotps into a separate loop from the tail of the prescreener also helped somewhat, because then out of order execution could do something. Still far from my end result though.) * Some more scheduling improvements smaller than the above. * There's a lot of spatial locality in the prescreener results. I subsampled it and then evaluated the in between samples only when the neighbors disagree. (This might be counterproductive on the really slow modes.) * Your exp approximation was excessively precise. A 6th order polynomial iirc, whose error was less than floating-point precision. I got away with 2nd order (0.1% rmse). * Mean removal can be factored into weights for free. (Noticed just now and updated the code.) Quote:
Quote:
Last edited by akupenguin; 21st August 2010 at 17:07. |
|||
|
|
|
![]() |
|
|