Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
24th August 2010, 10:17 | #263 | Link | |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Quote:
(BTW: Looking forward for any speed-optimization in NNEDI3)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ |
|
24th August 2010, 17:45 | #264 | Link |
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
I did try SSIM, but it's hard to effectively use it within the online gradient descent training scheme because the derivative for a single pixel depends on all of the other pixels within the gaussian weighted window around it. That means I can't calculate a single pixel result and then do gradient descent on the model weights unless I assume a value for all of the missing pixels (those which need to be interpolated by the network) and pre-compute the necessary statistics beforehand (so that during training I only have to modify a few values based on the current result to compute the gradient). What I tried was assuming all of the missing values were perfect... that gave results very close to mse training. The one idea I didn't try was starting by assuming all of the missing values were interpolated using cubic interpolation, and after each iteration (after presenting all of the training cases to the network) recalculating all of the ssim statistics that depend on the missing pixels based on the current state of the network. Like a lot of things I plan to try that in the future.
Now if you're not using online gradient descent (say CMA-ES or a fitness function based optimization) you could just run through the training set, interpolate all the necessary pixels, compute ssim for everything at the end. I actually tried that as well, but it takes A LOT more cpu time. Using online gradient descent I can train the 6x48x256 network on 60 million pixels in only a few days on my quadcore desktop. For nnedi2 I was using separable CMA-ES for training (using the scheme I just described) and to train on 1/5 the pixels was taking about 3x longer while using about 4x the cpu power. And nnedi2 is basically equivalent to 4x12x32 in nnedi3 terms. Those are rough numbers of course. Last edited by tritical; 24th August 2010 at 17:56. |
24th August 2010, 19:15 | #265 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Time to start a nnedi@home project
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ |
25th August 2010, 11:04 | #266 | Link | ||
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Quote:
Quote:
Last edited by akupenguin; 25th August 2010 at 12:22. |
||
25th August 2010, 16:14 | #267 | Link | ||
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
Quote:
EDIT: The partial of ssim with respect to the network output is only for ssim of the 11x11 gaussian weighted window centered on the current pixel. I'm not explicitly taking into account other ssim windows that depend on the network output at this point. Quote:
Last edited by tritical; 25th August 2010 at 16:59. |
||
26th August 2010, 12:17 | #268 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
BTW, modes with more weights than fit in L1d cache are bottlenecked by cache misses, not arithmetic throughput. I didn't rectify this (except insofar as int16 reduces cache footprint too), but you probably want to if you're tuning for nsize>8x6.
Last edited by akupenguin; 27th August 2010 at 03:02. |
28th August 2010, 13:20 | #269 | Link |
Herr
Join Date: Apr 2009
Location: North Europe
Posts: 556
|
@tritical: Could you please make a comparison of NNEDI3, with the same source as these:
http://forum.doom9.org/showthread.ph...68#post1343668 Thanks |
29th August 2010, 11:42 | #270 | Link | |
Registered User
Join Date: Apr 2005
Posts: 213
|
Quote:
|
|
5th September 2010, 15:03 | #272 | Link |
Registered User
Join Date: Mar 2010
Location: Sweden
Posts: 13
|
I can't get EEDI3 to work with this AA-function:
Code:
o=last AssumeBFF().SeparateFields() dbl = mt_Average(SelectEven().EEDI3(field=0),SelectOdd().EEDI3(field=1),U=3,V=3) dblD = mt_MakeDiff(o,dbl,U=3,V=3) shrpD = mt_MakeDiff(dbl,dbl.RemoveGrain(11),U=3,V=3) DD = shrpD.Repair(dblD,13) dbl.mt_AddDiff(DD,U=3,V=3) Last edited by Skauneboy; 5th September 2010 at 15:15. |
5th September 2010, 16:33 | #273 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 5,391
|
Oops, I don't have eedi3.dll at hand ... give a try if it works with the following change?
dbl = mt_Average(SelectEven().EEDI3(field=0),SelectOdd().EEDI3(field=1),U=3,V=3).AssumeFrameBased()
__________________
- We´re at the beginning of the end of mankind´s childhood - My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!) |
5th September 2010, 19:43 | #275 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 5,391
|
Ah, reading documentation helps. EEDI3 works framebased like NNEDIx, not fieldbased like EEDI2. Try like so:
Code:
o=last AssumeBFF() # should not be needed ... EEDI3(field=-2) dbl = merge( selecteven(), selectodd() ) dblD = mt_MakeDiff(o,dbl,U=3,V=3) shrpD = mt_MakeDiff(dbl,dbl.RemoveGrain(11),U=3,V=3) DD = shrpD.Repair(dblD,13) dbl.mt_AddDiff(DD,U=3,V=3)
__________________
- We´re at the beginning of the end of mankind´s childhood - My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!) |
12th September 2010, 20:34 | #277 | Link |
Huh?
Join Date: Sep 2003
Location: Uruguay
Posts: 3,103
|
I did some more comparisons using Gabriel Knight's "Making Of" video:
eedi3_rpow2(rfactor=2,cshift="spline36resize",hp=false) nnedi3_rpow2(rfactor=2,nsize=3,nns=4,qual=2,cshift="spline36resize") eedi3_rpow2(alpha=0.3,beta=0,rfactor=2,cshift="spline36resize",hp=true,vcheck=3) Original screen batch here. In my opinion, on the particular filterchain used for this particular source, the first version of EEDI3 gives the more pleasing results.
__________________
Read Decomb's readmes and tutorials, the IVTC tutorial and the capture guide in order to learn about combing and how to deal with it. |
13th September 2010, 13:57 | #279 | Link |
lurkster
Join Date: Jul 2009
Location: D9|D10
Posts: 123
|
Thanks also Chainmax. I couldn't choose between them though. Each is good on different areas (parts of mic, side of face), but struggle on others. They are pretty much equal imho, in as far as weighing up the positives and negatives effects.
|
21st September 2010, 03:10 | #280 | Link |
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
So I finally got back to working on this, and almost have the next nnedi3 release ready... but one question first. Does anyone using this not have an SSE2 capable processor? I'm considering dropping support for SSE/MMX since it's a pain to keep around and test.
|
|
|