Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
11th September 2008, 16:29 | #201 | Link | |
Registered User
Join Date: Dec 2002
Location: UK
Posts: 1,673
|
Quote:
I ask because, in the UK at least, full time Masters degrees usually run for one year. I wonder if Tritical has started, or finished, or if the next we'll see of his idea is if/when it's deployed commercially. Any chance of an update? Cheers, David. |
|
17th September 2008, 19:10 | #202 | Link |
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
If I was going full time it would take ~3 semesters (degree requires 30 hours, 9 hours is full time). However, I've only been part time, and then working some (plus I have no desire to give up the college lifestyle right now ). I should only have one more semester after this one.
Anyways, I'm still working on NNEDI... trying new ideas, etc... It has turned out to be a rather difficult problem (in terms of achieving the type of results I think are possible). If I ever get something significantly better than the current released version working then I will definitely post it. However, so far its just been small improvements. I'm hoping that the current idea I'm running with will show big improvements. |
21st September 2008, 02:23 | #206 | Link |
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
@Terka
It's still spatial only for now. Once that is working well, incorporating temporal information (which will most likely have to involve separate motion compensation) is definitely the next step. |
26th October 2008, 07:55 | #210 | Link | |
Registered User
Join Date: Feb 2007
Posts: 25
|
Quote:
I would like to see it even if He says that is not very fast, It's still useful if it leaves CPU for other filters. |
|
27th January 2009, 09:50 | #213 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Also, any news on publicizing the algorithm behind this? I have a few ideas, but I'd like to know for sure. |
|
27th January 2009, 20:30 | #215 | Link |
x264aholic
Join Date: Jul 2007
Location: New York
Posts: 1,752
|
@Dark Shikari: You mean like giving it the full resolution input then having the algorithm try to optimize towards getting the result sharp like the source, instead of trying to get the edges sharp like the source?
Just a guess, I don't know exactly how NNEDI works.
__________________
You can't call your encoding speed slow until you start measuring in seconds per frame. |
29th January 2009, 14:19 | #216 | Link |
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
@Dark Shikari
Yes, training specifically for resizing would make it better for that, and training specifically for anime would make it better at anime. The idea is actually pretty simple. Use cubic interpolation (or some other fast method) where it wont introduce much error, split remaining pixels into similar groups based on local neighborhood, have one or more neural networks for each group that are trained to output the interpolated pixel value given the local neighborhood as input. Of course there are lots of open questions there... How much data to use, and from how many sources? How to separate local neighborhoods into groups (clustering... what method? operate on raw pixel values? do dimensionality reduction? extract specific features?). How many groups to have? What to feed to the neural networks (raw pixel values? extracted features?). What structure should the neural networks have? How should they be trained? What should the objective function be? How should overfitting be avoided? The version of nnedi out now used pretty much the simplest methods, and took no steps to avoid overfitting aside from using lots of training data: k-means clustering with 64 clusters, cluster on raw pixel values of local neighborhood (mean removed), local neighborhood was 4x25 (100 pixels), cluster on ~20-25 million local neighborhoods from progressive frames from ~35-40 sources one neural network per cluster, input was raw pixel values (scaled to [-1,1] and mean of local neighborhood removed), trained with CMA-ES to minimize squared error, neural network had 2 hidden layers w/ 8 neurons apiece, each neuron used Elliott activation, nn had one output neuron with linear activation function which was connected to both hidden layers, one of the neurons in the first hidden layer used linear activation, and starting point for training the neural networks was set by solving for the linear lss weights for the cluster, sticking those into the first layer linear activation neuron (basically the networks started out predicting the linear best fit solution). I think that is about it, or what I remember at least. @Terka I have thought about how to include temporal information, but it isn't all that easy. It would require accurate motion compensation, and I think training would be much more complex than spatial only. |
29th January 2009, 14:29 | #217 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Would it be possible to make a release of NNEDI that could be trained specifically for whatever purposes I wanted? I have enough CPU power to go for it...
Also, how do you recommend training--downscaling sample input, NNEDI, and comparing it to original input? Won't that to some extent lead to an NNEDI that's optimized to a specific downscaling resampler? Also, a hunch: if you're basing the neural network on neighboring pixels, are you using the differences between the neighboring pixels as well (e.g. (T-L), (LT-T), (TT-T), (LL-L), etc, where T=Top, L=Left, TL=TopLeft, TT=TopTop [two above], etc)? I suspect this might give even better results (testing with FFV1 shows that it gives the best correlation). (By the way, here's a recent upscale I did with NNEDI and a few other filters: Left is Lanczos, Right is NNEDI)
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 29th January 2009 at 14:35. |
29th January 2009, 15:27 | #219 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Linkage
Script: image=ImageSource("test.png") r=image.ShowRed("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster() g=image.ShowGreen("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster() b=image.ShowBlue("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster() MergeRGB(r,g,b) ConvertToYV12() AddGrain(1,0.1,0.1) AddGrain(2,0.2,0.2) AddGrain(3,0.4,0.4) AddGrain is for dither/weak noise bascally. DFTtest is to deal with the jpeg artifacts from the original (the PNG is converted from an original source JPEG). Separate upscaling for each color channel is because, IMO, it seems to work better. |
29th January 2009, 23:24 | #220 | Link | |||
Registered User
Join Date: Dec 2003
Location: MO, US
Posts: 999
|
Quote:
Quote:
Quote:
Last edited by tritical; 29th January 2009 at 23:26. |
|||
Tags |
deinterlace, nnedi |
Thread Tools | Search this Thread |
Display Modes | |
|
|