Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 11th September 2008, 16:29   #201  |  Link
2Bdecided
Registered User
 
Join Date: Dec 2002
Location: UK
Posts: 1,673
Quote:
Originally Posted by tritical View Post
Don't fear, it is still being worked on . I've just been busy working on some other projects the last week or two. A bit of good news is that I got permission to run the training program on my university's 512 cpu cluster, and I'm doing some test runs as I write this. A new version should be ready in a week or two (no promises though ).
I wonder if it will make it in time for this post's first birthday?

I ask because, in the UK at least, full time Masters degrees usually run for one year. I wonder if Tritical has started, or finished, or if the next we'll see of his idea is if/when it's deployed commercially.

Any chance of an update?

Cheers,
David.
2Bdecided is offline   Reply With Quote
Old 17th September 2008, 19:10   #202  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
If I was going full time it would take ~3 semesters (degree requires 30 hours, 9 hours is full time). However, I've only been part time, and then working some (plus I have no desire to give up the college lifestyle right now ). I should only have one more semester after this one.

Anyways, I'm still working on NNEDI... trying new ideas, etc... It has turned out to be a rather difficult problem (in terms of achieving the type of results I think are possible). If I ever get something significantly better than the current released version working then I will definitely post it. However, so far its just been small improvements. I'm hoping that the current idea I'm running with will show big improvements.
tritical is offline   Reply With Quote
Old 17th September 2008, 20:27   #203  |  Link
Adub
Fighting spam with a fish
 
Adub's Avatar
 
Join Date: Sep 2005
Posts: 2,699
Good to hear from you!! I am glad that you are enjoying the college life (as I myself am) and I look forward to your future work with eagerness!
__________________
FAQs:Bond's AVC/H.264 FAQ
Site:Adubvideo
Adub is offline   Reply With Quote
Old 18th September 2008, 09:57   #204  |  Link
Terka
Registered User
 
Join Date: Jan 2005
Location: cz
Posts: 704
will new nnedi use also temporal information?
Terka is offline   Reply With Quote
Old 18th September 2008, 19:02   #205  |  Link
2Bdecided
Registered User
 
Join Date: Dec 2002
Location: UK
Posts: 1,673
Thanks for the update. I hope your thesis makes it on-line one day. Sadly, many universities still don't encourage this.

Cheers,
David.
2Bdecided is offline   Reply With Quote
Old 21st September 2008, 02:23   #206  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
@Terka
It's still spatial only for now. Once that is working well, incorporating temporal information (which will most likely have to involve separate motion compensation) is definitely the next step.
tritical is offline   Reply With Quote
Old 23rd September 2008, 10:04   #207  |  Link
Terka
Registered User
 
Join Date: Jan 2005
Location: cz
Posts: 704
holding the thumbs!
Terka is offline   Reply With Quote
Old 21st October 2008, 14:09   #208  |  Link
g_aleph_r
Registered User
 
Join Date: Feb 2007
Posts: 25
News about CUDA version?
I am currently working with 12 Fps on a 50Fps video, it is slooow!!
g_aleph_r is offline   Reply With Quote
Old 21st October 2008, 17:49   #209  |  Link
Adub
Fighting spam with a fish
 
Adub's Avatar
 
Join Date: Sep 2005
Posts: 2,699
I don't think anyone is actually working on a CUDA version.
__________________
FAQs:Bond's AVC/H.264 FAQ
Site:Adubvideo
Adub is offline   Reply With Quote
Old 26th October 2008, 07:55   #210  |  Link
g_aleph_r
Registered User
 
Join Date: Feb 2007
Posts: 25
Quote:
Originally Posted by tritical View Post
...Last fall I actually spent some time writing a CUDA implementation to offload part of the calculations used for training (which are pretty much the same ones used during normal operation).
...
Hopefully, the source code for NNEDI will be available by the summer. I actually have a new version ready, I just need a free day to update the code.
Sorry if I insist,

I would like to see it even if He says that is not very fast, It's still useful if it leaves CPU for other filters.
g_aleph_r is offline   Reply With Quote
Old 26th January 2009, 10:40   #211  |  Link
Terka
Registered User
 
Join Date: Jan 2005
Location: cz
Posts: 704
Hi tritical, any news regarding new version?
Terka is offline   Reply With Quote
Old 27th January 2009, 09:49   #212  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
Not really. I still work on it when I get new ideas, but as it turns out the original formulation of nnedi was pretty good and not that easy to beat.
tritical is offline   Reply With Quote
Old 27th January 2009, 09:50   #213  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by tritical View Post
Not really. I still work on it when I get new ideas, but as it turns out the original formulation of nnedi was quite good and not that easy to beat.
Would it work better for resizing if it was trained specifically for resizing instead of for edge-interpolation? And how about, for example, making a version explicitly for cartoons by training on such?

Also, any news on publicizing the algorithm behind this? I have a few ideas, but I'd like to know for sure.
Dark Shikari is offline   Reply With Quote
Old 27th January 2009, 12:55   #214  |  Link
Terka
Registered User
 
Join Date: Jan 2005
Location: cz
Posts: 704
imho users will be grateful if temporal component was added.
Terka is offline   Reply With Quote
Old 27th January 2009, 20:30   #215  |  Link
Sagekilla
x264aholic
 
Join Date: Jul 2007
Location: New York
Posts: 1,752
@Dark Shikari: You mean like giving it the full resolution input then having the algorithm try to optimize towards getting the result sharp like the source, instead of trying to get the edges sharp like the source?

Just a guess, I don't know exactly how NNEDI works.
__________________
You can't call your encoding speed slow until you start measuring in seconds per frame.
Sagekilla is offline   Reply With Quote
Old 29th January 2009, 14:19   #216  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
@Dark Shikari
Yes, training specifically for resizing would make it better for that, and training specifically for anime would make it better at anime. The idea is actually pretty simple. Use cubic interpolation (or some other fast method) where it wont introduce much error, split remaining pixels into similar groups based on local neighborhood, have one or more neural networks for each group that are trained to output the interpolated pixel value given the local neighborhood as input. Of course there are lots of open questions there... How much data to use, and from how many sources? How to separate local neighborhoods into groups (clustering... what method? operate on raw pixel values? do dimensionality reduction? extract specific features?). How many groups to have? What to feed to the neural networks (raw pixel values? extracted features?). What structure should the neural networks have? How should they be trained? What should the objective function be? How should overfitting be avoided?

The version of nnedi out now used pretty much the simplest methods, and took no steps to avoid overfitting aside from using lots of training data:

k-means clustering with 64 clusters, cluster on raw pixel values of local neighborhood (mean removed), local neighborhood was 4x25 (100 pixels), cluster on ~20-25 million local neighborhoods from progressive frames from ~35-40 sources

one neural network per cluster, input was raw pixel values (scaled to [-1,1] and mean of local neighborhood removed), trained with CMA-ES to minimize squared error, neural network had 2 hidden layers w/ 8 neurons apiece, each neuron used Elliott activation, nn had one output neuron with linear activation function which was connected to both hidden layers, one of the neurons in the first hidden layer used linear activation, and starting point for training the neural networks was set by solving for the linear lss weights for the cluster, sticking those into the first layer linear activation neuron (basically the networks started out predicting the linear best fit solution).

I think that is about it, or what I remember at least.

@Terka
I have thought about how to include temporal information, but it isn't all that easy. It would require accurate motion compensation, and I think training would be much more complex than spatial only.
tritical is offline   Reply With Quote
Old 29th January 2009, 14:29   #217  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Would it be possible to make a release of NNEDI that could be trained specifically for whatever purposes I wanted? I have enough CPU power to go for it...

Also, how do you recommend training--downscaling sample input, NNEDI, and comparing it to original input? Won't that to some extent lead to an NNEDI that's optimized to a specific downscaling resampler?

Also, a hunch: if you're basing the neural network on neighboring pixels, are you using the differences between the neighboring pixels as well (e.g. (T-L), (LT-T), (TT-T), (LL-L), etc, where T=Top, L=Left, TL=TopLeft, TT=TopTop [two above], etc)? I suspect this might give even better results (testing with FFV1 shows that it gives the best correlation).

(By the way, here's a recent upscale I did with NNEDI and a few other filters: Left is Lanczos, Right is NNEDI)

Last edited by Dark Shikari; 29th January 2009 at 14:35.
Dark Shikari is offline   Reply With Quote
Old 29th January 2009, 15:21   #218  |  Link
*.mp4 guy
Registered User
 
*.mp4 guy's Avatar
 
Join Date: Feb 2004
Posts: 1,348
Could you also post the source image?
*.mp4 guy is offline   Reply With Quote
Old 29th January 2009, 15:27   #219  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by *.mp4 guy View Post
Could you also post the source image?
Linkage

Script:

image=ImageSource("test.png")
r=image.ShowRed("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster()
g=image.ShowGreen("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster()
b=image.ShowBlue("YV12").nnediresize_YV12().dfttest(sigma=1).fastlinedarken().limitedsharpenfaster()
MergeRGB(r,g,b)
ConvertToYV12()
AddGrain(1,0.1,0.1)
AddGrain(2,0.2,0.2)
AddGrain(3,0.4,0.4)

AddGrain is for dither/weak noise bascally. DFTtest is to deal with the jpeg artifacts from the original (the PNG is converted from an original source JPEG). Separate upscaling for each color channel is because, IMO, it seems to work better.
Dark Shikari is offline   Reply With Quote
Old 29th January 2009, 23:24   #220  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
Quote:
Would it be possible to make a release of NNEDI that could be trained specifically for whatever purposes I wanted? I have enough CPU power to go for it...
It's possible, and I have thought about it before (allowing users to give training data). It would take a little work as the training code is scattered among multiple programs.

Quote:
Also, how do you recommend training--downscaling sample input, NNEDI, and comparing it to original input? Won't that to some extent lead to an NNEDI that's optimized to a specific downscaling resampler?
It will be biased towards that resampler, but is there a better way to do it? In most of the papers I've read they test upsampling by downscaling large images (usually with basic averaging + some sharpening, trying to approximate how various imaging devices work).

Quote:
Also, a hunch: if you're basing the neural network on neighboring pixels, are you using the differences between the neighboring pixels as well (e.g. (T-L), (LT-T), (TT-T), (LL-L), etc, where T=Top, L=Left, TL=TopLeft, TT=TopTop [two above], etc)? I suspect this might give even better results (testing with FFV1 shows that it gives the best correlation).
I don't do that. Theoretically, it is unnecessary/redundant, as those differences are simply linear combinations of the input variables... so the input layer neurons could learn the same mappings given the original pixel values as input vs if they were given those differences as input. It might make the learning faster though, would have to try.

Last edited by tritical; 29th January 2009 at 23:26.
tritical is offline   Reply With Quote
Reply

Tags
deinterlace, nnedi

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:32.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.