Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
18th June 2005, 12:17 | #1 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
filter: ExtendedBilateral
Well following up from this thread I decided to try and implement an extended version of the bilateral filter. Using the paper in that thread as well as another that suggests a couple of improvements to the bilateral filter, I came up with ExtendedBilateral.
The extension itself is that it adds an "initial estimation preprocess" that goes before the regular bilateral filtering step. Because of this, weaker smoothing settings are used when using both the preprocessor and the main bilateral filtering step. ExtendedBilateral can also be used in conjunction with TBilateral via its "clip2" parameter. I haven't had much time to test it out fully, but the quick tests I did do were slow until I added a new LUT (thanks to tritical for the suggestion). I still have one more thing to do (remove the upsampling, which I may not get around to for another couple of days) but even after that I suspect it will still be a bit slower than TBilateral given me lack of experience in this kind of thing. Since I haven't tested all the features, I'd love comments or bug reports; knowing me, I made some dumb mistake somewhere that completely screwed the whole particular process over. I suspect that this ought to work better on sources with large blocks of colour (cartoon/anime) and not quite as great on live sources (though I haven't tested it). I could also use some suggestions for speed-ups, if anyone actually looks at the code. Here's the latest version of the filter: gone now Last edited by insanedesio; 17th April 2011 at 10:43. |
21st June 2005, 21:37 | #2 | Link |
Registered User
Join Date: Nov 2004
Location: Spain
Posts: 408
|
Hi,
After give extendedbilateral a try, i can report: its sloooooooooooooooooooooooooooooow It took abot two minutes to initiate process (time to load and process script in virtualdub) in my pentium III 800 PC. Same time (roughly) to advance to next frame. I have used to produce a ppclip for Tritical's tbilateral filter. I can say there are a sligth, but noticeable, improvement in edge preservation. The speed, alas, make the filter unusable for me (perhaps in a superCraycomputer?) Best regards |
22nd June 2005, 04:32 | #3 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
Yeah it's pretty damn slow... though not as slow as you're saying (the PIII 800 might have something to do with it). On my tests it showed to be about 40-50% the speed of TBilateral (the main bilateral process) and the preprocessor was a bit faster than that. Once I take out the upsampling it should be faster (though by how much, I'm not sure... a quick test showed it faster by an fps or two but the actual difference oughta be more since the actual pixel processing changes and becomes simpler...) If you're getting processing times that are really slow (like less than 30% of TBilateral) then I'd be confused. (If you think this version is slow, you should have seen 0.5.0.1 - eight times or so slower).
What settings did you use, the defaults? Some settings will slow the filter down significantly on certain values. Also, when using it with TBilateral, you need to make sure that both filters use the same corresponding dev and sigma values (info in the readme). You might want to use smaller devs in TBilateral than you'd normally use since the preprocessor adds a bit of smoothing on its own. As for the "edge preservation," the real strength of this process is that it is supposed to be able to tackle noise closer to edges that a regular bilateral process cannot. So yeah, in some senses the edge preservation is supposed to be better (more noise removal close to edges). EDIT: Another thing ExtendedBilateral offers is for the multiple kernels (the flat one is supposed to be pretty fast... some of the others should be alot faster than the default gaussian, too... check the PRASA paper mentioned in the readme for more info) as well as the median bilateral filters, which are also suggested as improvements in the same paper. Last edited by insanedesio; 22nd June 2005 at 08:56. |
22nd June 2005, 21:49 | #6 | Link |
Registered User
Join Date: Nov 2004
Location: Spain
Posts: 408
|
Scripting slowly
Hi:
I have repeated my tests. My script has been: setmemorymax(128) loadplugin("C:\Archivos de programa\AviSynth 2.5\plugins2\extendedbilateral.dll") t=avisource("bvi_v1.avi").assumetff().crop(256,0,256,0,align=true) q=t.TDeint(order=-1,mode=1,field=1,type=0,sharp=true,mtnmode=0,mthreshL=1,mthreshC=1,cthresh=1).converttoyv12() i=q.ExtendedBilateral(preprocess=0) v=q.tbilateral(sdevl=5,sdevc=5,ppclip=i,gui=false,chroma=true) stackhorizontal(q,v,subtract(q,v)) and the timing: - Time to process frame 0 = 2 minutes 30 seconds. - Time to skip to fame 1 = 2 minutes 25 seconds. In my Pc : Pentium III 800 EB (with more cache) 384 MB RAM Windows XP professional with no processes running (other than antivirus and housekeeping) Version 5.0.2 of the filter. By the way. The more, the merrier. I appreciate all the filters that people share without no fee. All. Even the less applicable. |
23rd June 2005, 02:57 | #8 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
@Dreassica
That's even weirder because that's what I'm on... exactly. It ran at I think 3-4 fps for me (TBilateral at 7), can't remember (just the main bilateral filter). Of course, I was running it with just avisource() and ConvertToYUY2() and default settings. What settings did you use, the defaults? Are you using it in conjunction with TBilateral? Maybe ExtendedBilateral doesn't like working with other filters for some reason. Although the clip I tested on was only 40 frames long... but that shouldn't matter I don't think. @AVIL That's even worse than I thought originally, since it's only the preprocess, which for me ran ~4-6 fps. It's possible the SetMemoryMax() might have something to do with it, but there's not much avoiding that if you're on 384MB RAM. My guess would be that your 800MHz processor might have something to do with it... ~4 times slower than mine, however, 2 minutes doesn't make sense. I didn't test it in conjunction with TBilateral but the preprocessor on its own took a max of like six seconds to load then less than half a second (probably around a quarter or fifth) to render. Try ExtendedBilateral on its own, without TBilateral (with preprocess = 2, the default). I'm going to run some tests with using both in conjunction. Last edited by insanedesio; 23rd June 2005 at 03:10. |
23rd June 2005, 03:47 | #9 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
Added new version (0.5.0.3) in the first post. This should be faster (thanks to a tip from tritical, he pointed out something that just ate up time in the upsampling routine... which I'm still going to kill, eventually...), especially if you're using YV12. In my tests in YUY2 it brought down the load time significantly and brought the processing speed up by 1 fps (20% increase, I explain in the readme how it could actually be upto 1.9fps which is 42% in that case... though that's probably pushing it.. alot ). Hopefully that should work better for you guys.
Last edited by insanedesio; 23rd June 2005 at 03:52. |
23rd June 2005, 04:13 | #10 | Link |
Evil tweaker...
Join Date: Sep 2002
Posts: 33
|
@insanedesio
I'm sorry ^^ I don't take it as a competition but I have plenty of work and almost no time to implement ^^ so you win because you did it before me that's all ^^ (and it's really a good thing, i'm eager to try it) I will continue to give you my opinion (and/or advice) on this topic but I wonder if I would find time to throw away matlab and focus on real avisynth plug-in long life to the skillfull developpers ^^ (signed : the evil -but inexperimented- tweaker) btw be happy, on matlab it take about 10 min for one frame For the moment I have some ideas of new functions to replace the gaussian one but... even more slow ^^ (i have to stop maths and do some code ^^) Last edited by ambrotos; 23rd June 2005 at 04:45. |
23rd June 2005, 07:32 | #12 | Link |
Evil tweaker...
Join Date: Sep 2002
Posts: 33
|
In fact I was thinking about something for the range filter (the estimator of the weight, given the "color-distance") in the second step of the bilateral filter
for a given distance (in color-space) : d = abs[ pixel_value(x) - pixel_value(e) ] weight(d) = 1 / sqrt[ 1 + (d / d_c) ^(2*n) ] it is a kind of low pass filter where d_c is the cutting-frequency and n is the order of the filter. The shape can be chosen very close to a gaussian (d_c = 1) but it is more flexible because we can set a filter which give high weights until a certain color-distance and cut sharp just after (with a high enough order) Example for n = 5 (red), n = 50 and n = 500 (blue), d_c = 500 So, in the case of a high order (ex : n > 500) for a uniform colored area, with noise (but under the cutting level) it is a real gaussian filter, and if there is a detail (superior to the thresold) it use the bilateral ability. But it is not only a binary thresold... as it can be softened by using a lower order. And by tweaking the d_c value you can delay the response to a change of color any question ? Last edited by ambrotos; 23rd June 2005 at 12:14. |
23rd June 2005, 20:12 | #13 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
Can't say I feel like adding that, since it takes two parameters, and all the kernels I use only take one. It is also similar mathematically to the El-Fallah Ford kernel; the only difference is that the El-Fallh Ford kernel doesn't have a variable order... it's just (1 + (diff / sigma) ^ 2) ^ (-0.5).
Last edited by insanedesio; 23rd June 2005 at 20:14. |
24th June 2005, 00:11 | #15 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
I'm pretty new to *actual* programming so I don't really know what you mean by "profiling." I've looked at the code and tried to find places where I think it might be slow but I have no idea how I might be able to narrow down the places that are slowing it down in reality. I believe the most recent change I made might've been a culprit when working in YV12 and I'm pretty sure it'll be a faster still when I take out the upsampling, but beyond that I really have no clue.
|
24th June 2005, 01:15 | #16 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
you can use the free tool CodeAnalyst from AMD (http://www.amd.com/us-en/Processors/...9_3604,00.html) it will work even with a non-AMD processor although only the timer-part(but that is the part you need) or you can use QueryPerformanceCounter or another timer to meassure the time to complete different part of your code
|
24th June 2005, 06:08 | #18 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
Well I took a look at CodeAnalyst and couldn't quite figure out how to get it work (especially with a .dll). Mind giving me a quick walkthrough? I must say I am rather interested in finding out whether the upsampling really is that slow, especially after the change in 0.5.0.3 that made it faster. Keeping it in there makes the coding simpler, the way I did it at least... if it's only a small culprit then it probably isn't worth taking it out but if it's bigger I'll have to take the time to get rid of it :-\.
Last edited by insanedesio; 24th June 2005 at 06:34. |
24th June 2005, 15:02 | #19 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
sure. First create a release build with Debug Information Format set to Program database(it hides in project properties->C/C++->general) and Generate Debug Info set to Yes(under project properties->Linker->Debugging) and Enable Incremental Linking to yes(under project properties->Linker->General)
Next start CodeAnalyst and create a new project(file->new) set Session name to Timer 1. Project directory to theplace you want to save the generated files. Project Name to ExtendedBilateral. Ignore Working directory and set Launch app to the player you use to open the avs file with. I use virtualdub so I set the Launch app to this: "c:\vd16\virtualdub.exe" "c:\test.avs" so virtualdub opens the avs script that contains the filter you want to profile. Set session type to Timer Trigger and set Project information->Terminate app and set duration to 360 secunds and start delay to 10 secunds (the time it takes for virtualdub to open the avs file and you to click play when the file is open) now click on play and when virtualdub has opened the avs file hit play in virtualdub and wait while codeanalyst collects the data. When codeanalyst is done virtualdub is closed find your session under TBP sessions and double click it. Now the System Data tab appears with all the loaded programs and dlls. Find ExtendedBilateral and double click on it. Now the list with the time per function appears. The function name appears in the Symbol + Offset column. If it only displayes NO SYMBOL it might be neccesary to use the commandline tool or check if the linker options aren't enabled due to some other settings(like enable global optimization). If you doubleclick the function name the source code appears with the timer information. |
24th June 2005, 23:02 | #20 | Link |
Registered User
Join Date: May 2003
Posts: 27
|
Alright, the test results basically told me that the upsampling/downsampling take about 1.5% of the whole filter's time/processing; the rest goes to the pprocess() and process() functions, which are the main functions anyways. Small amounts (a couple samples amidst thousands) went to the LUT building functions (and thus the kernel functions). The only thing I suppose that can be done now is to somehow speed up the main functions themselves. Though the sampling didn't take up too much the main functions ought to be faster once I get rid of it since it will have to process less stuff, assuming I set it up right, which I still need to think about before I get started on it.
Other than that, there wasn't much else. _ftol() took up 6% or so but not much can be done about that, I don't think. |
Thread Tools | Search this Thread |
Display Modes | |
|
|