Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 16th July 2004, 00:47   #1  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
RemoveGrain

RemoveGrain is a spatial denoiser for progressive video. Because of SSE optimisation, it is fairly fast (above 100 fps on my whimpy 1.3 GHZ Celeron). There are now significantly faster SSE2, SSE3 versions. For more details go to www.RemoveGrain.de.tf.
16|07|04 version 0.4 (first public version) released.
17|08|04 version 0.5 released (for a change log see the posting below)
12|09|04 version 0.6 released (for a change log see the posting below)
13|02|05 version 0.7 released (for a change log see the posting below)
09|04|05 version 0.8 released (for a change log see the posting below)
01|05|05 version 0.9 released (for a change log see the posting below)

Last edited by kassandro; 1st May 2005 at 14:41.
kassandro is offline   Reply With Quote
Old 16th July 2004, 21:10   #2  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
Nearly missed this one..I'll test it tomorrow on my P4 system
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 16th July 2004, 21:53   #3  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
I can't get the SSE2 version working. I extracted first both RemoveGrain.dll and RemoveGrainSSE2.dll in my plugins folder and then tried removing RemoveGrain.dll. In both cases I got an error message "there is no function named DRemoveGrain".
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 16th July 2004, 22:51   #4  |  Link
ARDA
Registered User
 
Join Date: Nov 2001
Posts: 291
Same here; and would like to test this filter

Thanks ARDA
ARDA is offline   Reply With Quote
Old 17th July 2004, 13:08   #5  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
Sorry, I was a little bit too fast and too tired (after midnight), when I rushed out the plugin. Actually RemoveGrainSSE2.dll was the non-test version, which I wanted to release only after satisfactory testing. I hope that it's correct now. Thanks in advance for testing.
I also made some slight corrections in the documentation.
kassandro is offline   Reply With Quote
Old 17th July 2004, 15:17   #6  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
OK, I tested the SSE2 version, and there were no differences in the DebugView log. However, the only modes I got working were 5,6,7 and 8, all the others gave an access violation when the script was loaded in VirtualDub. I used a 720x576 MPEG-2 capture as the test material.

Other than that, the filter seems to work very well, mode 8 did an excellent job with my TV caps when followed by RemoveDirt(). Which reminds me, are you planning on adding SSE2 optimizations to RemoveDirt as well?
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 18th July 2004, 01:43   #7  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
First, many thanks for testing.

Quote:
Originally posted by Boulder
However, the only modes I got working were 5,6,7 and 8, all the others gave an access violation when the script was loaded in VirtualDub. I used a 720x576 MPEG-2 capture as the test material.
Hmmm, was this a read access or a write access violation? Did the access violation also appear with the non-SSE2 version (the only one I could test)? Does the access violation persist if you crop before RemoveGrain with crop(4,4,-4,-4, align=false)? Finally, there may be a bug in the difference filter, though this is extremely unlikely, because it works with the other modes. But nevertheless one should make a try without it. It is enough to test this for the simplest mode=1, where surprisingly the access violation already happens. I expect that you did the tests with an yv12 clip, which is the natural one for mpeg2. The other color spaces are not well tested. In fact, I am stunned about the very poor performance for the non-yv12 color spaces, but I postponed the investigation of this problem for a while.

Quote:

Other than that, the filter seems to work very well, mode 8 did an excellent job with my TV caps when followed by RemoveDirt(). Which reminds me, are you planning on adding SSE2 optimizations to RemoveDirt as well?
I did some work in this direction, but the performance gain will be quite small as long as the block width is 8. Changing the block width to 16 SSE2 will become very valuable, but then the frame width has to be a multiple of 16. Also unlike RemoveGrain, RemoveDirt heavily uses the psadbw instruction and this instruction behaves differently on 128 bit SSE registers (used with SSE2) and 64 bit mmx registers used (which must be used by the non-SSE2 version). Thus, while I get the RemoveGrain SSE2 version for free (it only requires testing) using smart macros, this is no more the case for RemoveDirt, where I have to make modification at some places, where I use the psadbw instruction. But since Intel has adopted the Athlon64 platform, we all will move to this platform within the next 2-3 years, because with its 16 general purpose registers (14-15 regsiters will then be usable, currently only 6-7 registers can be used) C++ compilers can generate significantly more efficient code even for 32 bit apps on the Athlon64 platform. Also the video resolutions will increase (HDTV). These two reasons more than justify a block width of 16 (the block height will stay at 8), which also has some slight disadvantages.

Last edited by kassandro; 18th July 2004 at 01:47.
kassandro is offline   Reply With Quote
Old 19th July 2004, 15:21   #8  |  Link
Chainmax
Huh?
 
Chainmax's Avatar
 
Join Date: Sep 2003
Location: Uruguay
Posts: 3,103
This is the filter that you mentioned works similarly to Undot, right?
Chainmax is offline   Reply With Quote
Old 19th July 2004, 19:01   #9  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
@kassandro: I'll do the tests tomorrow and let you know what happens.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 19th July 2004, 23:30   #10  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
Quote:
Originally posted by Chainmax
This is the filter that you mentioned works similarly to Undot, right?
Yes, RemoveGrain(mode=1) = UnDot(). Even the speed of the SSE version (with mode=1, of course) is identical to that of Undot. Like Undot and many other spatial denoisers, also the other eight modes are based on the eight neighbours of a pixel. The details of the algorithms can be found in the documetnation

@Boulder: many thanks already. So far, I can't reproduce these ugly access errors. A thorough inspection of the source code also didn't help.
kassandro is offline   Reply With Quote
Old 20th July 2004, 09:30   #11  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
Quote:
Hmmm, was this a read access or a write access violation? Did the access violation also appear with the non-SSE2 version (the only one I could test)?
It gives a read access violation, the exact message is "Avisynth: caught an access violation at 0x012f151a, attempting to read from 0xffffffff". The violation doesn't appear on the non-SSE2 version, all modes are OK.

Quote:
Does the access violation persist if you crop before RemoveGrain with crop(4,4,-4,-4, align=false)?
Yes, modes 4-8 are the only ones that work.

Quote:
Finally, there may be a bug in the difference filter, though this is extremely unlikely, because it works with the other modes. But nevertheless one should make a try without it. It is enough to test this for the simplest mode=1, where surprisingly the access violation already happens.
No go, so the problem must be somewhere within RemoveGrain.

Quote:
I expect that you did the tests with an yv12 clip, which is the natural one for mpeg2. The other color spaces are not well tested. In fact, I am stunned about the very poor performance for the non-yv12 color spaces, but I postponed the investigation of this problem for a while.
Yep, YV12 it is, I don't deal with YUY2 anymore since I bought the PVR-250.

Hope this all helps
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 21st July 2004, 17:19   #12  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
Thank you Boulder for all your efforts, but I'll have to withdraw the SSE2 version. I spend quite some time to find the bug(s) using your precise observation, but all I found was one wrong SSE2 macro in for mode 1 and nothing in the other modes. Surprisingly my Celeron with Tualatin P3 design swallows the SSE2 version without any access error (also with 720x576 mpeg2 input). Of course, the output is not quite correct. While with the true SSE2 one can handle 16 pixels at a time, with the P3 SSE2 only the first 8 bytes of the 16 byte SSE register are processed. Thus if chrom processing id disabled (i.e. modeU=0), then after 8 correctly processed pixels there are 8 incorrectly processed pixels with the SSE2 version on Tualatin P3 (I don't know what happens on a Katmai or Coppermine P3 or on the various Athlons). I simply have to get my hands on a P4 to finish the job. Programming for the P4 without having one simply does not work.
kassandro is offline   Reply With Quote
Old 21st July 2004, 17:28   #13  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
Well, fortunately the speedup isn't that big, I'd estimate somewhere along 8-12% on my system. I suppose that mode 8 (which I also prefer) works OK, at least according to the difference debug log, so the SSE2 version can be used and I probably will.

I don't know if there are any developers who have a P4 system and could help..maybe it would be worth asking around a bit.

Thanks for your efforts anyway
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 21st July 2004, 19:56   #14  |  Link
dbzgundam
Hates all his encodes
 
dbzgundam's Avatar
 
Join Date: Sep 2003
Posts: 166
Why can't I access the page?

It's loading but I see a blank page.
__________________
http://thevideophile.blogspot.com/ Watch and be amazed by my frustration over the video world!
dbzgundam is offline   Reply With Quote
Old 22nd July 2004, 15:58   #15  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
Quote:
Originally posted by Boulder
Well, fortunately the speedup isn't that big, I'd estimate somewhere along 8-12% on my system. I suppose that mode 8 (which I also prefer) works OK, at least according to the difference debug log, so the SSE2 version can be used and I probably will.
8-12% is quite disappointing. The following formula should hold
Code:
 SSE processing time - SSE mmemory access time = 2 * (SSE2 processing time - SSE2 mmemory access time)
Now in SSE2 mode, RemoveGrain tries to read and write 16 Bytes in one stroke, while in SSE mode only 8 bytes are read and wrote in one stroke. In 90-100% of all cases RemoveGrain reads from the L1 cache. However, reading and writing is usually unaligned. In SSE3 (Prescott P4) there is even a special instruction for unaligned reading from memory, which may be helpful.

Quote:

I don't know if there are any developers who have a P4 system and could help..maybe it would be worth asking around a bit.
I think that by the end of the year I will buy a Prescott Celeron. Then I will certainly return to the SSE2 version.

Quote:
Originally posted by dbzgundam

Why can't I access the page?

It's loading but I see a blank page.
Why are you young guys always so impatient? If you click on the above link, the server from www.AlpenNIC.com maps the domain www.RemoveGrain.de.tf to a path on an Austrian server, where RemoveGrain and all my other plugins are hosted. As this server is advertisment free (however AlpenNIC generates some popups) and probably publicly funded, it is not the fasted (thousands of austrian home pages are hosted there). So you have to be patient. If it doesn't even work patience, I will give you a direct link.
kassandro is offline   Reply With Quote
Old 22nd July 2004, 17:21   #16  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
Quote:
Originally posted by kassandro
8-12% is quite disappointing. The following formula should hold
Code:
 SSE processing time - SSE mmemory access time = 2 * (SSE2 processing time - SSE2 mmemory access time)
Now in SSE2 mode, RemoveGrain tries to read and write 16 Bytes in one stroke, while in SSE mode only 8 bytes are read and wrote in one stroke. In 90-100% of all cases RemoveGrain reads from the L1 cache. However, reading and writing is usually unaligned. In SSE3 (Prescott P4) there is even a special instruction for unaligned reading from memory, which may be helpful.
That's probably theory and real life colliding The speedup might have (and probably would have) been bigger if RemoveGrain was the only thing eating CPU cycles, there's always other processes and Avisynth and CCE probably cause some loss to the actual gain.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 17th August 2004, 08:21   #17  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
I just put up version 0.5 to the web site.
There have been substantial internal changes. The code for handling color spaces other than yv12 has changed, however the speed still remains surprisingly poor compared with the very satisfactory speed for yv12. The assembly code for mode=2,3,4 has been optimised. Here is a comparison of the old and new version on my machine (1.3 GHZ Celeron):
Code:
mode		old version	new version
2		149.5 fps       150.5 fps
3		141	        145.5
4		142	        149.5
I also made a comparison of trbarry's Undot and RemoveGrain(mode=1). It turned out that Undot and RemoveGrain(mode=1) differ for the pixels on the right border. By the very nature of the algorithm border pixels cannot be processed and should therefore be left unchanged. While Undot does just this for the left, the top and the bottom border. it makes a mistake on the right border, where instead of copying the last pixel on the line, it copies the penultimate pixel to the last pixel, which simply doesn't make sense and should be considered as a bug.

Last edited by kassandro; 17th August 2004 at 08:27.
kassandro is offline   Reply With Quote
Old 17th August 2004, 09:42   #18  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
Thanks, RemoveGrain's one of my standard filters for processing analog TV caps

EDIT: Has the SSE2 version changed in any way or am I better off using the SSE optimised one? I use mainly mode=8.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...

Last edited by Boulder; 17th August 2004 at 09:47.
Boulder is offline   Reply With Quote
Old 17th August 2004, 15:24   #19  |  Link
Audionut
Registered User
 
Join Date: Nov 2003
Posts: 1,283
In Vitualdubmod, I get an error,

Unable to load "C:\RemoveGrain.dll"
Same goes for the SSE2 version.

LoadPlugin("C:\PROGRA~1\ARCALC~1\AVS_Plugins\Mpeg2dec3.dll")
LoadPlugin("C:\RemoveGrain.dll")
Mpeg2Source("z:\*.D2V")
RemoveGrain(mode=4,mode=U2)
Audionut is offline   Reply With Quote
Old 17th August 2004, 15:27   #20  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,836
Works fine for me.. did you try extracting the dll to the Avisynth plugins directory?
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:43.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2026, vBulletin Solutions Inc.