Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > VapourSynth

Reply
 
Thread Tools Search this Thread Display Modes
Old 13th August 2014, 20:10   #1  |  Link
Mystery Keeper
Beyond Kawaii
 
Mystery Keeper's Avatar
 
Join Date: Feb 2008
Location: Russia
Posts: 724
[VapourSynth] DctFilter

DCT Filter for VapourSynth r3
Source code

Only C routines are implemented now, so relatively very slow.

Usage:

dct.Filter(clip clip, float factors[8])

Performs DCT on 8x8 blocks of source clip, applies modification to it, then performs IDCT.
Modification is done as following: dct(x, y) = dct(x, y) * factor[x] * factor[y]

This filter does essentially the same as original Tom Barry's DctFilter for AviSynth, but does it differently.
All calculations are done on floating point values, and factors are applied as they are, not rounded.
Thus, the accuracy is higher.

Padding to mod8 is automatic for every plane, but cropping to non-mod16 values before applying this filter is impractical and shouldn't be done.
__________________
...desu!

Last edited by Mystery Keeper; 20th September 2016 at 21:35.
Mystery Keeper is offline   Reply With Quote
Old 22nd August 2014, 14:51   #2  |  Link
Mystery Keeper
Beyond Kawaii
 
Mystery Keeper's Avatar
 
Join Date: Feb 2008
Location: Russia
Posts: 724
R2 is out with direct approach replaced with row-column algorithm. Should be 4 times faster, but still pure C.
__________________
...desu!
Mystery Keeper is offline   Reply With Quote
Old 23rd August 2014, 12:30   #3  |  Link
Gser
Registered User
 
Join Date: Apr 2008
Posts: 418
Could this be used with deblock QED?
Gser is offline   Reply With Quote
Old 23rd August 2014, 13:09   #4  |  Link
Mystery Keeper
Beyond Kawaii
 
Mystery Keeper's Avatar
 
Join Date: Feb 2008
Location: Russia
Posts: 724
Quote:
Originally Posted by Gser View Post
Could this be used with deblock QED?
It already is. Get the latest version of HAvsFunc.
__________________
...desu!
Mystery Keeper is offline   Reply With Quote
Old 31st May 2015, 16:15   #5  |  Link
MonoS
Registered User
 
Join Date: Aug 2012
Posts: 203
After seeing handmade hero i wanted to port this function to AVX for understanding how optimization works [i don't think i'll do SSE].

A changed all the variable from double to float so that a whole stride can fit into a 256bit register, i hope that this wont change the behaviour of the function this much.

Right now i've almost finished the cdct function

Edit: did some test on the function fillfactors, went down from 134cycles full unrolled to 16, to bad it's only called once XD

Last edited by MonoS; 31st May 2015 at 18:37.
MonoS is offline   Reply With Quote
Old 31st May 2015, 21:06   #6  |  Link
MonoS
Registered User
 
Join Date: Aug 2012
Posts: 203
Ooook, i think i converted properly all the function inside croutines.

I had no chance to test it because the dll produced by my copy of codeblocks is not recognized by vs.

Notable changes:
- All intermediate computation is done in float instead of doubles, less work for the cpu and less work for me
- Added a transposed lut, so that during the dct all the 8 values can be loaded in a single instruction
- Reworked all the function [except for fillLUT] to use avx instruction [for example, iaca said that fillFactors needed 134 cycles to execute fully unrolled, now it need only 16 cycles, the row loop instead now is only 384 cycles fully unrolled, i don't even imagine how many cycles required before].

Probably there are other places to optimize [clamping perhaps??] but i didn't dig to deep into the code.

Hope someone can test this and/or let me know if i made any mistake, i repeat, it's my first time doing simd optimization

EDIT: As i expected there are room for other optimization [but not using a profiler, i'll BTW], the old algorithm, before the row/column split may, be faster now with simd and transposed lut.
If i have some other spare time i'll try to implement it and if i'll succeed to compile it and test it i'll make some tests
Attached Files
File Type: zip dctfilter avx.zip (4.1 KB, 172 views)

Last edited by MonoS; 31st May 2015 at 23:34.
MonoS is offline   Reply With Quote
Old 20th September 2016, 21:39   #7  |  Link
Mystery Keeper
Beyond Kawaii
 
Mystery Keeper's Avatar
 
Join Date: Feb 2008
Location: Russia
Posts: 724
DCTFilter r3 is here with fixed stupid bug causing memory leak.
__________________
...desu!
Mystery Keeper is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:50.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.