Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 16th February 2010, 10:53   #1  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
SEt's Avisynth 2.5.8 MT compiled for *X86_64*, Latest Build 4/16/2010

Edit: Quicklinks Updated 4/26/2010

Featured Release 4/16/2010
+ Resize artifacting fixed
+ Horizontal resize code re-written to use SSE registers
+ Worth noting, often used functions Temporal Soften, Merge, etc have been tweaked for a decent speed gain
+ Bug found, fixed in memory copy routine, again
+ Universal binary, no longer need to distinguish between AMD and Intel builds
+ Optimized BitBlt memory copy Routine
+ Started implementation of SSE3/4 specific instructions when supported processor is detected
+ Removed most code paths intended to support CPU's without mmx/iSSE
+ Resize functions reworked to take advantage of extra registers available when processor is in 64bit mode
Avisynth64 binary and installer built on 4/16/2010

Many of the plugins have been optimized and recompiled, please get the latest and greatest versions with this release.

Service update on 3/19/2010
+ Minor fixes to allow better usage of MT modes
+ Tweaks to code for small performance increases all around
+ Fixed resize bug
Use this build for Intel processors
Use this build for AMD processors

Version from 3/15/2010:
64 bit Avisynth 2.5.8 w/multithreading

------------------------------------------------
Plugins (Alphabetically, for now)
------------------------------------------------

New on 3/21/2010
AddGrainC x64

Built on 3/14/2010
AutoCrop x64

New on 3/21/2010
aWarpSharp x64
This version is based on SEt's original rewrite found in this thread

Built on 3/20/2010
Color Matrix x64

New on 3/13/2010
DFTTest x64-->needs the included libfftw3f-3.dll to be in your system32 directory

Built on 3/19/2010
DgDecode 1.5.8 x64
Note: Is missing some IDCT modes, will get them back ASAP

Built on 4/10/2010
EEDI2 x64
+ Vectorized main loop
+ Further restructured main loop to minimize branching, processor dependent speed increase

FeildHints x64

Built on 3/12/2010
FFT3DFilter-->needs the included libfftw3f-3.dll to be in your system32 directory

Built on 3/15/2010
FFT3DGPU x64
note:The hlsl (shader program) file is edited from the original to adhere to pixel shader 3.0 syntax rules. Please make sure to place the correct file in the same directory as the 64bit plugin.

New on 3/13/2010
kemuri-_9's FFMS2 (The Fabulous FM Source 2)
Big thanks to kemuri-_9 for the build

Built on 3/29/2010
GradFun2DB x64

Built on 4/08/2010
hqdn3d x64

Built on 3/14/2010
LeakKernelDeint x64

Built on of 3/12/2010
MaskTools 2.0a41
+ Now straight from the source, Manao

Built on 3/31/2010
MVTools2 x64
+ Continued conversion and updating of assembly functions
+ Removal of some code intended to support processors without mmx/iSSE
+ Converted often used assembly functions to SSE2 instead of mmx/iSSE
+ Updated to latest shared function library from x264
+ Healthy 20%+ speed increase over x86 version in most cases

New on 3/14/2010
TDeint x64
This is basically the same as squid80's build, main differences being newer avisynth.h and newer compiler

TelecideHints x64

Built on 3/13/2010:
TIVTC x64


New on 3/20/2010
TNLMeans_x64 v1.0.3

New on 3/20/2010
TTempSmooth x64 v0.9.4

RemoveGrain x64

Repair x64

VerticalCleaner x64

Visit Squid80's homepage for more x64 plugins

Benchmarking Suggestions

Here is a 64bit avs2avi for benchmarking.
You can run it against the original

To simply run the script through Avisynth, execute the following at a command prompt:
Code:
avs2avi64.exe <path:\script.avs> -o n
A dialog box will pop up asking what to do for compression. Choose no recompression (or whatever similar option your os gives you) and the script will run without saving an output.

The same should be done with avs2avi.exe.

This will take two factors out of the speed equation: 64bit vs 32bit compressors and hard disk write speed. The final fps report from both runs will allow a fairly apples to apples comparison of the two builds.

Source Code
For those who are interested: The source is now hosted over at google code, I'll keep it as up to date as I can
The source is constantly in flux.

This wiki page has all of the plugins linked as well.

The source to any of the plugins I have personally modified is available upon request. Please message me if interested.

Last edited by JoshyD; 27th April 2010 at 03:05.
JoshyD is offline   Reply With Quote
Old 16th February 2010, 15:30   #2  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 597
Hi, JoshyD

I tried to test it with my Q9450 and 64bit windows7, but some DLLs seems to be insufficient.
where can i get libiomp5md.dll ?
Chikuzen is offline   Reply With Quote
Old 16th February 2010, 20:06   #3  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
Hmm, it appears that even when linking all your libs statically, intel's compiler still links the openmp libs dynamically, I'll get a rebuild up ASAP. From the ICC forums:

"The 11.0 Windows compilers (both C++ and Fortran) have decoupled /MT from having any effect on which OpenMP runtime (static or dynamic) gets linked. In fact, all of /MT[d], /MD[d], and /ML[d] (latter VS2003 only) now only effect which MS C runtime is linked.
We made this change because we want dynamic to be the default when linking the OpenMP RTL. The use of static OpenMP libraries is not recommended, because they might cause multiple libraries to be linked in an application. The condition is not supported and could lead to unpredictable results. It can also cause Thread Checker false positives and other problems with the Intel Threading tools.
If you want to link against the static OpenMP RTL, you must add /Qopenmp-link:static, which is a new switch for 11.0. So to produce a purely static executable, compile/link with /MT /Qopenmp-link:static
Patrick Kennedy
Intel Compiler Lab"

I didn't build the project with OpenMP libs enabled, but did allow the compiler to auto-parallelize loops it found it could. Perhaps this is the issue.

Last edited by JoshyD; 16th February 2010 at 20:23.
JoshyD is offline   Reply With Quote
Old 17th February 2010, 00:48   #4  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
There is support for SetMTMode and it's functions, but not MT("command") as of now. The whole MT function is contained in an entirely separate DLL loaded form your plugins dir automatically, or manually at the beginning of the script.

The RAR does contain a copy of direct show source for use with AVI synth. Any program that can open avs files AND is already a 64 bit executable should do the trick. VDub64 comes to mind, as well as media player classic home cinema 64. 64 bit builds of x264 that specifically have the ability to open avs files should work as well.

I have personally tested Virtual Dub 64, it has played back all of my little test cases I've had a chance to run without any complaint.

For some 64bit plugins that MAY work with this DLL please check out Squid80's prior work. He was the guy who ported Avisynth 2.5.5 to x64 a while back, without the nicety of having a compiler that supports inline asm.

Other than that, the the top two plugins that I want to see work with 64bit are MvTools2 and FFT3DFilter. I get a lot of mileage out of those two projects.

If there's enough interest generated I'm considering going back and optimizing all the ASM routines to take advantage of architectural changes that have occured over the past 5 or so years. Some of these assembly routines are long in the tooth, and could be better tuned for newer processors (I think).

Anyhow thanks for the post, and keep checking back for updates.
JoshyD is offline   Reply With Quote
Old 17th February 2010, 00:54   #5  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
What I meant was, could you modify your avisynth64 release to use a separate directory for autoloading 64-bit plugins to avoid name-conflicts and similar issues? On an aside, having mvtools2 + MaskTools2 would allow a lot of script functions to work automatically.

Last edited by Stephen R. Savage; 17th February 2010 at 00:56.
Stephen R. Savage is offline   Reply With Quote
Old 17th February 2010, 01:33   #6  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
Quote:
Originally Posted by Stephen R. Savage View Post
What I meant was, could you modify your avisynth64 release to use a separate directory for autoloading 64-bit plugins to avoid name-conflicts and similar issues? On an aside, having mvtools2 + MaskTools2 would allow a lot of script functions to work automatically.
Sure, that's pretty easy. I can add throw that one on the to do list. I'm just a bit more focused on making sure others can run it at the moment.
JoshyD is offline   Reply With Quote
Old 17th February 2010, 03:50   #7  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
Update: I copied avisynth.dll and devil.dll to system32 on a Windows 2008 R2 setup, after which I imported avisynth.reg. I created a test script with the code "BlankClip().ConvertToYV12()". It does not load in either VirtualDub64 or x264. Both crash upon exit. VirtualDub64 gives an error message "AVI Import Filter error: (Unknown) 80040154".

Quote:
Faulting application name: Veedub64.exe, version: 1.9.8.0, time stamp: 0x4b343875
Faulting module name: avisynth.dll_unloaded, version: 0.0.0.0, time stamp: 0x4b7af14e
Exception code: 0xc0000005
Fault offset: 0x000007feee3b5408
Faulting process id: 0x62c
Faulting application start time: 0x01caaf7bdcc81b9e
Faulting application path: Veedub64.exe
Faulting module path: avisynth.dll
Edit: Processor is Core 2 T7250 "Merom".

Last edited by Stephen R. Savage; 17th February 2010 at 04:02.
Stephen R. Savage is offline   Reply With Quote
Old 17th February 2010, 05:02   #8  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
Quote:
Originally Posted by Stephen R. Savage View Post
Update: I copied avisynth.dll and devil.dll to system32 on a Windows 2008 R2 setup, after which I imported avisynth.reg. I created a test script with the code "BlankClip().ConvertToYV12()". It does not load in either VirtualDub64 or x264. Both crash upon exit. VirtualDub64 gives an error message "AVI Import Filter error: (Unknown) 80040154".



Edit: Processor is Core 2 T7250 "Merom".
This works for me, did you add the registry key that's included in the rar? When VDub throws that error, it usuaully means it can't find the right filter to decompress your stream.

I'm not sure which snapshot of the binary I compiled you may have grabbed, but along the way I realized I had the compile flags wrong. I was only generating code for a core I3/I5/I7. This caused my Penryn laptop to die when trying to load a clip. With the latest file up there, my Penryn executes no problem.

Just to be double safe, try this compilation of the avisynth binary:

It only has the SSE2 code path enabled

If that doesn't work, we'll come up with a solution, I'm a bit hazy on how Windows associates registry keys with filter types. If anyone else has some pointers, that'd be great too.
JoshyD is offline   Reply With Quote
Old 17th February 2010, 05:16   #9  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
The copy you just linked to depends on libiomp5md.dll, so I can't run it. Incidentally, my Merom CPU supports flags up to SSSE3, but not SSE4.1 like the higher-end Penryns. Perhaps you could use runtime CPU detection instead of requiring a specific instruction set compatibility?

Edit: Success! The build in the topic post of this now works. I guess it must have been silently updated.

Edit2: DirectShowSource and Spline36Resize work. However, scripts in my Avisynth32 plugins directory don't seem to be autoimported.

Edit3: TDeint64 built by Squid80 works fine. RemoveGrain64 by Kassandro does not (unknown exception).

Edit4: EEDI2_64 by Squid80 does not work. I believe this is because it is also statically linked to OpenMP, which according to a Google result on Intel's webpage, can cause conflicts.

Edit5: ConvertToRGB32 does not work. It always errors with the message "Rec.709 and PC Levels support require MMX and horizontal width a multiple of 4" regardless of settings/input.

Last edited by Stephen R. Savage; 17th February 2010 at 05:49.
Stephen R. Savage is offline   Reply With Quote
Old 17th February 2010, 06:10   #10  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
EDIT:
You were right about EEDI2 using OpenMP and being dynamically linked to the libraries. I just recompiled the source with the OpenMP libraries statically linked:
EEDI2 64bit Multithreaded

Also, where did you run across RemoveGrain64? I'd like a copy (and the source if possible) so I can figure out what exception it's throwing.

The only thing I can say for sure is that converting RGB error is hard coded in there for now, as there was a decent amount of assembly involved in converting the routine, and I just plain didn't feel like it. I'll get back around to it, put that on my todo as well.

I'll start checking into those other two plugins, I'm working on MVTools at the moment . . . that would be a huge win.

The other issue with the code paths was that I was basically telling ICC to target just my host CPU. When I tried to get it going on any other computer I realized my mistake. The current DLL for download has code paths for all EMT64 supporting Intel processors.

Once again, don't know if ICC cripples AMD chips or not . . .

Last edited by JoshyD; 17th February 2010 at 06:58.
JoshyD is offline   Reply With Quote
Old 17th February 2010, 20:51   #11  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 597
Hi, JoshyD
Thx for static build version. it seems to work at present without troubles.

I did some benchmarks. the results is here.

I think that faster 64bit decoder is necessary for me...

Last edited by Chikuzen; 2nd March 2010 at 13:39.
Chikuzen is offline   Reply With Quote
Old 17th February 2010, 21:22   #12  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
Kassandro posted his RemoveGrain64 on his own forum:

http://videoprocessing.11.forumer.co...2cd20f1c9425d0

Edit: Your rebuilt EEDI2() causes artifacts in chroma.
Edit2: Seems to be a pitch error. Doing a TurnLeft/TurnRight works around it.

Last edited by Stephen R. Savage; 17th February 2010 at 21:35.
Stephen R. Savage is offline   Reply With Quote
Old 18th February 2010, 21:35   #13  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
Thanks for the benchmarcks! I don't think we're going to see any speed increases with 64bit code unless the assembly is re-worked to take advantage of more registers / less memory access.

Any resizers you run through there are really just using their old 32 bit counter part, essentially. The size of the pointers change, but the register usage at the CPU level is still the same, because it was specified explicitly. I'm going to continue work to see if I can't eek out some extra performance. I mainly want to see if I can't get some of the more demanding plugins a speed boost.


I'm not sure what's wrong with EEDI2, the source was straight compiled again.
JoshyD is offline   Reply With Quote
Old 19th February 2010, 01:05   #14  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
Perhaps there was always the bug in the source code. Nevertheless, if you have experience in Avisynth plugin development, perhaps you could squash it for us? Please?

EEDI2 would normally not be anywhere near the top of my priorities, but without nnedi2, it's pretty much a necessary step to get TempGaussMC working on avs64 (the other steps being removegrain64, mvtools2_64, and masktools2_64).
Stephen R. Savage is offline   Reply With Quote
Old 19th February 2010, 12:06   #15  |  Link
JoshyD
Registered User
 
Join Date: Feb 2010
Posts: 84
Quote:
Originally Posted by Stephen R. Savage View Post
Perhaps there was always the bug in the source code. Nevertheless, if you have experience in Avisynth plugin development, perhaps you could squash it for us? Please?

EEDI2 would normally not be anywhere near the top of my priorities, but without nnedi2, it's pretty much a necessary step to get TempGaussMC working on avs64 (the other steps being removegrain64, mvtools2_64, and masktools2_64).
Edit: EEDI2 bug squashed, I think
Try this out and let me know if it produces consistent results


I DO have a somewhat working MVTools2. In so far as I have tested it, the "important" functions are working. A TON of the ASM has been re-coded to adhere to function calling specifications set forth by x64 c++. I did it by hand, meaning, there's probably a decent chance you'll crash it.

I also had to update the parts borrowed from other projects (xvid, x264, fftw), so those are a little "rough" at the moment.

My test cases mainly focused around motion vector generation and the degraining functions. Perhaps someone else will be able to fault it in other places, allowing me a chance to find and fix the bugs.

Here's the link to MVTools2 x64

Personally, I see a significant performance increase (from ~20fps x86 to ~30fps x64, when using multi threading in both cases) when just writing out a raw stream. Try it out, and let me know where the problems are.

This is a little sample of what I've been using to mess around with the parameters. You can get it to go through a surprisingly large number of code paths just by varying the inputs to different degrain functions.

Code:
#MVTools x64
SetMTMode(2,4) #could be more, my system has four logical threads, but in certain instances more increase my encoding fps
LoadPlugin("D:\Development\mvtools2.dll")
AviSource("D:\testfile.avi")
ConvertToYV12(interlaced=true)


function MDegrain2i(clip source, int "overlap", int "dct", int "blksize", int "pel", int "search", int "searchparam")
{
	overlap = default(overlap,0) # overlap value (0 to 4 for blksize=8)
	dct = default(dct,0) # use dct=1 for clip with light flicker
	blksize = default(blksize, 8)
	pel = default(pel, 2)
	search = default(search, 4)
	searchparam = default(searchparam, 3)
	
	fields = source.SeparateFields() # separate by fields
	super = fields.MSuper(chroma=true, pel=pel)
	
	backward_vec2 = super.MAnalyse(blksize=blksize, isb = true, delta = 2, overlap=overlap, dct=dct, truemotion=true, temporal=true, pelsearch=pel, search=search, searchparam=searchparam)
	forward_vec2 = super.MAnalyse(blksize=blksize, isb = false, delta = 2, overlap=overlap, dct=dct, truemotion=true, temporal=true, pelsearch=pel, search=search, searchparam=searchparam)
	
	backward_vec4 = super.MAnalyse(blksize=blksize, isb = true, delta = 4, overlap=overlap, dct=dct, truemotion=true, temporal=true, pelsearch=pel, search=search, searchparam=searchparam)
	forward_vec4 = super.MAnalyse(blksize=blksize, isb = false, delta = 4, overlap=overlap, dct=dct, truemotion=true, temporal=true, pelsearch=pel, search=search, searchparam=searchparam)
	
	fields.MDegrain2(super, backward_vec2,forward_vec2,backward_vec4,forward_vec4, thSAD=500, thSCD1=500, thSCD2=130, plane=4)
	Weave()
}

return MDegrain2i(last, overlap=4, blksize=8, pel=2)
Enjoy, and let me know any problems

Last edited by JoshyD; 20th February 2010 at 01:19.
JoshyD is offline   Reply With Quote
Old 20th February 2010, 15:49   #16  |  Link
aegisofrime
Registered User
 
Join Date: Apr 2009
Posts: 452
I do hope we can see TempGaussMC ported to 64 bit. That's one plugin that's slow but used a lot in the community.

Well, with MVTools and RemoveGrain done that's half the work over?
aegisofrime is offline   Reply With Quote
Old 20th February 2010, 17:08   #17  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
RemoveGrain is not very useful without its brother Repair, but the version you linked works. Also, what version did you compile? The EEDI2 build works and is consistent, though no longer multithreaded. Will all filters with internal multithreading be incompatible with this version of Avisynth?

Edit: Strange request, but could you build TelecideHints and FieldHint as well? They're pretty useful for anyone who uses Yatta, even if Yatta itself isn't 64-bit, and should be completely free of assembly code.

Edit2: Hmm, perhaps I should make a checklist of things that'd be cool in Avs64.

Edit3: VSFilter64 available here.

Last edited by Stephen R. Savage; 20th February 2010 at 18:18.
Stephen R. Savage is offline   Reply With Quote
Old 21st February 2010, 19:17   #18  |  Link
Fizick
AviSynth plugger
 
Fizick's Avatar
 
Join Date: Nov 2003
Location: Russia
Posts: 2,183
JoshyD, you do big work!
I am intersted to see your TON of the ASM mvtools_64 source code lines
__________________
My Avisynth plugins are now at http://avisynth.org.ru and mirror at http://avisynth.nl/users/fizick
I usually do not provide a technical support in private messages.
Fizick is offline   Reply With Quote
Old 22nd February 2010, 01:23   #19  |  Link
tedkunich
Potentate
 
Join Date: Mar 2003
Posts: 219
Quote:
Originally Posted by Fizick View Post
JoshyD, you do big work!
I am intersted to see your TON of the ASM mvtools_64 source code lines

Fizick, you do know that he is Jeremy Duncan in another persona, right?

From the user control panel, Jeremy's last activity was on Feb 5. JoshyD joins on Feb 5... Hmmmmm
tedkunich is offline   Reply With Quote
Old 22nd February 2010, 02:35   #20  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 322
I'm fairly certain that Jeremy Duncan would not be able to write any significant amount of code, much less port a large software project.
Stephen R. Savage is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:21.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.