Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
22nd February 2010, 10:20 | #22 | Link | |
Registered User
Join Date: Feb 2010
Posts: 84
|
Quote:
Besides, Jeremy Duncan could barely compile the original 32bit source . . . |
|
22nd February 2010, 14:06 | #23 | Link |
Compiling Encoder
Join Date: Jan 2007
Posts: 1,348
|
my question this entire time of the thread is:
why are you patching a patched version of avisynth rather than the official version like the avs 2.6 cvs branch which has already had the multithreading features committed to it. |
22nd February 2010, 20:06 | #25 | Link |
AviSynth plugger
Join Date: Nov 2003
Location: Russia
Posts: 2,183
|
Strange personality discussion.... I am simply waiting the source code and project files of MVTools2_x64 (GPL).
Are 32bit and 64 bit asm code generalized?
__________________
My Avisynth plugins are now at http://avisynth.org.ru and mirror at http://avisynth.nl/users/fizick I usually do not provide a technical support in private messages. |
22nd February 2010, 20:20 | #26 | Link | |
Registered User
Join Date: Feb 2010
Posts: 84
|
Quote:
It was my understanding that the 2.6 branch was still in alpha status. There's a fair amount of new functionality for new colorspaces that I didn't want to go messing with if they weren't done. The 2.6 branch also makes more use of Softwire to dynamically generate assembly code. In the end, I honestly didn't want to deal with working around Softwire more than was necessary. Softwire is a bit long in the tooth to begin with, and the original developer took that project and it's now Swift Shader with a somewhat different focus, and closed source. I took what seemed "stable" at the time, and just started working around there. There's no reason I couldn't merge these changes into the 2.6 alpha, I just wanted to be sure I was starting from a stable base. That being said, anyone care to clue me in on the status of 2.6 as of now? I don't want to step on anyone's toes, I just wanted to poke around the source in my spare time. If there's enough new/stable code in there, I could definitely focus some time/effort in getting that code to target x64. It seems that the work being done going forward is focused on other areas more important than an x64 code base. This whole thing is a bit of an experiment / sideshow. I thought it was interesting enough to share, hence it's posting here. I know the eventual goal is to get Avisynth working cross platform, as well. A targeted win64 release would also be useless in the long run, if that's where the project is going. Side note/tangent: The inference about cross platform support for Avisynth can really be seen by looking at the MaskTools code base. I'm pretty sure the documentation for it explicitly states that it was written with the intent of eventually being cross compiled. The assembly functions for it are ALL contained in ASM files, and there's a lot of them. The layout is really quite elegant, I like it quite a bit. |
|
22nd February 2010, 21:09 | #27 | Link | |
Registered User
Join Date: Feb 2010
Posts: 84
|
Quote:
Unfortunately, they're not generalized as a whole. I did the inline assembler generally using just #ifdef's and then grabbed the latest x264 asm functions that the project uses, which are generalized. The actual asm for functions contained in files like bilinear.asm just got overwritten. It's trivial to swap the file for the original, and re-compile. It's just not exactly elegant. The functions should have been generalized, but I was just working fast, kind of without putting a ton of thought into the process. I've been learning as I go, and it seems I always find an example of a sleeker solution after working through the first ugly one that popped into my head. The main differences are in function calling, and register usage. You can get by with a lot less push/popping in 64bit land. Stack allocation between function calls changes as well. As a rule, all arguments are aligned at 8 byte boundaries. Arguments that are passed to the functions via registers still get shadow space on the stack, so your 5th integer argument will be at [rsp+40], if you didn't push any registers on the stack in the first place. I wanted to ask some specific questions about the filtering and code copying functions. Was there a reason that they're often limited to the mmx registers? Things like the Horizontal Weiner filter are bothering me, because depending on your byte window, you're going to get different results from it, or it would seem that way. Right now, it has a 4 byte "window" to filter around. My understanding of a weiner filter is that it's adaptive, so changing its discrete window would change the filter's output altogether. It's possible to look at 8 byte chunks using the XMM registers (unpacking to words for arithmetic (128bits total), repacking) but I'm unsure on the effect on overall image quality. Thoughts? Finally, a lot of the mmx functions don't take advantage of the fact that we have a ton of XMM registers floating around that can also be used in mmx arithmetic. XMM0-XMM5 are all volatile across calls which could prevent some mmx registers from being shuffled around, etc. When writing assembler, I'm not sure how the CPU's register files are architected to interact with each other. As in, if there's a pentalty associated with transferring a qword from an XMM register to an MMX reg, and vice versa. I'm actually a VLSI designer (very large scale integration) by education, so thinking on the machine level is interesting and thought provoking. I don't know enough about the design of the x86 cores of late to generalize performance impact of various code paths. Is there any way other than running a battery of tests to analyze the clock cycles it takes for an instruction to retire? I'm going to search around for the answers, but I thought I'd ask anyhow. Sometimes that's the fastest and most concise way to find the info you're after. Last edited by JoshyD; 22nd February 2010 at 21:45. |
|
22nd February 2010, 21:40 | #28 | Link | |
Registered User
Join Date: Feb 2010
Posts: 84
|
Quote:
Here is the repair function to try out. I removed threading form EEDI2 because Avisynth can handle threads internally for its filters. When all these little binaries link to each other, there should be some arbiter of thread creation. At least that's what seems logical at the time of writing. If you tell Avisynth work with 4 threads, and then it takes one of those threads and instantiates a filter who goes off and spawns 4 more threads, it would seem that the situation would get messy after a bit. There would possibly be a lot of context switching going on, which would degrade performance overall. Also, any filterd compiled to statically link to the OpenMP libraries cannot be instantiated alongside any other filter also statically linked to those libs, because this will cause openmp.lib to be initialized more than once. Apparently, this can degrade performance, at least according to the error that occurs upon crash. The plugins can be compiled to ignore this, but in the short term, I was trying to save people some trouble shooting head aches. As for the other two plugins you requested, I'll see what I can do. If Avisynth 2.6 can withstand a 64 bit port somewhat seamlessly, I'm going to focus my attention there first. Last edited by JoshyD; 23rd February 2010 at 03:24. |
|
22nd February 2010, 21:56 | #29 | Link |
Registered User
Join Date: Nov 2009
Posts: 327
|
I get the error "LoadPlugin: unable to load 'Repair.dll'". Thank you for clarifying on the OpenMP issue. Again, I would like to ask what version of RemoveGrain/Repair your versions are based off. I ask this because there are two versions in "common" use.
Edit: ConvertToYUY2(interlaced=true) crashes your Avisynth64 build with SetMTMode(2) and causes an exception without SetMTMode. Edit2: http://img43.imageshack.us/img43/50/avisynth64.png I'm liking the direction this project is going in. With masktools2 built, I can upgrade from "crude mvbob" (no error checking) to proper mvbob. Hopefully we can get some other volunteers to build plugins, so JoshyD doesn't have to do all the work. Last edited by Stephen R. Savage; 22nd February 2010 at 22:57. |
23rd February 2010, 03:27 | #30 | Link | |
Registered User
Join Date: Feb 2010
Posts: 84
|
Quote:
Try this one instead If that's a no go, maybe a sample clip is in order? You are quite correct about the repair I uploaded being completely incorrect. I'm not quite sure where that dll even came from, the final build is ~3x its size. As for the version, it says "1.0" at the top of the main source file, I started with the original attempted RemoveGrain64. There were some glaring oversights in it (not sign extending 32bit integers when using them to compute 64bit addresses) that needed to be corrected, but otherwise, everything looks quite similar to the version available from the author's website. That being said, I think the post that RemoveGrain64 came from was made ~a week after the last version of RemoveGrain and Repair were released. I also built the two plugins you asked about, they're linked in the first post. Last edited by JoshyD; 23rd February 2010 at 04:30. |
|
23rd February 2010, 05:43 | #31 | Link | |
Potentate
Join Date: Mar 2003
Posts: 219
|
Quote:
My apologies sir... The subject matter and timing were just a too coincidental... |
|
23rd February 2010, 05:43 | #32 | Link |
Registered User
Join Date: Nov 2009
Posts: 327
|
You forgot to statically link OpenMP again. I was converting to YUY2 from interlaced YV12 (generated by SeparateFields().SelectEvery(4,0,3).Weave()) on a 640x480 format frame.
Edit: I googled for a copy of libiomp5md.dll and copied it to system32. I hope that's legal. Anyway, YUY2 conversion no longer crashes with the updated version. The new Repair version also works. Thanks for clarifying the version number involved. TelecideHints and FieldHint work as expected. Edit2: For the record, Decomb64 by Squid80 works as expected (even with MT). TIVTC would be nice though. Edit3: My remaining (major) wishlist, in descending order of priority: mt_masktools TIVTC nnedi2 (yeah right) GradFun2DB dfttest I suspect most of these will be a pain to port, possibly with the exception of GradFun2DB and mt_masktools. Getting a full stable avisynth.dll is of course top priority though. Edit4: I believe the RemoveGrain/Repair version is the "pre-release" then. Confusingly, there is a "1.0 pre-release" and a "pre-release". Last edited by Stephen R. Savage; 23rd February 2010 at 06:13. |
23rd February 2010, 06:51 | #33 | Link |
Registered User
Join Date: Feb 2010
Posts: 84
|
OpenMP is open, so copying it to your system should be a ok, it's actually better than statically linking to the library, as it will allow the usage of multiple plugins that were compiled with parallel directives to be isntantiated concurrently.
Is there a major difference between the RemoveGrain and Repair versions? Which version of mt_masktools are you looking for? I think 1.5.8 was the last build before 2.0 alphas started rolling out. I can port 1.5.8 very quickly, 2.0 will take a bit longer. |
23rd February 2010, 07:33 | #34 | Link |
Registered User
Join Date: Nov 2009
Posts: 327
|
Latest v2.0 alpha (2.0a36). 1.5 stable isn't really used anymore. I think there was some rearrangement of the modes to Repair between the two versions, but I couldn't really tell you the difference to be honest. I think Didee mentioned something about it though.
Edit: You haven't answered this yet. How does the current build handle script/plugin autoloading? Is it simply disabled? Last edited by Stephen R. Savage; 23rd February 2010 at 18:08. |
26th February 2010, 03:45 | #36 | Link | |
Registered User
Join Date: Feb 2010
Posts: 84
|
Quote:
Last edited by JoshyD; 27th February 2010 at 03:26. |
|
26th February 2010, 04:06 | #37 | Link |
Registered User
Join Date: Nov 2009
Posts: 327
|
mt_edge crashes when using the built-in kernels. A custom kernel seems to work fine.
TempGaussMC(edimode="EEDI2") also raises an exception, but I can't seem to trace it to a specific line. Last edited by Stephen R. Savage; 26th February 2010 at 04:27. |
26th February 2010, 04:44 | #39 | Link |
Registered User
Join Date: Nov 2009
Posts: 327
|
Code:
DirectShowSource("file.avi") mt_edge(param) # param = "sobert" or "roberts" or "laplace" I have traced the TGMC exception down to a line unrelated to masktools. It occurs due to TemporalSoften at the line "temporalsoften(2,255,255,28,2)". The following mt_masktools-based scripts seem to work: AntiAliasing (custom version) DeHalo_alpha FastLineDarken LimitedSharpenFaster SRestore (except double blend modes, depends on Average.dll) This means that so far the following operations are already enabled in Avisynth64: Motion-adaptive deinterlacing (TDeint, EEDI2) Crude Inverse Telecine (TDeint + Decimate) Deblending (mt_masktools) Sharpening (mt_masktools) Denoising (MVTools2) Last edited by Stephen R. Savage; 26th February 2010 at 05:51. |
|
|