Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
16th May 2003, 12:28 | #41 | Link |
Moderator
Join Date: Oct 2001
Location: England
Posts: 3,285
|
Id already tried the same thing, difference is minimal, but helps. I think if we can cut down the number of copyall's that would make a quite a difference, maybe by creating spare buffers and changing pointers rather than whole memcopys ? (I cant look at it until next weel )
Also reading in a large file is always slow, and reading it in 2048 byte chunks cant be too quick...If I improve this do you think it would make much of a difference? Cheers, -Nic |
16th May 2003, 14:50 | #42 | Link |
Registered User
Join Date: Oct 2001
Location: Gainesville FL USA
Posts: 2,092
|
Haven't tried it yet but I like the idea of using the optimized Avisynth copy. The inline macro currently used for CopyAll looks like it will often be doing unaligned copies because it does the spare change first before the MMX copies. So if the size (not pitch) is not a multiple of 8 it will be loading and storing from unaligned addresses even if the buffer is aligned.
Does anyone know yet who gets and frees the buffers that are passed? I wonder if that has any relationship to the vdub debug error messages. If Avisynth can properly manage buffers on close/reopen than it seems Avisynth storage management should maybe be used for them. Is it? - Tom |
19th May 2003, 16:48 | #43 | Link |
Registered User
Join Date: Nov 2001
Posts: 9,770
|
did you stop the development?
__________________
Between the weak and the strong one it is the freedom which oppresses and the law that liberates (Jean Jacques Rousseau) I know, that I know nothing (Socrates) MPEG-4 ASP FAQ | AVC/H.264 FAQ | AAC FAQ | MP4 FAQ | MP4Menu stores DVD Menus in MP4 (guide) Ogg Theora | Ogg Vorbis use WM9 today and get Micro$oft controlling the A/V market tomorrow for free |
19th May 2003, 18:20 | #44 | Link |
Registered User
Join Date: Oct 2001
Location: Gainesville FL USA
Posts: 2,092
|
Did anyone try my 1.0.5 version?
@Nic - Did you try compiling it with your Intel Compiler and apparently better compile options? My 1.0.5 was based upon your 1.0.4 source so nothing should be lost along the way. - Tom |
19th May 2003, 22:24 | #47 | Link |
Moderator
Join Date: Oct 2001
Location: England
Posts: 3,285
|
Went to see my gf, sorry for lack of posts and work on it.
(got a new gfx card too, so been playing games ) (Should be getting icl 7.1, but I wont be compiling it in it and releasing it until I can be sure ICL won't harm the quality (which im pretty definite it won't)) @trbarry: Sorry for being blind, couldn't spot your 1.05 version in this thread (Got a migraine right now, ill look properly tomorrow) Im going to try and re-write parts of the decode function (to stop all the memory copying, there must be away of just using more memory and then shifting pointers about). And also try improving the reading of a file, ive never tried memorymapping a file...Ill see if it helps any. Another thing, try setting the libraries to use single threaded instead of multithreaded, improved things quite a bit on my machine, could just be a fluke though, and there should be no need for the multithreaded libraries...(?) @alx: I very much doubt coloured squares are caused by our decoder, or that its slower. (on mine it was faster and produced exactly the same output) ill look into it none the less. Cheers, -Nic |
19th May 2003, 23:40 | #48 | Link | |
Registered User
Join Date: Oct 2001
Location: Gainesville FL USA
Posts: 2,092
|
Quote:
Sympathize with you on your migraine. See my post from 5/15 above. - Tom |
|
20th May 2003, 04:03 | #49 | Link |
Registered User
Join Date: Oct 2001
Posts: 101
|
Nic, your 1.04 build takes 6:10 m, and Tbarry 1.05 takes 5:30 and MarcFDīs build 1.00 takes 5:05...............same script, same machine, same xvid build........so , yours is SLOW!! period......haha, no ofense plis, just joking.
Alx |
20th May 2003, 09:52 | #50 | Link |
Moderator
Join Date: Oct 2001
Location: England
Posts: 3,285
|
@trbarry:
Your 1.05 causes blocks to appear on some video I cant spot the error yet (I must admit I dont quite understand the changes yet, but ill read through). (if you can't reproduce the blocks ill send you a bit of a SVCD music video that shows the problem) (edit: oh, I take that back, I do understand the changes..havent spotted the bug yet though) (edit2: The bug's in the Add_Block code, but havent found it yet, the intra/non-intra code is fine) I tried improving the speed yesterday, but didnt get everso far, if you change the :ecode function to just GetHdr(); DecodePicture(1, dst); you should have the decoder decoding in its quickest (progressive only) state. But it still wasn't much faster Ill look again tonight -Nic ps Edit3: Forgot Marc broke the Crop support, (using crop inside dvd2avi when creating the d2v file crashes the mpeg2dec...nice. ) Ill try and fix that or at least make it so it ignores the crop params from the d2v file. Ive added back aquaplanings code for using the DLL without avisynth, makes sh0dan's changes difficult (I have to check AVSEnv is in existence for the copys...BTW: Does the *env change? Or can we just do a AVSEnv = env in the constructor instead of GetFrame ? ) edit4: grr..env-SubFrame only gives access to the non planar version o SubFrame. Last edited by Nic; 20th May 2003 at 15:13. |
20th May 2003, 18:07 | #51 | Link | ||
Registered User
Join Date: Oct 2001
Location: Gainesville FL USA
Posts: 2,092
|
Quote:
I probably did something stupid there but I should be able to find it with a compare. Or I guess you can just not use the new Add_Block portion for now. Quote:
IIRC, the crop was also broken when I first started working on MPEG2DEC, but I fixed it. The problem originally was that it was implemented inside one of the color conversion functions of Store.cpp and wasn't adjusting for the 2:1 size difference for chroma planes. I don't know where it should be implemented now, for YV12, since those conversion functions are hopefully not even being used. Probably just as an adjustment to the CopyAll parms. Is it really true that this is slower than Marc FD's last version for some reason? - Tom |
||
20th May 2003, 18:19 | #52 | Link |
Retired AviSynth Dev ;)
Join Date: Nov 2001
Location: Dark Side of the Moon
Posts: 3,480
|
Wow - that's a big percentwise change, alx gets.
Nics version might be bacuse of ICL - trbarry's version could be because it has inlined assembler within a rather CPU intensive section. MSVC has a tendency to disable optimizations in a C-block, if it contains inline assembler. Calling other functions containing the assembler doesn't have this impact. @Nic: Look for an Off by one bug. (sorry Tom - couldn't help myself)
__________________
Regards, sh0dan // VoxPod |
20th May 2003, 20:59 | #53 | Link |
Moderator
Join Date: Oct 2001
Location: England
Posts: 3,285
|
The version ive been testing recently is definitely faster, but ive got alot of machines to test on before the next "release"
Crop could just be implemented at the end of ::GetFrame or anywhere really, I tried to use env->subframe, but it didnt work out. -Nic ps @sh0dan: Thanks for the hint, but feel free to post the fix |
20th May 2003, 22:40 | #55 | Link |
Retired AviSynth Dev ;)
Join Date: Nov 2001
Location: Dark Side of the Moon
Posts: 3,480
|
Using subframe is a bit tricky here, since MPEG2DEC writes full resolution to the AviSynth PVideoFrame.
You would have to create a separate full-resolution videoframe, and returning the cropped one to AviSynth in the constructor. The full resolution VideoFrame should be used for the env->NewVideoFrame(vi), that constructs the output frame for MPEG2DEC. Then the subframe function can be used to return the cropped version, that correspons with the VideoInfo returned by the constructor. transform.cpp / Crop function can of course be used as a reference here. Oh - This is actually the hard way of doing this. The easiest way is probably to invoke the internal crop filter: in AVISynthAPI.cpp: Code:
AVSValue mpegS = new MPEG2Source( args[0].AsString(d2v), args[1].AsInt(cpu), args[2].AsInt(idct), args[3].AsBool(iPP), args[4].AsInt(moderate_h), args[5].AsInt(moderate_v), args[6].AsBool(showQ), args[7].AsBool(fastMC), args[8].AsString(cpu2), env); AVSValue CropArgs[5] = {mpegS.AsClip(),10,10,-10,-10}; return env->Invoke("crop",AVSValue(CropArgs,5));
__________________
Regards, sh0dan // VoxPod |
20th May 2003, 23:13 | #56 | Link |
Registered User
Join Date: Oct 2001
Location: Gainesville FL USA
Posts: 2,092
|
The reason I was first enthused about making crop work in MPEG2DEC2 was partially for convenience but mostly for speed. Some of my 1920x1080 HDTV upconverts (like Buffy, no longer an issue after tonight ) had to be cropped to 4:3 taking off 256 pixels on each side.
That was 512x1080 pixels on each frame that did not have to be converted to YUY2, something noticeable in performance. It is probably less important now when the data just has to be copied for YV12, but who knows. The other possibility I've discussed here before is to not crop in MPEG2DEC3 at all but just pass both the crop & resize parms back to Avisynth, as global variables that scripts could refer to by some known names, later on. Especially with resize this would allow the convenience of specifying the values interactively in DVD2AVI but still allow the resize to occur after deinterlacing, where it belongs. - Tom |
20th May 2003, 23:23 | #57 | Link |
Moderator
Join Date: Oct 2001
Location: England
Posts: 3,285
|
@sh0dan:
Damn, didn't think of doing it that way and its very obvious too. doh. Oh and as for subframe I realised that problem, and I thought I compensated for it, but must have made a mistake somewhere (trying to code two things at once (while I was at work ) @trbarry: Its a nice idea, but I wouldn't want a special thing in avisynth just to deal with the cropping and resizing, sh0dan's method seems a good solution...beem meaning to look into "invoke" for a while...as DDogg will testify Your Intra/Non-intra code does cause mpeg2dec3 to go slower too Tom, don't know why, might be for the reason's sh0dan said. Anyway ill put all this stuff together and test it out I honestly think, unless one of us gets a brainwave, its going to be hard to squeeze much more out of mpeg2dec3 speed wise, the structure of it is more the problem than anything. I might go back do what I should have done in the first place with mpegdecoder, and use libmpeg2 to make a mpeg2dec clone. -Nic |
21st May 2003, 00:35 | #58 | Link | |
Registered User
Join Date: Oct 2001
Location: Gainesville FL USA
Posts: 2,092
|
Quote:
But if the Add_Block stuff is busted and the other stuff is slower then I guess we might as well just delete my test release 1.05. - Tom |
|
21st May 2003, 13:48 | #59 | Link |
Moderator
Join Date: Oct 2001
Location: England
Posts: 3,285
|
@Sh0dan: That crop code worked great Thanks for that, should have been able to come up with that idea myself though
@Tom: Well im using an AMD 1800 XP ill leave the code in and test more. Have you spotted the bug in Add_Block? Can you fix it? Ive been playing round with another iDCT (yes, I know, another one). Which seems very accurate and that little bit faster, but im still profiling it. May as well release a latest version to update the 1.04 soon... -Nic |
21st May 2003, 15:36 | #60 | Link |
Retired AviSynth Dev ;)
Join Date: Nov 2001
Location: Dark Side of the Moon
Posts: 3,480
|
I'm also starting to enjoy filter invokation, and have used it in my latest filters (MipSmooth), and in a more advanced version in ConditionalFilter.
Regarding the bitblitting, you could simply assign AvsEnv 0 in the mpeg2decoder constructor, and do an Code:
if (AvsEnv) { bitblt } else { use existing }
__________________
Regards, sh0dan // VoxPod Last edited by sh0dan; 21st May 2003 at 15:47. |
|
|