Doom9's Forum - View Single Post - SEt's Avisynth 2.5.8 MT compiled for *X86_64*, Latest Build 4/16/2010

Stephen R. Savage · 27th February 2010, 19:37

I think a key problem is a lack of ways to benchmark AVS64. I use avs2avi for standard AVS, but there is no analogue for AVS64. For example, if you benchmark against x264, the results are skewed by the 64-bit advantage for x264. If you normalize by using avs2yuv for the 32-bit Avisynth, you still have to factor in the speedup by cutting out the piping overhead.

That said, here are some benchmarks. All results were taken using x264 64-bit in q=51, preset=ultrafast. 32-bit AVS was fed using avs2yuv, which should have negligible performance cost when single-threading.

Edit: Performance benchmarks have been redone using avs2avi.

TempGaussMC/EEDI2
32-bit: 2.83 fps
64-bit: 3.01 fps

MDeGrain3
32-bit: 6.20 fps
64-bit: 6.90 fps

AAA (mt_masktools, EEDI2)
32-bit: 1.92 fps
64-bit: 0.63 fps (SetMTMode: 1.94 fps)

Didee's Edge Mask
32-bit: 80.16 fps
64-bit: 89.02 fps

EEDI2 Resize2x
32-bit: 5.59 fps
64-bit: 5.67 fps

None of these cases came out bit-exact. I am a bit confused as to why AAA() comes out so much slower when all the components are faster.

Edit: I have traced the AAA slowdown to the following code fragment:

Code:

input = DirectShowSource("640x480p30.xvid.avi")

ox = width(input)
oy = height(input)

aa = TurnRight(input).EEDI2(field=1).TurnLeft().EEDI2(field=1)

edge = mt_logic(mt_edge(aa, "5 10 5 0 0 0 -5 -10 -5 4", 0, 255, 0, 255),
	\ mt_edge(aa, "5 0 -5 10 0 -10 5 0 -5 4", 0, 255, 0, 255), "max").Greyscale().
	\ Levels(0, 0.8, 128, 0, 255, false).Spline36Resize(ox, oy, -0.5, -0.5, 2 * ox, 2 * oy)
ds = Spline36Resize(aa, ox, oy, -0.5, -0.5, 2 * ox, 2 * oy)
maskmerge = mt_merge(input, ds, edge, U=1, V=1)
MergeChroma(ds)

I think it is a cache-related bug, because none of the individual pieces is slower.

27th February 2010, 19:37	#47 \| Link
Stephen R. Savage Registered User Join Date: Nov 2009 Posts: 327	I think a key problem is a lack of ways to benchmark AVS64. I use avs2avi for standard AVS, but there is no analogue for AVS64. For example, if you benchmark against x264, the results are skewed by the 64-bit advantage for x264. If you normalize by using avs2yuv for the 32-bit Avisynth, you still have to factor in the speedup by cutting out the piping overhead. That said, here are some benchmarks. All results were taken using x264 64-bit in q=51, preset=ultrafast. 32-bit AVS was fed using avs2yuv, which should have negligible performance cost when single-threading. Edit: Performance benchmarks have been redone using avs2avi. TempGaussMC/EEDI2 32-bit: 2.83 fps 64-bit: 3.01 fps MDeGrain3 32-bit: 6.20 fps 64-bit: 6.90 fps AAA (mt_masktools, EEDI2) 32-bit: 1.92 fps 64-bit: 0.63 fps (SetMTMode: 1.94 fps) Didee's Edge Mask 32-bit: 80.16 fps 64-bit: 89.02 fps EEDI2 Resize2x 32-bit: 5.59 fps 64-bit: 5.67 fps None of these cases came out bit-exact. I am a bit confused as to why AAA() comes out so much slower when all the components are faster. Edit: I have traced the AAA slowdown to the following code fragment: Code: input = DirectShowSource("640x480p30.xvid.avi") ox = width(input) oy = height(input) aa = TurnRight(input).EEDI2(field=1).TurnLeft().EEDI2(field=1) edge = mt_logic(mt_edge(aa, "5 10 5 0 0 0 -5 -10 -5 4", 0, 255, 0, 255), \ mt_edge(aa, "5 0 -5 10 0 -10 5 0 -5 4", 0, 255, 0, 255), "max").Greyscale(). \ Levels(0, 0.8, 128, 0, 255, false).Spline36Resize(ox, oy, -0.5, -0.5, 2 * ox, 2 * oy) ds = Spline36Resize(aa, ox, oy, -0.5, -0.5, 2 * ox, 2 * oy) maskmerge = mt_merge(input, ds, edge, U=1, V=1) MergeChroma(ds) I think it is a cache-related bug, because none of the individual pieces is slower. Last edited by Stephen R. Savage; 3rd March 2010 at 00:49. Reason: Update benchmarks