Log in

View Full Version : Speed up encodes whilst maintaining quality?


hydra3333
7th June 2014, 12:56
Looking to change to a 750ti video card soon, in an effort to speed up encodes whilst maintaining quality.
With this
REM input is .mpg (mpeg2 1440x1080i TFF) output is mpeg4 576i
"avs2yuv.exe" "Man About The House.avs" -o - | "x264-x64.exe" - --stdin y4m --thread-input --frames "128818" --profile high --level 4.1 --preset slow --interlaced --tff --no-cabac --crf 18 --sar 64:45 --colormatrix bt470bg -o "Man About The House-temp.MP4"
SetMTmode(mode=3,threads=8) # start with mode=5 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
setmemorymax(1024)
DGSource("Man About The House.dgi",deinterlace=2,resize_w=720,resize_h=576) #deinterlace=2 means double rate deinterlacing
SetMTmode(mode=2) #
AssumeTFF()
Assumefps(25)
trim(1,-999999) # fix a double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556
#LAST.HEIGHT==1080 ? AddBorders(int left, int top, int right, int bottom) : LAST
LAST.HEIGHT==1080 ? AddBorders(0,0,0,8) : LAST
AssumeTFF() #choose the SAME field order like before deinterlacing
SeparateFields().SelectEvery(4,0,3).Weave() #reinterlace.
AssumeTFF() #choose the SAME field order like before deinterlacing
AssumeFPS(25)
LAST.HEIGHT==1088 ? cropbottom(8) : LAST
ColorMatrix(interlaced=true,mode="Rec.709->Rec.601")
SetPlanarLegacyAlignment(True)
I get about 9-10fps with an 8800GT.
I tried feeding the .avs directly into the 32bit version of x264 ... approximately the same speed :)
Suggestions for "speedup" welcomed (whilst maintaining about the same quality).

PS it's a TV capture, no rules broken.

edit: fixed --reset to --preset

Guest
7th June 2014, 13:13
First, upgrade your Nvidia card as you mentioned. That will greatly improve the decode side. But the big determinant is what you are doing on the encode side.

All those field operations look very dodgy to me too. They shouldn't even be necessary at all.

Groucho2004
7th June 2014, 13:21
With this
REM input is .mpg (mpeg2 1440x1080i TFF) output is mpeg4 576i
"avs2yuv.exe" "Man About The House.avs" -o - | "x264-x64.exe" - --stdin y4m --thread-input --frames "128818" --profile high --level 4.1 --reset slow --interlaced --tff --no-cabac --crf 18 --sar 64:45 --colormatrix bt470bg -o "Man About The House-temp.MP4"
"--reset slow" - I guess that should be "--preset slow"
"--interlaced" - Not used any more. Just use "--tff"

I get about 9-10fps with an 8800GT.That seems very slow for SD resolution. What CPU do you use?
Also, run the script through AVSMeter to exclude the encoding from the measurements.

hydra3333
8th June 2014, 00:41
"That seems very slow for SD resolution. What CPU do you use?"
An i3820 (4-core hyperthreading LGA2011 i7-3820 with quad-channel memory and apparently decent PCI Express bandwidth), win7, 16Gb, encoding from sata6 disk to an SSD (the commandline is an "cut" example not showing it).
Also, run the script through AVSMeter to exclude the encoding from the measurements.OK, I'll look up how to do that this afternoon.

hydra3333
8th June 2014, 00:52
First, upgrade your Nvidia card as you mentioned. That will greatly improve the decode side. But the big determinant is what you are doing on the encode side.

All those field operations look very dodgy to me too. They shouldn't even be necessary at all.
OK upgrading soon.

Agreed, this script was a bad quick copy/chop from another where there are cleanup operations in the middle before the re-interlace. I'll remove them and see how it goes.
SetMTmode(mode=3,threads=8) # start with mode=5/3 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
setmemorymax(1024)
DGSource("Man About The House.dgi",deinterlace=2,resize_w=720,resize_h=576) #deinterlace=2 means double rate deinterlacing
SetMTmode(mode=2) #
trim(1,-999999) # fix a double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556
SeparateFields().SelectEvery(4,0,3).Weave() #reinterlace.
AssumeTFF() #choose the SAME field order like before deinterlacing
AssumeFPS(25)
ColorMatrix(interlaced=true,mode="Rec.709->Rec.601")

Groucho2004
8th June 2014, 09:37
An i3820 (4-core hyperthreading LGA2011 i7-3820 with quad-channel memory and apparently decent PCI Express bandwidth), win7, 16Gb, encoding from sata6 disk to an SSD
Hm, looks like that you are limited by the script/decoding. Let's see what testing only the script reveals.

SetMTmode(mode=3,threads=8) # start with mode=5/3 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
setmemorymax(1024)
DGSource("Man About The House.dgi",deinterlace=2,resize_w=720,resize_h=576) #deinterlace=2 means double rate deinterlacing
SetMTmode(mode=2) #
trim(1,-999999) # fix a double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556
SeparateFields().SelectEvery(4,0,3).Weave() #reinterlace.
AssumeTFF() #choose the SAME field order like before deinterlacing
AssumeFPS(25)
ColorMatrix(interlaced=true,mode="Rec.709->Rec.601")
I don't see anything in that script that would justify using MTMode with 8 threads. You're just wasting memory and introducing potential problems.

Edit: In fact, this line:
SeparateFields().SelectEvery(4,0,3).Weave()
slows the script to a crawl when using MT and that's your problem.
Without SETMTMode, the script runs just fine.

hydra3333
8th June 2014, 11:26
And so it does. Goodness me.

The MT mode is a hangover from the original script which did filtering, after deinterlacing and before re-interlacing, and was thought to only improve things even in this case which it doesn't by quite a fair factor.

Re-jigging the testing:
with MT and 8 threads: 6 fps encoding
without MT and 8 threads: 103 fps encoding

Tinkering with the script this also gives 103 fps:
SetMTmode(mode=5,threads=8) # start with mode=5 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
setmemorymax(768)
DGSource("C:\test\test.dgi",deinterlace=2,resize_w=720,resize_h=576) #deinterlace=2 means double rate deinterlacing
#
SetMTmode(mode=2) #
trim(1,-999999) # fix a double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556
SeparateFields()
SelectEvery(4,0,3)
Weave() #reinterlace.
AssumeTFF() #choose the SAME field order like before deinterlacing
AssumeFPS(25)
SetMTmode(mode=5,threads=8) # start with mode=5 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
#ColorMatrix(interlaced=true,mode="Rec.709->Rec.601")
SetPlanarLegacyAlignment(True)
#Distributor()

And putting back the colormatrix for hd->sd gives 96 fps, a monstrous improvement over 6fps:
SetMTmode(mode=5,threads=8) # start with mode=5 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
setmemorymax(768)
DGSource("C:\test\test.dgi",deinterlace=2,resize_w=720,resize_h=576) #deinterlace=2 means double rate deinterlacing
#
SetMTmode(mode=2) #
trim(1,-999999) # fix a double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556
SeparateFields()
SelectEvery(4,0,3)
Weave() #reinterlace.
AssumeTFF() #choose the SAME field order like before deinterlacing
AssumeFPS(25)
ColorMatrix(interlaced=true,mode="Rec.709->Rec.601")
SetMTmode(mode=5,threads=8) # start with mode=5 forAVIsource http://forum.doom9.org/showthread.php?p=1067216#post1067216
SetPlanarLegacyAlignment(True)
#Distributor()

Guest
8th June 2014, 12:00
Why do you do double-rate deinterlacing and then re-interlace?

hydra3333
8th June 2014, 12:06
In this script, for no purpose other than to prove that it can be done.

In other scripts based on this, I can runs filters over the progressive frames before re-interlacing. eg some TV shows shown on 1440x1080i are "old" and in need of some loving to be watchable.

Various cases:
- use NV to just resize to 576 and return interlaced frames without processing, for encoding
- use NV to return 576 double framerate deinterlaced, process frames, reinterlace to 576i
- use NV to return 1080 double framerate deinterlaced, process frames and resize to 576i, reinterlace

nhakobian
8th June 2014, 17:46
In other scripts based on this, I can runs filters over the progressive frames before re-interlacing.

I understand the whole de-interlacing part, but why would you ever want to or need to re-interlace for encoding?

The only time when this could even be remotely necessary is on blu-rays for certain combinations of resolutions and framerates. And for that you have the fake-interlaced and pulldown modes in x264 to encode them as progressive, but have flags placed in so a blu-ray player can re-create the interlaced frames for devices that require it.

Interlaced encoding is less efficient (since you are encoding by field), plus you have repeated temporal data that will cause a loss of efficiency, plus playback of files on computer devices will not look great.

hydra3333
9th June 2014, 01:38
Good question. Not sure what you mean about repeated temporal data, 1080i -> 576i, could you clarify ?
Source 1044x1080i -> cleaned 576i/p, in my case 576i because, in increasing order of importance:
- Comparing,
* 576i = 576i50 = 25 fps @ 50 fields/sec whereas 576p = 576p25 = 25 fps (refer "motion fluidity" below)
* using 576p50 instead would be double the data vs 576i, and hence double the final filesize with other network/playback consequences
- Target playback device which outputs to the telly is limited in what it can play; neither playback device nor viewing device are a computer
- Some of the TV captures are 1080i fast action sports on a large grassed arena with lots of panning and attempted detail (the TV stations limit bandwith atrociously and the end result is "blocky")
* with fast sports action, there's a discussion somewhere about "motion fluidity" where 576i50 can appear visually "less jerky" than 576p25

Happy for you to point out silliness on my part, and improvement that could be made.

edit: http://www.avforums.com/threads/576i-vs-576p.723452/
Sport at 25fps would be horrible. Movies limit the speed of pans to avoid to large a jump between frames because of the relatively low framerate. You can't do that with sport, it has to follow the speed of the action, so in that case what you want and get is 50 progressive frames per second where the deinterlacing has filled in the blank lines in the field using whatever algorithm it has available.

Guest
9th June 2014, 01:41
The point is you made it double rate progressive. Leave it that way!

It's too big for you? Reduce the bitrate. In the modern era, interlacing is a terrible compression method.

nhakobian
9th June 2014, 02:29
Good question. Not sure what you mean about repeated temporal data, 1080i -> 576i, could you clarify ?

Many times (mostly in telecining 24p sources to 30i), you repeat individual fields to reach your goal rate. I think converting some 30fps to 25fps conversions (and the other way around) do the same thing as well. This is effectively repeated temporal data and wasted data in encoding.


* using 576p50 instead would be double the data vs 576i, and hence double the final filesize with other network/playback consequences

This is very untrue for (most) progressive sources. In a test I made on video game captures at 60fps vs. 30 fps a couple years back, the 60fps captures took 15-20% more space when encoded. This is primarily due to motion prediction. The only case I can think of where you would need tons more space is if you have an extremely grainy/noisy source and wanted to keep the grain. Since grain/noise is technically random from frame to frame, it takes more space to encode the random differences.

Oh course this is going to vary significantly from source to source. Best thing to do is test it and see, and use a denoiser if you have really noisy sources.


- Target playback device which outputs to the telly is limited in what it can play; neither playback device nor viewing device are a computer


This is the only time I could possibly think to keep it, if your device is really limited. If your source is something was was shot progressively, definitely store it as progressive. You'll be happier in the end.


- Some of the TV captures are 1080i fast action sports on a large grassed arena with lots of panning and attempted detail (the TV stations limit bandwith atrociously and the end result is "blocky")
* with fast sports action, there's a discussion somewhere about "motion fluidity" where 576i50 can appear visually "less jerky" than 576p25


These are probably the only true natively interlaced sources (besides some home video cameras). I still think of this as a hack (as with most interlaced coding) and it trades image quality for motion fluidity. As you said, most tv providers bitrate starve their channels, so it makes it worse. But I have to say, most LCD TV's have pretty good deinterlacers (for real time processing). That can't fix some really annoying issues (like the end of the Super Bowl when the pump tons of confetti into the air).

But if you have the double rate deinterlaced versions, you should really see if your device will play them. A few years from now you'll probably thank yourself for doing that.

hydra3333
9th June 2014, 04:30
Thank you neuron2 and nhakobian.
I will re-check my playback devices specs and their latest firmware to see if 576p is playable and also compare encode times/sizes.
(I'm in a PAL place so telecining isn't a straightforward issue for me; the TV stations would have already mucked up 30fps to 25fps for me ;))

edit: the WDTV specs don't say whether they handle 576p50 or even outputs 576p50 (it's ambiguous). In any case it seems that for all files they then only internally convert to the one pre-specified output spec, eg 1080i, which it pushes up the HDMI to the TV. My lowest common denominator TV is 1080i.

Mixer73
10th June 2014, 16:22
In my experience 576p encodes are just fine on these hardware devices. Set our media player to 720p and deinterlace before encoding, sit back and enjoy.

hydra3333
12th June 2014, 12:55
That looks like a plan !

Motenai Yoda
17th June 2014, 01:35
add an ie selectrangeevery(800,40) at the end of script
then encode it saving the prompt to a txt
read into txt how many bframes and reference it used.

ie
x264 [info]: consecutive B-frames: 3.9% 6.0% 9.4% 33.6% 16.4% 24.7% 6.1%
it used 6 bframes only in 6.1% of cases so u can reduce -b to 5
also in
x264 [info]: ref P L0: 56.5% 5.8% 17.4% 5.5% 4.6% 4.0% 3.4% 2.2% 0.6% 0.0%
it used 0% 9 reference and 0.6% 8 reference so u can reduce -r to 7

this will speedup encoding specially with b-adapt 2 and mixed-reference.