Log in

View Full Version : Some functions for continuity editing


rling
19th June 2010, 03:58
Hi Avisynth users

I have a set of functions I have used many times for non-linear editing, especially continuity editing, of home video, weddings, travel videos etc. I'll post them here in case they are useful to anyone else.

Some advantages of scripts that use these functions:
- easy to read and edit -- it looks like an EDL (Edit Decisions List)
- shorter than the equivalent script that uses Trim() and Splice()
- it's easy to add comments, so you can understand it later on...

Support for basic continuity editing is built in -- eg. "cutaways" or "reaction shots"

First, here is a part of a real edited video that uses these functions, so you can see what it looks like:

tape = AVISource("I:\tape1.AVI")

# black lead-in for fade-up
push(BlankClip(tape))
cut(0)
wait(30)
pop()

push(tape)
dis(36576,15) # Scene 284, golden buddha sitting outside temple
wait(36696) # max range: 36576 - 36716
dis(37844,15) # Scene 292, slow zoom on temple roof
wait(37935) # max range: 37844 - 37935
dis(37053,15) # Scene 287, carved door
wait(37192) # max range: 37053 - 37192
dis(37583,15) # Scene 290, row of stupas
wait(37711) # max range: 37583 - 37711
disa(38052,15)
disv(38188,15) # Scene 295, silhouetted bells
waitv(38188+75) # max range: 38188 - 38377

# slow-mo this scene as it is too short
push(changespeed(tape, 0.5))
disv(38377*2-75,15) # Scene 295, silhouetted bells
waitv(38377*2) # max range: 38188 - 38377
disv(38052*2,15) # Scene 294, slow pan of scene to horizon
disa(38052,15) # Scene 294, slow pan of scene to horizon
waitv(38122*2) # max range: 38032 - 38187
pop()

# cut to inside

cut(36717) # Scene 285, incense bowls & monks kneeling, from side
wait(36832) # max range: 36717 - 36832
cut(38955) # Scene 300, monks kneeling, from the rear
wait(39078) # max range: 38955 - 39078
#dis(36833,15) # Scene 286, crash zoom on silhouetted banners
#wait(37052) # max range: 36833 - 37052
cut(37487) # Scene 289, still of temple freize
wait(37582) # max range: 37487 - 37582
dis(37193,15) # Scene 288, still of temple freize
wait(37486) # max range: 37193 - 37486
cut(377120) # Scene 291, gold ornaments
wait(37843) # max range: 37712 - 37843

# slow-mo this scene as it is too short
disa(37936,15) # Scene 293, slow zoom on gold ornaments
push(changespeed(tape, 0.25))
disv(37936*4,15) # Scene 293, slow zoom on gold ornaments
waitv(37936*4+75) # max range: 37936 - 38031
pop()

#cut(38378) # Scene 296, gold ornaments, out of focus leave out
#wait(38589) # max range: 38378 - 38589
#cut(38590) # Scene 297, gold ornaments, out of focus leave out
#wait(38747) # max range: 38590 - 38747

# outside

cut(38748) # Scene 298, exterior mosaic
wait(38845) # max range: 38748 - 38845
dis(38846,15) # Scene 299, exterior mosaic
wait(38954) # max range: 38846 - 38954
cut(39079) # Scene 301, monks in courtyard
wait(39192) # max range: 39079 - 39192

[... more of the same ...]

pop()

# fade to black
push(BlankClip(tape))
disv(0, 15)
wait(20)
pop()

return finish()


SIMPLE EXAMPLE:

Now to explain some of the functions, here is a simple example... it would be easy to do this using Trim() and Splice(), so you might wonder why I need these functions at all, but trust me it gets more complicated later!

Import("continuity.avs")
global debug_sync = true
avi = AudioDub(BlankClip(length=1000,width=100,height=100),Tone(type="Silence"))
push(avi)
cut(100)
wait(150)
cut(200)
wait(250)
dis(300,10)
wait(350)
pop()
return finish()

Lets look line by line:

Import("continuity.avs")

This function imports my continuity editing functions.

global debug_sync = true

Turning this boolean on causes frame numbers to be displayed on input AVIs, and also displays the audio under the video in the output AVI, so that we can see what frames are being used in both video and audio. Normally, you would leave debug_sync turned off.

avi = AudioDub(BlankClip(length=1000,width=100,height=100),Tone(type="Silence"))
push(avi)

These lines create a dummy blank AVI and set it as the input AVI, that we are currently taking video and audio from. For me the input is usually a complete DV tape with many scenes. (Actually you can probably guess that push() pushes the AVI onto a stack, but I will explain that later.)

cut(100)
wait(150)

This says, cut to frame 100 of the input clip, then wait until we reach frame 150 of the input clip. That is, the frames 100-150 inclusive (51 frames) are the first piece of output. We can't wait() for a frame that is earlier than the one we cut() to. For example, wait(90) after cut(100) would be an error. But we could have used wait(100), that would output just one frame (frame 100).

cut(200)
wait(250)

This says, cut to frame 200 of the input clip, then wait until we reach frame 250 of the input clip. That is, the frames 200-250 inclusive (51 frames) are appended to the first piece of output, the total output is now 102 frames.

dis(300,10)
wait(350)

This says, dissolve to frame 300 of the input clip, with an overlap of 10. Then wait until we reach frame 350 of the input clip. Now we have output another 51 frames, but the first 10 of those frames are dissolved with the last 10 frames of the previous output. So the output is only 41 frames longer than before. The total output is 143 frames.

pop()

This function pops off the input clip we pushed at line 2, because we are now finished with it. We don't actually need to do this in this case, but it is tidy!

return finish()

This function indicates that this is the end of the editing. finish() returns the edited output.

BASIC CONTINUITY EDITING:

Next lets look at unsynchronised video and audio (eg. for "cutaway shots" or "reaction shots"). This is a basic trick of continuity editing, I use it all the time. Say we replace all the lines between the push() and pop() with these lines:

cut(100)
wait(150)
cutv(500)
waitv(520)
cutv(600)
waitv(620)
cutv2a()
wait(200)

Line by line again:

cut(100)
wait(150)

The output starts at frame 100, and continues up to and including frame 150. So we have 51 output frames so far.

cutv(500)

This function cuts the video to frame 500, but the audio stays where it was, at frame 150. So, this begins a "cutaway shot", where we take video from another part of the tape without changing the soundtrack. The video and audio are now unsynchronised.

waitv(520)

This function waits until video frame 520. (We can't use wait() here, because the video and audio are unsynchronised, so video frame X and audio frame X will happen at different times. Using wait() would trigger an assert.) Anyway, this waitv() outputs a video-only clip of frames 500-520. Remember that the audio is still only up to frame 150. We haven't said anything about the audio yet, so no audio is output.

cutv(600)
waitv(620)

These two functions output another video-only clip of frames 600-620. Just to show that we can join together cutaway shots. We still haven't said anything about the audio so it stays at frame 150.

cutv2a()

This function ("cut video to audio") resynchronises the video to the current audio position. When we desynchronised the video, the audio was at frame 150, and we added 42 video frames since then. So, this function outputs the next 42 frames of audio from the current audio position (frames 151-192 inclusive) making the streams the same length again. The next audio frame will be frame 193, so it cuts the video to frame 193 as well. Now the video and audio are synchronised again.

wait(200)

This line waits until frame 200. So, frames 193-200 (of both video and audio) are appended to the output. The final output is a clip of 101 frames, where the audio is taken from frames 100-200 inclusive and the video is made up of 4 small clips.

Often, cutaways are used in recorded TV interviews to make an answer shorter. You can't just cut out a section of video and audio from an answer, because the person will move slightly so you get a "jump cut" in the video. So, first you cut to a brief shot of the interviewer. Then you cut out the audio you don't want, perhaps joining using a dissolve so you don't get a "pop" in the soundtrack. Then you can cut back from the interviewer to the person's answer. And you avoid a jump cut. Here is the script fragment for this, showing how we take an answer that lasts for frames 100-300, and cut out the audio from frames 150-200 without a jump cut:

cut(100) # cut to the person starting their answer
wait(140) # wait till shortly before the start of the audio we want to cut out
cutv(5000) # cut video to a shot of the interviewer (which was actually taken much later)
waita(149) # wait up to and including the last frame before the audio we want to cut out
cuta(201) # cut to the first frame after the audio to be cut out, the video is not affected
waitv(5050) # wait till the interviewer has been seen for about 50 frames (51 to be exact)
cutv2a() # cut video back to the person giving the answer (frame 241).
wait(300) # wait till the end of the answer

The length of this sequence is determined by the length of the 2 audio segments, the first from 100-149 (50 frames) and the next from 201-300 (100 frames) so the total is 150 frames. We can change the parameters to cutv() and waitv() without changing the length of the sequence. But if we make the distance between cutv() and waitv() too long (like say 500 frames), we get an error at wait(300), because after cutv2a() the synchronised position will be already past frame 300 and we can't wait for a frame that has already passed!

What if we wanted to use dissolves instead of cuts? Here are the functions for that. We still have to start with a cut, because there is no previous video to dissolve with.

cut(100) # cut to the person starting their answer
wait(140) # wait till shortly before the start of the audio we want to cut out
disv(5000,5) # dissolve video to a shot of the interviewer, with an overlap of 5
waita(149) # wait up to and including the last frame before the audio we want to cut out
disa(201,5) # dissolve to the first frame after the audio to be cut out, with an overlap of 5
waitv(5050) # wait till the interviewer has been seen for about 50 frames
disv2a(5) # dissolve video back to the person giving the answer
wait(300) # wait till the end of the answer

The length of this sequence is 5 less than the previous one, because there is a 5 frame overlap between the 2 audio segments, so the total is 145 frames.

You can see that cutv2a() and disv2a() save you having to work out the maths. So are there functions called cuta2v() and disa2v()?? Yes!! cuta2v() is the opposite of cutv2a(). It cuts the audio to match the current video position. And disa2v() is the opposite of disa2v(). Suppose we changed the second last line of the above script to:

disa2v(5)

This script fragment is the same as before, except instead of cutting the video to synchronise with the audio, it cuts the audio to synchronise with the video, which is frame 5051. Now the video and audio are synchronised again. But, the call to wait(300) is an error, because frame 5051 is already past frame 300. If we used wait(6000) instead, it would work (if the input was long enough).

Usually, cuta2v() and disa2v() are more useful for doing the opposite of a "cutaway shot" - where we want to replace the audio with audio from another time:

cut(100)
wait(150)
cuta(500) # cut audio to frame 500, video stays at frame 150
waita(520) # output 21 audio frames from 500-520 inclusive
cuta2v() # output 21 video frames (151-171) and cut audio to sync with video at frame 172
wait(200) # output synchronised frames 172-200 inclusive


Thats it for now. See the functions in the next post.
I can post more explanation if anyone is interested

Cheers
R

rling
19th June 2010, 04:16
Functions are a bit too long to post as text, so please see attached AVS file.

Cheers
R

7ekno
19th June 2010, 05:29
Wow, that's pretty neat! Thanks for posting, certainly helps instead of a heap of Trims()!

Tek

b66pak
19th June 2010, 19:36
thanks a lot...
_

Gavino
19th June 2010, 20:13
A very interesting and useful example of the power and flexibility of Avisynth's scripting language.

Very well written too.
:thanks:

Gavino
20th June 2010, 11:12
Very well written too.
Having said that (and I still agree :)), I have a few comments on the code:

1. function 'image' uses an undefined variable 'template'.
Perhaps should be a function parameter, or use 'video_input' instead?

2. In function 'zoom', perhaps you want to Assert(scale <= 1.0)
(Actually, the meaning of scale as used here seems counter-intuitive to me)
Why round the last 4 resizer params to integer, when they accept float for sub-pixel accuracy?

3. In function __append_range,
range = Trim(input, first, ((first == 0 && last == 0) ? -1 : last))
could simply be
range = Trim(input, first, -count)
Also here, better to avoid using 'last' as a variable, given its special meaning in Avisynth.

rling
20th June 2010, 22:32
Thanks everyone for your comments!


1. function 'image' uses an undefined variable 'template'.
Perhaps should be a function parameter, or use 'video_input' instead?

2. In function 'zoom', perhaps you want to Assert(scale <= 1.0)
(Actually, the meaning of scale as used here seems counter-intuitive to me)
Why round the last 4 resizer params to integer, when they accept float for sub-pixel accuracy?


Actually the functions zoom(), addsubtitle() and image() should not really be here. Gavino is right, they're broken!! :eek:
You probably guessed that these functions are "input filters" which are used with push().
About 2 or 3 years ago when this file became very long, I moved all the "input filters" to separate files, to try to keep this file tidy. It looks like I forgot to delete some of the old ones! I have not used the ones here for a long time. Please ignore them or remove them! ;)

(But I do know that in my current zoom() I still don't assert, I just have to get the parameter right... :rolleyes: )

3. In function __append_range,
range = Trim(input, first, ((first == 0 && last == 0) ? -1 : last))
could simply be
range = Trim(input, first, -count)
Also here, better to avoid using 'last' as a variable, given its special meaning in Avisynth.

Thanks Gavino! I think when I was working on this, I didn't know if Trim(input, first, -count) did really what I wanted. So I just wrote exactly what I wanted, then I was 100% sure!
And it worked, so I never thought about it again (same problem as forgetting to remove the extra input filters ;) )
But your code is shorter and neater than mine, and you are welcome to change it! Thanks for that!

I hope that everything else works OK, so if there are other bugs please tell me. If I get time I'll post a bit about my workflow with these functions. You can probably guess from looking at the comments in the first script fragment I posted, that the initial script is software-generated... :cool:

Cheers
R

rling
24th June 2010, 08:44
SEPARATE VIDEO AND AUDIO:

We can build up the video and audio streams completely separately using these functions. But we have to be careful, because of the way cuts/dissolves and waits interact.

Basically, any cut or dissolve (other than the first one) happens right after reaching the frame given in the wait that precedes it in the script. (The very first cut has no preceding wait, of course.)

So for example, you already know what this does:

cut(20)
wait(30) <-- this wait here...
cut(10) <-- is the preceding wait for this cut.
wait(15)

The cut to frame 10 happens right after frame 30 is output.
We can also cut the video and audio separately:

cut(20)
wait(30) <-- this is the preceding wait for the following 2 cuts.
cutv(100) <-- this cut happens right after frame 30 is output.
cuta(200) <-- this cut happens at the same time.

So, suppose we wanted to build the video and audio streams separately, and have them play together:
We can start like this:

cut(20)
wait(30) # output 11 synchronised frames

cutv(100)
waitv(110) # output 11 video frames
cutv(120)
waitv(130) # output 11 video frames
cutv(140)
waitv(150) # output 11 video frames (3 x 11 = 33 frames)

Now the video is 11+33 frames long, but the audio is still only 11 frames long. We want to output some audio that will be matched with those 33 extra video frames.
Here is something that will not work:

cuta(200)
waita(208) # output 9 audio frames
cuta(210)
waita(218) # output 9 audio frames
cuta(220)
waita(228) # output 9 audio frames
cuta(230)
waita(238) # output 9 audio frames (4 x 9 = 36 frames)

The reason this will not work is that the preceding wait for cuta(200), is waitv(150), NOT wait(30)! There are NOT separate wait positions for video and audio! So we are actually asking the script to cut the audio to frame 200 just after outputting video frame 150. The audio has to be extended by another 33 frames to reach that time, so audio frames 31-63 are output automatically, so that the video and audio are the same length. THEN the audio cuts to frame 200, and while the audio is output, the video is extended by 36 frames starting from frame 151. That is not what we wanted at all!!

To do what we want, after outputting the video, we have to add a new "preceding wait" just above cuta(200), to remind the script that we want this audio to start straight after frame 30, like this:

waita(30) # preceding wait
cuta(200)
waita(208) # output 9 audio frames
cuta(210)
waita(218) # output 9 audio frames
cuta(220)
waita(228) # output 9 audio frames
cuta(230)
waita(238) # output 9 audio frames (4 x 9 = 36 frames)

We can use waita(30) here because the audio position still has not passed frame 30. We couldn't use wait() here, because the video and audio are not synchronised and of course the video is already past that frame anyway, so we would get an assert.

So now, we have output 11+33 frames of video and 11+36 frames of audio. We can resynchronise video and audio with a simple cut, like this:

cut(300)
wait(310)

This cut will happen after outputting the 36 audio frames, but the video is still only 33 frames long, so the video will be automatically extended by 3 frames (frames 151-153). In effect, the script inserts waitv(153) just before cut(300).

So here is the full script:

cut(20)
wait(30)

cutv(100)
waitv(110)
cutv(120)
waitv(130)
cutv(140)
waitv(150) <-- this wait is redundant

waita(30)
cuta(200)
waita(208)
cuta(210)
waita(218)
cuta(220)
waita(228)
cuta(230)
waita(238)

cut(300)
wait(310)

You can see that the waitv(150) doesn't actually do anything, because it is followed by another wait. This is a simple rule: any wait followed immediately by another wait doesn't do anything, so it can be removed from the script.

You might try to break the script here by making the unsynchronised audio shorter than the unsynchronised video. For example, you could comment out some lines like this:

waita(30)
# cuta(200)
# waita(208)
# cuta(210)
# waita(218)
# cuta(220)
# waita(228)
cuta(230)
waita(238)

This will reduce the unsynchronised audio from 36 frames to 9 frames. Now you will get an error: "can't waitv(126) after cutv(140)". There is no waitv(126), so what does it mean?

Remember that in the previous example, the script worked out that the cut(300) should happen after video frame 153, so it inserted a waitv(153) for us. That was because the video was 3 frames too short. Now the video is too long, so it needs to add a video clip with "negative length" to reach the cut point -- starting from video frame 140, it needs to cut after frame 126! Of course this is not possible, so it gives an error. Basically we need to remove some video, if we want to use cut() after adding only 9 frames of audio.

HOW TO BREAK IT?

One way to break it, or at least to do something strange:

cut(0)
waita(10)
cutv(200)
waita(5)
cuta(15)
waita(50)

Here we start outputting synchronised video and audio from frame 0. We tell the script that when the audio reaches frame 10 (which is the same as video frame 10 of course) it should cut the video to frame 200. But then we cut out part of the audio. The audio will never reach frame 10! So will the video cut to frame 200??

Actually it still does. The script does not really care whether audio frame 10 is actually reached. It predicted that the audio will reach frame 10 after 10 frames, so it cuts the video after 10 frames.


SOME MORE SIMPLE RULES

I already said any wait followed immediately by another wait doesn't do anything, so it can be removed from the script.

Here are 2 more rules:

(1) any function that cuts or dissolves video (cut, dis, cutv, disv) that is followed immediately by another function that cuts or dissolves video, has no effect and can be removed from the script. Only the last one has any effect. For example:

cutv(0)
cutv(100)
disv(200,10)
cutv(50)

All this could be replaced with cutv(50).

(2) any function that cuts or dissolves audio (cut, dis, cuta, disa) that is followed immediately by another function that cuts or dissolves audio, has no effect and can be removed from the script. Only the last one has any effect. For example:

cuta(0)
disa(200,10)
cuta(50)
cut(123)

All this could be replaced with cut(123).

Calling cut() is actually the same as calling cutv() and cuta() together. So sometimes you might get an error talking about cuta() or cutv() when actually you used cut(). Similarly, calling dis() is actually the same as calling disv() and disa() together.

Calling wait() is the same as calling waitv(), except that wait() asserts that the video and audio are synchronised.

Cheers
R