Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
20th March 2014, 02:25 | #1 | Link |
Registered User
Join Date: Mar 2014
Posts: 1
|
Add a "Who's talking" display to video
I've been attempting this on my own but haven't gotten anything to work out and I haven't seen a discussion on it to help point me in the right direction. I'm looking for a way to visually identify when noise comes from audio clips. I'd appreciate any advice anyone might have. If you've ever seen a mumble or skype HUD displaying this sort of info. while in a large call, that is essentially what I'm looking to achieve.
background: I put together a lot of video game footage of multiple players in a grid format, the audio is their commentary. (example 4-person layout) The audio is all individual files until I mix them together, which I think is necessary for what I want to achieve here. So I've been trying to find a way with avisynth to do a simple effect for when an individual is talking, such as temporarily changing the color of their name or displaying a speaker icon. This would be beneficial because I generally create groups of 4-8 and it would be useful to see who is speaking visually. I figure that if I can find a way to check an audio clip for dramatic changes in volume, I can possibly apply these periodic effects. Is there a way? I've tooled around with avisynth for a while now but I think what I'm looking to do is beyond me to work out. |
20th March 2014, 02:45 | #2 | Link |
Guest
Join Date: Jan 2002
Posts: 21,901
|
That sounds hard to do in Avisynth. If I had to accomplish this to save my life, I would first preprocess the separate audio files to create a "talking map". Then I would write an avisynth filter to use the talking map to overlay a sprite(s) on the video in the appropriate position(s).
|
20th March 2014, 03:24 | #3 | Link |
Registered User
Join Date: Sep 2007
Posts: 5,377
|
Is it noise/speech vs silence ?
One way to do it is with conditionalfilter() and minmaxaudio.dll using AudioRMS() You need 4 invididual sets of audio & video , and you can combine them after with stackhorizontal/stackvertical The replacement "layer" can be anything, you can overlay a logo, make a pointer, overlay a border like skype, change the color whenever audio is detected above a threshold value In this example , I used the alternating audio on/off from colorbars() channel 2 as the audio source, and the "replacement" whenever audio is detected is just a darkened version Code:
colorbars(pixel_type="yv12") trim(0,300) getchannel(2) orig=last replace=orig.levels(0,0.2,255,0,255,false) ConditionalFilter(orig,replace,orig,"AudioRMS(0)", ">", "-50", show=true) |
20th March 2014, 08:43 | #4 | Link |
Retried Guesser
Join Date: Jun 2012
Posts: 1,373
|
@poisondeathray, good idea, but I have some refinements: variable transparency and a little decay time to cut down on flickering.
Code:
LoadPlugin("MinMaxAudio\Release\MinMaxAudio.dll") A1=WavSource("a1.wav") ## uncompressed audio is a lot faster due to runtime analysis! A2=WavSource("a2.wav") #A3=... AviSource("v.avi") debug=true ## set to true for adjusting the mask windows overlay_1 = Subtitle("VOICE ONE", x=24, y=32, size=56) AudioLevelOverlay(A1, overlay_1, \ 16, 26, 512, 80, showmask=debug) overlay_2 = Subtitle("VOICE TWO", x=Width-524, y=32, size=56) AudioLevelOverlay(A2, overlay_2, \ Width-532, 26, 512, 80, showmask=debug) #overlay_3 = ... #AudioDub(final_audio_mix) return Last ################################## ### show overlay clip only when there is audio ### http://forum.doom9.org/showthread.php?p=1674312#post1674312 ## ## @ C - base clip ## @ A - audio ## @ O - overlay ## @ x, y, wid, hgt - position & size of mask window ## @ boost - overall level boost (fudge factor) (default 18) ## @ gate - ignore audio under (-gate+boost) dB; (default 20) ## NOTE "boost" and "gate" are shared among all instances of ## this function (thanks Gavino). ## * if the overlay does not get fully opaque, increase boost; ## * if the overlay shows up when it shouldn't, increase gate. ## For example, I used boost=24, gate=12 on a muddy source. ## @ showmask - for setting window size & position ## function AudioLevelOverlay(clip C, clip A, clip O, \ int x, int y, int wid, int hgt, \ int "boost", int "gate", bool "showmask", string "mode") { Assert(O.Width==C.Width && O.Height==C.Height, \ "AudioLevelOverlay: overlay must be same size as base clip") global boost = Min(Max( 0, Default(boost, 18)), 24) global gate = Min(Max(0, Default(gate, 20)), 60) showmask = Default(showmask, false) mode = Default(mode, "blend") AudioDub(C, A.AmplifyDB(-6).AudioEcho.Normalize(1)) S = ScriptClip(Last.Crop(0, 0, wid, hgt), """ x = Min(Max(0, Round(AudioRMS(0))+gate+boost), 255)*255/gate return Last.BlankClip(color=to_rgb(x))""") M = Overlay(C.BlankClip, S, x=x, y=y).ConvertToY8 return (showmask) \ ? C.Overlay(M, opacity=0.5, mode="add") \ : C.Overlay(O, mask=M, mode=mode) } function AudioEcho(clip A, float "delay", float "mix") { delay = Min(Max(0.01, Float(Default(delay, 0.33))), 5.0) mix = Min(Max(0.0, Float(Default(mix, 0.33))), 1.0) return A.MixAudio(A.AudioTrim(0, delay)+A, (1.0-mix), mix) } function to_rgb(int r, int "g", int "b") { r = Min(Max(0, r), 255) ## thanks Gavino g = Min(Max(0, Default(g, r)), 255) b = Min(Max(0, Default(b, r)), 255) return (r*65536) + (g*256) + b } Last edited by raffriff42; 22nd March 2014 at 05:11. Reason: changes in blue |
20th March 2014, 11:37 | #5 | Link |
Avisynth language lover
Join Date: Dec 2007
Location: Spain
Posts: 3,431
|
Neat idea, raffriff42 (and poisondeathray for the basic method).
Note that the global variables will cause problems if you ever want to call AudioLevelOverlay() more than once in a script with different values of 'gate' and/or 'boost'. A better way to pass arguments into a run-time script is to use the 'args' parameter of the GRunT run-time filters. Also, the setting of the mask levels seems to be incorrect. If I understand, the intention is to make the overlay start to become visible for audio levels above -gate, and fully opaque when it reaches -boost. However, with this code it becomes visible at -(gate+boost) and the opacity overshoots 255 (and hence wraps around to zero) at -boost. (Perhaps function to_rgb() should limit its arguments to 255). |
20th March 2014, 12:44 | #6 | Link |
Registered User
Join Date: Jan 2006
Location: Finland
Posts: 134
|
An alternate suggestion - You could use a plugin such as AudioGraph() to add a visual representation of the audio on each of the clips. With a bit of other AviSynth scripting you could make it as visible or unobtrusive as you prefer.
|
20th March 2014, 23:43 | #7 | Link | |
Retried Guesser
Join Date: Jun 2012
Posts: 1,373
|
Thanks for the help, Gavino. I've fixed (in blue) some of the issues you mention, but not this one:
Quote:
For now, I have left the globals as they are, with a caveat. Re: gate & boost, I messed around trying to meet your specification, but then I realized I was originally thinking of a microphone "boost" switch on an audio mixer, active *before* the noise gate. So your second wording is describes what the settings do, except the wraparound problem is fixed. |
|
21st March 2014, 00:32 | #8 | Link | |
Avisynth language lover
Join Date: Dec 2007
Location: Spain
Posts: 3,431
|
Quote:
In your function, you could use: Code:
S = ScriptClip(Last.Crop(0, 0, wid, hgt), """ x = Min(Max(0, Round(AudioRMS(0))+gate+boost)*255/gate), 255) return Last.BlankClip(color=to_rgb(x))""", args="gate,boost") |
|
21st March 2014, 19:38 | #10 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
+1 on Waveform, much nicer.
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? |
22nd March 2014, 04:56 | #11 | Link |
Retried Guesser
Join Date: Jun 2012
Posts: 1,373
|
Waveform works nicely - for example you could size & move each audio waveform under the matching speaker's names.
Here's a demo video for the script above: I'm no animator, but I managed to prepare a still "base" image and 2 "glow" versions, one for each speaker. "Who's talking" effect - avisynth (youtu.be) This task required Overlay mode="lighten" because the glow effects overlap one another, with no mask rectangle used (ie, rectangle = full screen), so script has a new "mode" option. Another idea - modulate *position* instead of opacity; make an overlay "sprite" jiggle when someone talks. Last edited by raffriff42; 18th March 2017 at 00:54. Reason: (fixed image link) |
Tags |
audio |
|
|