Audio Development [Archive] - Page 2

Pages : 1 [2] 3

Myrsloik

27th June 2020, 11:07

Test 4 posted. Adds the functions AudioReverse and AudioLoop. Improves AudioSplice to be faster with many clips and fixes AudioMix and ShuffleChannels so they have sane argument lists. Plus all the things that changed during test3 that's not very well documented.

Note that the crash bug still remains but doesn't happen ever happen when using vspipe.

The todo list:

Fix crash bug
Add a function for simple volume adjustment
Audio format conversion functions (format and/or samplerate)
Do a huge cleanup of the API and internal code before declaring audio support stable.

tebasuna51

28th June 2020, 12:22

DJATOM

28th June 2020, 13:06

Add 'import vapoursynth as vs' above core import

tebasuna51

28th June 2020, 13:31

Add 'import vapoursynth as vs' above core import

Perfect, thank you.

Now, how can know the max float value in order to do a Normalize() after a
AudioMix(... matrix=[1.5]...) for instance?

Loading the float output in Audacity I have values > 1 than can produce clip when convert to int.

Myrsloik

28th June 2020, 14:43

Thanks for the new version.

I try to make some test, to emulate a simple volume adjust (Amplify in Avs), but seems I have some syntax error here:
from vapoursynth import core
c = core.bas.Source(r"C:\tmp\tri16 - f32.wav")
# a simple mono wav
c = core.std.AudioMix(clips=c, matrix=[0.5], channels_out=[vs.FRONT_CENTER])
c.set_output()
and obtain:

Other tips:

std.SplitChannels(anode clip)
(In docs like Trim/AudioTrim function)
How obtain the separate clips? Maybe:
fl,fr,fc,lf,sl,sr = std.SplitChannels(source)

AudioTrimS(anode clip[, float first=0, float last, float length])
Values in seconds with int(value * samplerate)
Accept negative values for first, to replace DelayAudio(value), insert silence at the begining.

Do you want make internal fuctions for change samplerate to replace Avs ResampleAudio, SSRC, TimeStretch?
Or maybe support external plugins like TimeStretch in Avs+ or other plugins like Sox to audio management?.

Thanks for your job.

I'm going to add a function for simple volume adjustment in the next version. You can emulate it by passing a scaled identity matrix to AudioMix until that happens.

If you check the documentation you see that AudioMix always needs input channels*output channels arguments as the matrix. So for stereo it'd be [0.8, 0, 0, 0.8] as an example.

If you want to determine the max value you have to scan the whole clip yourself. You should be able to implement it as a pure python script or simply pass the vspipe output to something that can figure out the max values.

tebasuna51

28th June 2020, 20:35

...If you want to determine the max value you have to scan the whole clip yourself. You should be able to implement it as a pure python script...

Yes, but how can obtain the sample values in the script?

For a clip, for instance c = core.bas.Source(r"C:\some.wav")
I can know:

NumSamples = len(c)
SampleRate = c.sample_rate
MaskChannels = c.format.channel_layout
NumChannels = c.format.num_channels
BitDepth = c.format.bits_per_sample
SampleFormat = c.format.sample_type
TimeLength = NumSamples/SampleRate

And can do a for loop by samples and channels if I know the sintax to obtain c.VALUE(samples,channels):

max = 0.0
for samples in range (0,NumSamples-1):
for channels in range (0,NumChannels-1):
if c.VALUE(samples,channels) > max:
max = c.VALUE(samples,channels)

Myrsloik

28th June 2020, 23:57

Yes, but how can obtain the sample values in the script?

For a clip, for instance c = core.bas.Source(r"C:\some.wav")
I can know:

NumSamples = len(c)
SampleRate = 2 * c.format.samples_per_frame (for what a frame 0.5 sec?)
MaskChannels = c.format.channel_layout
NumChannels = c.format.num_channels
BitDepth = c.format.bits_per_sample
SampleFormat = c.format.sample_type
TimeLength = NumSamples/SampleRate

And can do a for loop by samples and channels if I know the sintax to obtain c.VALUE(samples,channels):

max = 0.0
for samples in range (0,NumSamples-1):
for channels in range (0,NumChannels-1):
if c.VALUE(samples,channels) > max:
max = c.VALUE(samples,channels)

Short answer: Not implemented yet, raw data access only works for video frames at the moment. Will probably be implemented for test5. That's how you know you're in a development thread...

Myrsloik

29th June 2020, 18:01

Test4 build sneakily updated. Changes:

Added AudioGain filter
Added the possibility to directly access audio data in the same way as video
Fixed crashes
Fixed broken wave64 headers

Myrsloik

29th June 2020, 18:10

How to get the max value of audio as a simple script:

maxval = 0
for frame in c.frames():
for channel in range(c.format.num_channels):
data = frame.get_read_array(channel)
for val in data:
if abs(val) > maxval:
maxval = abs(val)

print(maxval)

tebasuna51

29th June 2020, 21:56

Thanks.

tebasuna51

30th June 2020, 18:26

The todo list:
...
Audio format conversion functions (format and/or samplerate)
...

- The format conversions in Avisynth are:

ConvertAudioTo8/16/24/32bit/Float

For me we only need ConvertAudioTo16/24/Float

The 8 bit unsigned int, still not supported like input:
Error: Unsupported audio format from decoder (probably 8-bit)
is not necesary at all.
The 32 bit int signed, accepted like input, can be managed (Trim, Splice, Loop, Reverse, Shuffle) but don't have sense upconvert other format to it.

The conversion between 24 int signed and 32 float can be considered lossless (precission preserved with the same bits used for mantissa).

Any arithmetic operation (Gain, Mix, Resample, Filter, ...) are a lossy operation and is recommended make it in float format to avoid clip (overflow if use int format). Maybe each operation must store the max value reached in order to gain or attenuate before reconvert the float to int (Normalize).
Arithmetic operations aren't allowed in 32 bits int format in AviSynth.

Many encoders support float input but the user must selecct the final output format.

- I check some wav input formats and work fine with 16, 24, 32 int formats, 32 float, u-law and a-law.
With 8 bit int and 64 float show the same error:
Error: Unsupported audio format from decoder (probably 8-bit)
64 float is not supported in AviSynth also.
The adpcm_ms and adpcm_ima read a 44100 samplerate 20 seconds like 18.375 seconds samplerate 48000.

I try correct the error with:
c.format.samples_per_frame = 22050
and
AttributeError: attribute 'samples_per_frame' of 'vapoursynth.AudioFormat' objects is not writable

Avisynth have the AssumeSampleRate function and is a method to make some duration changes, for instance:
"SpeedUp 23.976 -> 25">
AssumeSampleRate(last, (AudioRate()*1001+480)/960).SSRC(AudioRate(last))

tebasuna51

30th June 2020, 19:41

I was wrong, in this audio:
Audio Format Descriptor
Id: 11001000
Name: Audio16
Sample Type: Integer
Bits Per Sample: 16
Bytes Per Sample: 2
Samples Per Frame: 24000
Channels: FRONT_LEFT, FRONT_RIGHT
the Samplerate is not 2*SamplesPerFrame = 48000

How we can know the Samplerate of that audio?

The same Audio Format for samplerate 8000 to 192000, the Samples Per Frame data is useless.

The concept of Frame in audio is not the same than video.
The equivalent to video frame is the audio Sample

Also the output message:
"Output 160 frames in 0.07 seconds (2240.11 fps)"
don't have many sense in audio, maybe:
"Output 3 840 000 samples in 0.07 seconds (54 857 142 sps)"
or better, in time format like some encoders:
"Output 20 seconds in 0.07 seconds (285x)

Myrsloik

30th June 2020, 20:13

I was wrong, in this audio:

the Samplerate is not 2*SamplesPerFrame = 48000

How we can know the Samplerate of that audio?

The same Audio Format for samplerate 8000 to 192000, the Samples Per Frame data is useless.

The concept of Frame in audio is not the same than video.
The equivalent to video frame is the audio Sample

Also the output message:
"Output 160 frames in 0.07 seconds (2240.11 fps)"
don't have many sense in audio, maybe:
"Output 3 840 000 samples in 0.07 seconds (54 857 142 sps)"
or better, in time format like some encoders:
"Output 20 seconds in 0.07 seconds (285x)

Hint:
clip.sample_rate

tebasuna51

1st July 2020, 12:51

Thanks. The audio support is now:
Avisynth Audio functions and their VapourSynth equivalents
==========================================================
Lossless functions (must work over ALL format samples, 8 bits int not necesary)
------------------
AssumeSampleRate Pending, make attribute 'sample_rate' writable
AudioTrim std.AudioTrim()
DelayAudio Pending, maybe std.AudioTrim(-delay) insert silence at first
GetChannel std.ShuffleChannels(), std.SplitChannels()
MergeChannels std.ShuffleChannels()
+ + or std.AudioSplice(), std.AudioLoop()
AudioDub/AudioDubEx Pending, maybe never necesary
EnsureVBRMP3Sync Not needed
KillAudio/KillVideo Without sense now

Lossy functions (must work only in FLOAT format, in Avisynth the marked [16] work also with 16 int)
---------------
ConvertAudioTo8bit Not needed, maybe the source plugin can supply 16int (also 32f for 64f)
ConvertAudioTo16bit Pending
ConvertAudioTo24bit Pending
ConvertAudioTo32bit Not needed, it doesn't make sense upconvert other format to 32 int
ConvertAudioToFloat Pending
Amplify/AmplifydB [16] std.AudioGain()
ConvertToMono [16] std.AudioMix()
MixAudio [16] std.AudioMix()
MonoToStereo [16] std.AudioMix()
Normalize [16] Search MaxValue and std.AudioGain()

ResampleAudio [16] Pending, maybe with external plugins
SSRC Pending, maybe with external plugins
TimeStretch Pending, maybe with external plugins
SuperEQ Pending, maybe with external plugins

TimeStrech work like a plugin in Avs+ and can replace to ResampleAudio and SSRC
Sox can work like a plugin and replace SSRC, SuperEQ and add many other audio functions

Can we use avs.LoadPlugin() with Avs+ plugins?
I tried with
core.avs.LoadPlugin(r"C:\plugins\TimeStretch.dll")
c=c.avs.TimeStretch(tempo=100.0*25.0/(24000.0/1001.0))
and:
vapoursynth.Error: TimeStretch: argument c1 was passed an unsupported type (expected clip compatible type but got AudioNode)

MeteorRain

1st July 2020, 20:04

My personal opinion regarding to ConvertAudio* family:

I would rather getting the support of 32 bit instead of 24 bit. 24 bit (3 bytes) is not aligned to SIMD boundary and processing of it is difficult. 32 bit processing offers a huge improvement on performance because operations can be done using SSE or AVX.

DJATOM

1st July 2020, 21:02

And my personal opinion is that 24 bit sometimes the maximum supported bitness for certain audio codecs (for example, flac). Leaving it behind a boat is bad.
Why not support all possible cases, if possible?

tebasuna51

1st July 2020, 21:44

I hope are you talking about 32 bits float, because 32 int is a format than I never see in audio tracks of movies.

All decoders of lossy formats output 32 (even 64) bits float, and many lossy encoders accept 32 float like input then most the time we don't need any conversion.

Lossless formats (TrueHD, DTS-MA) use 16/24 int sample and the decoder must supply that, if we use only lossless functions (Trim, Add, Shuffle) we dont need convert the audio and output the same.
If we need use a lossy function we need convert to float (16/24 bits int -> 32 float) after that recover the previous format is not recommended.

I don't know if a final conversion 32 float -> 24 int can save time, of course save space if we need a phisical file output.

Myrsloik

1st July 2020, 22:37

Does anyone actually use double for audio processing? If so, where can I find it. Which applications support it?

feisty2

1st July 2020, 22:40

Adobe Audition has support for fp64 audio

MeteorRain

1st July 2020, 23:55

@DJATOM @tebasuna51 No I'm talking about 32 bit integer. If I mean float I'd say float.

24 bit audio is stored in 3 bytes. For each m128 register you can store 5.33 samples, and a m256 register holds 10.67 samples.
If 16 bit is insufficient, I'd prefer using 32 bit as internal depth.
Of course, float or double may be a better format. But in any case, no 24 bit for internal processing unless no arithmetic involved.

If 24 bit output is needed, process in float or 32 bit and then convert back to 24 bit.

tebasuna51

2nd July 2020, 01:03

Does anyone actually use double for audio processing? If so, where can I find it. Which applications support it?

eac3to with the parameter -full and using libav/ffmpeg can obtain 64 float:

eac3to v3.34
command line: eac3to 6a321.ac3 6a321.wav" -full
------------------------------------------------------------------------------
AC3, 5.1 channels, 0:00:20, 448kbps, 48kHz
Decoding with libav/ffmpeg...
Writing WAV...
Creating file "D:\Test\AudioD\Samples\ac3\6a321.ac3_.wav"...
The original audio track has a constant bit depth of 64 bits.
The processed audio track has a constant bit depth of 64 bits.

By default:
Decoding with libav/ffmpeg...
Reducing depth from 64 to 24 bits...

feisty2

2nd July 2020, 05:13

you guys confused the data structures of intermediate representations (data accessed via vsapi->getReadPtr) and the final output (uncompressed raw produced by vspipe), the two are not necessarily the same. it is definitely possible to pad uint24 to uint32 for vsapi->getReadPtr and retain the uint24 structure when the audio script is materialized thru vspipe.

this is already true for videos since each row of a frame returned by vsapi->getReadPtr is guaranteed to be 32-byte aligned, meaning two consecutive rows might not be consecutive in memory. however when vspipe materializes a frame, consecutive rows are definitely consecutive in the uncompressed raw, the data structures of two cases are already different as you can see.

feisty2

2nd July 2020, 05:29

simple rule: a 24-bit sample should always be padded to uint32 until it is materialized. all your codecs and stuff only deal with materialized output so that won't be a problem.

Myrsloik

2nd July 2020, 10:36

You know you could look at the actual API or ask BEFORE committing virtual hate crimes against 24 bit integers.

Storage is obviously always a power of two number of bytes just as for images. Packing only happens for output purposes. Just like video. Was that so hard to guess?

Those of you who know how lazy programmers can be would've also assumed I'm greatly inspired by the FFmpeg API and its solutions.

Btw, the next build will probably take a while longer due to the other code cleanups and improvements I want to do as well.

richardpl

2nd July 2020, 10:47

FFmpeg solutions is better that anyone could code in several decades.

tebasuna51

2nd July 2020, 11:42

@feisty2, @Myrsloik, OK

I only want say than lossless operations (Trim, Shuffle, Splice, Split, Loop) are only rearrange data, and can't be a problem for any kind of data.

But lossy operations (Gain, Mix, Resample, Filter,...) implies aritmethic operations and must be done always in float format, 32 float is enough for audio with a precission equivalent to 24 bits int (the human ear can difference only to a precission of 20 bits)

The final output must be a choice of user, if only do lossless operations the output can be the same than input, and be recoded with lossless formats (FLAC, DTS-MA,...)
When use lossy operations doesn't make sense recode to lossless formats, we don't need recover int format except for some cases: we want burn a CD audio, the recoder don't support 32 float (strange), ...

MeteorRain

2nd July 2020, 22:02

24 bit audio was stored as 3 bytes stream in AviSynth+.

If you would store it as 32 bit, then how's it any better than actually using a full 32 bit sample?
You get less accuracy for the ... exact same computation cost. And you would think that's the "benefit" of using 24 bit instead of 32 bit data?
And you would pack the output by truncating 1 byte from it, which is, well, exactly the same as packing a full 32 bit audio to 24 bit output.

Sorry, maybe I'm too dumb to get the point.

_Al_

2nd July 2020, 23:22

This is amazing, thanks Myrsloik and all developing, testing.

last VapourSynth64-Portable-R51-audio4.7 and last BestAudioSource.dll,
this works:
VSPipe --wav audio_tests.py - | neroAacEnc -ignorelength -lc -cbr 96000 -if - -of audio.m4a
but, this does not:
VSPipe --wav audio_tests.py - | ffplay.exe -i -
G:\VapourSynth64-Portable-R51-audio4.7>VSPipe --wav __audio_tests.py - | neroAacEnc -ignorelength -lc -cbr 96000 -if - -of audio.m4a
*************************************************************
* *
* Nero AAC Encoder *
* Copyright 2009 Nero AG *
* All Rights Reserved Worldwide *
* *
* Package build date: Feb 18 2010 *
* Package version: 1.5.4.0 *
* *
* See -help for a complete list of available parameters. *
* *
*************************************************************

[mov,mp4,m4a,3gp,3g2,mj2 @ 00000000028F7F40] stream 0, timescale not set
Output 89 frames in 1.15 seconds (77.17 fps)

G:\VapourSynth64-Portable-R51-audio4.7>VSPipe --wav __audio_tests.py - | ffplay.exe -i -
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000000002917F40] stream 0, timescale not set
ffplay version N-95015-gba24b24aab Copyright (c) 2003-2019 the FFmpeg developers
built with gcc 9.2.1 (GCC) 20190918
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --e
nable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus -
-enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --ena
ble-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enabl
e-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d1
1va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 35.100 / 56. 35.100
libavcodec 58. 58.101 / 58. 58.101
libavformat 58. 33.100 / 58. 33.100
libavdevice 58. 9.100 / 58. 9.100
libavfilter 7. 58.102 / 7. 58.102
libswscale 5. 6.100 / 5. 6.100
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
Input #0, wav, from 'pipe:':aq= 0KB vq= 0KB sq= 0B f=0/0
Duration: N/A, bitrate: 3072 kb/s
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
SDL_OpenAudio (2 channels, 48000 Hz): WASAPI can't initialize audio client: CoInitialize has not been called.

SDL_OpenAudio (1 channels, 48000 Hz): WASAPI can't initialize audio client: CoInitialize has not been called.

SDL_OpenAudio (2 channels, 44100 Hz): WASAPI can't initialize audio client: CoInitialize has not been called.

SDL_OpenAudio (1 channels, 44100 Hz): WASAPI can't initialize audio client: CoInitialize has not been called.

No more combinations to try, audio open failed
Failed to open file 'pipe:' or configure filtergraph
nan : 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0
Error: fwrite() call failed when writing frame: 1, errno: 32
Output 6 frames in 20.29 seconds (0.30 fps)
script is simple:
import vapoursynth as vs
from vapoursynth import core
a = core.bas.Source('video.mp4')
a.set_output(0)

feisty2

2nd July 2020, 23:23

the same applies to videos you know, apparently the computation cost on 10-bit videos is the same as 16-bit videos, so why hasn't 16-bit become the mainstream video format?
you're still mixing up the intermediate representation and the materialized output.

MeteorRain

3rd July 2020, 00:52

ConvertAudioTo24bit Pending
ConvertAudioTo32bit Not needed, it doesn't make sense upconvert other format to 32 int

ConvertAudio* family is not only used in materialized output, but is frequently used in intermediate representation.

Assuming you are objecting my reply, you'd be thinking that we don't need 32bit intermediate format, and filters should work with 24bit if required to be using intergers?

tebasuna51

3rd July 2020, 03:38

You get less accuracy for the ... exact same computation cost. And you would think that's the "benefit" of using 24 bit instead of 32 bit data?
And you would pack the output by truncating 1 byte from it, which is, well, exactly the same as packing a full 32 bit audio to 24 bit output.

Sorry, maybe I'm too dumb to get the point.

Sorry, maybe I don't know how explain you the point.

I will try:

- Don't exist any standard input with better precission than 24 bit int.

- We can't output any standard format with 32 bits of precission.

- The human ear can difference until a precision of 20 bits, then 24 int or 32 float is more than enough

- You can use 32 int to do internal operations (https://forum.doom9.org/showthread.php?p=1916468#post1916468) if you think is better, but like audio format is useless.

- But I recommend use 32 float to do any aritmetic operations if posible, to recover overflows.

MeteorRain

3rd July 2020, 11:53

My point was 32 bit audio is no where close to Not needed, it doesn't make sense upconvert other format to 32 int.

Now, because I was redirected to here from AVS threads, and AVS never had padding on 24 bit audio, we always get 3 bytes unaligned samples, hence my point to not use 24 bit at all.

The idea of padding a 24 bit audio effectively make it a 32 bit sample (except less accuracy than true 32 bit but the rest are exactly the same).
Because If you pad and strip on LSB instead of MSB you automatically get a 32 bit.

feisty2

3rd July 2020, 12:37

AVS never had padding on 24 bit audio, we always get 3 bytes unaligned samples, hence my point to not use 24 bit at all.

now I'm actually curious how 24-bit audio is handled in avs, since int24 is not a valid integer type

using uint24_t = std::uint8_t[3];

maybe?

MeteorRain

3rd July 2020, 13:11

now I'm actually curious how 24-bit audio is handled in avs, since int24 is not a valid integer type

using uint24_t = std::uint8_t[3];

maybe?

No "using".

Just (uint8_t *).

tebasuna51

3rd July 2020, 14:26

Like you can see there aren't functions lossy (with aritmetic operations) than support 24 bits, the lossless functions can be managed like some array of bytes without problems.

For lossy functions can't use 24 bits, use 32 int if you want, but I thing 32 float is easy to control overflows and preserve the required precission.
Also all decoders of lossy formats supply 32 float by default.
At end we need the option to convert to 24 bits int because there are formats/encoders than need that precission.

tebasuna51

3rd July 2020, 14:49

last VapourSynth64-Portable-R51-audio4.7 and last BestAudioSource.dll,
this works:
...
but, this does not:
...
script is simple:
import vapoursynth as vs
from vapoursynth import core
a = core.bas.Source('video.mp4')
a.set_output(0)

The input is a mp4 with video and audio, by the moment VapourSynth-Portable-R51-audio4.7 only work with audio, maybe there are some data related to video sended to output.

Try extract the audio data only and repeat with
a = core.bas.Source('audio.m4a')

NeroAacDec/Enc admit like input mp4's with video/audio (ignoring the video) maybe for that work fine.

EDIT: I can't reproduce the problem, work for me:

C:\Test\Vapour>VSPipe.exe --wav zzaud.vpy - | C:\Portable\0\ffplay.exe -i -
ffplay version git-2020-04-03-52523b6 Copyright (c) 2003-2020 the FFmpeg developers
...
Input #0, wav, from 'pipe:':aq= 0KB vq= 0KB sq= 0B f=0/0
Duration: N/A, bitrate: 3072 kb/s
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
Output 267 frames in 132.73 seconds (2.01 fps)0KB sq= 0B f=0/0
133.39... M-A: 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0

I neeed stop the process with Ctrl/C
The output wav is correct.

_Al_

4th July 2020, 00:40

Try extract the audio data only and repeat with
a = core.bas.Source('audio.m4a')

That produces one more line (yellow warning): [aac @ 0000000002A11000] Could not update timestamps for skipped samples.
But again Nero encoder works and ffplay does not. I downloaded and updated to latest ffplay and still it does not work.
But ffmpeg build with vapoursynth enabled works. Two outputs v.set_output(0) and a.set_output(1) in script. I used Pat357 example:
VSPipe --wav -o 1 audio_tests.py - | ffmpeg -f vapoursynth -i audio_tests.py -i pipe: -map 0:0 -map 1:0 -f AVI -c:v utvideo -pix_fmt yuv420p -colorspace bt709 -c:a pcm_s16le -y ffmpeg.avi
So something about that ffplay.

qyot27

4th July 2020, 02:28

That produces one more line (yellow warning): [aac @ 0000000002A11000] Could not update timestamps for skipped samples.
But again Nero encoder works and ffplay does not. I downloaded and updated to latest ffplay and still it does not work.
But ffmpeg build with vapoursynth enabled works. Two outputs v.set_output(0) and a.set_output(1) in script. I used Pat357 example:
VSPipe --wav -o 1 audio_tests.py - | ffmpeg -f vapoursynth -i audio_tests.py -i pipe: -map 0:0 -map 1:0 -f AVI -c:v utvideo -pix_fmt yuv420p -colorspace bt709 -c:a pcm_s16le -y ffmpeg.avi
So something about that ffplay.
Judging by the SDL_OpenAudio/WASAPI errors, I'd say it's definitely a problem with the SDL2 lib in the build environment of whoever built ffmpeg/ffplay. Maybe it's actually some sort of problem in Windows itself (if it truly is coming out of WASAPI and not SDL2), but I'd see if it's fixed with an ffplay linked against a different build of SDL2.

It works here (build used (http://www.mediafire.com/file/f5xe8bhex020x2z/ffmpeg_r98029%252B8.7z/file)):
E:\Documents>vspipe --wav test.vpy - | ffplay -i -
ffplay version r98029+8 master-fb17ba86a8 HEAD-e0cb5ba49a
contains: avsrgb datetime new_pkgconfig silent_invoke vapoursynth_alt versioninfo
Copyright (c) 2003-2020 the FFmpeg developers
built on Jun 2 2020 01:27:59 with gcc 10.1.0 (GCC)
libavutil 56. 49.100 / 56. 49.100
libavcodec 58. 90.100 / 58. 90.100
libavformat 58. 44.100 / 58. 44.100
libavdevice 58. 9.103 / 58. 9.103
libavfilter 7. 84.100 / 7. 84.100
libswscale 5. 6.101 / 5. 6.101
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
[opus @ 00000239FA49FBC0] Could not update timestamps for skipped samples.
Input #0, wav, from 'pipe:':aq= 0KB vq= 0KB sq= 0B f=0/0
Duration: N/A, bitrate: 3072 kb/s
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
1.90 M-A: 0.000 fd= 0 aq= 385KB vq= 0KB sq= 0B f=0/0
Error: fwrite() call failed when writing frame: 6, errno: 32
Output 11 frames in 2.39 seconds (4.61 fps)
As does setting -o 1 when I have both video and audio and a.set_output is directed to 1.

_Al_

4th July 2020, 04:46

ok, thanks, so focusing on that error, I have Windows 7,

and googling that error (which I should do before) and this (https://stackoverflow.com/questions/46835811/ffplay-wasapi-cant-initialize-audio-client-ffmpeg-3-4-binaries) helped, setting SDL_AUDIODRIVER variable to directsound.

All ffplay builds now work.

Myrsloik

9th July 2020, 14:05

A bit of an update in case you're not paying attention to the commits:

Development has continued but mostly on creating the next big API revision with several improvements and cleanups. The current state is a bit too unstable and so far adds nothing in terms of functionality to audio and that's why there have been no new builds.

DJATOM

9th July 2020, 21:40

In fact It's even impossible to compile python module at the time: https://pastebin.com/HNjkQYh5

Myrsloik

2nd August 2020, 14:18

The test builds have now resumed after the big api cleanup. Test5 has fixed all known bugs and should be fairly stable. Adding headers in vspipe is now done with -c wav/w64/y4m. The only known bugs are:

vspipe's "--graph simple" option isn't implemented and the logging management functions in python are not working well. Apart from that report anything weird you find.

BestAudioSource needs to be updated again as usual.

DJATOM

2nd August 2020, 14:33

I've got some log in vsedit

2020-08-02 16:30:18.013
API MISUSE! Function 'eedi3' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'eedi3' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Dilate' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Dilate' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Erode' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Erode' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Open' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Open' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Close' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Close' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'TopHat' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'TopHat' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'BottomHat' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'BottomHat' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'RemoveGrain' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'RemoveGrain' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Repair' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Repair' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Clense' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Clense' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'ForwardClense' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'ForwardClense' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'BackwardClense' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'BackwardClense' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'VerticalCleaner' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'VerticalCleaner' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Vinverse' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'Vinverse' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'VFM' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'VFM' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'VDecimate' failed to register with error: Argument 'clip' has invalid type 'clip'.
API MISUSE! Function 'VDecimate' failed to register with error: Argument 'clip' has invalid type 'clip'.
2020-08-02 16:30:18.446
Script was successfully evaluated. Output video info:
Frames: 555 | Time: 0:00:23.148 | Size: 1920x1080 | FPS: 24000/1001 = 23.976 | Format: YUV420P8
Probably due to https://github.com/vapoursynth/vapoursynth/blob/doodle1/src/filters/vivtc/vivtc.c#L1623 must be a vnode type.

Myrsloik

2nd August 2020, 14:42

Doh, test5 now sneakily updated AGAIN!

Pat357

5th August 2020, 18:16

How can I create a list of all functions for the VS_audio_test5 ?

Here is what I get :
d:\programs\VapourSynth64-Portable-R51-audio5\VapourSynth64Portable\VapourSynth64>python
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32
>>> import vapoursynth as vs
>>> core = vs.core
>>> print(core.list_functions())
<stdin>:1: DeprecationWarning: list_functions() is deprecated. Use "plugins()" instead.
<stdin>:1: DeprecationWarning: get_plugins() is deprecated. Use "plugins()" instead.

After this, I never get the >>> prompt back and it seems like python is hanging. Also no ctrl+Z to exit.
Only way was killing python.exe in taskman.

Myrsloik

5th August 2020, 19:30

lansing

9th August 2020, 06:24

Test 5 is not releasing memory. I ran a script in vseditor, then closed the tab, but the memory didn't get release.

Myrsloik

9th August 2020, 19:28

Test5 is updated again. This time it fixes the missing mimalloc dlls which prevented it from working outside vspipe most of the time.

tebasuna51

10th August 2020, 12:45

Thanks for your updates.

I have time to test now and I want explain the changes I see.

1) The command line I use now is:

VSPipe.exe -c wav zzaud.vpy zzD1sec.wav

Before vspipe need --wav

The zzaud.vpy I use is:
import vapoursynth as vs
from vapoursynth import core
c = core.bas.Source(r"C:\Test\Vapour\zzaud_.wav")
#
NumSamples = len(c)
SampleRate = c.sample_rate
MaskChannels = c.channel_layout # before was = c.format.channel_layout etc.
NumChannels = c.num_channels
BitDepth = c.bits_per_sample
SampleFormat = c.sample_type
TimeLength = NumSamples/SampleRate
print("SampleRate: %5d, BitDepth: %2d, %s" % (SampleRate, BitDepth, SampleFormat))
print("MaskChannels: %3X (%4d), TimeLength: %9.3f seconds" % (MaskChannels, MaskChannels, TimeLength))
#
delay = core.std.AudioTrim(c,1,48000)
delay = core.std.AudioGain(delay,0)
c = delay + c
NumSamples = len(c)
TimeLength = NumSamples/SampleRate
print("MaskChannels: %3X (%4d), TimeLength: %9.3f seconds" % (MaskChannels, MaskChannels, TimeLength))
#
c.set_output()

2) Now access to properties in a different way

And I obtain a output of:
Warning: Use logging.info instead of print.
Information: SampleRate: 48000, BitDepth: 32, SampleType.FLOAT
Information: SampleRate: 48000, BitDepth: 32, SampleType.FLOAT
Information: MaskChannels: 3F ( 63), TimeLength: 12.000 seconds
Information: SampleRate: 48000, BitDepth: 32, SampleType.FLOAT
Information: MaskChannels: 3F ( 63), TimeLength: 12.000 seconds
Information: MaskChannels: 3F ( 63), TimeLength: 13.000 seconds
Output 624000 samples in 0.04 seconds (14269777.41 sps)

3) How Use logging.info instead of print?
I obtain the messages but some duplicated and with the undesired prefix "Information: "

- Like you can see I try to emulate a delay of 48000 samples, or 1 second, maybe that can be abreviated with:
c = core.std.AudioTrim(-48000)

- In /doc I don't see any change with the precedent version.

ChaosKing

10th August 2020, 12:55

With a video clip you can also trim like this clip = clip[:-30] #remove last 30 frames, maybe it works with audio too.