Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Audio encoding

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st December 2024, 20:58   #1  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
Pitch Changers

How do I change the pitch from regular videos to that from Italian videos? For example, I want the pitch from this video to sound like the pitch from this video.

Source video:
https://www.mediafire.com/file/fu34q...ample.mp4/file

Last edited by jay123210599; 7th December 2024 at 07:35.
jay123210599 is offline   Reply With Quote
Old 8th December 2024, 00:47   #2  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,156
I think you're better off asking either in the audio section or in the Avisynth section next time, but given that everyone overlooked this and you asked me for help privately, I'll try to reply.
The U.S version has the original English dubbing, while the Italian version has the Italian dubbing applied on it (fun fact: I can understand them both as I speak both languages, but I've never actually seen Rocky as it came out in 1976 and even if this one is Rocky 4 which came out in 1985, I wasn't even born yet).
Anyway, the point I'm trying to make is that they're gonna sound different no matter what, but we can take the environment sound (i.e music and effects) as a reference.

Now, interestingly, the Italian version at 25fps didn't go through the 23,976fps to 25fps 4% speed up, but rather it got converted by duplicating 1 frame every 24.
Indexing the source will show that you have repeated frames at:

44 ok
45 ok
46 dup
47 ok
48 ok
49 ok
50 ok
51 ok
52 ok
53 ok
54 ok
55 ok
56 ok
57 ok
58 ok
59 ok
60 ok
61 ok
62 ok
63 ok
64 ok
65 ok
66 ok
67 ok
68 ok
69 ok
70 ok
71 dup
72 ok

all the way through to the end.
Sure enough 71-46 = 25 which is the missing frame between 23,976fps (i.e 24) and 25fps, so there you have it, one repeated frame every 24.

Quote:
LWLibavVideoSource("D:\ROCKY IV - (LA MORTE DI APOLLO CREED CONTRO IVAN DRAGO) (240p_25fps_H264-128kbit_AAC).mp4")
Crop(0, 32, -0, -36)
Spline64Resize(320, 180)
ShowFrameNumber(scroll=true)
Here's the little sequence going from frame 44 to frame 47 where 46 is the duplicated one (i.e identical to frame 45):





and the same goes for the frame 69 to 72 sequence where 71 is the duplicated frame and is identical to frame 70:






On the other hand, the U.S version is 29,970fps with the 3:2 pulldown applied, which means that you're gonna have dups (this is generally done in interlaced land, but here we're in progressive land). Anyway, this is because it was originally 23,976fps and it has been telecined to 29,970 which basically means that it has been converted by duplicating frames, a standard technique still very much in use today.

4 ok
5 ok
6 ok
7 ok
8 dup
9 ok
10 ok
11 ok
12 ok
13 dup
14 ok
15 ok
16 ok
17 ok
18 dup

here's the sequence where frame 6 is ok, frame 7 is ok, frame 8 is the same as frame 7 as it's the duplicated one, frame 9 is ok again:






We can then remove the duplicated frames from both versions to bring them back to 23,976fps and align them:

Quote:
LWLibavVideoSource("D:\Apollo Creed’s Death Scene (720p_30fps_H264-128kbit_AAC).mp4")

SinPowerResize(320, 180)

TDecimate(mode=2, rate=23.976)

us_version=last

LWLibavVideoSource("D:\ROCKY IV - (LA MORTE DI APOLLO CREED CONTRO IVAN DRAGO) (240p_25fps_H264-128kbit_AAC).mp4")

Crop(0, 32, -0, -36)

Spline64Resize(320, 180)

TDecimate(mode=2, rate=23.976)

trim(3456, 0)

it_version=last

StackHorizontal(us_version, it_version)
We can see that they both come from the original 23,976fps version and that the PAL version wasn't subject to any speed up as it keeps the speed at any scene change like:




So now we only need to put the Italian dubbed track on the higher resolution US remastered version and the job is done.
Also, to test this theory further, we can isolate the bell that is hit at the beginning of the match and we can hear that in both the Italian and the English dubbing the "ding" is the same.
Here you can find three files:

- Rocky_IV_EN_Bell.wav
- Rocky_IV_IT_Bell.wav
- Rocky_IV_IT_Bell_PitchBad.wav

https://we.tl/t-MOiSTWgAXy
(link expires in 3 days)

The first is collected by trimming the original US source and using a Normalize(0.24) to get the proper audio levels, the second is collected by trimming the IT source and also using Normalize(0.24) to get the proper levels and lastly the third is what the Italian track would sound if we were to erroneously apply a pitch adjustment to it despite it not having being sped up.
Once again, this is reflected by the Fourier transform graph that we can generate with Spek:





As you can see, the EN Bell and the IT Bell are almost exactly the same, but if we were to erroneously apply a pitch correction using TimeStretch(pitch=96) to the Italian track as if it were subject to a 4% speed up, then we would get a completely wrong result. Sure, I understand that they don't sound the same, but to me that's more due to degradation in the Italian version which is not only older but it also went through multiple processing steps down the line. In other words, I don't think this is related to any framerate conversion, but rather to degradation and we can't do much here.

So, bottom line is: we don't need to change the pitch, just get rid of the dups, normalize the audio and produce the final output.

Quote:
x264.exe "D:\Rocky_IV_EN.avs" --crf 22 --preset medium --profile High --level 4.1 --ref 4 --deblock -1:-1 --overscan show --range tv --log-level info --thread-input --opencl --colormatrix bt709 --transfer bt709 --colorprim bt709 --videoformat component --aud --nal-hrd vbr --vbv-maxrate 25000 --vbv-bufsize 25000 --output "I:\temp\raw_video.h264"

Bepipe.exe --script "Import(^D:\Rocky_IV_EN.avs^)" | neroAacEnc.exe -lc -br 320000 -if - -of "I:\temp\audio1.m4a"

Bepipe.exe --script "Import(^D:\Rocky_IV_IT.avs^)" | neroAacEnc.exe -lc -br 320000 -if - -of "I:\temp\audio2.m4a"

MP4Box.exe -add "I:\temp\raw_video.h264" -add "I:\temp\audio1.m4a" -add "I:\temp\audio2.m4a" "I:\temp\final_output.mp4"


pause

and here is our final file:
https://we.tl/t-7x6RPYvWxI
(link valid for 3 days)

Quote:
General
Complete name : /home/FranceBB/Share Windows Linux/temp/Rocky IV - Apollo Creed vs Ivan Drago HD (Eng Dub & Ita Dub).mp4
Format : MPEG-4
Format profile : Base Media
Codec ID : isom (isom/avc1)
File size : 82.5 MiB
Duration : 5 min 31 s
Overall bit rate mode : Variable
Overall bit rate : 2 086 kb/s
Frame rate : 23.976 FPS
Encoded date : 2024-12-07 23:30:24 UTC
Tagged date : 2024-12-07 23:30:24 UTC
Writing application : MP4Box

Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High@L4.1
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 4 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 5 min 31 s
Bit rate mode : Variable
Bit rate : 1 571 kb/s
Maximum bit rate : 25.0 Mb/s
Width : 1 280 pixels
Height : 720 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 23.976 (24000/1001) FPS
Original frame rate : 23.976 (23976/1000) FPS
Standard : Component
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.071
Stream size : 62.0 MiB (75%)
Title : raw_video.h264
Writing library : x264 core 133 r2334 a3ac64b
Encoding settings : cabac=1 / ref=4 / deblock=1:-1:-1 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=6 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=23 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=22.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / vbv_maxrate=25000 / vbv_bufsize=25000 / crf_max=0.0 / nal_hrd=vbr / ip_ratio=1.40 / aq=1:1.00
Encoded date : 2024-12-07 23:30:24 UTC
Tagged date : 2024-12-07 23:30:41 UTC
Color range : Limited
Color primaries : BT.709
Transfer characteristics : BT.709
Matrix coefficients : BT.709
Codec configuration box : avcC

Audio #1
ID : 2
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 5 min 31 s
Bit rate mode : Variable
Bit rate : 203 kb/s
Maximum bit rate : 217 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 8.02 MiB (10%)
Title : audio1.aac
Encoded date : 2024-12-07 23:30:34 UTC
Tagged date : 2024-12-07 23:30:41 UTC

Audio #2
ID : 3
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 5 min 31 s
Bit rate mode : Variable
Bit rate : 311 kb/s
Maximum bit rate : 370 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 12.3 MiB (15%)
Title : audio2.aac
Encoded date : 2024-12-07 23:30:38 UTC
Tagged date : 2024-12-07 23:30:41 UTC
FranceBB is offline   Reply With Quote
Old 8th December 2024, 11:20   #3  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,160
@FranceBB, I see you use the old Bepipe.exe but don't work for me:
With rocen.avs only:
Quote:
LWLibavAudioSource("Apollo Creed's Death Scene.mp4")
The command line:
Quote:
Bepipe --script "Import(^rocen.avs^)" | neroAacEnc -lc -br 320000 -if - -of "rocen.m4a"
always crash with:
Quote:
...
ERROR: Can't find audio stream!
...
For me work fine the last avs2pipemod:
Quote:
avs2pipemod -wav rocen.avs | neroAacEnc.exe -lc -br 320000 -if - -of "rocen.m4a"
...
avs2pipemod [info]: A2PM_VERSION 1.1.2
avs2pipemod [info]: writing 331.186 seconds of 44100 Hz, 2 channels, 32 Bits [channelmask=3] audio stream.
avs2pipemod [info]: wrote 331.186 seconds [100%]rocessed 331 seconds...
avs2pipemod [info]: total elapsed time is 6.877 sec.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 8th December 2024, 16:01   #4  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,156
Yeah, it actually crashes on my Windows 10 Enterprise, but I was on a temporary Windows XP computer last night where I had old versions of everything (including my BAT files) and it worked (note the severely outdated x264 build too).
On Win10 however I generally go through FFMpeg to generate a physical PCM .wav file and then pass it to the NeroAACEncoder which is a bit of a bummer.
I actually missed the new avs2pipemod release, but I'll test it out first thing tomorrow, when I'm on my *real* PC!
FranceBB is offline   Reply With Quote
Old 8th December 2024, 16:27   #5  |  Link
GeoffreyA
Registered User
 
Join Date: Jun 2024
Location: South Africa
Posts: 236
Quote:
Originally Posted by FranceBB View Post
I generally go through FFMpeg to generate a physical PCM .wav file and then pass it to the NeroAACEncoder
At 320 kbps, they'll all be transparent, but generally speaking, Apple's CoreAudio, by way of qaac, tends to lead among all the AAC-LC encoders, beating even FDK. And qaac, as a frontend, works like a dream
GeoffreyA is offline   Reply With Quote
Old 8th December 2024, 17:10   #6  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,156
Quote:
Originally Posted by GeoffreyA View Post
At 320 kbps, they'll all be transparent,
For that specific crappy source, yes, but for other more complex and encoded from a real PCM 24bit 48000Hz lossless stream, not quite.

Quote:
Originally Posted by GeoffreyA View Post
generally speaking, Apple's CoreAudio, by way of qaac, tends to lead
Yeah, my personal chart from top to worse in terms of audio quality per bits is:

- Apple AAC Encoder
- Fraunhofer FDK AAC Encoder
- Nero AAC Encoder
- Libavcodec AAC Encoder

The only problem is that, realistically, at work I can only really use the libavcodec's AAC Encoder, 'cause the Nero AAC one is proprietary and licensed, the Fraunhofer one is also non free and can't be used commercially and definitely you can't use the Apple one. So, all the clips that are sent to the web are encoded with libavcodec's AAC Encoder which is the FFMpeg open source encoder. That being said, it actually got a bit better over the last few years, but it still has a very long way to go...
FranceBB is offline   Reply With Quote
Old 8th December 2024, 17:25   #7  |  Link
GeoffreyA
Registered User
 
Join Date: Jun 2024
Location: South Africa
Posts: 236
I understand. In your case, at work, you're severely limited in what you can use, and the only option is FFmpeg's encoder, which isn't disastrous any more but far from ideal. It is sad, though, because I think most AAC in the world, in videos, has been encoded with this encoder, and arguably, even LAME beats it. If only FDK had had a different licence, perhaps the world would have better-encoded AAC on the whole.
GeoffreyA is offline   Reply With Quote
Old 8th December 2024, 23:18   #8  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
@FranceBB I downloaded the file you gave me and the result is not what I'm looking for. I want the pitch from the US video to be the one from the Italian version (e.g. make a video with English voices with the Italian pitch).
jay123210599 is offline   Reply With Quote
Old 8th December 2024, 23:25   #9  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,156
Quote:
Originally Posted by jay123210599 View Post
I want the pitch from the US video to be the one from the Italian version (e.g. make a video with English voices with the Italian pitch).
There's no Italian pitch, the Italian version didn't go through the 4% speed-up process, they took the 23,976fps original US version, duplicated 1 frame to get to 25fps and put the Italian Dubbing on it, hence we can't "correct" what isn't wrong in the first place. The fact that you hear things differently, in this case, isn't due to the speed-up but rather due to the degradation in the Italian source. In other words, the pitch isn't different due to it being "higher" as part of the speed-up, but rather because the analog source got through degradation before it got transferred to a digital form.

Still, if you wanna play with the pitch, you can use TimeStretch(pitch=xx).
In the example, for instance, I used:

TimeStretch(pitch=96)

That was supposed to bring the Italian Dubbing back to the original US one assuming it went through the 4% speed-up process, but it didn't, what it did was creating a completely different output 'cause again the Italian dubbing didn't go through the speed-up, which is what you can also see by checking the duplicated frames.

Last edited by FranceBB; 8th December 2024 at 23:28.
FranceBB is offline   Reply With Quote
Old 8th December 2024, 23:45   #10  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
Quote:
Originally Posted by FranceBB View Post
There's no Italian pitch, the Italian version didn't go through the 4% speed-up process, they took the 23,976fps original US version, duplicated 1 frame to get to 25fps and put the Italian Dubbing on it, hence we can't "correct" what isn't wrong in the first place. The fact that you hear things differently, in this case, isn't due to the speed-up but rather due to the degradation in the Italian source. In other words, the pitch isn't different due to it being "higher" as part of the speed-up, but rather because the analog source got through degradation before it got transferred to a digital form.

Still, if you wanna play with the pitch, you can use TimeStretch(pitch=xx).
In the example, for instance, I used:

TimeStretch(pitch=96)

That was supposed to bring the Italian Dubbing back to the original US one assuming it went through the 4% speed-up process, but it didn't, what it did was creating a completely different output 'cause again the Italian dubbing didn't go through the speed-up, which is what you can also see by checking the duplicated frames.
How am I suppose to bring the sounds of the Italian Dub to the English one, then?
jay123210599 is offline   Reply With Quote
Old 8th December 2024, 23:58   #11  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,156
Quote:
Originally Posted by jay123210599 View Post
How am I suppose to bring the sounds of the Italian Dub to the English one, then?
Given that it's caused by actual analog degradation, I have no idea, I'll let other more expert than me on audio reply. There's probably a way to simulate that digitally, but I wouldn't know how.
FranceBB is offline   Reply With Quote
Old 9th December 2024, 10:53   #12  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,160
Like FranceBB demonstrate, and I also verified, the audio have the same length, then it is not estretched to sinc the different fps of video (both play the same duration because have less or more duplicated frames).

I can't understand how do you listen different pitch when speak the original voices and the italian audio dubbers, of course can be different because are different persons.
I can't understand for what you want modify the good english audio with something of the noisy and worse italian audio.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 13th December 2024, 03:59   #13  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
Pitch Changers

In this video, the pitch in the scene where Apollo is fatally wounded changes twice. How do I make the pitch from this video match the ones from the previous video?
jay123210599 is offline   Reply With Quote
Old 13th December 2024, 11:28   #14  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,160
You have already your answers.

You are at risk of being banned if you continue to repeat questions.

Please read the forum rules.
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 13th December 2024 at 11:34.
tebasuna51 is offline   Reply With Quote
Old 13th December 2024, 13:36   #15  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
Quote:
Originally Posted by tebasuna51 View Post
You have already your answers.

You are at risk of being banned if you continue to repeat questions.

Please read the forum rules.
I already tried the different scripts given here and neither matched the two pitches I wanted.
jay123210599 is offline   Reply With Quote
Old 14th December 2024, 10:49   #16  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,160
Because both have the same pitch (music and noise) and the dialog pitch can't be the same because are different persons talking.

BTW you have controls for bass/acute sounds in many players. For permanent changes try with sox filters.
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 14th December 2024 at 10:53.
tebasuna51 is offline   Reply With Quote
Old 20th December 2024, 18:05   #17  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
Quote:
Originally Posted by tebasuna51 View Post
Because both have the same pitch (music and noise) and the dialog pitch can't be the same because are different persons talking.

BTW you have controls for bass/acute sounds in many players. For permanent changes try with sox filters.
Alright, how do I change the sound then to match these two points?

Last edited by jay123210599; 20th December 2024 at 18:10.
jay123210599 is offline   Reply With Quote
Old 20th December 2024, 20:12   #18  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,160
match points? I can't understand.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 23rd December 2024, 15:08   #19  |  Link
jay123210599
Registered User
 
Join Date: Apr 2024
Posts: 305
Quote:
Originally Posted by tebasuna51 View Post
match points? I can't understand.
I mean the audio changes in those two points of this video.
jay123210599 is offline   Reply With Quote
Old 24th December 2024, 09:39   #20  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,160
Sorry, I can't listen the difference.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Reply

Tags
pitch

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:53.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.