Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 3rd December 2024, 00:21   #1  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,915
Subtitle Edit Guides & Tips

Subtitle Edit Proxy files and their handling, based on SE 4.0.8. behaviour (End 2024)
A guide for manual manipulation of proxy files while pruning a SE subtitle project, here for the "Shotchanges" feature
https://github.com/Flitskikker/SubTi...ngs-Beautifier
SE's proxy's naming convention seems to be a 16-digit HEX value, which might have been derived from audio/video file's hash ?
Plus SE edition checksum ?
Well, guesswork, study of source code will tell.

An example: 6316127b6e514232.<extension> (further placeholder for 6316127b6e514232 shall be *)
On first loading of an audio track (or video track with audio track) a proxy for that audio track is derived:
.wav downconverted to stereo 16bit @ fs=150Hz, some 128Hz, max. 90min
On my noinstall standalone (.zip unpacked) this file is saved into folder C:\_PROG\! Subtitle Tools\SE408\Waveforms\*-0.wav
On an installation this is saved into folder C:\Users\<username>\AppData\Roaming\Subtitle Edit\Waveforms\*-0.wav
The suffix -0 is the audio track index for the first found audio track.
(SE versions before ~January 2023 seemed to work on the first found audio stream only, so do not append/respect any track index.
The later SE versions which can load more than one audio track would need to append/respect that track index.)

If there is more than 1 audio track, and this track becomes selected:
Next parsing follows, generating new file, with same name plus suffix
-1 -> 6316127b6e514232-1.wav -2 -> 6316127b6e514232-2.wav...
After parsing the active .wav is rendered onto timeline window's upper section.

On first loading of an audio track FFT Spectrograms are derived: individual .gif files 0.gif - xyz.gif (max. 512 ?) of 1024pix width, 128pix height
On my noinstall standalone (.zip unpacked) these resulting .gif files are saved into a track folder in C:\_PROG\! Subtitle Tools\SE408\Spectrograms\0.gif - xyz.gif
On an installation these are saved into a track folder C:\Users\<username>\AppData\Roaming\Subtitle Edit\Spectrograms\0.gif - xyz.gif
For the folder the same naming convention is used: 6316127b6e514232
If there is more than 1 audio track, and this track becomes selected:
Next parsing follows, generating new folder, with same name plus suffix,
matching to .wav name like 6316127b6e514232-1, 6316127b6e514232-2...
After parsing these .gifs of the active audio stream are rendered tile-by-tile onto timeline window's lower section.

Per folder Info.xml tells their content:
<SpectrogramInfo>
<SampleDuration>0.00533333333333333</SampleDuration>
<NFFT>256</NFFT>
<ImageWidth>1024</ImageWidth>
<SecondsPerImage>5.46133333333333</SecondsPerImage>
</SpectrogramInfo>

(may vary, 2024 double chunklength)

<SpectrogramInfo>
<SampleDuration>0.0106666666666667</SampleDuration>
<NFFT>256</NFFT>
<ImageWidth>1024</ImageWidth>
<SecondsPerImage>10.9226666666667</SecondsPerImage>
</SpectrogramInfo>

Frame Timecodes are derived while "Beautify Timecodes -> Extract Timecodes", parsing the video stream for all frames using FFprobe or FFmpeg.
The resulting text file contains a list of all found video frames in format ssss.mmm (from 2024 on extended to ssss.mmmµµµ)
On my noinstall standalone (.zip unpacked) this is saved in folder C:\_PROG\! Subtitle Tools\SE408\TimeCodes\*.timecodes
On an installation this is saved into folder C:\Users\<username>\AppData\Roaming\Subtitle Edit\TimeCodes\*.timecodes
The same naming convention is used: 6316127b6e514232.timecodes

Shotchanges are derived after "Beautify Timecodes -> Generate/Import Shotchanges" while parsing the video stream again for all frame differences over a certain threshold using FFprobe or FFmpeg.
The resulting text file contains a list of Shotchange frame candidates in format ssss.mmm
The same naming convention is used: 6316127b6e514232.shotchanges
On my noinstall standalone (.zip unpacked) this file is saved into folder C:\_PROG\! Subtitle Tools\SE408\ShotChanges\*.shotchanges
On an installation this file is saved into folder C:\Users\<username>\AppData\Roaming\Subtitle Edit\ShotChanges\*.shotchanges

If the automated Shotchange result leaves something to be desired, manipulation becomes easy:
Rename the first generated file to something like 6316127b6e514232.shotchanges.a
Lower threshold to 0.10, parse again, obtain a different 6316127b6e514232.shotchanges again.
Rename the second generated file to something like 6316127b6e514232.shotchanges.b
Lower threshold to 0.06, parse again, obtain another different 6316127b6e514232.shotchanges again.
Choose the one you like, and if you see it almost fit, handedit it.
Finally swap in the desired result by simply renaming the one you prefer to 6316127b6e514232.shotchanges
At the moment "Import" seems not to be implemented in SE, and because of the naming convention, can not.

If the various SE versions generate the same unique filename from the same AV file or additionally introduce their version fingerprint: further tests or study of source code will tell.
The filename theory (video file hash ? Checksum ?) has been successfully tested as follows:
Prepare a virgin unpack of SE.
Fill the helper folders with the helper files from the SE version that made them.
Now import the video file alone, without any subtitle file.
The virgin SE will search for matching helper files at their expected folders.
As a matching SE version it will calculate the video checksum or hash the same way.
If it finds them matching it will populate timeline window automatically and immediately with the given waveforms, FFT spectrograms and shotchanges, without generating them new.

So it is possible to manually introduce any .shotchange file into a project as long as path and naming convention are met.

I opened this thread to establish a one-stop shop for Subtitle Edit solutions.
Other contributions welcome.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."

Last edited by Emulgator; 28th December 2024 at 13:24. Reason: Extended for the legacy case of non-suffixed waveforms
Emulgator is offline   Reply With Quote
Old 5th December 2024, 19:17   #2  |  Link
Nikse555
Registered User
 
Join Date: Feb 2004
Location: Mars
Posts: 434
The file names for shot changes, is (yes) a file hash of the video/audio file - see https://github.com/SubtitleEdit/subt...eHasher.cs#L45
I think it's the same hash as OpenSubtitles uses (or used).
Nikse555 is offline   Reply With Quote
Old 5th December 2024, 23:25   #3  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,915
Many thanks for your confirmation, Nikse555, and even more thanks for developing this great tool !
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."
Emulgator is offline   Reply With Quote
Reply

Tags
handediting, pruning, scenecut, shotchange

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 22:28.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.