Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
19th November 2018, 22:14 | #1 | Link |
Registered User
Join Date: Dec 2005
Location: Germany
Posts: 1,795
|
Merge / Blend n frames together
I'm playing around with this OCR script here https://github.com/pocketsnizort/pythOCR
I extracted some text from an old movie but the text it is unstable. My idea is to blend all text-frames togehter to see if this improves the text quality for ocr. Scroll to 14s: https://www.dropbox.com/s/59sicf9gqosa614/subs.mkv?dl=0 But I don't know how to approach this. I have a scenechanges.csv and now I would need a function to blend (or maybe merge, I don't know whats better) frames from n to m together. Where should I start? Or Maybe someone can give an example. SceneChanges.csv Code:
[Video Informations] fps=23.976024 frame_count=2878 [Scene Informations] frame,is_start,is_end,subimage 0269,1,0,"0269.png" 0326,0,1,"" 1071,0,1,"" 1072,1,0,"1072.png" 1178,0,1,"" 1681,1,0,"1681.png" 1757,0,1,"" 1758,1,1,"1758.png" 1759,1,1,"1759.png" 1760,1,1,"1760.png" 1761,1,1,"1761.png" 1762,1,1,"1762.png" 1763,1,0,"1763.png" 1787,0,1,"" 1895,1,0,"1895.png" 1943,0,1,"" 1945,1,0,"1945.png" 2017,0,1,"" 2018,1,0,"2018.png" 2297,1,0,"2297.png" 2377,0,1,"" 2479,1,0,"2479.png" 2535,0,1,"" 2613,1,0,"2613.png" 2670,0,1,""
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth VapourSynth Portable FATPACK || VapourSynth Database |
19th November 2018, 22:32 | #2 | Link |
Registered User
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
|
Try std.Expr(). The basic idea to split range onto frames and feed them to Expr, your expression will be like 'x y max z max ... etc' or for example, count frames in your range, divide every clip into that value and add all frames afterwards:
rangeLen = len(range) f'x {rangeLen} / y {rangeLen} / + z {rangeLen} / +' Then just loop resulted image to match original length and add to the clip you will OCR later. On my opinion first method should be better, but probably you'll have a lot of unneeded white dots in the end. |
20th November 2018, 11:04 | #4 | Link |
Registered User
Join Date: Dec 2005
Location: Germany
Posts: 1,795
|
@lansing yes ocr = img to text
@DJATOM Do you mean like this? Code:
rangeLen = 8 Clip = Clip.std.Trim(1895,1943) # one sub block Clip = core.std.Expr(clips=[Clip,Clip,Clip], expr=[f'x {rangeLen} / y {rangeLen} / + z {rangeLen} / +']) I never used Expr and don't really understand it yet. @HolyWu Yes this is more or less what I wanted. I also tried tla.TempLinearApproximate(radius=7), it looks also very similar to Averageframes() but needs its own trim() for every sub. Your solution is more elegant The ocr result has definitely improved now but still contains some random crap chars.
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth VapourSynth Portable FATPACK || VapourSynth Database Last edited by ChaosKing; 20th November 2018 at 11:09. |
20th November 2018, 14:41 | #5 | Link |
Registered User
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
|
Not really. I'll describe it below.
Code:
Clip = Clip.std.Trim(1895,1943) # one sub block rangeLen = len(Clip) rangeList = [] exprString = '' for i in range(rangeLen): rangeList.append(Clip[i]) if len(exprString) is 0: exprString += f'x {rangeLen} / ' else: exprString += f'y {rangeLen} / + ' # I think you can make a string with "xyz..." symbols and then extract proper variable with slicing: exprVar[1] ... exprVar[26] Clip = core.std.Expr(clips=rangeList, expr=exprString) Last edited by DJATOM; 20th November 2018 at 15:05. |
20th November 2018, 16:26 | #6 | Link | |
Registered User
Join Date: Sep 2006
Posts: 1,657
|
Quote:
|
|
21st November 2018, 13:03 | #7 | Link | |
Registered User
Join Date: Dec 2005
Location: Germany
Posts: 1,795
|
Quote:
I guess this is an Expr limitation? It works with when is set the Trim to Trim(1895,1895+25) The result is 1 frame which looks a bit better but still very similar than without expr. Pro side: it has less artifacts Here is a comparison EDIT Ok, it depends on the startframe. Since the artifacts in the averageframe() version are "less white" they can now be filtered out more easily. Big THX @ all With std.Binarize(threshold=80) the gray smear is gone in the average version
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth VapourSynth Portable FATPACK || VapourSynth Database Last edited by ChaosKing; 21st November 2018 at 13:12. |
|
28th November 2018, 16:32 | #8 | Link |
Registered User
Join Date: Dec 2005
Location: Germany
Posts: 1,795
|
One additional question. Is it possible to make the black background transparent? RGBA? How would this look in vs code?
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth VapourSynth Portable FATPACK || VapourSynth Database |
28th November 2018, 17:37 | #9 | Link | |
Registered User
Join Date: Sep 2007
Posts: 5,374
|
Quote:
e.g (starting from your RGB png) Code:
v = core.imwri.Read(r'PATH\7MQEv0x.png') a = core.resize.Point(v, format=vs.GRAY8, matrix_s="709", range_s="full") v.set_output(alpha=a) |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|