PDA

View Full Version : AviSubDetector v0.6.0.5


Shalcker
11th February 2005, 03:54
AVISubDetector is a tool for extraction and recognition of hard-burned subtitles from various video sources.

Latest version is available here:
http://animeburg.omake.ru/download/utils/AVISubDetector_english.zip
http://animeburg.omake.ru/download/utils/AVISubDetector_english.rar

Source Code (GPL):
http://animeburg.omake.ru/download/utils/AVISubDetector_0605src.zip

Older thread about this program is available here:
http://forum.doom9.org/showthread.php?s=&threadid=56062

gubi-gubi
14th February 2005, 19:55
Whats the changes?

Shalcker
15th February 2005, 01:52
Originally posted by gubi-gubi
Whats the changes?
Only a few minor changes from 0.6.0.2.

A bit of localization support (See "File\Language").
Few visualization fixes (there was several >= while actual program used >).
Option to capture subtitle images with full frame width - since sometimes program fails to properly set subtitle edges and some part of subtitle is cut.

..and now you can download sources too :)

seewen
15th February 2005, 01:56
This tool is EXCELLENT.
I'll try it at once. Thanks.

seewen
15th February 2005, 12:37
There's no readme file at all with this tool ?

Shalcker
15th February 2005, 13:45
Originally posted by seewen
There's no readme file at all with this tool ?
Yes, there is no readme. There was one but it pointed to aforementioned older thread. There was even older one with basic settings but that one is also explained in older thread.

Most settings have "hints" that should appear fairly fast as you hover over them. It's hard for me to explain them better then i've done in that thread - but if you still have questions about interface you should ask them and hopefully i'll be able to make it clearer.

gubi-gubi
16th February 2005, 00:44
This is truley an awesome tool. It does deserve a proper tutorial though... Anyone fancy it? :D

johner23
16th February 2005, 01:00
I really enjoy your program. :p

Very useful to "rip" subtitles from my old VHS collection ( currenty stored into AVI ou MPEG-2 formats in my dvd collection ).

The only thing that worry me is that AVISubDetector is very fully about features and become a little bit "hard" to manage for beginners.

If someone wanna try to create a General Guide for it, it will be very useful.

PS: do you code it using Assembly, C/ C++ or wich language ? BTW, congratulations for these very handy and nice software.

Best Regards.

johner

Shalcker
16th February 2005, 02:26
PS: do you code it using Assembly, C/ C++ or wich language ?
Program was created in Delphi (Pascal) - since in the beginning it was more like a test of one simple idea, and the faster i could get working thing, the better... and in the end it was too big for major rewrite :D

I can probably manage to move some functions into DLL and write those in C/C++ or assembler for speed.

johner23
20th February 2005, 23:39
Hi, Shalcker.

I'm trying to extract a hardcoded subtitle from an avi.

But the problem is that stupid subtitles are very white and pale. And the movie is very clear too. Many "lights" around the scenes.

When I did the same with "yellow-golden" colored subtitles, in other movies, no problem. :D

But now, with those white and pale ones, ASD can't recogize them. :(

What should I do ? Any setting trick ?

If you could help, I will be very grateful.

Thanks in advance.

johner

Shalcker
21st February 2005, 04:08
Originally posted by johner23
But now, with those white and pale ones, ASD can't recogize them. :(

What should I do ? Any setting trick ?

There is too many variations of video, "description" in most cases is not enough. Can you post image with example somewhere and give link here?

Mouse
21st February 2005, 15:33
Shalcker, GREAT Software!

Question;

Can you possible add support for Swedish chars ?
(we use them stupied dots over a and o ) :) - look like this ;

(hope this translates OK ?)

Uppercase =
Uppercase =
Uppercase =

I'll add an image, just in case.

Mouse
21st February 2005, 16:07
Image...

svcdprayer
27th February 2005, 13:40
Hello! Anyone can tell me if theres anyone wrote a guide how to properly use this great software?

Thanks

Congratulations to the author of the software!

MickRC
15th March 2005, 15:52
OCR is freezing up after some time. Could u please fix that?
Great program, anyway!

johner23
15th March 2005, 18:25
Hi, my friend Shalcker.

A very great program, no doubt. :)

Do you remember me ? I sent you some pictures from white-pale subtitle hard burned in a movie.

ASD can't read them properly ( don't recognize them )and I sent you some pics to analise it.

If you get time, progress and patience for solve those problems, let me know, ok ? ;)

I repeat: when I use an yellow/golden subs to rip, ASD solve the task without any problem. But the blue/white pales ones, it doesn't.

Thank you very much.

johner

Shalcker
15th March 2005, 19:50
Originally posted by johner23
If you get time, progress and patience for solve those problems, let me know, ok ? ;)
Well, just lower Drop settings to 60 instead of default 100... since there is no outline around these subtitles AND they are from low-resolution video. Either that or changing "shift" from 1 to 3. Program should be able to detect them then (or at least it does for samples you provided to me).

MickRC
16th March 2005, 00:40
I'm having a problem with subtitles that are in interlace.
Sometimes, first line of subtitle that appears is lower line. Upper is unrecognizable, presented with those interlace lines. On the next frame, both lines are complete. So, I had to reduce "Skip Change if Distance is Less than" to 0, to be able to get both lines. Now, program detects change and result is like this:


4
00:25:47,480 --> 00:25:47,520
starac, morat ces me slusati.

5
00:25:47,520 --> 00:25:52,320
Ako kanis postati smezurani
starac, morat ces me slusati.



As you can see, time difference between subs 4 and 5 is only one frame.
Later I have to manually delete subs like 4 since they are not required. I was just thinking, something like option "Start OCR X Frames After Change is Detected" would solve the problem.

Other problem is with local East European letters. We have S with \/ on top of it. Couldn't find a way to configure S and \/ as one letter.

Shalcker
16th March 2005, 01:14
Originally posted by MickRC
As you can see, time difference between subs 4 and 5 is only one frame.
Later I have to manually delete subs like 4 since they are not required. I was just thinking, something like option "Start OCR X Frames After Change is Detected" would solve the problem.

Well, you can also deinterlace using avisynth before feeding such source to AVISubDetector, if i get you right... or tweak detection settings if by "interlaced" you mean something different...


Other problem is with local East European letters. We have S with \/ on top of it. Couldn't find a way to configure S and \/ as one letter.
"Areas" are united into one symbol when they are no farther then "OCR/Main/Symbol Rects/Max Diacritic Distance" vertically. Increase or decrease it to match your font size. Default 32 is around 8 real pixels, so it should be sufficient unless there are really big fonts used.

MickRC
16th March 2005, 01:15
I've been experimenting with your cool program lately, and would like to share my experience.
Source material is capture from digital Satellite TV, mpeg2, picture is sharp and clear, and so is the subtitle. Letters are white, on grey background. Input video resolution is 704x576.
First, I filter the picture in VirtualDub.
-resize to 4:3 aspect ratio
-mask unncessary areas with grey, using logoaway filter with solid fill option. Chosen color is 0x373737, color close to background of letters.
(First logoaway filter from top of the screen to top of first line of subtitles, second logoaway filter from bottom of 1st line of subs to top of 2nd line of subs, third from bottom of 2nd line of subs to bottom of the screen. Fourth and fifth mask left and right of where subs appear.)
Why? Well, I'm interrested only in subtitles, and that is what I want AVISubDetector to see. It gives OCR an easier job, less artefacts and resulting ripped subtitle has less garbage in it. :)

Then, using frame server option in VirtualDub, I load filtered video in AVISubDetector.
Options that are different than default are:
-settings-crop
-Drop Values-Shift 4
-Blocks-Value 780
-DLC 17
-L/RMB 60
-OCR-Main-Clearing settings-Alternative base, unchecked all three of Remove open Area From Text Mask, and (020) Match Text Mask to Outline Mask. (I guess that in theory this is used for removing garbage from the picture, but in practice they remove normal letters as well. So, I don't use them, sry.)
-OCR-Main-OCR Setting-Required Similarity 95%.


Once I defined all symbols, I started to use REMOVE Unknown-All. And Required Similarity to 90%.

And now it is pure pleasure to watch your great program in action!

MickRC
16th March 2005, 01:39
Originally posted by Shalcker

"Areas" are united into one symbol when they are no farther then "OCR/Main/Symbol Rects/Max Diacritic Distance" vertically. Increase or decrease it to match your font size. Default 32 is around 8 real pixels, so it should be sufficient unless there are really big fonts used.

Well, this works for small letters, like i. Connecting | and . to create i. Here I need to connect capital S with small \/ on top of it. Distance between those two areas is even smaller than distance on i. I've changed Max Diacritics Distance, even to the extremes like 2 and 100. Still can't define it properly.

Shalcker
16th March 2005, 01:55
Originally posted by MickRC
Well, this works for small letters, like i. Connecting | and . to create i. Here I need to connect capital S with small \/ on top of it. Distance between those two areas is even smaller than distance on i. I've changed Max Diacritics Distance, even to the extremes like 2 and 100. Still can't define it properly.
Oh... now as you mentioned that i can see it too. Guess since they are in a different horizontal "stripe" they are never checked for proximity... Will fix it today. :)

MickRC
16th March 2005, 02:17
Originally posted by Shalcker
Well, you can also deinterlace using avisynth before feeding such source to AVISubDetector, if i get you right... or tweak detection settings if by "interlaced" you mean something different...



By interlaced I mean really in interlace.
The way subs appear is something like this:
Frame 1: No Subs
Frame 2: Subs partially appear. Every second horizontal line is presented.
Frame 3: Subs are OK, solid, looking the way they should look

Some appear like this:
Frame 1: No Subs
Frame 2: Subs are solid

But, when there are 2 lines of subtitles, they can appear like this
Frame 1: No Subs
Frame 2: Lower line of subtitle is solid. Upper is partial, horizonatal lines
Frame 3: Both lines are solid


AVISubDetecor works well with 1st and 2nd example, but not with 3rd. If I don't set "Skip Change if Distance is Less" to 0, I get only lower line of subs. And when I set it to 0, I get two subtitles, at frames 2 and 3. And subtitle on frame 2 in only partial,useless, and have to delete it manually later. In this case, OCR is needed on frame 3, not 2.

Tryed all sorts of Deinterlace filters in VirtualDub, still couldn't get it right.

Shalcker
16th March 2005, 03:16
Originally posted by MickRC
By interlaced I mean really in interlace.
...
Tryed all sorts of Deinterlace filters in VirtualDub, still couldn't get it right.

Try AviSynth with this script (create text file with AVS extension after installing avisynth):
-- cut --
AVISource("path\to\your.avi",audio=false)
TDeint(mode=1)
BlendBob()
ConvertTORGB24()
-- cut --
Then open it in AVISubDetector (or VirtualDub(Mod), if you prefer it that way).
That's what i use for most of my interlaced sources.

BlendBob and TDeint can be downloaded here:
http://www.avisynth.org/warpenterprises/
just put dlls into plugins folder of AviSynth.

MickRC
16th March 2005, 09:21
Now is that upper line very transparent, can't OCR it. Changed settings and experimented. But thx for deinterlace tip!

konik
22nd March 2005, 04:30
Hi,I used AVISub Detector on divx movie and I get signs like this:
1
00:01:09,160 --> 00:01:17,240
! Frame=1729:DLC=19;MED=171;MBC=21;LBC=6881348;MBX=7;LMB=174;RMB=424;

2
00:01:18,440 --> 00:01:22,600
! Frame=1961:DLC=8;MED=246;MBC=29;LBC=6881348;MBX=1;LMB=114;RMB=480;
3
00:01:22,600 --> 00:01:22,720
* Frame=2065:DLC=8;MED=251;MBC=29;LBC=6881348;MBX=4;LMB=114;RMB=488;

What's that?
How can I get words from it?
And last time why are end of line #2 and start of #3 same?Both are
00:01:22,600
I think it will not work properly!
Thanks for help!

Shalcker
22nd March 2005, 15:08
Originally posted by konik
Hi,I used AVISub Detector on divx movie and I get signs like this:
1
00:01:09,160 --> 00:01:17,240
! Frame=1729:DLC=19;MED=171;MBC=21;LBC=6881348;MBX=7;LMB=174;RMB=424;

What's that?
Frame decision data. Generally "Automatic Mode" is used to gather subtitle timing data to type words later in whatever subtitle editing program you like.

How can I get words from it?
If you need words, you either type them manually (in Manual Mode) or use OCR (in OCR Mode - you'll still have to type letters manually a few times).


And last time why are end of line #2 and start of #3 same?
Both are
00:01:22,600

That's because program percieves that there is a subtitle change and there aren't any frames without subtitles between them (you can see that by * (subtitle change) mark instead of ! (subtitle start)).

konik
22nd March 2005, 22:44
Originally posted by Shalcker
If you need words, you either type them manually (in Manual Mode) or use OCR (in OCR Mode - you'll still have to type letters manually a few times).


Hi,thanks for answer,but I thought if it can rip subtitles from movie(I meant rip them with words and exact time) to be ready for translate in other program? Is it possible? I'm noob in that!

konik

MickRC
22nd March 2005, 23:29
Originally posted by konik
Hi,thanks for answer,but I thought if it can rip subtitles from movie(I meant rip them with words and exact time) to be ready for translate in other program?

konik

Yes, it can rip words and exact timing. To rip words, you have to use OCR option. Then you'll have to define found symbols as letters. Some more than once.
Program has lots of options, and no manual. Take your time and experiment until you get it right. And one other thing, current version is known to freeze after some time. Then you have to save subtitle, load program again, set it up and continue ripping.

Krizzz989
17th April 2005, 05:33
I'm new to this, How can I ignore the inside of letters? Thanks in advance. http://img.photobucket.com/albums/v171/krizzzopolis/dot.png

Joe_smish
29th April 2005, 00:28
What does this program do exactly? can it remove subtitles from avi files??

Shalcker
29th April 2005, 03:29
Originally posted by Joe_smish
What does this program do exactly? can it remove subtitles from avi files??
No, removing hard-burned subs from avi is quite difficult and there is no program for that yet as far as I know.

This tool is used to produce softsubs (text files with timecodes) based on hard-burned subs.

Joe_smish
29th April 2005, 16:20
oohh ok.. Thanks. :) I hope there is one of those programs for removing hard code subs soon. They got there somehow and arent a part of the original picture, so there must be a way to get them out as well; I would think anywyas, I dont know too much about this stuff.

johner23
29th April 2005, 17:00
And what about video filters? See an example above.

---> http://compression.ru/video/subtitles_removal/index_en.html

They are not capable to remove some kind of permanent subtitles from the movie?

Of course, the results will be not 100% perfect and the video get some "defects" after the operation too.

But, at least, the subs were be removed, I hope.

best regards

johner

johner23
8th May 2005, 00:56
Hi, Shalcker.

I have one more doubt about ASD.

I was ripping one AVI hardsubbed subtitle movie, in manual mode.

It takes me more than 2 hour, because this movie was VERY big.

I use to save the stats, stop some minutes to rest and continue. I save stats 3 times. When I finished typing the letters and go to save srt, ASD crash and I lost everything. :eek:

What should I do to avoid those crashes?

I think that saving only the status was enough to save my all my work after that. But, when I try to open the status in ASD, the scanning start from beginning, not the point where I stopped before !!

Any help will be apreciated.

Thanks in advance.

Shalcker
8th May 2005, 06:13
Originally posted by johner23
I use to save the stats, stop some minutes to rest and continue. I save stats 3 times. When I finished typing the letters and go to save srt, ASD crash and I lost everything. :eek:

What should I do to avoid those crashes?
Press "Save everything marked for AutoSave" on Project tab at intermediate points (that also happens when you press "Stop", but doesn't happen when you "Pause")... and certainly mark subtitles to be autosaved - then partial work will be saved in project directory and they will also be saved when you close program.
Or just save subtitles from "Subtitles" tab.


I think that saving only the status was enough to save my all my work after that. But, when I try to open the status in ASD, the scanning start from beginning, not the point where I stopped before !!
Stats are only used to speed up frame decisions (no subtitle/subtitle/subtitle change) in case you need to redo it or change some detection/change thresholds without reprocessing each frame. Actual work is only saved as subtitles.

And last processed point is not saved anywhere but in subtitles - so you'll need to go to subtitles tab and set starting point from end of last subtitle there in case you'll need to continue (done by Right-clicking on required line and setting that frame as starting one).

MickRC
2nd June 2005, 21:26
Shalcker, could you please fix that freezing issue?
Here is screenshot:
http://i4.photobucket.com/albums/y112/MickRC/test2.jpg

And description:
I've found optimal settings, defined characters, loaded the AVI. Now, what I want is to leave ASD to process all frames and give me ripped subtitles.
So I press REMOVE Unkown-ALL and start ripping with Start (Full). All that I need to do is to leave ASD to work, come back after 3 hours and choose name for my newly ripped subtitles. That is in theory.
In practice, however, this never happens. More complicated picture is for OCR to analyze, sooner will ASD freeze up. As if someone pressed pause during analyzing. And ASD will stay blocked until I press OK. Then it moves on, but freezes again very soon when something complicated comes along. Then I have to shut down ASD, start it again, load partially ripped subtitles and resume with ripping. Later on, ASD freezes again...
For 45 minutes of material, I usually have to restart ASD for about 4 times. If movie is dark, ASD can rip subtitles in one go. But when more analyzing is needed, more it blocks and more restarting is also needed.

On the screenshot you can see how ASD is frozen when it should be ripping.
Cheers!

Shalcker
3rd June 2005, 05:51
Yes, i know about this issue and i am working to fix it. Hopefully new version will be available somewhere next week.

BTW, judging from screenshot, if there is only one color of subtitles you should set these colors manually and turn off auto-color for faster OCR (color search is quite slow).

MickRC
3rd June 2005, 13:15
Thx, you mean ejecting AUTO-COLOR button once first subtitle is found and color are analyzed?
One more thing- every time when rip subtitles, I change values under OCR submenu (Clearing Settings, Minimum Width, Height, Max Diacritics Distance, Required Similarity, Space Distance, Auto Color). Would it be possible that they are saved with Save Settings?

ai4spam
17th June 2005, 01:37
Shameless plug: have you tried the new SubRip? It's not as versatile as ASD, but is simpler to use and probably just as effective. See http://zuggy.wz.cz/redir.php?co=101 for a guide and a quick look at what it can do. Comments are welcome in the SubRip topic http://forum.doom9.org/showthread.php?s=&threadid=93680

Also, I'm trying to modify the DeLogo VirtualDub filter to work with the bitmaps I save in SubRip for subtitle removal. Unfortunately, I'm having trouble debugging it (as in, I have no idea how to do it, other than popping up error message dialogs), and although I had it somewhat working at some point (I managed to break it in the meantime), I couldn't get it to work in the main window, it only worked in preview mode from the filter properties. I think there's some problem with passing parameters to/from the filter, but I don't know enough to solve the problem. Can anyone help? Please send me a PM if you can.

Thanks.

ChiffaFox
9th October 2005, 11:50
What's happened with Shalker? His page responds with "Bandwidth Limit Exceeded" message... :( Guys, can anybody help me with any documentation on AVISubDetector?

Shalcker
9th October 2005, 14:54
What's happened with Shalcker? His page responds with "Bandwidth Limit Exceeded" message... :(
Well... omake.ru is a site we share with quite a number of people... it worked fine before for a few years, now i guess either one of them used too much (and limit was around 20 GBytes per month) or people just forgotten to pay to extend the service... or maybe it's a temporal glitch... we'll see, hopefully it'll be fixed next week.

Either way, program is also available at http://web.etel.ru/~shalcker/ (but without sources, as there is only one megabyte available there).

As for documentation... ask, and i'll answer any question.

ChiffaFox
11th October 2005, 20:21
Thanks for your answer. :) But to say about questions... Can you just tell me, what to do step by step to extract and OCR subtitles?

There are so many settings and parameters in AVISubDetector - I've lost in there completely... :(

ericf
1st November 2005, 09:47
Yeah, I've tried to understand some of what has been discussed in the earlier thread about the settings but I guess I failed in most respects.
I do have the ability to use the levers in the settings window but I never used the numerical values at the bottom before. I don't get the "number of lines/blocks" explanations either.
Sorry.
Documentation of manual/OCR modes would be very welcome.
ericf

Shalcker
1st November 2005, 17:10
Yeah, I've tried to understand some of what has been discussed in the earlier thread about the settings but I guess I failed in most respects.
I do have the ability to use the levers in the settings window but I never used the numerical values at the bottom before. I don't get the "number of lines/blocks" explanations either.
Let's see... How about this - each time there is block/line exceeding threshold, their counter is incremented by 1. Each time there is a break between blocks/lines (one or more below threshold) counter is decremented by 1. Is this easier to get? :)

Blocks/lines above threshold are marked as yellow (in color modes), and once counter exceeds threshold for them ("Block Count" for blocks, "Line Count" for lines) they turn green.
Blocks/lines above threshold but right after "break" are marked as red to indicate possible "decrement".

Once you see green block you can be sure that line with this block is above "Block Count" (and thus counted in "Line Count"), and once you see green line you can be sure that this subtitle is "detected" (line counter is also saved as "DLC" for detecting subtitle changes later).

As for OCR... which questions do you think this documentation should cover? :)

ericf
4th November 2005, 23:31
For OCR: do I have to do anything at all to make it work? I've tried it before on Win 98 and I always got stuck on the same characters again and again. I have tried to read the information here but I'm not computer-language literate so I guess much of it goes above my head. The OCR read Everything as subtitles: Things around the subtitles, inside the subtitles, far outside the subtitles...

To tell the truth I've only tried it a couple of times but with so many checkboxes that I don't have the slightest idea of what they do, I'm at a loss. I don't know the meaning of Mask in this program so I don't know what to do. I've only recently started to use Win XP but in Win 98 the OCR kept crashing. Also, It seems I have to use the resolution 1024x768, oterwise I can't see what the text on the button to the right of the Clear-Ignore points button. It starts with Pa, Mabye it's Pause?

When the subs are on an old TV-show AVI with fuzzy subs, what do I do when the program substitutes a with o and a with s, F with i and so on?
How do I delete symbols added to the program? How do I delete all symbols?

As you can read I'm all confused about this.
I imagine that may others are as well.

Using Manual mode is what I've done up until now. I have found that I can use the checkboxes at the bottom of the settings window for making adjustments to subtitlechecking. I always used the levers on the 3 video windows before. I've no idea what the preprocessing color checkboxes but I've found that unchecking the T (black) box and the one without a symbol does help the subtitle check in the program.

So, please, what do I do with the OCR when it doesn't recognize the letters for what they really are?
ericf

EDIT: With OCR auto settings it seems like there are only a few problems with very clear subtitles. I'd like a few hints as to how to set the controls for fuzzy/clear subs so I can change them as I need to, though.
Thanks.
ericf

ericf
5th November 2005, 09:49
OK. Problem: I have a video with fuzzy subs.
When I press start after pressing OCR (experimental) the video plays past at least 5 subtitles until it comes to a very clear subtitle with black outline. When it stops, it just stops. No asking for characters to write into a symbol matrix. (note: Must be because it has auto-ocr on at the beginning, right?)
But then, sometimes it does get the small fuzzy subs and does open the symbol window for me to write in the letters.

Any idea why it works sometimes and not always? Are there any particular settings that will enable me to always have the symbols window open upon the unclear subtitles?
Edit: Even on clear subtitles it seems that the symbol window doesn't open much of the time. Then we have the fact that much of the time only half the symbols are detected, some are blanked out. How do I make all letters seen in the symbol adding window? End edit.

Also, the Alternatives: Auto-clolor, auto-ocr and auto-search.
It seems like the program does auto-search even though I have unchecked it. I need to check that the text is correctly read by the program as it reads and that is problematic. I get 5 small graphics of the subtitle area at the bottom left but most of the time they are so small that I can't read the text. Can I make those graphics bigger?
Edit:
When changing resolution to 1024x768 I got the images to get big.
So how do I use those 5 graphics? When I click on them I get a reaction but I only have one that seems to get all the letters. How do I make sure the program reads that way? It's the one that is black and white. Is it possible?
End Edit.

AVS scripts: I do have some mpg files that I'd like to search for subtitles. I have seen some of the script ideas you have put up here and in the old thread but how about putting up a few that might work with most AVI/Xvid/MKV/MPG files?

I do realize I'm asking a lot but it's a great program when used in Manual mode (I've always done that since I check the subs visually to make suer none are missed) but I'd like to start using it with the OCR function as well. When the subs are very clear it seems to be OK (apart from some problems with the symbols window that doesn't always open on unknown subs) but it seems like the program does loose most functions but OCRing in some cases. I worked with the program for about 3 hrs last night and sometimes I had to just shut it down because the only area I could watch was the main part of the OCR area with the Clearing Settings. When clicking on the other parts of the program nothing happended.

Also, the symbol matrix seems to have a few problems. I tried to change the corresponding letter on a few of those symbols I inputted because I had made a couple of mistakes but when I clicked on the symbol I wanted to change the whole set of symbols moved baco to the top and I couldn't change them. If I saved the matrix and ordered it and then opened the saved matrix, I could make the changes.
Edit: Attempting the edit thing now but it only works on some symbols, not on the whole matrix.I have 35 symbols in it right now but I can't do anything abbout the symbols after symbol no. 33 (it's in the bottom part of the window right now). Is there a need for there to be whole lines of symbols to be able to edit them? The up and down lever on the right keeps blinking dark/light during the editing as well. Is that indication that something is wrong?

The window on the right bottom of the program in OCR mode where the text is written has an OK button on the left for oking what the program has OCR-ed. But it also has a button that says OCR. Its text it fat black while the other buttons are thin black. Nothing happens when I push that button. If I get the wron OCR, isn't that for changing the OCRed symbols or something? Could you explain that area a little bit? The clear ignore-points and Build Masks part as well?
How do I clear the symbols matrix without closing the program?
End Edit.

That's it for now.
Thanks.
ericf

ericf
7th November 2005, 22:10
How do I make the OCR recognize subs that are white without a border?
ericf

ericf
9th November 2005, 17:11
Another thing:
QUOTE
"Try AviSynth with this script (create text file with AVS extension after installing avisynth):
-- cut --
AVISource("path\to\your.avi",audio=false)
TDeint(mode=1)
BlendBob()
ConvertTORGB24()
-- cut --
Then open it in AVISubDetector (or VirtualDub(Mod), if you prefer it that way).
That's what i use for most of my interlaced sources.

BlendBob and TDeint can be downloaded here:
http://www.avisynth.org/warpenterprises/
just put dlls into plugins folder of AviSynth." UNQUOTE

I put the DLL files into the pluginfolder and put the path to my avi in the script. All that happens it that AVISubDetector opens the script and starts OCRing the text file.
What am I doing wrong?
When I put the avi and avs right on C:\ I get a message saying Script Error: there is no function named "TDeint".
Also, in the brackets at the end after the avi information it says: , line 2).
Is that something that causes trouble?
Thanks.
ericf

ericf
7th December 2005, 21:52
I found that I can't use AVS scripts in Avisubdetector because all my avis' get flipped vertically.

Using this script:

AVISource("path\to\your.avi",audio=false)
TDeint(mode=1)
BlendBob()
ConvertTORGB24()

with FlipVertical()

doesn't help.

Does anyone know what is wrong?
Can you help me?
Is it a problem with Xvid? If so, how do I get by it?
Thanks.
ericf

bshater
21st February 2006, 16:56
can someone show me how to use OCR?

ericf
4th June 2006, 11:48
Any development in the OCR?
I keep getting hang-ups with it and it stops reading and writing the characters. Even if I restart the program it's no good. How much memory does it need? I have 1000 MB but mabye that isn't enough? Or is it not memory related? I can see that the OCR function starts to write the letters but it stops before the sentence is finished. That's why it doesn't write any letters at all. Manually typing letters IS an option but not one I really feel is supposed to be necessary. Of course, letter types like Times New Roman are more difficult for the program to read than Arial but does that mean it won't agree at all with those types of letters?

The features of the program Subrip are truly not acceptible so that is not an option for me. They simply don't work at all! However, I'm having a lot of problems with AviSubDetector as well and I'd like to know if something is about to be changed to the better.

Thanks for making the program.
ericf

Shalcker
4th June 2006, 13:29
Any development in the OCR?
I keep getting hang-ups with it and it stops reading and writing the characters. Even if I restart the program it's no good. How much memory does it need? I have 1000 MB but mabye that isn't enough? Or is it not memory related?
It IS memory-related. But it's not about memory amount - it's about memory fragmentation (and Delphi memory manager). The way OCR works now leaves a lot of small memory fragments, which aren't reclaimed by default memory manager, resulting in "out of memory" error even though total amount of "free" memory is relatively large... and once that "out of memory" happens OCR stops (usually mid-sentence due to masked access violation while it tries to add unrecognized letters), and program can ocassionaly freeze too :)

Are you using latest version BTW? This error shouldn't pop there too often...

I can see that the OCR function starts to write the letters but it stops before the sentence is finished. That's why it doesn't write any letters at all. Manually typing letters IS an option but not one I really feel is supposed to be necessary. Of course, letter types like Times New Roman are more difficult for the program to read than Arial but does that mean it won't agree at all with those types of letters?
As long as letters aren't too disjointed, letter type shouldn't influence anything. Most problems can be resolved by proper settings for color masks and limiters - which is easier to explain on example. If you have any image which gives you trouble with OCR, you can post it and i'll explain which settings are optimal for such subtitles (and how can you deside what should be changed from defaults).

ericf
8th June 2006, 19:07
"Shalcker:
It IS memory-related. But it's not about memory amount - it's about memory fragmentation (and Delphi memory manager)."

Oh... I don't understand but okay.

"Are you using latest version BTW? This error shouldn't pop there too often..."

I think so. AVISubDetector0.6.0.5. Oops, there's a 0.7. version out. I'll try that.

"As long as letters aren't too disjointed, letter type shouldn't influence anything. Most problems can be resolved by proper settings for color masks and limiters - which is easier to explain on example. If you have any image which gives you trouble with OCR, you can post it and i'll explain which settings are optimal for such subtitles (and how can you deside what should be changed from defaults)."

I have had problems with most types of subtitles. Unless they are white with large black borders. One problem is that while many subbers use the same letter types they keep changing the outline color. How do I make a setting that can deal with that?

Even if I use avisynth to make the files black and white there seems to be sufficient differences to cause problems.

I'll try to include an image with several problematic subs below. I don't know much about adding images to boards.

This avi is a dvix3 low motion clip. I can't seem to use it with this avisynth script:

AviSource("c:\AmoviesDVD03\MegamiKouhoseiCFGep03.avi")
Lanczos4Resize(704,480)
GreyScale()

Can you tell me why (I only get a black image) and do you have some advice on how to be able to make all these OCR well with one setting and even when the background is bright or full of little details? The green subs only work when the background is rather dark.

ericf

PS: The new version seems to do a better job of ORCing this file. However, I have to OK every line before it moves on. Is that how it's meant to be?
Thanks.
ericf

maxleung
25th June 2006, 22:32
Some serious problems to report:

The frame offset reported for each detected subtitle are incorrect. For example, if a subtitle is reported on frame 9333 (as shown in the current frame textbox), the script will incorrectly write the line as starting on frame 9318.

The result is that the subtitles go completely out of sync with the video!

The video is reported as 23.976 FPS. The video is in FourCC XVID format. The AVS script looks like this:


directshowsource("[Ani-Jiyuu]_Ergo_Proxy_-_10_[97428AD2].avi",audio=false)
ConvertTORGB24()


Is there a setting I missed?

I tried using Subtitle Workshop to change the FPS rate but that didn't work - still a lot of places where the subtitles are off by 2 seconds or more!

maxleung
25th June 2006, 22:52
Another bug:

If you save the script in SSA format, and there are line breaks in the text, this will result in a bad SSA script. Line breaks should be replaced by \n or \N automatically.

Also, if you read back an SSA script written by AviSubDetector, and if any subs span more than one line, the extra lines will be lost! This lost me 2 hours of work the other day - I was loading an SSA script created by ASD earlier, then saved it again, losing all the subs that spanned more than one line.

Workaround is to never add linebreaks to a subtitle (use \n instead), or perhaps save in SRT format <- but I haven't tested this.

maxleung
26th June 2006, 01:13
The weird thing about the incorrect frame problem is that the durations seem to be correct most of the time. From my attempts to correct the sync using Subtitle Workshop, it seems that ASD occasionally loses 5 or more frames every now and then. The effect appears to be cumulative in most cases.

maxleung
26th June 2006, 07:43
I think I figured out why the subtitle detection goes out of sync - whenever a scene goes too dark or too bright, it loses frames. I'm guessing that it loses synch depending on how long the scene stays dark or bright. This could be a codec-related problem - the frame counter doesn't include these dark/bright frames!

maxleung
27th June 2006, 05:57
Heh, seems like I'm the only one here. :)

I fixed my problem - using the DirectShowSource function was the problem. When I use AVISource and ensure that FFDShow's XVID decoder was enabled in its VFW configuration fixed it! No more frame synch problems! :D

(BTW, ASD is a really great tool!)

ericf
15th August 2006, 20:04
I'm getting quite frustrated here.
I've been trying to come up with a good setting for an anime file with white subs and black borders. Okay. The settings seem pretty perfect but the OCR just keeps marking everything around the letters. There's nothing to OCR outside the text.
What gives? Why does this happen?

And why is it impossible for me to use a setting I used one time in OCR color (not using auto color other than analyzing it first) the next time I open the program? Using those settings the text isn't read correctly. It disappears into oblivion. Either the OCR window is gray, black or pink.

Can anyone help me with this?

And what do I do when all the words stick together in every sentence? Theysticktogetherlikethis:
andsometimes.there's.alotofdots.inthetexteventhoughtherearenone.inthetexttheprogramissupposed.toocr.

Has anyone worked out how to explain the functions for a layman?
I simply cannot get the terms used here and in the program. Both because the language isn't my first language and because it's technical language that seems to define things I couldn't understand anyway.

This thread seems quite dead but I hope someone reads it.
Thank you.
ericf

Shalcker
19th August 2006, 17:20
I'm getting quite frustrated here.
I've been trying to come up with a good setting for an anime file with white subs and black borders. Okay. The settings seem pretty perfect but the OCR just keeps marking everything around the letters. There's nothing to OCR outside the text.
What gives? Why does this happen?Can you put an image somewhere for checking?
Setting colors directly to match borders and subtitles should help (if you aren't using auto colors). Check that their types are right - text is marked as Tb on OCR/Main/Colors tab and outline is marked as Oc. Click on them there to change their type.

And why is it impossible for me to use a setting I used one time in OCR color (not using auto color other than analyzing it first) the next time I open the program? Using those settings the text isn't read correctly. It disappears into oblivion. Either the OCR window is gray, black or pink.
Well, these settings aren't saved, so they aren't restored either... adding that save is relatively easy though, do you need it? :) Can be done in a few hours...

And what do I do when all the words stick together in every sentence? Theysticktogetherlikethis:
andsometimes.there's.alotofdots.inthetexteventhoughtherearenone.inthetexttheprogramissupposed.toocr. Well, pretty much any small area looks like a dot. So it often gets a good match rating, inserting dots from "noise" above or below letters.

As for missing spaces, that is controlled OCR/Main/OCR Settings/Space Distance. Just reduce it and they should work more or less fine (with exception to most italic fonts).

ericf
25th August 2006, 18:20
Thanks for answering.
I'll post two images here. It tends to take a while until they appear but they will be here (like the former image I posted with several different outline colors in one episode).

And if I have a particular color (perhaps green) as outline, do I take a still and go into Photoshop and find out the RGB of that color and set it in the OCR Colors window? 255 and 100 makes a white background. 0 and 30 makes a black background. What does the 100 and 30 mean?

Using outline as black and text as white doesn't make the OCR read any text at all in the below pictures. Why? 3rd image at bottom. Do I have to find out the exact color combination in Photoshop or some other program to make it work?

The same question goes for Color Domination / Use Color Processing in the settings window. If I decide to use Use Color Processing, what PreProcessing Colors boxes should be ticked? All? None? What does X mean? T should be Text and O outline. And I don't know what Shift+LMB means when you mention it at Separation by subtitle color. What does it do?

I usually use Color Difference / Inverted Blocks / Inverted lines in the setting window.

I'm sorry if I seem stupid. I've tried to make some changes to settings in order to make the OCR work properly but I don't seem to be making it very well.

I've fiddled around with the settings in Auto-Color sometimes as well.The original 6-100-64 settings. I've found that it's sometimes good to make the settings so that a blue outline appears around the characters. But sometimes that doesn't work.

Any thoughts on how I can improve that?

And making the settings you used last time stay when you open it the next time would be very nice.

And I understand the trouble with italic fonts. You can never make a j or i work properly since they can't be read separately.

Why does the box that I thought was there in the OCR window to tell me what the program is checking in the text so often end up outside the text area? I still get letters in the three boxes but isn't the black outline supposed to show the area it's OCRing?

One more thing.
I often get OCR sentences with the words of the last part of line 1 and the first part of line two something like this:

I never knew what hit me hu
e ntil told me I had been struck by lightning.

Can I avoid this in some way?

I always get Access violation at adress ************ when I try to exit the program. I have XP Home on my computer. Any idea what might cause this? (Pressing stop and removing the avi makes the program happy. But could you perhaps add a buton for exiting the program?)

Also, is there some way I can make the OCR not mistake for ' or ,? It seems I get a lot of commas instead of apostrophes and accents instead of commas.

Sometimes I can't save the subtitles and symbols. The buttons work but I don't get a window for saving. It's when the whole file has been OCRd. The subs don't appear in the temp folder either. Is there any way to make sure the subs always get saved? Perhaps continually during OCRing? When I start the program and try another OCR the program always asks for me to choose wether to continue on the last subtitles or not. That way I can save the subs. I can't ever save the symbols I wrote into the symbol file, though. Is it possible to save that continually as well?

Thanks
Ericf

renoturks
27th September 2007, 22:50
Sorry to bother , but i need help, i'm having the problem:

invalid floating pont operation

what should i do

Adub
28th September 2007, 01:03
I recommend using Subrip. It is more updated and easier to handle.

ericf
3rd December 2007, 20:56
Sure, it's more updated. Last time in 2006.

Anyway, the way it reads subtitles is in most ways a faulty process. AviSubDetector always finds the subs (except for some of the very short time subs that you have to make special settings for) with almost no change to default settings.

In Subrip, no two letters look alike. That means that every single one of the letters of a movie has to be entered manually. Not so with AviSubDetector. There may be a few duplicates of letters in the symbol sets but it reads characters very well.

No manual is a problem. I still don't know why things work when they do. But it is much better than Subrip.

No contest.

ericf

help
13th June 2008, 20:35
I can't even preview the .avi file:

Video: XVID 696x400 25.00fps 1148Kbps
Audio: MPEG Audio Layer 3 48000Hz stereo 128Kbps

ericf
12th July 2008, 22:28
Hello.
I have a problem with the symbols I saved.
Now the symbols are treated as garbage. Instead of the correct letters I get all kinds of strange symbols.
Can anyone tel me how to fix that?
I don't like to create a new symbol file every time I start OCRing subtitles.
Thank you.

ericf
7th August 2008, 22:17
I need to know how to open mkv/mp4 or any h264 encoded video in Avisubdetector.

I'm unable to open even a normal avi using this script:
AviSource("file.avi")

Thanks.
ericf

Adub
8th August 2008, 16:17
I'd try and help, but I can't seem to find a working link.

Edit: Ah, wait, videohelp has one. Give me a couple minutes.

Edit2: Huh, ok. I can load a simple avi source just fine with an Avisynth script.

Example script:

Avisource("myavi.avi")
ConverttoRGB24()

Just load that and you should be fine. It seems the program only accepts RGB24 colorspace, so you have to add that last line to all of your scripts at the end, but it seems to work fine.

ericf
9th August 2008, 17:05
It seems like I didn't have Avisynth (the common one) installed.


I thought that came with my codec pack.
Sorry for the trouble.
ericf

Adub
9th August 2008, 18:53
I would highly recommend against using codec packs.

FFDshow is all you really need. However, if you absolutely must use a codec pack, stick with CCCP.

simonrule
23rd October 2009, 17:36
can any one help me how i create new symbol

videoslicker
3rd January 2010, 08:29
Great program.

I've gotten great results when adding a brightness/contract filter to the AVS/VDR script. I'm ripping the subs off a jdrama fansub with clear outlined white subs. I turned the brightness all the way down, and the contrast a little up. It really takes away the video and accentuates the hard subs. I'm going to add a gamma filter to see if it helps any further.

DirectShowSource("D:\video.avi")
Tweak(bright=-255,cont=4)
ConvertToRGB24()



-- edit --

Actually this was terrible advice...

AVISubDetector needs to be able to recognize the colors in order to detect the subs. Just leaving the color/contract alone worked great.

There were bugs in the program that made it almost unusable but there were ways around it once you get familiar w/ the program. One major bug was that the program always quit going to auto mode after recognizing italicized letters. To work around this, pause during any italicized sub; DO NOT OCR that line; go to manual mode and manually enter in the subs until all the italicized lines are transcribed and then revert back to OCR (experimental).

Some helpful tips that helped w/ my sub rips...
- dont load symbols; start from scratch on new projects
- manual mode any scene that contain several artifacts that detect as subs
- manual any multi-colored outlined subs; the program only picks up 1 outline color at a time
- always have the preview window open so you can see the entire scene

quz
5th January 2010, 10:54
how to open vob file in asd?

videoslicker
10th January 2010, 19:03
how to open vob file in asd?

create a AVS script to frameserve the vob.

Open VOB files in Avisynth
http://forum.doom9.org/archive/index.php/t-109454.html

Gfy
3rd August 2011, 11:43
If someone is looking for it, the source code can be downloaded here:
http://web.archive.org/web/20071031111130/http://animeburg.omake.ru/avisubdetector.htm

TheFrenchDumb
21st August 2011, 14:10
i love this tool, but i got some crashes, I dont use OCR.


Most important thing is the video size, if you got a too big image size (like 1080) it will crash.

I always convert in 700x400 (dont remember exactly) w


If you got new gen file (mkv/mP4)i suggest to convert into avi xvid+mp3

djmasturbeat
20th January 2012, 00:41
I can't open any avi files without error:
"Error in FrameOpen - conversion to RGB24 not available. Check that you have required codec installed."

Can anyone tell me what I need to do to get this working?
I have ffdshow installed, is there a setting I need to change there?

thanks

oh, I am on Windows 7 x64, also
maybe this won't run on it properly?

Abyssal
7th March 2012, 02:47
I can't open any avi files without error:
"Error in FrameOpen - conversion to RGB24 not available. Check that you have required codec installed."

Can anyone tell me what I need to do to get this working?
I have ffdshow installed, is there a setting I need to change there?

thanks

oh, I am on Windows 7 x64, also
maybe this won't run on it properly?

Hey,

All you need to do is to create an AviSynth script that calls the video file and converts it to RGB24.

I'm gonna assume you know that the program AviSynth exists to begin with so let's move on.

Just create a new text document and add the following:


AVISource("xxxx.avi")
ConvertToRGB24()


You obviously have to change xxxx.avi to whatever your file is called. To make it easier save this document in the same folder as the video file so you don't have the longest file path ever.

So anyway, you save this document with just those two rows as xxx.avs - So if you wanna call it starwars then save it as starwars.avs

Then you just open up this .avs file instead of the video file in AviSubDetector.