Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
4th July 2010, 23:11 | #41 | Link |
Registered User
Join Date: Jun 2002
Location: On thin ice
Posts: 6,837
|
ANSII covered Cyrillic chars are not my problem but Unicode Cyrillic chars which I could possibly convert to ANSII, maybe I'm just writing nonsense, I don't know.
__________________
https://github.com/stax76/software-list https://www.youtube.com/@stax76/playlists |
4th July 2010, 23:22 | #43 | Link |
Registered User
Join Date: Jun 2002
Location: On thin ice
Posts: 6,837
|
Script file names are my problem, StaxRip generates those based on the name of the source file and the source file will of course also appear in the scripts but that's not a problem it seems.
__________________
https://github.com/stax76/software-list https://www.youtube.com/@stax76/playlists |
4th July 2010, 23:58 | #45 | Link |
Registered User
Join Date: Jun 2002
Location: On thin ice
Posts: 6,837
|
StaxRip rejects Unicode file names, it was requested to remove this but I don't believe it can be removed because it cannot work, it can work with Cyrillic ANSII but not with Cyrillic Unicode. Using alternative script names would be a big task as scripts are generated in countless locations with complicated code. At least I can say it's not my fault.
__________________
https://github.com/stax76/software-list https://www.youtube.com/@stax76/playlists |
5th July 2010, 00:29 | #46 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
I simply don't understand your problem. Since Avisynth doesn't support UTF-8 or UTF-16 you have to create your scripts with the appropriate MBCS encoding using the correct code page. There are Win32 API functions you can use to ensure the proper encoding. Even dummy programming languages like VB should support this. I'm doing this in plain C without problems, one would think that .NET (which I assume you're using) should make this very simple. I suggest you do some reading on the matter and re-think that statement. |
|
11th July 2010, 10:30 | #47 | Link |
Registered User
Join Date: Mar 2008
Location: Hong Kong, China
Posts: 5
|
Actually, IMHO, for nowadays programs on Windows platforms, Native Unicode support could be the base requirement.
Such as using only Unicode version of API, which there is "W" at the end of API functions... It could be huge difficulty, but it could be the only complete way to avoid any this kind of problems... For AVS script parser, as like foxyshadis said, BOM could be the best way to determine whether the script is UTF-8, UTF-16 or MBCS encoding, then convert them to UTF-16 with MultiByteToWideChar() and then do the things... Please, throw away ...A() for future version. |
11th July 2010, 14:42 | #48 | Link | |
Registered User
Join Date: Jun 2002
Location: On thin ice
Posts: 6,837
|
Quote:
__________________
https://github.com/stax76/software-list https://www.youtube.com/@stax76/playlists |
|
11th July 2010, 15:11 | #49 | Link |
Compiling Encoder
Join Date: Jan 2007
Posts: 1,348
|
it's not as simple as just tossing the 'A', from the API functions and it'll magically just work.
avisynth is highly designed around the use of char * which would all have to be changed to wchar_t * if unicode was to be used and supported within scripts. this is by no means a trivial change. let's not forget that this core change would break every current filter there is as well, and it's no small undertaking to fix them all either. the avisynth project is even defined to use MBCS which defines 'A'-less API calls back to 'A' versions through the preprocessor. Last edited by kemuri-_9; 11th July 2010 at 15:15. |
11th July 2010, 15:40 | #50 | Link |
Unavailable
Join Date: Mar 2009
Location: offline
Posts: 1,480
|
Change the title of this thread ?
Just a nit-pick, but, IMHO,
"Foreign Language characters in filenames" should be replaced by "Non-ASCII characters in filenames" (or: "Non-ANSI characters in filenames") ((because "foreign" is just a point-of-view)). Last edited by Midzuki; 11th July 2010 at 15:45. |
11th July 2010, 22:50 | #51 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
The only "inconvenience" you have with Avisynth not supporting Unicode is that you have choose the proper encoding of your script according to the system code page it's supposed to be used on. It's very simple. |
|
18th July 2010, 21:15 | #54 | Link | ||
Registered User
Join Date: Oct 2003
Location: Germany
Posts: 377
|
Quote:
Quote:
Last edited by krieger2005; 18th July 2010 at 21:18. |
||
18th July 2010, 22:26 | #55 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
@krieger2005
You obviously don't know anything about encoding standards. First of all, I recommend that you look up what ASCII stands for. Wikipedia is your friend. ACSII is only a 7 bit subset of a 8 bit encoding scheme. AVISynth however supports 8 bit encodings based on the common Windows code pages. If you are for example using code page 1252 it covers all Western languages like English, German, French, Norwegian, Italian, etc including all accented characters these languages have. If you set your Windows to Russian (you don't need Russian Windows for that) it will support all Cyrillic languages and so on. Sorry, but your argument is nonsense. There is only a problem when people don't know how create a script properly. |
19th July 2010, 01:13 | #56 | Link |
Registered User
Join Date: Dec 2008
Posts: 589
|
And which code page contains the Euro character (€) ? Don't answer, there's several code pages that actually exist and contain it but the point is it's not in ASCII and not all people getting a file with such character may actually have installed that code page.
I won't even get into the new currency symbol for the new Indian Rupee for example which is planned to be included in the UTF-8 - the rupee is represented by the Unicode character 20A8 (₨) right now, but it's going to receive another position for the new symbol... the point is code pages is antique technology.. The sensible way is to move all the way to utf-8 which is perfectly fine and can represent if not all, then almost all characters out there. There are free libraries like the IBM's unicode one which would make the job easier but I'm not that good of a programmer to get involved. ps.. some people work as freelancers therefore receive files with various names from various parts of the world - sometimes included in winrar archives, don't make the assumption that a video is created on one computer and ends its life on the same computer, on the same code page window has running. Last edited by mariush; 19th July 2010 at 01:16. |
19th July 2010, 03:37 | #57 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Hm, I tried but can't figure out why one would use the Euro symbol in an AVISynth script.
Quote:
|
|
19th July 2010, 20:07 | #59 | Link |
Registered User
Join Date: Oct 2003
Location: Germany
Posts: 377
|
actually i am working as a software developer and know exactly what i'm talking about since i used the libraries behind converting character from codepages to unicode and the way back and so must know the theory behind usage of such libraries. Using Codepages is a possible solution and used decades away. but when you start using Character which are in two different codepages you can't use codepages or must search very long for a codepage which support such character-combinations (for example german: ä and russian я).
Why should someone use such a mix? Just because it's possible!!!! Don't ask why peaple do something - because peaple just do it. |
19th July 2010, 21:57 | #60 | Link |
Angel of Night
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
|
To avoid recompiling plugins, and since AVISYNTH_INTERFACE_VERSION isn't passed into any constructors, you could either attempt to find the value within the plugin, or have two copies of AddFunction, a char * and a wchar_t * version, the plugin gets whichever environment it tries to use. Kind of nasty, but that's baked code for you. Actually, you could even modify the header to point AddFunction (ThrowError, etc) to an internal AddFunctionW when recompiled, since the filter only gets the functionality if recompiled, but never loses any; that way you don't need to hack in a char * check.
No matter what happens, there will have to be char->wchar conversion functions, because not all plugins are ever going to be recompiled. |
|
|