Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 11th December 2011, 17:32   #221  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
As far as I see, x264 uses the plain "char" type all the way. This will support Unicode just fine on Linux, as Linux uses UTF-8 all the way. Also in its API functions!

On Windows however, the local ANSI Codepage will be used for char-strings. This means: Everything that doesn't fit into the local Codepage is already "messed up" when it arrives at the "main" function.

Also, even if you use UTF-8 inside your application, functions like fopen() won't be able to deal with UTF-8 encoded strings - on Windows.


There are basically two workarounds:

(1) Use the wchar_t type with UTF-16 all the way, on the Windows platform. This means you will have to use wmain() instead of main(), wfopen() instead of fopen(), wcslen() instead of strlen() and so on.

(2) Write short wrapper functions that convert from/to UTF-16 to/from UFT-8 for the Windows platform. This way you can keep everything to the char type with UFT-8 inside your own application.


Solution (1) requires the use of platform-specific macros all over the program code. At least if you want to support Windows (wchar_t/UTF-16) as well as Linux (char/UTF-8) with the same unmodified code.

At the same time with solution (2) 99% of your code can remain unchanged, but for every API call that goes outside your own code (and uses strings), you'll have to convert back to UTF-16 - on Windows.


I have successfully applied method (2) to a bunch of Linux tools that had been ported to Windows, but without Unicode support. It usually is not a big deal and I guess the same could be done with x264...


That's my "Unicode Support" stuff, I usually hack into Linux-style C programs:
* http://pastie.org/private/hnevzpd1cu5zv04ovptjbq
* http://pastie.org/private/acu7qmfrcyvdswmwmcjaw

And this is how you would have to change the main method to make this stuff work:
* http://pastie.org/private/uwavpv4vosjckawh0whpwa


If there is any change to get Win32 Unicode support committed into x264 (and if really nobody else has done this yet), I would be happy to submit a patch.
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 11th December 2011 at 17:55.
LoRd_MuldeR is offline   Reply With Quote
Old 11th December 2011, 18:50   #222  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
What you're only really caring about is that filenames can be utf-8, not the entirety of x264.

Updating the entirety of x264 to be utf-8 on windows is largely more extensive that what LoRd_MuldeR's patches indicate....
his patch even opens up a can of worms for placing utf-8 strings into functions that heavily expect ansi strings, which is going to cause fun issues later.

I am somehow recalling a patch from one of the x264 licensees on the subject, but not sure what happened to that exactly. Guessing it was forgotten due to lack of interest.
__________________
custom x264 builds & patches | F@H | My Specs
kemuri-_9 is offline   Reply With Quote
Old 11th December 2011, 18:57   #223  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Well, the good thing about UTF-8 is that all ASCII characters (which includes the "control characters", such as '\n' and friends) are identical between ANSI and UTF-8.

Moreover, all UTF-8 characters, that can not be represented in plain ASCII, are coded in a way that they cannot be confused with ASCII characters. Thus all the C string functions deal properly with UTF-8.

Also keep in mind that x264 actually is using UTF-8 encoded strings on the Linux platform, since the beginning of time. AFAIK what you gent as agrv[] in your main() method is UTF-8, on Linux.

Printing out Unicode characters properly to the Windows console, may they be encoded as UTF-8 or UTF-16 or whatever, is another funny story and probably not the most important thing to worry about

BTW: The code snippets above are not patches for x264, they just give the idea how a simple patch could be made. I patched like probably a dozen tools that way and it seems to work pretty well...

(Actually this is the way how LAME does handle Unicode on the Win32 platform. Looking at the LAME code gave me the initial idea on how to do it)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 11th December 2011 at 19:05.
LoRd_MuldeR is offline   Reply With Quote
Old 11th December 2011, 19:06   #224  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
Yes, the basic ASCII characters 0 - 0x7f are generally the same across all codepages.

However, the 0x80 to 0xff characters are primarily different for each local codepage, additionally differing from what UTF-8 has for these.

This is the 'can of worms' I was referring to.
__________________
custom x264 builds & patches | F@H | My Specs
kemuri-_9 is offline   Reply With Quote
Old 11th December 2011, 19:16   #225  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,688
The patch from Pegasys was for UTF-16 support.
Dark Shikari is offline   Reply With Quote
Old 11th December 2011, 20:21   #226  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
If anybody cares, I have hacked together a quick UTF-8 patch for Win32.

Attached Files
File Type: diff x264_win32_utf8.V12.diff (24.3 KB, 11 views)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 13th December 2011 at 20:52.
LoRd_MuldeR is offline   Reply With Quote
Old 11th December 2011, 20:23   #227  |  Link
Chumbo
Registered User
 
Chumbo's Avatar
 
Join Date: Feb 2005
Posts: 585
Quote:
Originally Posted by kemuri-_9 View Post
What you're only really caring about is that filenames can be utf-8, not the entirety of x264.
...
That's the case for me so it can process my files named with UTF8 characters.

Quote:
Originally Posted by LoRd_MuldeR View Post
If anybody cares, I have hacked together a quick UTF-8 patch for Win32.

Sweet! Any chance of getting this into the latest dev build? I'm on 64bit so can the 64bit be patched likewise please?
__________________
Chumbo

Last edited by Chumbo; 11th December 2011 at 20:27.
Chumbo is offline   Reply With Quote
Old 11th December 2011, 23:15   #228  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Also added a workaround for Avisynth (and FFMS2), which, as far as I know, has no way of properly passing Unicode strings.

(Updated patch in previous post)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 12th December 2011 at 01:18.
LoRd_MuldeR is offline   Reply With Quote
Old 12th December 2011, 12:29   #229  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Added a workaround for MP4 output too. The GPAC library doesn't support Unicode either.

(Updated patch in previous post)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.


LoRd_MuldeR is offline   Reply With Quote
Old 12th December 2011, 19:45   #230  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Okay, FFMS2 does support Unicode/UTF-8 paths, if initialized accordingly. Unfortunately its UTF-8 support is broken for the index file, so we still need the workaround for the index file.

(Updated patch in previous post)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.


LoRd_MuldeR is offline   Reply With Quote
Old 13th December 2011, 04:07   #231  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
All this is only proving is that dealing with anything non-ascii is just not worth it on windows.

it's easy enough to just rename your files...
__________________
custom x264 builds & patches | F@H | My Specs
kemuri-_9 is offline   Reply With Quote
Old 13th December 2011, 12:36   #232  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Well, the lack of proper Unicode support may be a negligible problem for US users, but it certainly is a big annoyance for people from certain other countries

Not supporting Unicode is behind the times (this is not the early Nineties anymore). And adding proper UFT-8 support for Win32 is possible - with very minor code change*.

I see few reasons to not do it. The biggest problem, at the moment, is that some third-party libs, such as Avisynth and FFMS2, screw up

While I don't think there is much hope for Avisynth, FFMS2 seems to be under active development and already does support UTF-8 - just not for the index file (yet)**.

And even for those third-party libs that we can't get to work with UTF-8, we can still convert to "short" path names internally, which works most of the time.

(I know that the generation of "short" path names can be disabled by registry hack, but I'm yet to encounter such a system. And in that very rare case, we can still rename)

__________________
(*) Most code changes in the patch are actually workarounds for third-party libs, that we hopefully can get rid of (not the libs, the workarounds), as soon as those libs are fixed
(**) It's actually a MinGW-specific problem with the fstream class. The workaround is creating your own implementation of fstream that uses wfopen() internally. I have done that for a project at work already.
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 13th December 2011 at 12:45.
LoRd_MuldeR is offline   Reply With Quote
Old 17th January 2012, 13:27   #233  |  Link
Juce
Registered User
 
Join Date: Mar 2011
Location: Finland
Posts: 25
A video is bluish if a source is 48 bit per pixel PNG.

x264 r2145 (x264.nl), Windows XP 32 bit

Code:
x264 "frame%03d.png" -o testi.mkv 2> log.txt

ffms [error]: could not create index
lavf [info]: 1280x720p 0:1 @ 25/1 fps (vfr)
resize [warning]: converting from rgb48be to rgb48le
resize [warning]: converting from rgb48le to yuv420p16le
x264 [info]: using cpu capabilities: MMX2 SSE2Slow SlowCTZ
x264 [info]: profile High, level 3.1
1 frames: 0.82 fps, 1240.40 kb/s  
4 frames: 2.64 fps, 567.35 kb/s  
7 frames: 3.73 fps, 464.89 kb/s  
10 frames: 4.51 fps, 403.86 kb/s  
                                                                               
x264 [info]: frame I:1     Avg QP:18.33  size:  5501
x264 [info]: frame P:6     Avg QP:19.39  size:  1946
x264 [info]: frame B:3     Avg QP:21.61  size:   771
x264 [info]: consecutive B-frames: 40.0% 60.0%  0.0%  0.0%
x264 [info]: mb I  I16..4: 72.6% 26.6%  0.8%
x264 [info]: mb P  I16..4: 10.2%  1.2%  0.0%  P16..4: 15.2%  0.7%  2.0%  0.0%  0.0%    skip:70.7%
x264 [info]: mb B  I16..4:  2.9%  0.4%  0.0%  B16..8: 12.9%  0.4%  0.0%  direct: 0.1%  skip:83.3%  L0:30.9% L1:68.9% BI: 0.3%
x264 [info]: 8x8 transform intra:19.8% inter:99.7%
x264 [info]: coded y,uvDC,uvAC intra: 8.0% 25.9% 1.5% inter: 2.4% 9.2% 0.0%
x264 [info]: i16 v,h,dc,p: 43% 49%  2%  6%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu:  2% 12% 84%  0%  0%  0%  0%  0%  0%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 13% 14% 72%  0%  0%  0%  0%  0%  0%
x264 [info]: i8c dc,h,v,p: 49% 43%  7%  1%
x264 [info]: Weighted P-Frames: Y:0.0% UV:0.0%
x264 [info]: ref P L0: 84.4%  0.6% 12.5%  2.6%
x264 [info]: ref B L0: 45.3% 54.7%
x264 [info]: kb/s:389.84

encoded 10 frames, 4.51 fps, 403.86 kb/s


PNG-images: http://www.mediafire.com/file/vc5nkb...7y9/frames.zip
Juce is offline   Reply With Quote
Old 17th January 2012, 17:54   #234  |  Link
MasterNobody
Registered User
 
Join Date: Jul 2007
Posts: 534
Quote:
Originally Posted by Juce View Post
A video is bluish if a source is 48 bit per pixel PNG.

x264 r2145 (x264.nl), Windows XP 32 bit

Code:
x264 "frame%03d.png" -o testi.mkv 2> log.txt
...


PNG-images: http://www.mediafire.com/file/vc5nkb...7y9/frames.zip
That is bug in lavf input (ffmpeg/libav) because there are same artefacts with:
Code:
avconv -i "frame%03d.png" -vcodec huffyuv test.avi
MasterNobody is offline   Reply With Quote
Old 21st January 2012, 21:04   #235  |  Link
professor_desty_nova
Registered User
 
professor_desty_nova's Avatar
 
Join Date: Nov 2006
Posts: 55
I just noticed a difference when when input is 10bit avc with the 8bit version of x264:

with x264 revision 2120 the resize warning is: "resize [warning]: converting from yuv420p10le to yuv420p"

with x264 revision 2146 the resize warning is: "resize [warning]: converting from yuv420p10le to yuv420p16le"

Is this a bug? From the warning it seems now it's converting from 10 bit to 16bit prior to encoding in the 8bit version of x264!

I don't know if it's related to this:
If the input is 8bit, both 2120 and 2146 use the same number of I,P and B-frames.
If the input is 10bit, the resulting encode of 2120 and 2146 has a different number of I, P and B-frames.

Last edited by professor_desty_nova; 21st January 2012 at 21:18. Reason: more info
professor_desty_nova is offline   Reply With Quote
Old 21st January 2012, 22:17   #236  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
This is actually not a bug, the old functionality was instead a bug:

Upon receiving input from ffmpeg/libav (whether via libavformat or ffms), the input needs to be 'normalized' into something the x264cli system can recognize.

for 8bit and 16bit per channel formats, x264cli supports these naturally (for what it does support), so no conversion is actually required.
However any intermediate bit/channel formats such as 10bit/channel need to be normalized.

Previously, most, if not all, types of such formats were normalized to 8bit. This is actually a quality degradation, so it was corrected to normalize the formats to 16bit, to allow for retention of quality when passing the data into the filtering system.

using an input -> intermediate -> final (libx264) denotation,
10 -> 8 -> 10 - what is passed to libx264 is not the same quality that was originally read in, but
10 -> 16 -> 10 has the same quality, as there was no bit-depth drop below what the input has.
__________________
custom x264 builds & patches | F@H | My Specs
kemuri-_9 is offline   Reply With Quote
Old 22nd January 2012, 00:27   #237  |  Link
professor_desty_nova
Registered User
 
professor_desty_nova's Avatar
 
Join Date: Nov 2006
Posts: 55
Quote:
Originally Posted by kemuri-_9 View Post
This is actually not a bug, the old functionality was instead a bug:

Upon receiving input from ffmpeg/libav (whether via libavformat or ffms), the input needs to be 'normalized' into something the x264cli system can recognize.

for 8bit and 16bit per channel formats, x264cli supports these naturally (for what it does support), so no conversion is actually required.
However any intermediate bit/channel formats such as 10bit/channel need to be normalized.

Previously, most, if not all, types of such formats were normalized to 8bit. This is actually a quality degradation, so it was corrected to normalize the formats to 16bit, to allow for retention of quality when passing the data into the filtering system.

using an input -> intermediate -> final (libx264) denotation,
10 -> 8 -> 10 - what is passed to libx264 is not the same quality that was originally read in, but
10 -> 16 -> 10 has the same quality, as there was no bit-depth drop below what the input has.
So, if you are using the 8bit x264 binary and start with 10bit source, it now does something like 10 -> 16 ->8.
professor_desty_nova is offline   Reply With Quote
Old 22nd January 2012, 00:40   #238  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Quote:
Originally Posted by professor_desty_nova View Post
So, if you are using the 8bit x264 binary and start with 10bit source, it now does something like 10 -> 16 ->8.
Yes, I think so.
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.


LoRd_MuldeR is offline   Reply With Quote
Old 22nd January 2012, 17:48   #239  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
Quote:
Originally Posted by professor_desty_nova View Post
So, if you are using the 8bit x264 binary and start with 10bit source, it now does something like 10 -> 16 ->8.
Yes, this is correct.

it might be expected that a
resize [warning]: converting from yuv420p16le to yuv420p
line would exist to indicate this, but the down conversion is actually handled separately from libswscale, so there is no message.
__________________
custom x264 builds & patches | F@H | My Specs

Last edited by kemuri-_9; 22nd January 2012 at 17:52. Reason: typo
kemuri-_9 is offline   Reply With Quote
Old 23rd January 2012, 09:07   #240  |  Link
professor_desty_nova
Registered User
 
professor_desty_nova's Avatar
 
Join Date: Nov 2006
Posts: 55
So, I guess the difference I'm seeing (If the input is 10bit, the resulting encode of 2120 and 2146 has a different number of I, P and B-frames and with the same crf the 2146 file is bigger, with 8bit input it stays the same between versions) is because now the down-conversion is handled by a different part of the x264 binary, and the result that is fed to the encoder part is not the same.
professor_desty_nova is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:04.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.