Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se |
|
|
#42 | Link | |
|
Software Developer
![]() Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,275
|
Quote:
But AFAIK the first Sandy Bridge generation will only support AVX with 256-Bit registers, rather than the full 512-Bit. Still that's twice the size of the SSE registers. Also you'll need Windows 7 with SP-1 to be able to use AVX. Or some recent Linux kernel
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 25th December 2010 at 23:34. |
|
|
|
|
|
|
#43 | Link | |||
|
Registered User
Join Date: Sep 2007
Posts: 5,669
|
Quote:
Quote:
|
|||
|
|
|
|
|
#44 | Link |
|
Software Developer
![]() Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,275
|
Didn't know that AVX is FP-only. That's a pity...
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ |
|
|
|
|
|
#45 | Link | |
|
Banned
Join Date: Oct 2010
Posts: 119
|
Quote:
he says avx is "Float-only, thus a useless pile of tripe" yet what he fails to mention is that he could convert the code to floating point, there's nothing that says it must be integer based. as a very simple example if you have the following code snippet: for ( a = 1; a < 100001; a++ ) for ( b = 1; b < 100001; b++ ) { ab = a * b; } and use the following variable declaration: int a, b, ab; you cause the above to be executed on the alu (integer unit), if however you do this: float a, b, ab; it's executed using the floating point registers. depending on the compiler you can even do something like this: _m128i a, b, ab; and perform a scalar calculation using the sse registers (there's a bit more code required than just that, but you get the idea). yes, it would be a lot of work to rewrite the code to take advantage of the new avx registers and it's contingent on gcc supporting the required assembler instructions (he could always spend the dough and buy a copy of intel's compiler, though he would also need a copy of visual c++), but there's nothing inherently integer based about the code (other than that's they way he wants it) and there's nothing really standing in his way from changing it to take advantage of the sandy bridge's capabilities. (<--in all fairness, if he did do this he would need to maintain 2 versions of x264, one for cpu's that support avx and one for those that don't and he may not be willing to do that). |
|
|
|
|
|
|
#46 | Link | |
|
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Anyways, since you're such a genius, write me a 16x16 SAD function that uses AVX and floating point input and runs in under 35 clock cycles (the speed of the SSE implementation on a Core i7). Here's the code for the C (with uint8_t converted to float): Code:
static float sad_16x16( float *pix1, int stride_pix1, float *pix2, int stride_pix2 )
{
float sum = 0;
for( int y = 0; y < 16; y++ )
{
for( int x = 0; x < 16; x++ )
sum += fabs( pix1[x] - pix2[x] );
pix1 += stride_pix1;
pix2 += stride_pix2;
}
return sum;
}
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 26th December 2010 at 05:34. |
|
|
|
|
|
|
#47 | Link |
|
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
pwned...
FP math is ALWAYS slower than INT math, unless your have a CPU with ridiculously big FP registers. also FP math leads to precision problems over time... unless you use a ridiculously high FP precision... that said, INT math is a way better solution.
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! |
|
|
|
|
|
#48 | Link | |
|
Banned
Join Date: Oct 2010
Posts: 119
|
Quote:
as for the homework assignment, i will gladly admit that i can't do it, but i will throw you a bone and use an excuse that you are fond of: company x hasn't sufficiently documented technology y, so it's their fault not mine. simply replace x with intel and y with avx. and here's some more excuses: i don't own an avx enabled cpu and i don't have a compiler that supports that instruction set (i don't think gcc supports it yet). but i'm man enough to admit that like most people i don't know how to code with avx instructions...yet. |
|
|
|
|
|
|
#49 | Link | |
|
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Pixels are only 8-bit, and transform intermediates (as well as DCT coefficients) are only 16-bit. x264 makes very minimal use of anything larger than 16-bit. Ironically, with integer SIMD, the variety of instructions available for 32-bit is actually rather lacking. For 16-bit, for example, you have pmulhw, pmullw, pmulhrsw, and pmaddwd for multiplication, providing a pretty good variety of instructions. For 32-bit, you basically only have pmulld and pmuldq -- the former of which is slow and SSE4-only, and the latter of which only does two multiplies, hardly justifying SIMD at all. |
|
|
|
|
|
|
#50 | Link | |
|
Banned
Join Date: Oct 2010
Posts: 119
|
Quote:
on the topic of gpu accelerated encoding, is one of the reasons you have claimed that for any given number of threads a cpu will be faster because with the cpu you can use 8 bit and 16 bit int's and with cuda (and it's brethren) you have to use 32 bit int's minimum? is it also safe to assume that bulldozer, with it's 2 128 bit alu's per core, will be THE cpu to get for x264 encoding? 2 more quick questions: you recently signed a licensing agreement with pegasys and reading some of the press releases it seems that you created a parallel "commercial friendly" license under which you licenses x264 llc (that is the name of the commercial variant, is it not?). does this not violate the spirit, if not the letter, of the gpl? i know many companies consider the gpl an "infectious" license, but doesn't the gpl explicitly forbid taking gpl'd code and making closed source? does it not also require that any derivative work also be gpl'd? by creating a parallel licensing scheme haven't you a) created a derivative that's not gpl'd, b) opened the door for companies to create derivatives that are not gpl'd, c) opened the door for companies to close source the x264 code they license from you, d) and perhaps most importantly open the door for a company to make some simple changes and try and claim copyright to that, a claim that they could use to prevent you from making similar changes to the gpl'd version of x264? lastly, i'm wondering what ide do you use during the development of x264, i'm assuming you use gcc to build the executables but do you use a front end like code blocks or dev-c++? also what, if any optimization options do you use? do you target any specific architecture, simply use -O3, a combination? thanks. |
|
|
|
|
|
|
#51 | Link |
|
Software Developer
![]() Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,275
|
Commercial usage is perfectly fine for GPL'd software. The license clearly says that you are allowed to use the software for any purpose, explicitly including commercial purposes.
Moreover commercial development/distribution and OpenSource are not necessarily contradictory. Just think about commercial Linux distributions, like RHEL. Last but not least, the authors of x264 could decide to continue the development of their software under some CloseSource license at any time, because they own the copyright. However they certainly do not have to do this in order to be able to license their software commercially. And there's absoloutely no indication of such a plan at this time. (I think the "commercial" license of x264 is more related to patent issues and/or support contracts. Something that is important for companies who use x264 in their products)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 26th December 2010 at 22:27. |
|
|
|
|
|
#52 | Link |
|
Registered User
Join Date: May 2006
Posts: 957
|
The commercial license for x264 just frees companies from releasing all code they link with x264 as GPL. Each company that wishes to do this must obtain a license. It explicitly does not cover the AVC patent license. All useful code changes will be committed as GPL into x264 therefore letting everyone use them.
tl;dr LURK MOAR
__________________
x264 log explained || x264 deblocking how-to preset -> tune -> user set options -> fast first pass -> profile -> level Doom10 - Of course it's better, it's one more. |
|
|
|
|
|
#53 | Link | |
|
Banned
Join Date: Oct 2010
Posts: 119
|
Quote:
lastly, my view of gpl'd software has always been that once it's gpl'd it's the same as being put into the public domain, copyright laws do not allow one to take something out of the public domain, not even whoever put it there in the first place. http://www.gnu.org/licenses/gpl.html i interpret the gpl to mean that you are not permitted to take a gpl'd product and release it under an alternate licensing scheme, not even if you're the person who gpl'd it in the first place. i'm interested in hearing DS' take on this... |
|
|
|
|
|
|
#55 | Link | |
|
Software Developer
![]() Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,275
|
Quote:
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 27th December 2010 at 00:02. |
|
|
|
|
|
|
#56 | Link |
|
Registered User
Join Date: Aug 2008
Location: The Land Of Dracula (Romania - EU)
Posts: 934
|
when money was invented the platonic love was gone...so anything is possible...
_
__________________
if you ask a question and somebody give you the correct answer don't forget to leave a "thank you" note... Visit The Land Of Dracula (Romania - EU)! |
|
|
|
|
|
#57 | Link | |||||
|
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
![]() Quote:
Note some of this has changed recently; ATI added an instruction to do a SAD of 4 8-bit integers, for example. Quote:
Quote:
2. Any copyright holder is free to release their work under any license. Releasing something as GPL does not mean you can't release it as something else too. Many popular software programs are available under multiple licenses: a popular example is Firefox, which I recall is triple-licensed. A popular example of commercially-licensed GPL software is MySQL. 3. Companies are required under our license to (if we ask) give us all of their changes to x264 back to us. Furthermore, they sign over their rights to those changes -- we get co-ownership of them, allowing us to do whatever we want with them -- including release them as GPL along with the rest of x264. This means there won't be proprietary forks. I would not have gotten agreement from the other developers without this promise -- nor would I have supported the plan myself. This is why I consider it in the spirit of the GPL: it still ensures that all improvements make it back to the community, which is what the GPL is really all about. Note there may be patches we don't release, but only because we don't consider them useful. If someone asks, we'll probably still be happy to go get it anyways. An example is a patch that adds UTF-16 path support for statsfiles, something I consider utterly useless. Quote:
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 27th December 2010 at 03:07. |
|||||
|
|
|
|
|
#58 | Link | ||
|
Banned
Join Date: Oct 2010
Posts: 119
|
Quote:
i know that starting with the core 2 intel went to a single cycle sse engine and a 4 wide architecture but i thought the biggest difference that the core i7 brought, other than the cache improvements, was that it extended the core 2's ability to fuse 32 bit instructions and treat them as one to 64 bit instructions, i never heard anything about it having 3 128 bit alu's. to hear amd say it bulldozer's 128 bit alu's are something never before seen in a desktop cpu. Quote:
this is going to sound like an amateur question but it's been a while since i built a project like x264 without using make on a linux system, how would i go about building x264 on a vista system just with gcc? i want to run a couple of experiments with various optimization options, just to see what kind of speed up, if any, is possible. i'm also thinking of using c to pascal, c to fortran and c to basic translators to port the code over to the respective languages, so that i may see a) what it would look like in said languages and b) what the relative performance of a good pascal, fortran and basic compiler would be in relation to gcc. |
||
|
|
|
|
|
#59 | Link | ||
|
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Quote:
|
||
|
|
|
|
|
#60 | Link | |
|
pencil artist
Join Date: Jan 2006
Posts: 202
|
Quote:
__________________
fevh264 - open-source baseline h.264 encoder |
|
|
|
|
![]() |
| Tags |
| media engine, x.264 |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|