Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
6th November 2019, 11:43 | #1 | Link |
Registered User
Join Date: Jan 2007
Posts: 45
|
How do I move inline assembly to external file?
I'm currently recompiling virtualdub plugins and need to move the inline assembly code to an external asm file but I dont know how to call it in the cpp.
As an example, here's a very simple bit of inline code: Code:
void LUT_iSSE (Pixel32 *dst,int *LUT,int psize) { __asm { mov edi, [dst] mov esi, [LUT] mov ecx, [psize] align 16 GLoop: mov eax, [edi] xor ebx, ebx mov edx, eax mov bl, ah and edx, 0xff0000 and eax, 0xff shr edx, 16 movd mm0, [esi + eax * 4 + (512 * 4)] prefetchnta[edi + 512] por mm0, [esi + ebx * 4 + (256 * 4)] por mm0, [esi + edx * 4] movd[edi], mm0 add edi, 4 dec ecx jnz GLoop emms } } Greatly appreciated! |
6th November 2019, 13:37 | #2 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
I am very rusty on this stuff and not an intel assmebler guy, but maybe something like this
myheader.h Code:
#ifndef __MYHEADER_H__ // Avoid multiple inclusions #define __MYHEADER_H__ #include <windows.h> // and whatever else #include <stdio.h> #include <stdlib.h> #include <time.h> #include <math.h> // etc // Other stuff ... extern "C" { void __stdcall LUT_iSSE (Pixel32 *dst,int *LUT,int psize); } #endif // __MYHEADER_H__ Code:
#include myHeader.h void __stdcall LUT_iSSE (Pixel32 *dst,int *LUT,int psize) { __asm { mov edi, [dst] mov esi, [LUT] mov ecx, [psize] align 16 GLoop: mov eax, [edi] xor ebx, ebx mov edx, eax mov bl, ah and edx, 0xff0000 and eax, 0xff shr edx, 16 movd mm0, [esi + eax * 4 + (512 * 4)] prefetchnta[edi + 512] por mm0, [esi + ebx * 4 + (256 * 4)] por mm0, [esi + edx * 4] movd[edi], mm0 add edi, 4 dec ecx jnz GLoop emms } } Perhaps others will give better advice. EDIT: One of the headers should define what Pixel32 is.
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 6th November 2019 at 13:42. |
6th November 2019, 13:45 | #3 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
He wants to move it to a asm file. There is no inline in asm files. I recommend googling, there's tons of info on the subject. I think there's even a section on avisynth.nl.
Edit1: Here is the page on the wiki. Edit2: Also, look at the code of plugins with external ASM modules.
__________________
Groucho's Avisynth Stuff Last edited by Groucho2004; 6th November 2019 at 13:48. |
6th November 2019, 13:51 | #4 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
In my case, MASM 5.1 was the last Assembler I worked with, very early 90's.
__________________
Groucho's Avisynth Stuff |
6th November 2019, 13:51 | #5 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Thanks G2K4, here on Wiki, Separate assembly modules :- http://avisynth.nl/index.php/Filter_...ler_optimizing
EDIT: Quote:
but have little knowledge other than that. Perhaps I might one day get into the intrinsics thing, but am put off by the whole menagerie of different CPU instruction requirements. EDIT: Well, I did do a teensy-weensy bit of 8080 back in 1981 [Zlilog Z80A was better].
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 7th November 2019 at 14:05. |
|
6th November 2019, 14:02 | #6 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
G2K4, is the posted header part of use as is, though ?
[of interest to both me and the bassquake] EDIT: I've never had to use extern "C", being only a C programmer, I think that its a CPP thing. Quote:
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 6th November 2019 at 14:06. |
|
6th November 2019, 14:06 | #7 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Intrinsics seem to be the way to go for time critical applications but for what I'm writing nowadays, plain C/C++ is sufficient.
__________________
Groucho's Avisynth Stuff |
6th November 2019, 14:10 | #8 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Dunno.
__________________
Groucho's Avisynth Stuff |
6th November 2019, 14:14 | #9 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Good, we can Dunno together then
Bassquake, post how you get on please.
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? |
6th November 2019, 14:15 | #10 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
__________________
Groucho's Avisynth Stuff |
6th November 2019, 16:01 | #11 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
VirtualDub2 on sourceforge, hqdn3d, has asm, no idea if of use:- https://sourceforge.net/projects/vdf...files/plugins/
EDIT: It uses YASM assembler (which I think most/all VD stuff uses).
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 6th November 2019 at 16:04. |
6th November 2019, 17:40 | #12 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,316
|
My plugins use external asm, and the asm provided with Visual Studio, no need to install any another stuff, just VS is enough.
You can check in my github to see exemples of code, for both x86 and x64 versions.
__________________
My github. |
7th November 2019, 11:52 | #13 | Link | |
Registered User
Join Date: Jan 2007
Posts: 45
|
Quote:
The code in the cpp is: Code:
extern "C" void lut_isse(Pixel32 *dst,int *LUT,int psize); Code:
.586 .mmx .xmm .model flat, c .code lut_isse proc dst:dword,LUT:dword,psize:dword public lut_isse mov edi,[dst] mov esi,[LUT] mov ecx,[psize] align 16 GLoop: mov eax,[edi] xor ebx,ebx mov edx,eax mov bl,ah and edx,0xff0000 ;A2206 missing operator in expression and eax,0xff ;A2206 missing operator in expression shr edx,16 movd mm0,[esi+eax*4+(512*4)] ;A2070 invalid instruction operands prefetchnta [edi+512] por mm0,[esi+ebx*4+(256*4)] por mm0,[esi+edx*4 ] movd [edi],mm0 ;A2070 invalid instruction operands add edi,4 dec ecx jnz GLoop emms ret lut_isse endp END I don't know assembly and was hoping wouldn't have to rewrite any of it. Last edited by bassquake; 7th November 2019 at 12:00. |
|
7th November 2019, 14:54 | #14 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,316
|
You're not in C...
Code:
and edx,0ff0000h and eax,0ffh Code:
and eax,ffh Code:
and eax,0ffh Try this : Code:
movd mm0,dword ptr[esi+eax*4+(512*4)] .... movd dword ptr[edi],mm0 Code:
movd qword ptr[edi],mm0 Also, i would write the begining like this (but maybe it will produce the same result). Code:
mov edi,dst mov esi,LUT mov ecx,psize The only registers you can alter without saving them are : eax, ecx, edx and all the mm* and xmm* registers. If you change esi, edi, ebx, ebp or esp, you'll need to backup and restore them. Note that if you change ebp, after you'll not be able anymore to do things like this : Code:
mov edi,dst This is what all the .model flat, c and lut_isse proc ... are for. So, the start of your function should be : Code:
public lut_isse push esi push edi push ebx mov edi,dst ... Code:
.... emms pop ebx pop edi pop esi ret Code:
dec ecx jnz GLoop Code:
loop GLoop I always use data i'm sure of the size, like uint32_t for fixed size in both 32/64 bits, or size_t for 32 bits in 32 bits, 64 bits in 64 bits (unsigned), and ptrdiff_t for pointer offset, as it also adapt the size for 32/64 bits, but it's signed.
__________________
My github. Last edited by jpsdr; 7th November 2019 at 15:06. |
7th November 2019, 17:29 | #15 | Link |
Registered User
Join Date: Jan 2007
Posts: 45
|
Cool thanks. I got it working with the following:
Code:
.586 .mmx .xmm .model flat, c .code lut_isse proc dst:dword,LUT:dword,psize:dword public lut_isse push esi push edi push ebx mov edi,dst mov esi,LUT mov ecx,psize align 16 GLoop: mov eax,[edi] xor ebx,ebx mov edx,eax mov bl,ah and edx,0ff0000h ;A2206 missing operator in expression and eax,0ffh ;A2206 missing operator in expression shr edx,16 movd mm0,dword ptr[esi+eax*4+(512*4)] ;A2070 invalid instruction operands prefetchnta [edi+512] por mm0,[esi+ebx*4+(256*4)] por mm0,[esi+edx*4 ] movd dword ptr[edi],mm0 ;A2070 invalid instruction operands add edi,4 loop GLoop emms pop ebx pop edi pop esi ret lut_isse endp END |
7th November 2019, 19:17 | #16 | Link |
Registered User
Join Date: Mar 2015
Posts: 775
|
Probably better to convert this to plain c++ (btw maybe you already have it), this asm is not doing anything special.
What is the plugin?
__________________
VirtualDub2 |
8th November 2019, 13:23 | #18 | Link |
Registered User
Join Date: Mar 2015
Posts: 775
|
You can get rid of asm by removing the lines
Code:
if ((CPUF_SUPPORTS_INTEGER_SSE & ff->getCPUFlags())) LUT_iSSE (dst,mfd->Lut,psize); else Notes for x64: SetWindowLong -> SetWindowLongPtr GetWindowLong -> GetWindowLongPtr
__________________
VirtualDub2 |
|
|