stdio.h fcntl.h io.h/unistd.h signal.h process.h dir.h/dirent.h/direct.hThe heap memory functions malloc(), calloc(), realloc() and free(), in stdlib.h, may also need to be changed.
In a well-written library, all of these changes will be confined to a relatively small number of files where the libc-to-OS interface occurs.
Language | Portable code? | Fast code? | Small functions can be inlined? | Preprocessor? |
---|---|---|---|---|
C | Yes | No | Yes, with GCC | Yes |
Inline assembler | No; same CPU and compiler only | Yes | Yes, with GCC | Same preprocessor as compiler |
Non-inline assembler | No; same CPU and compatible linker only | Yes | No | No preprocessor; or different preprocessor than compiler [*] |
[*] With care, the GNU C preprocessor can be used with almost any assembler. Your makefiles must be rewritten so that assembly is a two-step process, with separate filename extensions for the initial asm file and the preprocessed file. GNU software frequently uses the extension '.S' for non-preprocessed asm, and '.s' for preprocessed asm. This causes problems in DOS systems, where filenames are not case-sensitive.
32-bit code | 16-bit code, TINY, SMALL, or COMPACT memory models | 16-bit code, MEDIUM, LARGE, or HUGE memory models | |
---|---|---|---|
Create standard stack frame, allocate 16 bytes for local variables, save registers | push ebp mov ebp,esp sub esp,16 push edi push esi ... |
push bp mov bp,sp sub sp,16 push di push si ... |
push bp mov bp,sp sub sp,16 push di push si ... |
Restore registers, destroy stack frame, and return | ... pop esi pop edi mov esp,ebp pop ebp ret |
... pop si pop di mov sp,bp pop bp ret |
... pop si pop di mov sp,bp pop bp retf |
Size of 'slots' in stack frame, i.e. stack width | 32 bits | 16 bits | 16 bits |
Location of stack frame 'slots' | [ebp + 8] [ebp + 12] [ebp + 16]... |
[bp + 4] [bp + 6] [bp + 8]... |
[bp + 6] [bp + 8] [bp + 10]... |
If an argument passed to a function is wider than the stack, it will occupy more than one 'slot' in the stack frame. A 64-bit value passed to a function (long long or double) will occupy 2 stack slots in 32-bit code or 4 stack slots in 16-bit code.
Function arguments are accessed with positive offsets from the BP or EBP registers. Local variables are accessed with negative offsets. The previous value of BP or EBP is stored at [bp + 0] or [ebp + 0]. The return address (IP or EIP) is stored at [bp + 2] or [ebp + 4].
32-bit code | 16-bit code, all memory models | |
---|---|---|
8-bit return value | AL | AL |
16-bit return value | AX | AX |
32-bit return value | EAX | DX:AX |
64-bit return value | EDX:EAX | space for the return value is allocated on the stack of the calling function, and a 'hidden' pointer to this space is passed to the called function |
128-bit return value | hidden pointer | hidden pointer |
EBX, EDI, ESI, EBP, DS, ES, SSYou need not save these registers:
EAX, ECX, EDX, FS, GS, EFLAGS, floating point registersIn some OSes, FS or GS may be used as a pointer to thread local storage (TLS), and must be saved if you modify it.
EXTERN _conv_mem_size ; NASM syntax mov [_conv_mem_size],axLinux ELF does NOT use underscores. Watcom C uses trailing underscores for function names, and leading underscores for global variables.
If your GCC supports it, leading underscores can be turned off with the compiler option -fno-leading-underscore
In C, the calling function must 'clean up the stack' (remove function arguments from the stack after the called function returns). In Pascal, the called function must do this, before returning.
Pascal identifiers are case-insensitive. MyKewlProc() will be stored in the object code file as MYKEWLPROC
Watcom C uses a register-based calling convention. See sections 7.4, 7.5, 10.4, and 10.5 in cuserguide.pdf in the Watcom documentation. Individual functions can be declared to use the normal, stack-based calling convention.
GCC can be made to use a register calling convention by compiling with
gcc -mregparm=NNN ...
See the GCC documentation for details.
; C prototype ('extern' and parameter names 'arg1' and 'arg2' are optional): ; extern unsigned long long shr64(unsigned long long arg1, int arg2); BITS 32 SECTION .text GLOBAL _shr64 ; omit the underscores for Linux ELF _shr64: push ebp mov ebp,esp ; push ecx ; ECX is 'caller-save' for GCC mov ecx,[ebp + 16] ; ECX=arg2, at slot #3 mov eax,[ebp + 8] ; EDX:EAX=arg1, at slot #1... mov edx,[ebp + 12] ; ...and slot #2 again: shr edx,1 rcr eax,1 ; EDX:EAX >>= CL loop again ; pop ecx pop ebp ret ; 64-bit return value in EDX:EAX
; C prototype: ; extern unsigned long shr32(unsigned long arg1, int arg2); SEGMENT _TEXT PUBLIC CLASS=CODE GLOBAL _shr32 _shr32: push bp mov bp,sp push cx mov cx,[bp + 8] ; CX=arg2, at slot #3 mov ax,[bp + 4] ; DX:AX=arg1, at slot #1... mov dx,[bp + 6] ; ...and slot #2 again: shr dx,1 ; DX:AX >>= CL rcr ax,1 loop again pop cx pop bp ret ; 32-bit return value in DX:AX
as | NASM |
---|---|
.ifdef UNDERBARS .macro EXP sym .global \sym \sym: .global _\sym _\sym: .endm .macro IMP sym .extern _\sym .equ \sym,_\sym .endm .else .macro EXP sym .global \sym \sym: .endm .macro IMP sym .extern \sym .endm .endif |
%ifdef UNDERBARS %macro EXP 1 GLOBAL _$%1 _$%1: GLOBAL $%1 $%1: %endmacro %macro IMP 1 EXTERN _$%1 %define %1 _$%1 %endmacro %else %macro EXP 1 GLOBAL $%1 $%1: %endmacro %macro IMP 1 EXTERN $%1 %endmacro %endif |
nasm -dUNDERBARS=1 ... as --defsym UNDERBARS=1 ...ELF systems (e.g. Linux) do not require leading underscores.
A good C (and C++) standard reference is at: http://www.dinkumware.com/htm_cl/index.html
The Better String library for C (bstrlib): http://bstring.sf.net/
- The unit of linkage is the module. For C, module == file. Put each function into its own file to prevent bloat (linking of unrelated and unnecessary functions).