OPERATING SYSTEMS DEVELOPMENT QUESTIONS AND ANSWERS ================================================================ Common errors ================================================================ - Bootsector (or entering pmode) doesn't work because the data segment is not where you think. (What's in DS? Did you load it from CS? What was the value in CS? What is your ORG value? Is the CS value correct for this ORG value? Are you _sure_?) - Loading boot code to address 0 (using INT 13h), which wipes out the real-mode interrupt vector table, which causes INT 13h to fail. - Can't link asm code and DJGPP/MinGW/CygWin code because, for DOS and Windows, asm identifiers must have leading underscores. - Only one interrupt from timer or keyboard because you didn't outportb(0x20, 0x20); Only one interrupt from keyboard because you didn't inportb(0x60); - Screwing around trying to make a binary kernel (because that's what your homebrew bootloader needs) or, with DJGPP/MinGW/CygWin, an ELF kernel (because you think that's what GRUB needs). ================================================================ Advice ================================================================ 1. Have realistic expectations. Writing an OS is HARD. Most people won't be able to create even something like DOS or Linux 0.01 in less than a year... probably a lot longer. To stay motivated (and, for a school project, to make sure you finish on time), it's a good idea to have intermediate milestones between "hello, world" and "Windows clone": kernels of steadily increasing complexity, each doing something useful or interesting or educational or novel. One possible roadmap is described below. 2. The PC is probably the best target for a homebrew OS. It's common, cheap, and has an open, "hackable" architecture. Unfortunately, it's based on the Intel x86 CPU, which has a bizarre and complex architecture compared to some other CPUs. Be prepared to learn a lot about this chip. Whatever CPU you target, you will definitely need to learn assembly language, even if you write an OS in a higher-level language such as C. 3. C is a good choice for writing an OS (the language was designed for this purpose). C++ might be a better choice if: - you're ambitious - you know C++ inside and out, and - you're not concerned about backward compatability. Interpreted languages (BASIC, Perl) are probably too slow and too inflexible for writing a general-purpose OS. "Safe" languages (Java, C#, Python) may not be suitable either, because of - the need for a large runtime (for e.g. garbage collection, exception handling, dynamic typing system, bytecode interpreter or JITter, etc.) - slow code because of the extra functionality in these languages, compared to C (garbage collection again, or array bounds-checking) - essential low-level features may be missing (interface to assembly-language code, pointers) Whatever language you chose, you must know it well. You can't write an OS in C if you don't know how pointers work; or how to write a strlen() function. 4. 32-bit x86 CPUs can run in 16-bit mode (real mode; used by DOS) or 32-bit mode (protected mode or pmode; used by most OSes other than DOS). As you may have guessed, 32-bit OSes are more interesting. This dichotomy causes problems, though: a 32-bit OS can't easily use 16-bit BIOS functions. 5. If you are writing a 32-bit x86 OS, use GRUB to boot it. Don't get side-tracked writing your own bootloader, because it's not as easy or fun or interesting or educational as you think. Note that GRUB can load 32-bit kernels in ANY file format. The kernel need not be binary or ELF. DJGPP COFF and Win32 PE COFF file formats will work fine. With the limited exception of chain-loaded bootsectors, GRUB won't load a 16-bit OS, or a 32-bit OS that starts in 16-bit mode. Consider building such an OS as a DOS .COM or .EXE file and 'booting' it from the MS-DOS or FreeDOS command line. (xxx - DOS hooks many BIOS interrupts, and there's no way to restore them -- to "unboot" DOS -- before loading another real-mode OS.) 6. Don't get side-tracked on these, either: - Creating your own programming language - Devising your own executable file format - Devising your own filesystem - Re-compiling DJGPP or CygWin so it supports ELF (see previous notes about GRUB) - Writing a GUI Things like scheduling and memory management may seem boring, but it's essential to have this infrastructure in place before you can move on to other things. You can use naive (simple but functional) algorithms and code for these, then replace them with something better at a later date. 7. Use GCC to build your C/C++ OS. Reasons: - Free downloads and free upgrades - Free support in Usenet groups, mailing lists, etc. - Fewer bugs than some commercial compilers - Many hosts: DOS (DJGPP), Windows (CygWin and MinGW), Mac OS X, various Unixes - Many targets: 32-bit x86, PowerPC, ARM, Hitachi SuperH, MIPS, 680x0, ColdFire, SPARC, and more - Compiler and linker are much more flexible than commercial tools OS code uses language features and paradigms not found in normal application code. One consequence of this is that OS code is more likely to expose bugs in compilers and other software tools: - GCC 2.95.2 version of MinGW has a bug (BSS size stored in wrong field of section header) that prevents it from working with other tools chains such as NASM - NASM 0.98.00 creates bad ELF files if user-defined sections (anything other than .text, .data, or .bss) are added to the file Try to use newer versions of GCC and other tools. 8. Writing an OS may or may not be a good project to learn new programming languages, new architectures, etc. If you're on a deadline, things will go most quickly if you already know: - hardware details (e.g. PC) - CPU details (e.g. x86) - assembly lanaguage for the CPU you're using - C or whatever high-level language you're using ================================================================ How/where do I start? ================================================================ Assumptions: - 32-bit OSes of increasing complexity, written in C with some assembly language code - Target: PC with 32-bit x86 CPU - You are familiar with PC hardware and architecture - You are familiar with x86 CPU architecture, including pmode - You know C and x86 assembly language - You know how to use GNU tools: gcc, ld (including linker scripts), as, ar, and the dread Make. ("Putting text on the screen") 1. Learn to put text on the screen by writing directly into video memory. If you are targetting an embedded system that does not have a screen, get the serial port working, and send a "hello, world!" message out the port to a terminal emulator. 2. Write a putch() function, which puts single characters on the screen and handles tabs, newlines, and scrolling ("C library") 3. Write a printf() function to display error messages. Begin writing a C library for the kernel (libk). ("Interrupts and CPU exceptions") 4. Handle CPU exceptions and software interrupts/traps. GCC has no built-in support for interrupt functions, so you need asm. ("PC keyboard") 5. Hardware interrupts (IRQs). Handle timer interrupts by blinking the character in the upper left corner of the screen (bottom of video memory). Handle keyboard interrupts by reading the scancode byte from I/O port 60h, converting to ASCII (or other character set), and putting the character on the screen with putch(). 6. Extend video code to use video memory for multiple (virtual) consoles, and handle ANSI/VT escapes for moving the cursor, changing text color, etc. ("Multitasking") 7. Cooperative multitasking, using setjmp() and longjmp(). Static tasks linked into the kernel (more like threads than tasks). Use a different virtual console for each task 8. Preemptive multitasking, driven by timer interrupt. Wait queues with sleep_on() and wake_up() functions 9. Begin defining and organizing "syscall" functions in the OS; functions meant to be called by applications, e.g. read(), write(), and select(). Write some simple apps e.g. Tetris ("Memory management") 10. Write a naive kernel memory allocator (BogoMalloc), with functions kmalloc(), krealloc(), and kfree(). Use kernel heap memory instead of video memory for virtual consoles 11. Task priorities. Break the keyboard handler into a top-half interrupt handler and bottom-half kernel thread. Have an idle task running at lower-than-normal priority and run the keyboard bottom-half at higher-than-normal priority 12. Add realtime clock driver and time() syscall. If you're not already doing so, try to allocate kernel data structures from the kernel heap, instead of using fixed-size arrays. BogoMalloc won't work with memory protection, so replace it with something a little more sophisticated, e.g. the example malloc() implementation in K&R's "The C Programming Language" ("Executable and relocatable file formats") 13. Dynamic tasks. Add code to the kernel to link and relocate ELF, DJGPP COFF, and/or Win32 PE COFF relocatable (.o) files. Build the tasks as .o files and use GRUB to load them as kernel modules. This code is tricky. Maybe skip this step and go directly to segment-based address translation, below. 14. Make the kernel handle syscalls via INT 30h instead of CALL. Create a second C library (libc), identical with libk except for the addition of syscall dispatchers, e.g.: #include <_syscall.h> int read(unsigned handle, void *buf, unsigned len) { int ret_val; __asm__ __volatile__( "int %1\n" : "=a"(ret_val) /* EAX EBX ECX EDX */ : "i"(SYSCALL_INT), "0"(SYS_READ), "b"(buf), "c"(len), "d"(handle)); return ret_val; } Link apps (still in relocatable format) with libc. Do not link them with the kernel, or with libk. ("Virtual addressing, address translation, and protection") 15. Segment-based address translation. Add code to the kernel to load ELF, DJGPP COFF, and/or Win32 PE COFF executable files. Link apps with libc as executable (non-relocatable) files, meant to run at a specific address. Continue using GRUB to load the tasks as kernel modules. 16. Protection. Make task code and data segment limits as small as possible. Run tasks at user privilege (ring 3). Program the I/O permission bitmap to prevent I/O by tasks. 17. Page-based protection, address translation, and kmalloc() 19. Devices: console character device, RAM disk block device. UNIX paradigm: same file descriptors and functions used for char devs, block devs, files, directories, pipes, and sockets. 20. Read-only VFS and FAT12 filesystem 21. Read-only command line shell; loads apps from RAM disk 22+ - IPC: signals, shared memory, message queues, anonymous pipes - Other filesystems: ext2 (Linux) and ISO-9660 (CD-ROM) - Read-write VFS, filesystems, and device arch - Paging 2: demand-loading, shared memory, discardable kernel code and data - More block devices: PIO mode ATA (IDE hard drive) and ATAPI (IDE CD-ROM) driver, PC floppy driver - Paging 3: memory-mapped files, virtual memory swapping, graphics-mode framebuffer in task data segment ================================================================ Web sites for OS development ================================================================ >However, I need some tutorials/information for absolute >newbies - everything from what protected mode is through to >what interrupts do what etc etc. If anyone knows of any good >URL's or general resources then I would appreciate it greatly. http://www.osdev.org/osfaq2/ http://www.osdev.org/cgi-bin/projects.cgi http://osdev.netrux.com/osring.htm#ring http://alexfru.chat.ru/elinks.html http://magic.hurrah.com/~sabre/os/ http://welcome.to/pmode http://dsos.cjb.net/ http://www.gaztek.co.uk http://www.thesky.demon.co.uk/os-development/index.html http://my.execpc.com/~geezer/johnfine/ http://my.execpc.com/~geezer/os/index.htm http://my.execpc.com/~geezer/osd/index.htm ================================================================ Books ================================================================ "Operating Systems design and implementation" by A.S.Tanenbaum (ISBN 0-13-637406-9) "Modern Operating Systems" by Tanenbaum (ISBN 0-13-588187-0) "Operating Systems Concepts", by Silberschatz and Galvin (ISBN 0-201-50480) "Operating Systems", by Deitel (ISBN 0-201-18038-3) "The Logical Design of Operating Systems", by Bic and Shaw (ISBN 0-13-540139-9) "Operating Systems", by Stallings (ISBN 0-02-415493-8) "The Design and Implementation of the 4.3BSD Unix Operating System", by Leffler, McKusick, Karels and Quarterman (ISBN 0-201-06196-1) "The Design of the Unix Operating System", by Bach (ISBN 0-13-201757-1) "The Indispensable PC Hardware Book" by Messmer "The Standard C Library" by P.J. Plauger "Understanding the linux kernel" from Oreilly (ISBN 0-596-0002-2) "Developing Your Own 32-bit Operating System" (MMURTL) by Richard Burgess "Protected Mode Software Architecture", by Tom Shanley (ISBN 0-201-55447-X) ================================================================ bootsector ================================================================ >Ok, i make my own boot program and then I copy it to one >disk using a 3 1.5 diskette (1.44mb) with the FDVOL program. > >I put on my code this at the end: > times ($$-$+510) db 0 > dw 0xAA55 >And when i boot my pc, my programs loads. Then i delete >these lines and boot my pc again and my programs loads. > >My question is: I have to put this: > times ($$-$+510) db 0 > dw 0xAA55 >or not? New BIOSes seem to ignore the 0xAA55, but you should leave it in so your code works OK on old PCs. Also, the Master Boot Record (MBR) will not load a bootsector if this value is missing. >Another question... my boot program can exced 512bytes? (I >know, I know... i need to make a better kernel not the >boot, but if is necessary it can be 512bytes+ ???) You must write the bootloader in "stages". The first stage must fit into 512 bytes, and it loads the larger second stage. >> A boot program starts at address 0000:07c00h, right? GRUB source code mentions BIOSes that jump to 07C0:0000 Maybe have a far JMP in your first-stage boot code to fix this. The offset (IP) of the JMP should agree with your ORG value. The segment of the JMP (CS) should be boot_cs = (7C00h - ORG_ADR) / 16 Or -- if your code doesn't use any far or absolute JMPs -- just make sure the value in DS agrees with your ORG value: boot_ds = (7C00h - ORG_ADR) / 16 >> To make that i need to put on my program [org 7c00h]. I >> compile my code and write it to a boot disk. The code >> works, but my problem is that when I view my program >> (boot.com) with Hexeditor or Debug it doesnt shows me the >> code starting at 07c00h... why? You must tell Hexeditor or Debug what the ORG value is. I don't know to do this with these programs, but with NDISASM, it's: ndisasm -o 0x7C00 myboot.bin >myboot.lst >So, how on Earth does the boot sector code translates the >partition-relative sector addresses to the global on-disk >sector addresses necessary for int13h accesses? FAT16 boot sector can read the hidden sectors value from the BIOS parameter block (BPB). In this case, hidden sectors is the number of sectors preceding the partition, i.e. the LBA value of the partition start. Just add this value to the partition- relative addresses to get the global sector addresses. >Will it read the MBR to do this? For this to work, there must be some standard way for the MBR and first-stage bootloader to communicate; some way for the MBR to say "I put the partition table at memory location NNNNN". The DOS MBR leaves DS:SI pointing to the 16-byte partition table record for the boot partition. >How the bootsector will know from what partition it is booted? I'm not sure. >Subject: Re: Does BIOS fills BPB when boot? No it doesn't. Nor does the BIOS use the BPB -- the name is very misleading. (Maybe it should be called Bootstrap Parameter Block instead of BIOS Parameter Block.) Also, note that the boot sectors of non-FAT filesystems, like ext2 (Linux) or FFS (BSD), do not have a BPB. > I'm trying to take a stab at writing an OS and I'm working > on a simple bootstrap. I can't find any information about > jumping from the initial bootstrap to a compiled C program. > How exactly is this done? Thanks. If booting from a hard disk, code in sector 0 of the hard disk (the master boot record, or MBR) finds an 'active' (bootable) partition on the hard disk, then loads and executes sector 0 of that partition (the boot sector). Floppies don't have partitions or an MBR; just a boot sector. The boot sector is 512 bytes. It gets loaded to address 7C00h. You can put whatever code in it you like. For compatability with DOS, however, you need a BIOS Parameter Block (BPB). Search the web for info on this, or download some bootloaders. If 512 bytes isn't enough for your OS, you have to write a loader. If the loader won't fit into 512 bytes, then the loader has to be written in 'stages'. I know from experience that 512 bytes is (barely) enough room for code that can find a binary file in the root directory of a FAT12 floppy disk and load it. Note BINARY file, above. You can compile your C code to .o files (relocatable object files) or to some kind of executable (like ELF, COFF, or .EXE), but in the end: 1. the kernel will have to be linked, objcopy-ed, or otherwise converted to binary format, or 2. a two-stage bootloader will be needed. The second stage contains the additional code needed for EXE-style relocations, or to decipher the COFF or ELF file sections, turn on pmode, load the file to extended memory, etc. ================================================================ installing a bootsector ================================================================ >My question is, after i write the assembly code for my >bootsector, how do i write the code to the floppy disks >bootsector? (Unix) dd if=my_boot skip=0 count=512 of=/dev/fd0 seek=0 -OR- cat my_boot >/dev/fd0 (DOS) partcopy my_boot 0 200 -f0 0 Get John Fine's PARTCOPY here: http://my.execpc.com/~geezer/johnfine/index.htm#zero Or, if you're desperate and you have DOS DEBUG available, you can use that: C:\>debug my_boot -w 0100 0 0 1 -q C:\> (The 'h' command in DEBUG will get you some help.) There is also a program called RAWRITE, but DOS RAWRITE will write anywhere from 3 sectors to an entire track at a time, instead of one sector. If you use it to install a FAT boot sector, it will trash the first FAT. If you want to install a FAT bootsector without changing the existing BIOS parameter block (BPB), you can do this: (Unix) dd if=my_boot skip=0 count=3 of=/dev/fd0 dd if=my_boot skip=36 count=476 of=/dev/fd0 seek=36 (DOS) partcopy my_boot 0 3 -f0 partcopy my_boot 24 1DC -f0 24 With all of these methods, be VERY careful. Type the wrong thing and you will overwrite the boot sector or MBR of your hard disk. "Bastian" wrote: >But the main problem is, I wanna copy my bootsector to the 2nd or 3th >partition. partcopy boot.bin 0 200 -aD (for D: partition) partcopy boot.bin 0 200 -aE (for E: partition) Alas, this does not work with non-FAT partitions (e.g. ext2). Partcopy also lets you copy partial sectors. These two commands copy 'fat12.bin' into the floppy bootsector but avoid the existing 33-byte BIOS parameter block: partcopy fat12.bin 0 3 -f0 partcopy fat12.bin 24 1DC -f0 24 ================================================================ getting memory sizes ================================================================ >is here any good way to count a system's memory? >something that might be easily portable to other architectures? To get PC extended memory size, start with BIOS interrupts, in this order: - INT 15h AX=E820h (look for type 1 memory) - INT 15h AX=E801h - INT 15h AH=88h Details: http://my.execpc.com/~geezer/osd/boot/biosmem.asm If none of those work, read extended memory size from CMOS ; get extended memory size from CMOS -- won't ; report more than 63.999 meg (65535/1024) mov al,18h out 70h,al in al,71h mov ah,al mov al,17h out 70h,al in al,71h mov [ext_mem_size],ax ; in K INT 15h AX=E801h and INT 15h AH=88h do not provide conventional memory size -- get this with INT 12h, or by reading CMOS, or by reading the 16-bit value at memory location 0040h:0013h Direct probing is fraught with problems: - Address "aliasing" caused by unconnected high-order address lines. I have a 486 system in which addresses 64 meg, 128 meg, 192 meg, etc. "wrap around" to address 0. - Memory mapped hardware that causes the computer to freeze when your memory probe steps on it. - PC might have a bizarre and unique method of turning on the A20 gate; a method not supported by your direct probing code. - C memory probing code that does not use 'volatile', or buggy compiler (Borland C++ 3.1) for which 'volatile' does not work - Bus float causes the memory test to succeed even if there's no memory ================================================================ compilers that support multiple segments/far pointers ================================================================ >*my OS uses segmentation (in protected mode, of course; it's >written for 386+). But I understand that pretty much all >compilers out there produce flat code. Is there one that >doesn't? Maybe Watcom C: http://www.openwatcom.org Turbo and Borland C, including the free Turbo C 2.0: http://community.borland.com/museum/ also support far pointers, but they produce only 16-bit code. ================================================================ Only One Interrupt (O1I) ================================================================ >I set the timer, exactly as you do in the source and it does >then give me one int 8, which is great. In my interrupt handler >i set the timer again, but I do not get any more int 8's ?? I >know my handler works, because in my primary task (the only one >running offcourse), i have some code to continously monitor the >keyboard and display a small message if a key is pressed. this >works all the time. I dont understand the code you have to tell >the PIC that the interrupt is done, is this necessary? YES! >; mov al, EOI ;Tell 8259 that interrupt is done >; out M_PIC, al > iret Un-comment that code. >In the below it has been commented out, but if I use it, I get >whole bunch of interrupt 6's ??? (_not_ exception 6, but >external int 6). So now I'm even more confused! :-) Maybe your interrupt table (IVT or IDT) is not properly set up. This bug happened to me with a real-mode kernel when I did not completely fill in the first 49 entries in the IVT. ================================================================ Where is free memory? ================================================================ >I just finished writing my boot sector/loader. I would like >to know if there are sensitive addresses in RAM I should not >touch while switching to protected mode(loading GDT,IDT,etc...). 000000-0003FF real-mode interrupt vector table (IVT) 000400-0004FF BIOS data segment 000500-09FBFF FREE CONVENTIONAL MEMORY 09FC00-09FFFF extended BIOS data area (EBDA) [1] 0A0000-0BFFFF video memory 0C0000-0FFFFF video and motherboard BIOS ROMs 100000-10FFFF high memory area (HMA) [2][3] 110000- FREE EXTENDED MEMORY [1] This EBDA is 1K. Some PCs have larger EBDAs, some have no EBDA. Use INT 12h to get the top of conventional memory and do not use memory above this limit. [2] DOS 7.0+ loads HIMEM.SYS silently and automatically, then loads itself into the HMA. If you load a pmode OS from DOS, avoid the HMA. (Actually, if you load a pmode OS from DOS, use XMS to allocate extended memory.) The HMA should be safe to use from a bootloader, though. [3] Odd-numbered megabytes of memory are not accessable unless you turn on the A20 gate. >And what's this memory hole I heard about? Some 16-bit SVGA boards (e.g. Cirrus 5422) support a linear framebuffer. The 16-bit ISA connector has only 24 address lines, so any memory-mapped device must be below the 16 Mbyte line. A hole in RAM between 15 Mbytes and 16 Mbytes can be left for a 1 Mbyte linear framebuffer. I don't know if any OSes today actually use or support this -- maybe OS/2 did. The "hole" could also be for certain types of scanners, I think. INT 15h AX=E820h will give you a "map" of extended memory, showing where these holes (if any) are. ================================================================ Minimal IDT ================================================================ >What is the minimal IDT one has to setup after jumping to >protected mode to be able to run the machine? I've read in >some tutorials that you don't need one but it seems to me you >would need to have some basic interrupts enabled... You don't need an IDT if: - Your code is sufficiently correct that it does not cause processor-detected exceptions (this is the hard part), AND - Your code does not use programmed exceptions (INT nn and related instructions), AND - Hardware interrupts are disabled (CLI) ================================================================ Using disk drives in pmode ================================================================ >Does switching to protected mode, therefore loosing the >interrupt vector functionality, mean that I can't access the >hard drive using int13h. Correct. >If so, how do I do it? You can switch back to real mode before issuing BIOS calls. This is how the GRUB bootloader works. Or, you can create a virtual 8086 mode machine monitor (VMM), which switches the CPU to virtual 8086 mode to issue BIOS calls. This is how DOS extenders work. Here are some examples: http://alexfru.chat.ru/epm.html#v86monitor http://members.tripod.com/~ladsoft/pmode/vm386.zip http://www.cs.utah.edu/flux/moss/ http://x86.ddj.com/ftp/dloads/v86mon.zip http://my.execpc.com/~geezer/johnefine/v86call3.zip http://my.execpc.com/~geezer/osd/pmode/v86mm.zip Or you can write your own pmode code to access the disks. >2) When my OS boots up, the hard drive and floppy drive LEDs >stay on and never shut off. Is this normal? The floppy drive is normally turned off by a timer in the BIOS. For a pmode OS or bootloader, you should shut off the floppy motor(s) yourself: mov dx,3F2h xor al,al out dx,al Hard drive light on could be caused by buggy IDE code. Make sure the sector count is not zero. Make sure the sector value starts with 1 instead of 0. Make sure interrupts are enabled (at the drive, at the 8259 chips, and at the CPU). Make sure you acknowledge the interrupt by reading the drive status register (I/O port 1F7h). >3) Has anybody got a module to use the hardrive in protected >mode under DMA? If so, I would like to see it as the >information is very scarce. There is info on this on Hale Landis' site: http://www.ata-atapi.com ISA (16-bit) hard drive interface boards are not normally wired to use ISA DMA for IDE access. You probably need a PCI (32-bit) IDE controller, e.g. Intel PIIX ("Triton"). The PCI DMA registers usually obey the SFF-8038 spec, so you can use the same code on different systems. If you are just starting out, I recommend you ignore DMA and use programmed I/O (PIO) instead. Much simpler code. This applies only to IDE hard disks -- with floppies, you're pretty much stuck with using ISA DMA. ================================================================ Keyboard driver ================================================================ >Tell me if I'm wrong, but each key I press on the keyboard >causes an interrupt right? Sometimes more than one interrupt. Each key produces a unique sequence of bytes (the "make code") when pressed, and another stream of bytes (the "break code") when released. Each byte produces an IRQ 1 interrupt. >Then the key is read from some area into a buffer.? Read the scancode, one byte at a time, from I/O port 60h >So now I have to implement it? Yes :) >3) Since you don't have int 16h in Pmode, what would be a good >way to get keyboard input from the user without going to V86 >mode? The keyboard causes an IRQ right? Yes, it causes IRQ 1. For a pmode OS, you should reprogram the 8259 interrupt controller chips so this IRQ goes to INT 21h or a higher INT. The handler for INT 21h should do something like this: push ax ; save other regs if you use them in al,60h ; get one byte of scancode ; ... either process the scancode here or put it ; into a queue for another task to process... mov al,20h ; reset interrupt at 8259 chip out 20h,al pop ax iret >4) Should the keyboard IRQ handler print caracters at the >screen or just leave them in a buffer for the operating system >(or both...)? The IRQ handler should not call non-reentrant functions, and the handler should be as fast as possible. So, it should put the characters into a buffer, then wake up another high-priority task (the "bottom half" interrupt handler) to process them. ================================================================ Video driver ================================================================ >5) When I get to the last line of the video display 80x25, is >there a way to tell the video card to lower its page (to start >the page at b80A0h instead of b8000h so I don't have to move >everything)? #include /* outportb() */ unsigned short crtc_adr = 0x3D4; /* 0x3B4 for monochrome */ unsigned short offset = 80; /* select CRTC register 12 (MSB of framebuffer offset) */ outportb(crtc_adr + 0, 12); /* the selected CRTC register appears at crtc_adr + 1 */ outportb(crtc_adr + 1, offset >> 8); /* select CRTC register 13 (LSB of framebuffer offset) */ outportb(crtc_adr + 0, 13); outportb(crtc_adr + 1, offset & 0xFF); This type of hardware scrolling can not continue indefinitely. Eventually the framebuffer will extend beyond the end of video memory. >How can I change from screen mode 0x3 (default boot) to say >0x12 (hi-res 640x480x16color) in pmode do I have to use ports >or what ?? 1. Switch to real or virtual 8086 mode to call the (S)VGA BIOS mode-set function. 2. Use someone else's pmode code to set VGA video modes, such as Tauron VGA utilities: http://alexfru.narod.ru/miscdocs/ega_vga/tauron30.zip 3. Write your own pmode code to set VGA video modes. 4. Write your own pmode code to set SVGA video modes. This depends on the SVGA chip used. ================================================================ Interrupts ================================================================ >How is an interrupt service routine written Partially or completely in assembly language. At the start of the handler, save all registers (or all registers that the interrupt handler code uses) and put known-good values into the segment registers. At the end of the handler, pop all the registers you saved, drop the error code from the stack (if there was one), and end with IRET instead of RET. If the interrupt came from hardware, you must acknowledge it at the 8259 interrupt controller chip: if(irq >= 8) outportb(0xA0, 0x20); outportb(0x20, 0x20); The handler should finish quickly. Lengthly operations should be deferred to a foreground task (Linux "bottom half"; Windows DPC). The handler must not perform non-reentrant functions like calling printf() or doing floating-point math. Here is example code: >> > I'd suggest to put there some extra stuff: >> ...and provision for switching stacks, to allow software >> task switching, would be handy, as well as pushing fake error >> codes for compatibility with CPU exceptions. > >Right, this is how I made my scheduler: > >_irq_00_wrapper: > push ds > push es > push fs > push gs ; saving segment registers and > pushad ; other regs because it's an ISR > > mov bx, 10h > mov ds, ebx > mov es, ebx ; load ds and es with valid selector > > mov ebx, esp > push ebx ; pass esp of a current process > call _scheduler ; call scheduler > pop ebx > mov esp, eax ; get esp of a new process > > popad ; restoring the regs > pop gs > pop fs > pop es > pop ds > iretd > >scheduler() simply returns a different value for ESP, >if a process switch is required. >1) I want to write my OS from scratch using protected mode. >Should I have to program the PIC. Yes, reprogram it if you use pmode. The BIOS programs the PIC in a way that makes it difficult to distinguish hardware interrupts from CPU-generated exceptions. >Is there any information [for the 8259 chip] that is easy to >understand for a beginner anywhere on the web. The spec sheets for the 8259 are confusing. Just copy the code from Linux, or from another OS, or from here: http://my.execpc.com/~geezer/osd/intr/8259.c The code is just a bunch of 'outb' instructions directed at I/O ports 20h, 21h, A0h, and A1h. There are four 'initialization control words' (ICWs). ICW2 is the one of interest: it sets the software interrupt corresponding to the IRQ. >2) I know i have to write an IDT for my operating system. But >I would really like to have an example of a well documented >simple ISR to get a feel a what it is I'm supposed to do. http://my.execpc.com/~geezer/osd/intr/x86-32.zip >For example, on the 1st exeption >(divide by zero), what should the interrupt routine do? If the exception was caused by a task, the task should be killed. If the exception happened in the kernel, you should panic: display an error message, display a hexadecimal dump of register values, and freeze. >3) What are the drawbacks of using V86 mode? Wouldn't it be >better to work in protected mode all along. By the way, I >don't intend to support 16bit anyway (don't you think we should >all drop the 16bit stuff?). You lose a lot of functionality if you abandon 16-bit code completely. Neither the motherboard BIOS nor the video BIOS are completely 32-bit, and they probably never will be. >4) I know that there are 32 reserved exceptions. Does that >mean that int13h would start at table entry 20h + 13h = 33h? No, INT 13h goes to table entry 13h. >2. The IDT has a capacity of 256 interrupt vectors. I assume >these interrupt vector numbers are fixed, so where can I find >a list of _ALL_ these vector numbers. Only the first 32 are reserved; the others are user-defined. http://my.execpc.com/~geezer/osd/intr >3. I've read you implement multitasking by programming the >task-re-scheduler on the clock-interrupt. Which interrupt >vector number has the clock interrupt (is it 0x8???)? It depends on how the 8259 interrupt controller chips (the PICs) are programmed. The BIOS programs these chips so IRQ 0 is INT 8. My own OS maps IRQ 0 to INT 20h. >4. Where in memory should the GDTR and IDTR point to, or is >this just the developers choice. It's up to you. ================================================================ PnP serial mouse ================================================================ >I wrote a mouse driver for MS serial mice. It worked with my >old mouse, but when I got a new 700 dpi mouse, the problems >started: this new mouse sends "M" as identifier, i.e. is is MS >compatible. But when I initialize COM port (and mouse) with the >procedure START it sends to me 30 (!?) bytes, and I DIDN't >touch the mouse. What's that? It's a PnP identifier. Go to Microsoft's site: http://www.microsoft.com/hwdev/respec/pnpspecs.htm and get document PNPCOM.RTF >The IRQ handler of course interpret these bytes and move the >cursor to the top right corner, but I want my cursor to be >showed in the center of screen ! I have also one other old >mouse, and it sends 2 byte = "MM". The 30 bytes are: >4D, 33, 8, 24, 25, 23, 2D, 10, 10, 10, 11, 3C, 3C,2D, >2F, 35, 33, 25, 3C, 30, 2E, 30, 10, 26, 10, 23, 18, 10, 9 4D 33 "M3" legacy 3-button mouse identifier 08 PnP begin ID 01 24 PnP version 25 23 2D 10 10 10 11 "ECM0001" PnP identifier 3C "\\" extend optional serial number (none) 3C "\\" extend 2D 2F 35 33 25 "MOUSE" optional PnP class identifier 3C "\\" extend 30 2E 30 10 26 10 23 "PNP0F0C" optional compatible device ID PNP0F0C = "Microsoft- compatible Serial Mouse" 18 10 checksum 09 PnP end ID >How to eliminate this problem ? Write a smarter mouse driver. ================================================================ Where to get the Bochs PC simulator? ================================================================ >I need BOCHS compiled for Win32 platform (an EXE file). I do >not have MSVC to compile it myself, so any help will be >welcome. I'm developing an OS and I'm using Windoze 2000, so >many reboots are time-consuming. Please help me. http://bochs.sourceforge.net ================================================================ GRUB ================================================================ >I am a little confused about GRUB. I am trying to make my >kernel load with it. It says invalid exec format. >Does grub understand pure binary files??? You _must_ have a valid MultiBoot header in your kernel. >If yes: > What's wrong with this [NASM] code then: >_MultiBoot: > Magic dd 0x1BADB002 > Flags dd 00000000000000010000000000000011b > CSum dd 0x0 > Header dd 0x100000 > Load dd 0x100000 > LoadEnd dd 0x100000 > BSSEnd dd 0x0 > Entry dd _start _MultiBoot must be aligned to a dword (4-byte) boundary. Put this: align 4 before the declaration of _MultiBoot Also, CSum must be set: F_PAGE_ALIGN equ 1<<0 F_MEM_FIELDS equ 1<<1 MAGIC equ 0x1BADB002 FLAGS equ F_PAGE_ALIGN | F_MEM_FIELDS CHECKSUM equ -(MAGIC + FLAGS) ; The Multiboot header align 4 dd MAGIC dd FLAGS dd CHECKSUM > Header dd 0x100000 > Load dd 0x100000 > LoadEnd dd 0x100000 > BSSEnd dd 0x0 For non-ELF kernels, these should also be set the same as "Entry", i.e. to the appropriate physical addresses: Header dd _MultiBoot Load dd _StartOfTextSection LoadEnd dd _StartOfBSSSection BSSEnd dd _EndOfBSSSection ================================================================ "Unreal mode", "Big Real mode" ================================================================ >> It is possible to access 4 GB of memory in real mode, too. >> How is it done? > >it's done by means of protected mode, you set up limits of >segments to 4GB-1B in PMode, load segment registers with >selectors of these segments and exit to real mode leaving limits >of 4GB-1B. I have an example of that on welcome.to/pmode site or try here: http://my.execpc.com/~geezer/osd/pmode/unreal.asm Unreal mode is interesting, but not as useful as you might think. ================================================================ Basic pmode ================================================================ >1) I tried switching to protected mode but failed. Welcome, brother! >I am attempting to jump into protected mode for the first >time. There are a few things unclear to me. > >1- Can the GDT be in a segment it describes? Certainly. It usually is. >What I mean is GDT[1] has >a base of 0 with a limit of fffffh with g=1, therefore the gdt >must be in the segment it describes. Right?!? Right. >3. I always thought that you use 16bit code before setting the >cr0 bit and 32bit after. Where exactly should I be concerned >about 32 bit coding before this? 32-bit pmode doesn't "kick in" until you put valid selectors in all the segment registers. Between setting the PE bit in register CR0 and reloading the segment registers, the CPU is in an indeterminate state. It acts like real mode UNTIL you attempt to load a segment register. A hardware interrupt might do this, which is why interrupts should be disabled before enabling pmode. >4. I've read that for using 32 bit instructions you have to set >a(some?) control bit in the segment descriptors. Is this >accurate? Is this what you guys talk about when you say 16bit/ >32bit (see #3)? Yes, the Default size (D) bit, in the descriptor. When this bit is set for a code segment, the segment is USE32 (TASM syntax) or BITS32 (NASM syntax). Addressing modes and instructions for such a segment are 32-bit by default. To get 16-bit instructions, you need one or both of the override prefix bytes. If D=0 for the code segment descriptor, it is USE16/BITS16, and you will be in 16-bit pmode when you jump to the segment. For a stack segment, the D bit is called the Big (B) bit, and controls the width of the stack (16 or 32 bits). You should probably use a 32-bit stack with 32-bit code -- I don't know if you can mix a 32-bit stack with 16-bit code...or vice versa... or why you'd want to. I also don't know what effect the D bit has on data segments. Better make everything either 16-bit or 32-bit as appropriate. >5. How can segments with different attributes overlap (read- >write and execute only for example)? What happens when there >is access to that area? Memory protection is governed by values in the descriptors, thus, by the selectors that point to the descriptors, thus, by the segment register used to access memory. DJGPP puts constant data and large literals into the code segment. To avoid use of the CS: prefix when accessing this data, DJGPP overlaps the code and data segment. This means you could have self-modifying code! It works because instructions used to modify the code use the DS register implicitly, and the descriptor for DS says it's OK to write to this memory -- even if the CS descriptor says otherwise. >So does that mean that when I put USE32 in Masm/Tasm, the >assembler generates the instructions to set the operand size >Bit? Bzzzt, wrong. It's up to YOU to set the D bit in the descriptor to the proper value, depending on whether the descriptor describes a USE16 or USE32 segment. >Can I use the lower memory used by the bios and the interrupt >vector once I'm in protected mode or even before (to put GDT >and IDT)? I think it would be nice to have everything packed >neatly in the bottom of memory... The real-mode interrupt vector table (IVT) is 1024 bytes, and the BIOS data segment is 256 bytes, for a total of only 1280 bytes. If you leave it alone, you can return to real mode, call BIOS interrupts in virtual 8086 mode, or use some of this data from pmode (e.g. COM port I/O addresses at address 00400h). Also, if you are using code compiled from C, it's better to rig your page tables or descriptor tables so that accessing memory location 0 triggers a page fault or protection fault. This helps find bugs (NULL pointer references). >1) In protected mode, what kind of segment descritor should >be used for my stack? Use the data segment for the stack. The "expand-down" feature of pmode data descriptors is not too useful, in my opinion. >4) In protected mode, do all CALL's and JUMP have to be FAR? No. One advantage of pmode is that you can use a 32-bit version of the SMALL or TINY memory model, where all pointers are near (but without the 64K limits of 16-bit code). >5) I don't fully understand the 16 bit vs. 32 bit addressing >scheme with the prefix opcodes. If I write my 'segments' in >NASM with [bits 32], and I use register ax (not eax) will >there be a prefix? Correct. You can use the 8- and 32-bit registers or addressing modes without a prefix in a [BITS 32] code segment. You can use the 8- and 16-bit registers or addressing modes without a prefix in a [BITS 16] code segment. NASM will insert the prefix byte(s) automatically, but watch out. The address is trickier than the register. Maybe you have code running in "unreal" mode. The code segment is [BITS 16] but 32-bit addresses are allowed. You want to write to text-mode video memory: BITS 16 xor ax,ax mov es,ax mov byte [es:0B8000h],'H' NASM will truncate the 32-bit address to 16 bits. You end up writing to memory location 0000:8000h. To fix this: mov byte [es:dword 0B8000h],'H' Now NASM will insert the 32-bit address and the necessary prefix byte. >2) What does it mean to "patch the GDT"? To set the segment base addresses in the GDT, probably to give everything the same address in both real mode and pmode. This is a good thing, because it makes it easier to switch to pmode without crashing. >So far my kernel is in Pmode but I don't have a single LDT or >TSS, and the parts I wrote seem to work, but Intel's manual >says that at least one TSS is nescessary to be in pmode... >What's up with that? You need a TSS to store the ring 0 (kernel privilege) stack pointer while ring 3 (user privilege) code is running. The TSS also has a pointer to the I/O permission bitmap, which you need to block I/O access by user privilege tasks. >Ok, I created a structure (btw, the whole thing is NASM/DJGPP >combination) in C for my TSS, created one TSS statically(for >the kernel task), and created a GDT descriptor for it.So far so >good. But now, I need to fill out the static fields in the TSS >and make a call to it. Now I'm lost. ; only SS0:ESP0 and the I/O permission bitmap base are used here: tss: dw 0, 0 ; back link tss_esp0: dd 0 ; ESP0 dw KERNEL_DATA_SEL, 0 ; SS0, reserved dd 0 ; ESP1 dw 0, 0 ; SS1, reserved dd 0 ; ESP2 dw 0, 0 ; SS2, reserved dd 0 ; CR3 dd 0, 0 ; EIP, EFLAGS dd 0, 0, 0, 0 ; EAX, ECX, EDX, EBX dd 0, 0, 0, 0 ; ESP, EBP, ESI, EDI dw 0, 0 ; ES, reserved dw 0, 0 ; CS, reserved dw 0, 0 ; SS, reserved dw 0, 0 ; DS, reserved dw 0, 0 ; FS, reserved dw 0, 0 ; GS, reserved dw 0, 0 ; LDT, reserved dw 0, tss_end - tss ; debug, IO permission bitmap base tss_end: >Ok, enough with my mamblings, on to the questions: > 1) If I have no tasks running and I jump to the kernel TSS >the first time, what task state does it save(and where), and >what does it backlink to???? The task register (TR) must point to a second TSS that will store the current CPU state when you jump to a new TSS. > 2) Do I absolutely nescesseraly have to have an LDT for the >kernel task??? No. > 3) How do I jump to a task from my timer IRQ routine >assuming I have a GDT descriptor for it???(just call with >offset zero in the descriptor???)) xxx >1. I know the maximum capacity of the GDT (and LDT) is 8192 >entries. But how big should I take it for an operating system? My OS needs only 6 descriptors: NULL, TSS, kernel code, kernel data, user code, user data >And why shouldn't I take the maximum capacity. To reduce memory usage. To prevent bugs. (It can't cause problems if it's not there.) >how can i load a gdt and idt in c code? You don't :) Seriously, it's much easier to set up these structures in asm code. >I've got this code > >struct x86_desc { > unsigned short limit_low; /* limit 0..15 */ > unsigned short base_low; /* base 0..15 */ > unsigned char base_med; /* base 16..23 */ > unsigned char access; /* access byte */ > unsigned int limit_high:4; /* limit 16..19 */ > unsigned int granularity:4; /* granularity */ > unsigned char base_high; /* base 24..31 */ >} __attribute__ ((packed)); That's awful code! For it to work depends on three things that are not governed by the C language specification: 1. Structures packed or not. __attribute__((packed)) works for GCC, but not for other compilers. 2. Width of types. In this case, we want sizeof(unsigned short) == 2. Will that always be the case? Who knows? 3. Bit fields. Is limit_high the high nybble or is granularity? Will the compiler truly use a single byte for both? There's no way to tell. >But I can't translate it to pascal, because I don't know what >the :4 means. Repent! Use asm for the low-level pmode code. >how can i load a gdt and idt in c code? i would do it in asm >but i don't know at&t syntax that my c compiler needs One way or another, you must use asm. Maybe load the GDT and IDT with NASM code (Intel syntax) instead of GNU inline asm (AT&T syntax). >My Bootloader sets up a GDT, switches to PMode and jump's to >the kernel... This means, I have to set a new GDT... How can >I do this? GLOBAL entry ; this is the entry point call where_am_i ; where did the loader put us? where_am_i: pop esi ; ESI=physical adr (where we are) sub esi,where_am_i ; - virtual adr (where we want to be) ; now ESI=virt-to-phys ; bootloader GDT is gone, or not what we want, so set up a new one. ; All data segment addressing is relative to ESI until we set ; up page- or segment-based address translation mov eax,esi shr eax,16 ; fix up gdt_ptr and idt_ptr add [esi + gdt_ptr + 2],esi add [esi + idt_ptr + 2],esi ; fix up kernel data and code segment descriptors ; (segment-based address translation; SBAT) mov [esi + gdt3 + 2],si mov [esi + gdt3 + 4],al mov [esi + gdt3 + 7],ah mov [esi + gdt2 + 2],si mov [esi + gdt2 + 4],al mov [esi + gdt2 + 7],ah ; turn on SBAT lgdt [esi + gdt_ptr] mov ax,SYS_DATA_SEL mov ds,eax mov ss,eax mov es,eax mov fs,eax mov gs,eax jmp SYS_CODE_SEL:pmode ; now we can use absolute addresses in the data and BSS segments pmode: ================================================================ A20 ================================================================ >What does "A20 address line" mean? '286 CPU lets real-mode programs access the first 64K of extended memory (the high memory area, or HMA). This feature/ bug caused some very old and broken DOS programs to fail, because they relied on addresses above 1 meg "wrapping around" to address 0, as they do on an 8088 CPU. The A20 gate was added to let these programs run properly. >If it is disabled, is the memory access limited to under 1M? If you don't enable the A20 gate, only even-numbered megabytes of RAM will be accessible (0...1M, 2M...3M, 4M...5M, 6M...7M, etc.) It should certainly be enabled if you want access to all RAM, but it can be left off for simple kernels (pmode or real mode) that run in conventional memory. >And how can I enabled it? If an XMS manager such as HIMEM.SYS is loaded, use that. The second best way is via BIOS functions INT 15h AH=87h (copy extended memory) and INT 15h AH=89h (enter pmode). Otherwise: - Test if A20 is already on. Some BIOSes turn it on at boot time, and future PCs may not even have an A20 gate (always on). - Try INT 15h AX=2401h - Use INT 15h AX=2403h to see if the "fast" (port 92h) method is supported. If so: - Read I/O port 92h - Test bit b1 - If b1==1 but the A20 gate is off, something's screwed. Move on to a different A20 control method. - Set bit b1 (OR with 02h) and write it to I/O port 92h - Try the "AT" method: - Read the 8042 keyboard controller "output port" - Test bit b1 - If b1==1 but the A20 gate is off, something's screwed. Move on to a different A20 control method. - Set bit b1 (OR with 02h) and write it to the "output port" - Try the "Vectra" method (sending 8042 command byte 0DFh) For each of these methods, you should verify A20 is on afterwards. >Is it necessary >to enable A20 address before I switch CPU into protected mode? Only if you use extended memory (1 meg or above). >3) What is a simple test to see if the A20 line was enabled >properly. push ax ; (this is real mode code) push ds push es xor ax,ax mov ds,ax ; DS=0 dec ax mov es,ax ; ES=FFFFh mov ax,[es:10h] ; read word at FFFF:0010 (1 meg) not ax ; 1's complement push word [0] ; save word at 0000:0000 (0) mov [0],ax ; word at 0 = ~(word at 1 meg) mov ax,[0] ; read it back cmp ax,[es:10h] ; fail if word at 0 == word at 1 meg pop word [0] pop es pop ds pop ax je a20_not_enabled >> I was recently testing my code to check for the A20 line being >> enabled. When I put the disk into my test computer (A Packard >> Bell 90mhz) it happily reported that the A20 line was enabled, >> even though I hadn't enabled it. I then tried to turn it off >> using the port 92h method and it still said that it was on! At >> first I thought something was wrong with my code, but then I >> tried the 8042 method of disabling it and it reported it >> disabled. So from the looks of things it would appear that A20 >> is enabled by default on my Packard Bell. Is that possible? > >Do you load your test code this from DOS or from your own start >up code? In case you test this from DOS, and you use win9x/dos >7.x, you do know that DOS will load himem.sys even if it isn't >in your config.sys file, unless you use DOS=noauto? I believe >that himem.sys vill turn on the A20 line by default. Just a remark: some bioses have the option : enable A20 on startup ================================================================ Interfacing C code to asm code ================================================================ >I'm trying to make a kernel using DJGPP and NASM. I want as much as possible >of the kernel to be written in C but I realise that the stub will have to be >in ASM. How do I link between C and DJGPP? I've tried linking coffs but I >always end up with non-executable code. Just a few things I can think of offhand: 1. ASM symbols and labels are static (private) by default. To be accessed by code outside the file, they have to be GLOBAL or PUBLIC, depending on assembler syntax. 2. For DOS or Windows systems, or compilers that produce COFF files, an asm symbol meant to be accessed from C code needs an underscore at the beginning of the name. The C prototype does not need the underscore: C prototype: unsigned getPageFaultAdr(void); NASM code: GLOBAL _getPageFaultAdr _getPageFaultAdr: mov eax,cr2 ret 3. NASM won't mix 16- and 32-bit code in the same ELF or COFF file. Use aout format instead (nasm -o foo.o -f aout foo.asm) The DJGPP linker does understand aout format. If you do this, no 16-bit object can be at an offset > 0xFFFF, or you will get this linker error: x.o(.text+0x13): relocation truncated to fit: 16 text ================================================================ Wait queues ================================================================ >I have previously asked wether or not IRQ1 handler should print >characters on the screen or not. I got a very clear response of NO. >But then if the handler puts keyboard scan codes in a buffer, does that >mean that the shell has to keep on scanning the buffer continuously for >new keys? Wouldn't that be a waste of processor time? A task waiting for data from the keyboard can added to a list of tasks waiting for keyboard data, and then the task is marked "waiting" (or "blocked") so it doesn't get any time slices from the scheduler. This is the sleep_on() function. When the keyboard has some data, it can wake_up() the task (or tasks) waiting for it, making them runnable again. ================================================================ ELF ================================================================ >What is a good linker for win32/dos that creates ELF? http://www.multimania.com/placr/binutils.html http://elfbinutils.narod.ru/ Since ELF is the native format of Linux, ld for Linux GCC can also output ELF files. *** NOTE *** NOTE *** NOTE *** The GRUB bootloader works with ANY file format; not just ELF. There isn't a compelling reason to use ELF vs. DJGPP COFF for simple OS development. >*I am not sure I understand how compilers work, so forgive me >if the following question sounds dumb: when you use, say, >"printf" in your program, what does the compiler do wih it? Is >it an external function part of a shared library, so the link >is resolved at load time; or is it statically included in the >final executable (thus making the executable platform specific)? >The first solution, I guess, but then how exactly do you >resolve the link at load time? DJGPP uses static linking exclusively (your 2nd solution above). GCC for Linux gives you the choice of static or dynamic linking (your 1st solution). If you opt for dynamic linking, all calls to printf() go through the ELF Procedure Linkage Table (PLT). Here is a disassembly of "hello world" for Linux, compiled with ELF GCC (dynamic linking): 08048378 <.plt+10> jmp *0x80494e0 08048474
pushl %ebp 08048475 movl %esp,%ebp 08048477 pushl $0x80484b8 --> "hello" 0804847c call 08048378 <_init+18> 08048481 xorl %eax,%eax 08048483 movl %ebp,%esp 08048485 popl %ebp 08048486 ret The PLT entry is just a JMP. I suspect the initial value after the JMP is some kind of tag, or hint to the Linux dynamic linker (ld-linux.so), telling it to "put the address of printf() here". The Win32 PE format also supports dynamic linking, but I don't know the details. >*I would also appreciate some advice on what kind of executable >format would be best suited for my project, given that I want >to keep everything as simple as possible. If you want to stay simple, avoid far pointers, multiple segments, and dynamic linking. >ELF was my first choice seeing it is pretty much a standard in >the Unix (Linux) world, but it seems overly complex. If you _do_ want dynamic linking, then ELF or Win32 PE are probably your best choices. >A simple "Hello, World!" compiled with cc gave >me 29 (!!!) sections in the binary file. Some are probably for debugging. Run 'strip' on the file, or compile and link with '-s' option. Some are for dynamic linking. Link with '-Bstatic' option. See which file sections disappear. The .ctors and .dtors sections are for C++, and there may be some others for C++ exception handling, templates, vtables, etc. Compile with -fno-exceptions. .init and .fini are probably for functions to be run before and after main(). With static linking, no debug, no C++, and some other simplifications, you could get down to 4 or 5 file sections of interest. HOWEVER: Your ELF loader should look at the program headers in the ELF file, not the sections. Multiple sections are combined into single program headers, so there are only a few of them. ================================================================ Rebooting ================================================================ >2. How can I reboot in protected mode? Do I need to re-enter real-mode to >achieve this, or should I deliberately tripple fault it? Deliberate triple fault doesn't seem to work on some PCs. (Triple fault doesn't really reset the CPU, it puts the CPU into "shutdown" mode. Most PC motherboards have hardware to detect shutdown mode and reset the CPU, but not all do.) Maybe reboot using the keyboard controller? static void reboot(void) { unsigned temp; disable(); /* flush the keyboard controller */ do { temp = inportb(0x64); if((temp & 0x01) != 0) { (void)inportb(0x60); continue; } } while((temp & 0x02) != 0); /* pulse the CPU reset line */ outportb(0x64, 0xFE); /* ...and if that didn't work, just halt */ halt(); } WARNING: the A20 gate must be enabled for this to work properly. ================================================================ CHS and LBA for hard disks ================================================================ Conversion from LBA to CHS: you must have the sectors-per-track (spt) value for the hard disk, and the number of heads. Then: sector = LBA % spt + 1 head = (LBA / spt) % heads cylinder= (LBA / spt) / heads % = modulus operator (remainder after division) >when i use the bios function 02 of the int 10h, i can read data only in the >first 1024 cylinders... this is a problem because, if i have a disk larger >than 8 GByte ( 1024 cylinders * 256 Heads * 64 sectors * 512 byte ) i cannot >map all the data area... > >can someone tell me something usefull about this ? Look up ralf browns list and find the int 13 extensions. Look for Int 13 Fn 48 Before using the BIOS LBA extension (INT 13h AH=4xh), check if they're supported by using INT 13h AH=41h Note: these functions are present in a Win95 DOS box, even if your BIOS doesn't support them. ================================================================ Getting hardware docs ================================================================ >2) I was wondering... How difficult is it to write drivers for specific >hardware for your OS. Would the graphics card,sound card,etc compagnies >give out the technical information for me to program my own drivers? >The guys from Linux did it, right? Is this possible? It depends on the company. Some make all documentation freely available. Some release partial documentation. Some require that you sign a non-disclosure agreement (NDA). Some don't release any documents at all. If you make an effort to buy only devices with freely-available programming specifications, you may encourage more companies to make their documents freely available. ================================================================ Bugs in the tools ================================================================ - MinGW32 ld crashes with "--oformat binary". You can create a linker script that outputs PE executables that are identical to binary format except for the 4096-byte header at the start of the file. - The BIOS supplied with Bochs doesn't handle INT 15h AH=89h (fixed in new version of Bochs?) - Bochs doesn't let you move the text-mode framebuffer to A000:0000 - The Bochs keyboard code panics when you try to change the scancode set or use the "Vectra" method to enable A20. - The ELF binutils for DJGPP http://www.multimania.com/placr/binutils.html might have some bugs ================================================================ Usenet archives ================================================================ >Where can I find the alt.os.development list archives? http://groups.google.com ================================================================ Setting video modes from protected mode code ================================================================ 1. VGA BIOS (INT 10h AH=0): real mode only (or virtual 8086 mode) 2. VESA VBE 1.x BIOS (INT 10h AH=4Fh): real or V86 mode only 3. VESA VBE 2.x BIOS (INT 10h AH=4Fh): a few functions that work in 32-bit protected mode, but the more useful functions (like changing video mode) still require real mode or V86 mode 4. VESA VBE 3.x BIOS (INT 10h AH=4Fh): most functions are carefully written so they work in either real mode or 16-bit (!) protected mode 5. Direct programming of VGA: pure 32-bit code, works well on many systems, but allows VGA graphics modes only 6. Direct programming of SVGA: pure 32-bit code, allows hi-res graphics modes, but you must write a separate driver for every SVGA chipset (!) VBE 3.x is uncommon. Code that uses it may work with your system but not with others. VBE 2.x is more common than VBE 3.x but less common than VBE 1.x Your video code should fall back to using VGA modes if no compatible VBE BIOS can be found. The VGA modes can be set by programming the hardware directly, instead of using the VGA BIOS. For code to do this, search the web for "Tauron VGA utilities" ================================================================ getting to ring 3 ================================================================ >Right now I've been reading a lot of the Intel documentation, and they >seem to have a lot of ways to jump up privledge levels, but there >seems to be limited documentation as to how to start programs in ring >3. There are two ways to get to ring 3: 1. Use the x86 task-switch mechanism to switch to a ring 3 task 2. Use the IRET instruction to pop a ring 3 CS selector. This will cause IRET to pop a total of 5 registers instead of 3: EIP, CS, EFLAGS, ring 3 ESP, ring 3 SS Method #2 is more popular. It needs a TSS to store the ring 0 SS and ESP values while the ring 3 code runs (the task-switch mechanism of the TSS is not used here). YOU must store the ring 0 ESP value in the TSS before performing IRET: ; now in ring 0 pmode ; stack must contain 19-dword register frame lea eax,[esp + 76] mov [tss_esp0],eax popa ; pop 8 general-purpose registers pop ds ; pop 4 segment registers pop es pop fs pop gs add esp,8 ; drop exception number ; and error code (2 dwords) ; go to ring 3: iret ; IRET pops 5 dwords ; total: 19 dwords == 76 bytes The ring 3 code uses its own ring 3 code and data segment selectors and descriptors: gdt: ; ...other descriptors here ; ring 3 user code segment descriptor USER_CODE_SEL equ ($-gdt) | 3 dw 0FFFFh dw 0 db 0 db 0FAh ; present,ring 3,code,non-conforming,readable db 0CFh db 0 ; ring 3 user data segment descriptor USER_DATA_SEL equ ($-gdt) | 3 dw 0FFFFh dw 0 db 0 db 0F2h ; present,ring 3,data,expand-up,writable db 0CFh db 0 ================================================================ CPU does not store ring 0 ESP automatically when switching to ring 3 ================================================================ ByrdKernel wrote: | cr88192 wrote: || luckily the cpu keeps the tss updated, so all you really need to do || is set up the tss and use ltr to load the tss segment... | | Cool! So I set up the one TSS, and the SS and ESP fields keep track | of the PL3 stack, and the SS0 and ESP0 fields keep track of the PL0 | stack? And (I'm guessing) everything else can be pretty much | ignored? (Assuming I don't use PL1 or 2, that is.) Unless you use hardware task switching, the CPU will not write to the TSS. When you switch stacks (i.e. when you switch tasks), you must write the new ESP0 value to the one and only TSS, so that when the next interrupt comes along, the CPU will choose the correct kernel stack. -- Tim Robinson http://www.themoebius.org.uk/ for a discussion of this, see the alt.os.development thread titled "Task switching without TSS", which starts with Message-ID: <3D177125.BD7709B1@yahoo.com> ================================================================ IRQ 8-15 and the 8259 chips ================================================================ >I was just reading up on it, so they are just the same 0 - 7 IRQs except >they are generated by the CPU, not the hardware? This is my understanding No, they _are_ generated by hardware. >of it. Am I wrong? Do you handle 8-15? If so, how? The same as 0 - 7, >since they are the same, right? Can somebody clear this up for me? Thanks. Right; software should handle IRQ 8-15 the same as IRQ 0-7. The 'master' and 'slave' relationships are a hardware issue, to prioritize interrupts that happen simultaneously. All 256 CPU interrupts can be caused by the INT instruction. Note, however, that the INT instruction is a trap, not a fault. Interrupt 0Dh caused by a CPU-generated General Protection Fault pushes an error code onto the stack, but interrupt 0Dh caused by the INT 0Dh instruction does not. Interrupts 0-31 can be caused by CPU exceptions: divide error, illegal instruction, page fault, GPF, etc. IRQs 0-7 and IRQs 8-15 can each be assigned to any eight consecutive INTs. The BIOS sets up the 8259 chips like this: IRQ 0-7 = INT 8-15 <- conflict with pmode exceptions IRQ 8-15 = INT 70h-77h With these mappings, IRQ 0 = INT 8 = Double Fault, and IRQ 5 = INT 0Dh = General Protection Fault. To avoid these conflicts, a protected-mode OS should reprogram the 8259 chips to route IRQs to INT 20h and greater. ================================================================ IOPL? ================================================================ IOPL-sensitive instructions cause an exception if the CPL > IOPL CPL=3 always in V86 mode IOPB-sensitive instructions are allowed if the bit in the I/O permission bitmap for the corresponding port=0 Privileged instructions are allowed only if CPL=0 CPL=0 always in real mode REAL MODE illegal instructions: LTR/STR LLDT/SLDT VERR/VERW LAR/LSL ARPL PMODE IOPL-sensitive instructions: IN/INS OUT/OUTS CLI/STI PMODE IOPB-sensitive instructions: IN/INS OUT/OUTS PMODE or V86 MODE privileged instructions: (not permitted in V86 mode; not permitted in pmode if CPL > 0) LTR LLDT LIDT LGDT LMSW MOV nn,CRn MOV nn,DRn MOV nn,TRn HALT MOV CRn,nn MOV DRn,nn MOV TRn,nn CLTS V86 MODE IOPL-sensitive instructions: CLI/STI PUSHF/POPF INT nn/IRET LOCK V86 MODE IOPB-sensitive instructions: (unlike pmode, these are not sensitive to IOPL) IN/INS OUT/OUTS From 386INTEL.TXT: "An attempt by a less privileged procedure to alter IOPL [or IF; by e.g. POPF or IRET] does not result in an exception; IOPL [or IF] simply remains unaltered." ================================================================ double fault ================================================================ >Subject: immediately getting a double fault after enabling interrupts. Oooh, I know the answer already! You didn't reprogram the 8259 chips. The BIOS set them up so IRQ 0 (timer) = INT 8. Now that you're in protected mode, INT 8 = Double Fault, so... ================================================================ ports ================================================================ >what exactly is the definition of a "port"? Is it just in memory? Where? >What part of memory? It's a badly overloaded word. 'Port' as a verb comes from the same root as 'portable'. Here, it means to modify software to work in a new environment (new OS or CPU or compiler). "Wolf3D has been ported to GNU C." "Linux has been ported to the PS/2." "OpenBSD has been ported to run on a TI-35 calculator." 'Port' as a noun can mean a piece of software that has been ported. "A QNX port of DOOM can be found at...." 'Port' is also a term used in network programming. It's just a number associated with a network connection. Connections are made to 'well-known' port numbers to obtain services: port 21 FTP port 23 Telnet port 25 SMTP (mail) port 80 HTTP port 110 POP (mail) port 113 auth/identd port 6667 IRC port 26000 multiplayer Quake :) Similar numbered 'network' connections can be made to the local machine. This is a form of inter-process communication (IPC). A microkernel might have a 'namer' port which other parts of the kernel can query to see what services are available. Port also refers to the I/O feature of the x86 CPUs. This is a separate address space, with 65536 addresses, and special instructions (IN, INS, OUT, OUTS) to access it. In this sense, 'port' can also refer to registers in the hardware. "To clear the IDE interrupt, read the byte from port 1F7h" CPUs can also use the regular memory space for I/O devices. On these CPUs, 'port' can refer to a memory location that is, in fact, an I/O device, rather than ROM or RAM. On top of all this, 'port' can refer to a connection on the back of your PC: serial port, PS/2 keyboard port, USB port, etc. >How would you go about allowing a app to write data to an other >program as though it were a hardware port? > > App 1 does : outportb(0x109, 50) > App 2 automatically recieves the 50 outportb() is for the x86 I/O ports Access to x86 I/O ports can be controlled on a port-by-port basis using the I/O permission bitmap (IOPB). You could "virtualize" I/O ports so that App 1 writing to port 0x109 causes an IPC message to be sent to App 2. This seems like a strange and contorted thing to do. >PS I'd like to use this for making a NFS client act like a standard HD >device driver A microkernel may contain an IDE "server" which controls the IDE hard disk. You communicate with this server by passing messages over message queues, pipes, or possibly local network connections [ fd = socket(AF_UNIX, ....); ]. In this case, just write a new IDE server that translates disk requests into NFS requests. ================================================================ "PE operations on non PE file" ================================================================ >I'm building this with DJGPP on WinXP. > >I got GNU ld version 2.13.90 20021005 but I still get link error, here >is what I do, (the source if from the page above). Does this mean you're using DJGPP with a linker you got somewhere else...? >C:\Kernel>nasmw -o start.c -f elf start.asm > >C:\Kernel>gcc -o kernel.o -c kernel.c -elf DJGPP, you say? It doesn't support ELF. What compiler is this, really? I've never seen a GCC with an "-elf" option. >C:\Kernel>ld -T kernel.ls --oformat=elf32-i386 start.o kernel.o -okernel >ld: PE operations on non PE file. PE is Win32 PE COFF, the file format used by CygWin and MinGW (GCC for Win32). MinGW displays this message when you try to link a non-PE file. The linker claims to support ELF, but it doesn't. ELF support is badly broken -- I don't know why the MinGW people bother including it. You might be able to link in the compiler's native format (Win32 PE COFF), then use objcopy to convert to ELF. Then again, objcopy is probably broken, too. Is this for GRUB? Your kernel doesn't have to be in ELF format to use GRUB -- you can use the "aout kludge". ================================================================ paging ================================================================ Advantages of paging: - address translation: turns fragmented physical addresses into contiguous virtual addresses (a HUGE win) - address translation: each task has the same virtual address - memory protection (buggy or malicious tasks can't harm each other or the kernel) - shared memory between tasks (a fast type of IPC, also conserves memory when used for DLLs) - demand-loading (prevents big load on CPU when a task first starts running, conserves memory) - memory-mapped files - virtual memory swapping (lets system degrade gracefully when memory required exceeds RAM size) >Hi, does somebody know a good document about how to implement Paging. I >know the basic stuff, how it works, but I cannot find any good document >for the implementation. BonaFide's document isn't good because it doesn't >explain anything and is not really complete. Tim Robinson's Memory >Management Docs are good, but no implementation... Try the source code of Linux 0.01 and 0.11 In Linux 0.01, mm/page.s contains the page fault handler, which calls two functions in mm/memory.c: do_no_page() and do_wp_page() do_wp_page() is for copy-on-write memory. If your kernel does not fork(), then you don't need this function, or any functions that it calls. do_no_page() demand-allocates heap memory. In Linux 0.11, demand-loading was added. In do_no_page(), you can see new calls to share_page(), bmap(), and bread_page() Both kernels track pages using a simple array of bytes (mem_map[]). Each entry in the array contains the sharing count for the associated page (0 if the page is free). >Hm? I will need to translate Virtual Addresses to Physical Addresses. I This is what the page tables (and their cache; the TLB) do. 0x100000 (1 MB) is a good physical address for your kernel. 0xC0000000 (3 GB) is a good virtual address for your kernel. Linux uses this, and so does Windows NT with the "/3GB" switch. This leaves nearly 3 GB of virtual memory for your task. The virtual address of each task should be low, but not too low. Don't use address 0 -- leave page 0 unmapped to catch NULL pointer errors. environment app virtual address ----------- ------------------- DJGPP 0x1000 (4K) - probably too low Windows CE 0x10000? (64K) - probably too low Windows 95 0x400000 (4M) Linux 0x8048000 (128M) (Identity-)map the bottom 1 MB of memory so your kernel can get at the BIOS data segment, video memory, and the BIOS ROMs. >also need to know how to enable Paging. Set up the initial page tables, then set the PG bit in register CR0. The page that contains the "MOV CR0,..." instruction must be identity-mapped. You can re-map this page once paging is enabled. >I need to handle Page Faults. If you are just starting, build static page tables to map everything before you set the PG bit. In this case, a page fault is a fatal error. You could test that page-based protection works by accessing an unmapped virtual address, or writing a read-only page. Discardable kernel code and data is probably the easiest paging feature to implement. Next: demand-loading. Use GRUB to load your tasks as kernel modules. If you get a page-missing fault in the code or data segment of a task, allocate and map a page, then copy the code or data from the module. If you get a page-missing fault in the BSS segment, allocate/map/zero. If you get a page-missing fault in the heap or stack, allocate/map. (Zero the page for security, don't zero it for speed.) ================================================================ segments ================================================================ >Now that I'm incorporating C into my kernel, I began to think of whether or >not my segmentation plan is the best one for using C in my kernel. I heard >that having a flat kernel is the best method with just one 4-gig segment for >code and data. But, where would the stack be located then? How would that >work? Put the stack in the data segment. The x86 "expand-down" segments are not too useful, IMHO. You don't need that many segments. The GDT in my OS contains only 6 descriptors: NULL, TSS, kernel (ring 0) code, kernel data, user (ring 3) code, user data >I don't know how productive rewriting would be. But, if I _do_ change my >segmentation, then I would have to go through EVERY thing and change segment >selectors to the proper ones. Did you use named constants for these? e.g. #define USER_DATA_SEL 0x2B If you use the "flat" memory model, you shouldn't have to mess around too much with segment register or selectors -- just when entering and leaving the kernel. ================================================================ TSS; again ================================================================ >Id like to implement software (stack) task switching but I need some >way to protect my IO space. Until now I now about just one method - io >bitmap in TSS. But I dont want to use TSS. How works the memory mapped You need not use the TSS for task-switching. However: - You MUST have a TSS if you have code running at different privilege levels. The TSS stores the ring 0 stack pointer while ring 3 code is running. - You MUST have a TSS if you want to use the I/O permission bitmap (IOPB). For I/O, you could do this: - No IOPB - Code runs at ring 3, IOPL=0, so all I/O instructions cause GPF - Any I/O is a fatal error -- kill the task Or this: - No IOPB - Code runs at ring 3, IOPL=0, so all I/O instructions cause GPF - GPF handler "virtualizes" I/O instructions: - Check opcode that caused GPF, see if it's IN/OUT/INS/OUTS - Get port number - Check if OS allows access to this port - If access is permitted, perform the IN or OUT instruction - If access is NOT permitted, then kill the task - Increment the stacked EIP value to skip over the faulting instruction - Return to ring 3 task Or this: - No IOPB - Code runs at ring 3, IOPL=0, so all I/O instructions cause GPF - GPF handler "virtualizes" I/O instructions: - Check opcode that caused GPF, see if it's IN/OUT/INS/OUTS - Get port number - Check if OS allows access to this port - If access is NOT permitted, simulate failed IN or OUT (IN should return FFh; the value of a floating bus) - Increment the stacked EIP value to skip over the faulting instruction - Return to ring 3 task ================================================================ mapping the page directory into...errr...the page directory ================================================================ >In some tutorials on OS development, I came acrossed a few that >discussed ever-so-lightly about using the last entry of a Page >Directory to map back to the physical address of the Directory itself. > >Granted, many aspects of OS-dev can easily spin one's head, but this >one has taken me far beyond the spinning point. Me too! Damn head came unscrewed and fell off. >May I inquire of you to clear my head by explaining 3 points of this: > >1) Why would one do this.. what will it accomplish? > >2) What other benifits might one reap? - Page directory has the same virtual address in each address space (e.g. 4KB at 0xFFFFF000) - All 1024 page tables can be accessed (if present) at the same virtual address in each address space (e.g. 4MB at 0xFFC00000) - Entries in page tables and in the page directory can be accessed with virtual addresses ================================================================ why rewrite printf()? ================================================================ >I've been reading around, and a lot of places on the web recommend that you >don't use certain commands / libraries - one thing that came up was printf - >why? My understanding of C was that everything you used / every library you >included, was compiled, and that after that it didn't matter. It's not as >if printf relies on anything strange, displaying text should be the same >across every pc / graphics card / monitor / whatever, shouldn't it? No, displaying text is NOT the same on all systems. If the C spec guaranteed that printf() would call putchar(), cprintf() would call putch(), etc, then it would be simpler -- just rewrite those lesser functions. Also, stock printf() drags in floating-point code, for printf("%f", ...). What if your OS doesn't support floating-point? ================================================================ triple fault does/does not reset the PC ================================================================ > * This routine reboots the machine by asking the keyboard > * controller to pulse the reset-line low. We try that for a while, > * and if it doesn't work, we do some other stupid things. > */ {snip} >static long no_idt[2]; >/* That didn't work - force a triple fault.. */ >__asm__ __volatile__("lidt %0": :"m" (no_idt)); >__asm__ __volatile__("int3"); INT 3 make the CPU fetch the INT 3 vector from the IDT. The IDT is too small, so double fault occurs instead (INT 8). This vector is also missing, so triple fault occurs. Triple fault actually puts the CPU into the "shutdown" state. Circuitry on many motherboards detects this state and resets the system. On these systems, triple faults causes reset. Other motherboards lack the necessary circuitry. Triple fault here merely causes the computer to freeze. ================================================================ what happens with segmentation when paging is turned on? ================================================================ >- What happens to the GDT? From the moment I set the PG bit, > have I to turn all segment register contents from GDT offsets > to pagetabeldirectory-indexes? > >- Whats about the TSS (I want to implement multitasking, later)? > Arent the Descriptors for them stored in the GDT? > How does the CPU decide, which selector is a GDT selector > and which is an index of the pagetabledirectory? > Or is the GDT useless, when paging is turned on? xxx ================================================================ various questions ================================================================ >1) Do you know QNX? It is a realtime operating system. How is it possilbe to >write such a rtos? Has that someting to do with mircokernel? "RTOS" and "microkernel" are two different concepts. QNX just happens to be both. RTOS features: - Pre-emptive scheduling - Many (~256) priority levels - Priority inheritance - Bottom-half interrupt handlers - Kernel does not run for long periods of time with interrupts disabled - The ability to turn off features (like swapping, demand- loading, memory-mapped files... most virtual memory features, actually) that lead to long, unpredictable delays - Maybe variable time-slice - Inter-process communication (IPC) features that let you write an application as a collection of tasks Features of "desktop" or "server" operating system that might be missing from an RTOS: - Disks - Filesystems - Memory protection - GUI or command-line shell. The RTOS probably runs a single application which has its own UI. >2) What are the positiv thinks of a "normal" and a microkenel? Monolithic ("normal") kernel is easier to write than a micro- kernel because the monolithic kernel has less need for IPC. Monolithic kernel is also faster (neither Linux nor Windows are microkernels), though new microkernels like L4 claim to be nearly as fast. Microkernels can be full of race conditions and potential deadlocks unless they are carefully written. Advantages of microkernels: - Enforces modular kernel design. You can load only the kernel components you need, to conserve memory. - Kernel components are tasks. They run at user privilege, with protection enabled. These tasks can be killed and re-loaded if they crash. They use the same C library functions as normal, unprivileged tasks, so they are potentially easier to write. - Message-passing kernel is easier to adapt to multiprocessor environments, either tightly-coupled (SMP) or loosely-coupled (distributed or clustered system) See also Andrew S. Tanenbaum's books and papers, and the infamous "Linux is obsolete" flamewar between Linus Torvalds and Dr. Tanenbaum. >3) How can I write an OS which can use software from other operating systems >like Unix. Are there tutorials somewehe about this? Look at the library functions the other software uses (e.g. fopen(), malloc(), setjmp()) and make sure you have those functions in your library. Look at the OS functions the other software uses (e.g. fork(), mmap(), read()) and implement those in your OS; either directly or through a compatability layer. >5) When I have finished my work on osdev, how can I make it faster? Have I >to keep the code small, or how can I do this? To make it faster: - Find a way to profile (i.e. measure the speed of) the code Optimize only the slow and frequently-called code; not the code that is already fast, or code that is infrequently called. - Use better algorithms (e.g. binary search instead of linear) - Use better data structures (e.g. hash table instead of array) - Use compiler optimization (gcc -O3 ...) and compiler and assembler switches that align objects in memory for maximum performance - Some library functions written in C could be converted to asm for speed - Structure the code so the most likely path of execution has few JMPs in it. Look at the assembly-language output of the compiler to check this.