| Summary: | Proposed FAQ on assembly programming | ||
|---|---|---|---|
| Product: | Documentation | Reporter: | tms2 <tms2> |
| Component: | Books & Articles | Assignee: | Murray Stokely <murray> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | Latest | ||
| Hardware: | Any | ||
| OS: | Any | ||
State Changed From-To: open->feedback Where would you like this entry to appear? In the FAQ? However, please resubmit it as DocBook SGML, the HTML files are generated automatically. Please see the other .sgml-files for the FAQ to learn how. It's easy :) Thanks! Here is an SGML version. It requires the entities in docs/19425. <qandaentry> <question> <para>How do I write "Hello, world" in FreeBSD assembler?</para> </question> <answer> <para>This program prints "Hello, world." on the standard output, and then exits with an exit status of 0. It is written for Intel machines, to be assembled by the GNU assembler, <command>as</command>. The syntax used by <command>as</command> is different from Intel's, but is common in the Unix world. See &man.as.1; or <command>info as</command> for details. This syntax is known as AT&T syntax. The most important difference for present purposes is that the order of operands is reversed: the source operand comes first, then the destination. In addition, a size suffix is appended to the opcodes.</para> <para>The program works by first calling &man.write.2; to write the message, and then calling &man.exit.2; to exit.</para> <programlisting> 1: .data # Data section 2: 3: msg: .asciz "Hello, world.\n" # The string to print. 4: len = . - msg - 1 # The length of the string. 5: 5: .text # Code section. 6: .global _start 7: 8: _start: # Entry point. 10: pushl $len # Arg 3 to write: length of string. 11: pushl $msg # Arg 2: pointer to string. 12: pushl $1 # Arg 1: file descriptor. 13: movl $4, %eax # Write. 14: call do_syscall 15: addl $12, %esp # Clean stack. 16: 17: pushl $0 # Exit status. 18: movl $1, %eax # Exit. 19: call do_syscall 20: 21: do_syscall: 22: int $0x80 # Call kernel. 23: ret</programlisting> <para><literal>_start</literal> (line 8) is the default name for an ELF program's entry point.</para> <para>Arguments to system calls are placed on the stack from right to left, just as in C. Lines 10 through 12 push the arguments for &man.write.2; on the stack, and line 17 pushes the argument for &man.exit.2;. The caller is responsible for cleaning up the stack after control has returned from the <literal>call</literal>.</para> <para>System calls are made by putting the call's index in <literal>%eax</literal> (lines 13 and 18), and invoking <literal>int $0x80</literal> (line 22). The kernel expects to find the first argument 4 bytes below the top of the stack, as it would be if the system call were made using the C library. Therefore, the invocation of <literal>int $0x80</literal> is placed in its own function.</para> <para>The kernel puts the system call's return value in <literal>%eax</literal>. If there is an error, the carry flag is set, and <literal>%eax</literal> contains the error code. This program ignores the value returned by &man.write.2;.</para> <para>Assuming you saved the program as <filename>hello.s</filename>, assemble and link it with:</para> <screen>&prompt.user; <userinput> as -o hello.o hello.s</userinput> &prompt.user; <userinput>ld -o hello hello.o</userinput></screen> <para>It is also possible to invoke system calls using the C library instead of using <literal>int $0x80</literal>.</para> <programlisting> 1: .data 2: 3: msg: .string "Hello, world.\n" 4: len = . - msg - 1 5: 6: .text 7: .extern write 8: .extern exit 9: .global main 10: 11: main: 12: pushl $len 13: pushl $msg 14: pushl $1 15: call write 16: addl $12, %esp 17: 18: pushl $0 19: call exit</programlisting> <para>Since we are linking with the C library, we must also use the C startup code, which means that the entry point to our program is now <literal>main</literal> (line 11) instead of <literal>_start</literal>. (the <literal>_start</literal> label is in the C startup code, which does some initialization and then calls <literal>main</literal>.)</para> <para>The easiest way to assemble and link this program is through <command>cc</command>, which will take care of linking in the proper startup modules in the correct order:</para> <screen>&prompt.user; <userinput>cc -o hello hello.s</userinput></screen> <para>There is a lot of information available about assembly programming on Intel machines, but little if any of it applies to FreeBSD specifically. Most or all Intel assembly books and Web sites are about programming in an MS-DOS environment. These books can be useful for a FreeBSD programmer to the extent that they discuss general principles or the Intel instruction set, but of course nothing specific to MS-DOS or the PC BIOS will work under FreeBSD. There is some material on the Web concerning assembly programming under Linux, but even this does not always apply to FreeBSD, because Linux uses a different protocol for making system calls.</para> <para>Here are some Web links that you might find useful:</para> <segmentedlist> <seglistitem> <seg><ulink url="http://developer.intel.com/design/litcentr/index.htm"> Intel Manuals</ulink></seg> <seg>Reference manuals for Intel processors can be found here.</seg> </seglistitem> <seglistitem> <seg><ulink url="http://webster.cs.ucr.edu/">Art of Assembly Language</ulink></seg> <seg>A well-regarded and very long (~1500 page) online textbook for assembly programming in MS-DOS.</seg> </seglistitem> <seglistitem> <seg><ulink url="http://linuxassembly.org/">Linux Assembly</ulink></seg> <seg>Assembly programming under Linux. Some useful information for FreeBSD programmers, but be wary of the differences between FreeBSD and Linux.</seg> </seglistitem> <seglistitem> <seg><ulink url="http://www.web-sites.co.uk/nasm/">NASM</ulink></seg> <seg>If you prefer Intel syntax in your assembler, try NASM. It is in the FreeBSD ports system.</seg> </seglistitem> <seglistitem> <seg><ulink url="news:comp.lang.asm.x86">comp.lang.asm.x86</ulink> <ulink url="http://www.geocities.com/SiliconValley/Peaks/8600/">Host page</ulink></seg> <seg>Contains links to other Intel assembly resources on the Web.</seg> </seglistitem> <seglistitem> <seg><ulink url="http://www.unix.digital.com/faqs/publications/base_doc/DOCUMENTATION/HTML/ulinkA-PS31D-TET1_html/TITLE.html"> Alpha Assembly Language Programmer's Guide</ulink></seg> <seg>Probably useful if you are running FreeBSD on an Alpha.</seg> </seglistitem> </segmentedlist> </answer> </qandaentry> Responsible Changed From-To: freebsd-doc->murray We now have an assembly language chapter of the Developer's Handbook. I'investigating whether any of this can be added to our existing chapter. State Changed From-To: feedback->closed This information is now in the developer's handbook x86 assembly language chapter. |
Information on assembly programming is not readily available. In particular, the correct way to make syscalls is not at all obvious. The proposed FAQ shows how to write a "Hello, world." program in assembly. Fix: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <html> <head> <title>Hello, World in FreeBSD Assembler</title> </head> <body bgcolor=white> <h1>How do I write "Hello, world" in FreeBSD assembler?</h1> <p>This program prints "Hello, world." on the standard output, and then exits with an exit status of 0. It is written for Intel machines, to be assembled by the GNU assembler, <code>as</code>. The syntax used by <code>as</code> is different from Intel's, but is common in the Unix world. See <code>man as</code> or <code>info as</code> for details. This syntax is known as AT&T syntax. The most important difference for present purposes is that the order of operands is reversed: the source operand comes first, then the destination. <p>The program works by first calling write(2) to write the message, and then calling exit(2) to exit. <pre> 1: .data # Data section 2: 3: msg: .string "Hello, world.\n" # The string to print. 4: len = . - msg - 1 # The length of the string. 5: 5: .text # Code section. 6: .global _start 7: 8: _start: # Entry point. 10: pushl $len # Arg 3 to write: length of string. 11: pushl $msg # Arg 2: pointer to string. 12: pushl $1 # Arg 1: file descriptor. 13: movl $4, %eax # Write. 14: call do_syscall 15: addl $12, %esp # Clean stack. 16: 17: pushl $0 # Exit status. 18: movl $1, %eax # Exit. 19: call do_syscall 20: 21: do_syscall: 22: int $0x80 # Call kernel. 23: ret </pre> <p><code>_start</code> (line 8) is the default name for a program's entry point. <p>Arguments to system calls are placed on the stack from right to left, just as in C. Lines 10 through 12 push the arguments for write(2) on the stack, and line 17 pushes the argument for exit(2). Note that the caller is responsible for cleaning up the stack after control has returned from the <code>call</code>. <p>System calls are made by putting the call's index in <code>EAX</code> (lines 13 and 18), and then invoking <code>int $0x80</code> (line 22). <strong>Important:</strong> A <code>call</code> must be made after the arguments are placed on the stack and before the <code>int $0x80</code>. If you replace <code>call do_syscall</code> (lines 14 and 19) with <code>int $0x80</code>, the program will not work. <p>The kernel puts the system call's return value in <code>EAX</code>. This program ignores the value returned by write(2). <p>Assemble the program with (assuming you saved it as <code>hello.s</code>) <pre> as -o hello.o hello.s</pre> and link it with <pre> ld -o hello hello.o</pre> <p>It is also possible to invoke system calls using libc instead of doing it directly through <code>int $0x80</code>. <pre> 1: .data 2: 3: msg: .string "Hello, world.\n" 4: len = . - msg - 1 5: 6: .text 7: .extern write 8: .extern exit 9: .global main 10: 11: main: 12: pushl $len 13: pushl $msg 14: pushl $1 15: call write 16: addl $12, %esp 17: 18: pushl $0 19: call exit </pre> <p>Since we are linking with libc, we must also use the C startup code, which means that the entry point to our program is now <code>main</code> (line 11) instead of <code>_start</code>. <p>The easiest way to assemble and link this program is through <code>cc</code>, which will take care of linking in the proper startup modules in the correct order: <pre> cc -o hello hello.s</pre> <h3>Resources</h3> <p>There is a lot of information available about assembly programming on Intel machines, but little if any of it applies to FreeBSD specifically (hence this FAQ). All of the books I have seen, and most of the Web sites, are about programming in an MS-DOS environment. These books can be useful for a FreeBSD programmer to the extent that they discuss general principles or the Intel instruction set, but of course nothing specific to MS-DOS or the PC BIOS will work under FreeBSD. There is also some material on the Web concerning assembly programming under Linux, but the same caveat applies. <p>Here are some Web links that you might find useful: <dl> <dt><a href="http://developer.intel.com/design/litcentr/index.htm"> Intel Manuals</a> <dd>Reference manuals for Intel processors can be found here. <dt><a href="http://webster.cs.ucr.edu/">Art of Assembly Language</a> <dd>A well-regarded and very long (~1500 page) online textbook for assembly programming in MS-DOS. <dt><a href="http://linuxassembly.org/">Linux Assembly</a> <dd>Assembly programming under Linux. Some useful information for FreeBSD programmers, but be wary of the differences between FreeBSD and Linux. <dt><a href="http://www.web-sites.co.uk/nasm/">NASM</a> <dd>If you prefer Intel syntax in your assembler, try NASM. It is in the FreeBSD ports system. <dt><a href="news:comp.lang.asm.x86">comp.lang.asm.x86</a> <a href="http://www.geocities.com/SiliconValley/Peaks/8600/">Host page</a> <dd>Contains links to other Intel assembly resources on the Web. <dt><a href="http://www.unix.digital.com/faqs/publications/base_doc/DOCUMENTATION/HTML/AA-PS31D-TET1_html/TITLE.html"> Alpha Assembly Language Programmer's Guide</a> <dd>Probably useful if you are running FreeBSD on an Alpha. </dl> </body> </html>