- •Table of Contents
- •Foreword
- •Do Not Pass GO
- •Counting in Martian
- •Octal: How the Grinch Stole Eight and Nine
- •Hexadecimal: Solving the Digit Shortage
- •From Hex to Decimal and from Decimal to Hex
- •Arithmetic in Hex
- •Binary
- •Hexadecimal as Shorthand for Binary
- •Switches, Transistors, and Memory
- •The Shop Foreman and the Assembly Line
- •The Box That Follows a Plan
- •DOS and DOS files
- •Compilers and Assemblers
- •The Assembly Language Development Process
- •DEBUG and How to Use It
- •Chapter 5: NASM-IDE: A Place to Stand Give me a lever long enough, and a place to stand, and I will move the Earth.
- •NASM-IDE's Place to Stand
- •Using NASM-IDE's Tools
- •NASM-IDE's Editor in Detail
- •Other NASM-IDE Features
- •The Nature of Segments
- •16-Bit and 32-Bit Registers
- •The Three Major Assembly Programming Models
- •Reading and Changing Registers with DEBUG
- •Assembling and Executing Machine Instructions with DEBUG
- •Machine Instructions and Their Operands
- •Reading and Using an Assembly Language Reference
- •Rally Round the Flags, Boys!
- •Using Type Specifiers
- •The Bones of an Assembly Language Program
- •Assembling and Running EAT.ASM
- •One Program, Three Segments
- •Last In, First Out via the Stack
- •Using DOS Services through INT
- •Boxes within Boxes
- •Using BIOS Services
- •Building External Libraries of Procedures
- •Creating and Using Macros
- •Bits Is Bits (and Bytes Is Bits)
- •Shifting Bits
- •Flags, Tests, and Branches
- •Assembly Odds 'n Ends
- •The Notion of an Assembly Language String
- •REP STOSW, the Software Machine Gun
- •The Semiautomatic Weapon: STOSW without REP
- •Storing Data to Discontinuous Strings
- •Chapter 12: The Programmer's View of Linux Tools and Skills to Help You Write Assembly Code under a True 32-Bit OS
- •Prerequisites-Yukkh!
- •NASM for Linux
- •What's GNU?
- •The make Utility and Dependencies
- •Using the GNU Debugger
- •Your Work Strategy
- •Genuflecting to the C Culture
- •A Framework to Build On
- •The Perks of Protected Mode
- •Characters Out
- •Characters In
- •Be a Time Lord
- •Generating Random Numbers
- •Accessing Command-Line Arguments
- •Simple File I/O
- •Conclusion: Not the End, But Only the Beginning
- •Where to Now?
- •Stepping off Square One
- •Notes on the Instruction Set Reference
- •AAA Adjust AL after BCD Addition
- •ADC Arithmetic Addition with Carry
- •ADD Arithmetic Addition
- •AND Logical AND
- •BT Bit Test (386+)
- •CALL Call Procedure
- •CLC Clear Carry Flag (CF)
- •CLD Clear Direction Flag (DF)
- •CMP Arithmetic Comparison
- •DEC Decrement Operand
- •IMUL Signed Integer Multiplication
- •INC Increment Operand
- •INT Software Interrupt
- •IRET Return from Interrupt
- •J? Jump on Condition
- •JMP Unconditional Jump
- •LEA Load Effective Address
- •MOV Move (Copy) Right Operand into Left Operand
- •NOP No Operation
- •NOT Logical NOT (One's Complement)
- •OR Logical OR
- •POP Pop Top of Stack into Operand
- •POPA Pop All 16-Bit Registers (286+)
- •POPF Pop Top of Stack into Flags
- •POPFD Pop Top of Stack into EFlags (386+)
- •PUSH Push Operand onto Top of Stack
- •PUSHA Push All 16-Bit GP Registers (286+)
- •PUSHAD Push All 32-Bit GP Registers (386+)
- •PUSHF Push 16-Bit Flags onto Stack
- •PUSHFD Push 32-Bit EFlags onto Stack (386+)
- •RET Return from Procedure
- •ROL Rotate Left
- •ROR Rotate Right
- •SBB Arithmetic Subtraction with Borrow
- •SHL Shift Left
- •SHR Shift Right
- •STC Set Carry Flag (CF)
- •STD Set Direction Flag (DF)
- •STOS Store String
- •SUB Arithmetic Subtraction
- •XCHG Exchange Operands
- •XOR Exclusive Or
- •Appendix C: Web URLs for Assembly Programmers
- •Appendix D: Segment Register Assumptions
- •Appendix E: What's on the CD-ROM?
- •Index
- •List of Figures
- •List of Tables
Characters In
Reading characters from the Linux keyboard is as easy as sending characters to the screen display. In fact, the C library calls for reading data from the keyboard (which is the default data source assigned to standard input) are almost the inverse of those that display data to standard output. This was deliberate, even though there are times when the symmetry gets in the way, as I'll explain.
String Input with fgets
If you poke around in a C library reference (and you should—there are a multitude of interesting routines there that you can call from assembly programs), you may discover the gets routine. You may have wondered (if I didn't choose to tell you here) why I didn't cover it. The gets routine is simplicity itself: You pass it the name of a string array in which to place characters, and then the user types characters at the keyboard, which are placed in the array. When the user presses Enter, gets appends a null at the end of the entered text and returns. What's not to love?
Well, how big is the array? And how dumb is your user?
Here's the catch: There's no way to tell gets when to stop accepting characters. If the user types in more characters than you've allocated room to accept them in an array, gets will gleefully keep accepting characters, and overwrite whatever data is sitting next to your array in memory. If that something is something important, your program will crash hard.
That's why, if you try to use gets, gcc will warn you that gets is dangerous. It's old, and much better machinery has been created in times since. The designated successor to gets is fgets, which has some safety equipment built-in—and some complications, too.
The complications stem from the fact that you must pass a file handle to fgets. In general, standard C library routines whose names begin with f act on files. (I explain how to work with disk files later in this chapter.) You can use fgets to read text from a disk file—but remember, in Unix terms, your keyboard is connected to a file, the file called standard input. If we can connect fgets to standard input, we can read text from the keyboard, which is what the old and hazardous gets does automatically.
The bonus in using fgets is that it allows us to specify a maximum number of characters for the routine to accept from the keyboard. Anything else the user types will be truncated and discarded. If this maximum value is no larger than the string buffer you define to hold characters entered by the user, there's no chance that using fgets will crash your program.
Connecting fgets to the standard input file is easy. The C library predefines three standard file handles, and these handles are linked into your program automatically. The three are stdin (standard input), stdout
(standard output), and stderr (standard error). For accepting input from the keyboard through fgets, we want to use stdin. It's there; you simply have to declare it as extern.
So here's how to use the fgets routine:
1.Make sure you have declared extern fgets and extern stdin along with your other external declarations at the top of the .text section.
2.Declare a buffer variable large enough to hold the string data you want the user to enter. Use the RESB directive in the [.bss] section of your program.
3.To call fgets, first push the file handle. You must push the handle itself, not the handle's address! So use the form push dword [stdin].
4.Next, push the value indicating the maximum number of characters you want fgets to accept. Make sure it is no larger than the buffer variable you declare in [.bss]! The stack must contain the actual value—don't just push the address of a variable holding the value. Pushing an immediate value or the contents of a memory variable will work.
5.Next, push the address of the buffer variable where fgets is to store the characters entered by the user.
6.Finally, call fgets itself.
7.
6.
7. (And as with all library function calls, don't forget to clean up the stack!)
In terms of actual code, it should look something like this:
push dword [stdin] |
; Push predefined file handle for standard input |
push dword 72 |
; Accept no more than 72 characters from keyboard |
push dword instring ; Push address of buffer for entered characters |
|
call fgets |
; Call fgets |
add esp,12 |
; 3 args X 4 bytes = 12 for stack cleanup |
Here, the identifier instring is a memory variable defined like this: |
|
[SECTION .bss] |
; Section containing uninitialized data |
instring resb 96 |
; Reserve 96 bytes for string entry buffer |
Recall that the RESB directive just sets aside space for your variable; that space is not preloaded with any particular value, with spaces, or nulls, or anything. Until the user enters data through fgets, the string storage you allocate using RESB is uninitialized and could contain any garbage values at all.
From the user side of the screen, fgets simply accepts characters until the user presses Enter. It doesn't automatically return after the user types the maximum permitted number of characters. (That would prevent the user from backing over input and correcting it.) However, anything the user types beyond the number of permitted characters is discarded.
The CHARSIN.ASM file shown later in this chapter contains the preceding code.
Using scanf for Entry of Numeric Values
In a peculiar sort of way, the C library function scanf is printf running backward: Instead of outputting formatted data in a character stream, scanf takes a stream of character data from the keyboard and converts it to numeric data stored in a numeric variable. Scanf works very well, and it understands a great many formats that I won't be able to explain in this book, especially for the entry of floating-point numbers. (Floating-point values are a special problem in assembly work, and I won't be taking them up in this edition of this book.)
For most simple programs you may write while you're getting your bearings in Linux assembly, you'll be entering simple integers, and scanf is very good at that. You pass scanf the name of a numeric variable in which to store the entered value and a formatting code indicating what form that value will take on data entry. The scanf function will take the characters typed by the user and convert them to the integer value that the characters represent. That is, scanf will take the two ASCII characters "4" and "2" entered successively and convert them to the integer value 42 after the user presses Enter.
What about a prompt string, instructing the user what to type? Well, many newcomers get the idea that you can combine the prompt with the format code in a single string handed to scanf—but that won't work. It seems like it should—hey, after all, you can combine formatting codes with the base string to be displayed using printf. And in scanf, you can theoretically use a base string containing formatting codes . . . but the user would then have to type the prompt as well as the numeric data!
So, in actual use, the only string used by scanf is a string containing the formatting codes. If you want a prompt, you must display the prompt before calling scanf, using printf. To keep the prompt and the data entry on the same line, make sure you don't have a newline called out at the end of your prompt string!
The scanf function automatically takes character input from standard input. You don't have to pass it the file handle stdin, as with fgets. (There is a C library routine fscanf to which you do have to pass a file handle, but for integer data entry, there's no hazard in using scanf.)
Here's how to use the scanf routine:
1.Make sure you have declared extern scanf along with your other external declarations at the top of the [.text] section.
2.Declare a memory variable of the proper type to hold the numeric data read and converted by scanf.
1.
2.
My examples here will be for integer data, so you would create such a variable with either the DD directive or the RESD directive. Obviously, if you're going to keep several separate values, you'll need to declare one variable per value entered.
3.To call scanf for entry of a single value, first push the address of the memory variable that will hold the value. (See the following discussion about entry of multiple values in one call.)
4.Next, push the address of the format string that specifies what format that data will arrive in. For integer values, this is typically the string "%d."
5.Call scanf.
6.Clean up the stack.
The code for a typical call would look like this:
push dword intval |
; Push the address of the integer buffer |
||
push |
dword iformat |
; Push |
the address of the integer format string |
call |
scanf |
; Call |
scanf to enter numeric data |
add esp,8 |
; Clean up the stack |
It's possible to present scanf with a string containing multiple formatting codes, so that the user could enter multiple numeric values with only one call to scanf. I've tried this, and it makes for a very peculiar user interface. The feature is better used if you're writing a program to read a text file containing rows of integer values expressed as text, and convert them to actual integer variables in memory. For simply obtaining numeric values from the user through the keyboard, it's best to accept only one value per call to scanf.
The following program shows how you would set up prompts alongside a data entry field for accepting both string data and numeric data from the user through the keyboard. After accepting the data, the program displays what was entered, using printf.
; Source name |
: CHARSIN.ASM |
; Executable name : CHARSIN |
|
; Version |
: 1.0 |
; Created date |
: 11/21/1999 |
; Last update |
: 11/30/1999 |
; Author |
: Jeff Duntemann |
; Description |
: A data input demo for Linux, using NASM 0.98 |
; |
|
;Build using these commands:
;nasm -f elf charsin.asm
;gcc charsin.o -o charsin
[SECTION .text] |
; Section containing code |
extern stdin |
; Standard file variable for input |
extern fgets |
|
extern printf |
|
extern scanf |
; Required so linker can find entry point |
global main |
|
main: |
; Set up stack frame for debugger |
push ebp |
|
mov ebp,esp |
; Program must preserve ebp, ebx, esi, & edi |
push ebx |
|
push esi |
|
push edi |
|
;;; Everything before this is boilerplate; use it for all ordinary apps!
;;First, an example of safely limited string input using fgets. Unlike
;;gets, which does not allow limiting the number of chars entered, fgets
;;lets you specify a maximum number. However, you must also specify a
;; file (hence the 'f' in 'fgets') so we must push the stdin handle.
push dword sprompt |
; Push address of the string input prompt string |
||
call printf |
; Display it |
||
add esp,4 |
; Clean up stack for 1 arg |
||
push |
dword [stdin] |
; |
Push predefined file handle for standard input |
push |
dword 72 |
; |
Accept no more than 72 characters from keybd. |
push dword instring ; Push address of buffer for entered characters
call fgets |
; Call fgets |
add esp,12 |
; 3 args X 4 bytes = 12 for stack cleanup |
push dword instring ; Push address of entered string data buffer
push |
dwrod sshow |
; Push |
address of the string display prompt |
|
call |
printf |
; |
Call |
printf |
add esp,8 |
; |
Clean up the stack |
;;Next, we'll use scanf to enter numeric data. This is easier, because
;;unlike strings, integers can only be so big and hence are self-
;;limiting.
push dword iprompt |
; |
Push address of the integer input prompt |
||
call printf |
; |
Display it |
|
|
add |
esp,4 |
; |
Clean up the stack |
|
push dword intval |
; Push the address of the integer buffer |
|||
push dword iformat |
; |
Push the address of the integer format string |
||
call scanf |
; |
Call scanf to enter numeric data |
|
|
add |
esp,8 |
; |
Clean up the stack |
|
push dword [intval] ; Push integer value to display |
|
|||
push dword ishow |
; |
Push base string |
|
|
call printf |
; Call printf to convert & display the integer |
|||
add |
esp,8 |
; |
Clean up the stack |
|
;;; Everything after this is boilerplate; use it for all ordinary apps! |
||||
pop |
edi |
; |
Restore saved registers |
|
pop esi |
|
|
|
|
pop ebx |
; Destroy stack frame before returning |
|||
mov esp,ebp |
||||
pop ebp |
; |
Return control to Linux |
|
|
ret |
|
|
||
[SECTION .data] |
; Section containing initialized data |
|||
sprompt db 'Enter string data, followed by Enter: ',0 |
',0 |
|||
iprompt db 'Enter an integer value, followed by Enter: |
||||
iformat db '%d',0 |
|
entered was: %s',10,0 |
|
|
sshow |
db 'The string you |
|
||
ishow |
db 'The integer value you entered was: %5d',10,0 |
|
||
[SECTION .bss] |
; Section containing uninitialized data |
|||
intval |
resd 1 |
; Reserve one uninitialized double word |
||
instring resb 128 |
; Reserve 128 bytes for string entry buffer |