Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Assembly Language Step by Step Programming with DOS and Linux 2nd Ed 2000.pdf
Скачиваний:
156
Добавлен:
17.08.2013
Размер:
4.44 Mб
Скачать

Boxes within Boxes

This sounds like Eastern mysticism, but it's just an observation from life: Within any action is a host of smaller actions. Look inside your common activities. When you brush your teeth you do the following:

Pick up your toothpaste tube.

Unscrew the cap.

Place the cap on the sink counter.

Pick up your toothbrush.

Squeeze toothpaste onto the brush from the middle of the tube.

Put your toothbrush into your mouth.

Work it back and forth vigorously.

And so on. The original list went the entire page. When you brush your teeth, you perform every one of those actions. However, when you think about the sequence, you don't run through the whole list. You bring to mind the simple concept "brushing teeth."

Furthermore, when you think about what's behind the action we call "getting up in the morning," you might assemble a list of activities like this:

Shut off the clock radio.

Climb out of bed.

Put on your robe.

Let the dogs out.

Make breakfast.

Brush your teeth.

Shave.

Shower.

Get dressed.

Brushing your teeth is on the list, but within the activity you call "brushing your teeth" is a whole list of smaller actions, as listed previously. The same can be said for most of the activities shown in the preceding list. How many individual actions, for example, does it take to put a reasonable breakfast together? And yet in one small, if sweeping, phrase, "getting up in the morning," you embrace that whole host of small and even smaller actions without having to laboriously trace through each one.

What I'm describing is the "Chinese boxes" method of fighting complexity. Getting up in the morning involves hundreds of little actions, so we divide the mass up into coherent chunks and set the chunks into little conceptual boxes. "Making breakfast" is in one box, "brushing teeth" is in another, and so on. Closer inspection of any box shows that its contents can also be divided into numerous boxes, and those smaller boxes into even smaller boxes.

This process doesn't (and can't) go on forever, but it should go on as long as it needs to in order to satisfy this criterion: The contents of any one box should be understandable with only a little scrutiny. No single box should contain anything so subtle or large and involved that it takes hours of hair-pulling to figure it out.

Procedures as Boxes for Code

The mistake I made in writing my APL text formatter is that I threw the whole collection of 600 lines of APL code into one huge box marked "text formatter."

While I was writing it, I should have been keeping my eyes open for sequences of code statements that worked together at some identifiable task. When I spotted such sequences, I should have set them off as procedures. Each sequence would then have a name that would provide a memory tag for the sequence's function. If it took 10 statements to justify a line of text, those 10 statements should have been named

JustifyLine, and so on.

Xerox's legendary APL programmer Jim Dunn later told me that I shouldn't ever write an APL procedure that wouldn't fit on a single 25-line terminal screen. "More than 25 lines and you're doing too much in one procedure. Split it up," he said. Whenever I worked in APL after that, I adhered to that rather sage rule of thumb. The Martians still struck from time to time, but when they did, it was no longer a total loss.

All computer languages have procedures of one sort or another, and assembly language is no exception. Your assembly language program may have numerous procedures. There's no limit to the number of procedures, as long as the total number of bytes of code contained by all the procedures together does not exceed 65,536 (one segment). Other complications arise at that point, but there are mechanisms in assembly language to deal sensibly with those complications.

But that's a lot of code. You needn't worry for a while, and certainly not while you're just learning assembly language. (I won't be treating the creation of multiple code segments in this book.) In the meantime, let's take a look at the "Eat at Joe's" program, expanded a little to include a couple of procedures:

; Source name

: EAT2.ASM

 

; Executable name : EAT2.COM

 

; Code model

: Real Mode Flat Model

; Version

:

1.0

 

; Created date

: 7/31/1999

 

; Last update

: 9/11/1999

 

; Author

: Jeff Duntemann

; Description

: A simple example of a DOS .COM file programmed using

;

 

 

NASM-IDE 1.1 and NASM 0.98 and incorporating procedures.

[BITS 16]

; Set 16 bit code generation

[ORG 0×0100]

; Set code start address to 100h (COM file)

[SECTION .text]

; Section containing code

Start:

; Load offset of Eat1 string into DX

 

mov DX,EatMsg1

 

call Writeln

;

and display it

 

mov DX,EatMsg2

; Load offset of Ear2 string into DX

 

call Writeln

;

and display it

 

mov ax, 04C00H ; This function exits the program

 

int 21H

; and returns control to DOS.

;-----------------------------|

;

PROCEDURE SECTION

|

;-----------------------------|

Write:

; Select DOS service 9: Print String

 

mov AH,09H

 

int 21H

; Call DOS

 

 

ret

; Return to the caller

Writeln:

; Display the string proper through Write

 

call Write

 

mov DX,CRLF

; Load offset of newline string to DX

 

call Write

; Display the newline string through Write

 

ret

; Return to the caller

;-----------------------------|

;

DATA SECTION

 

|

;-----------------------------

 

 

|

[SECTION .data]

; Section containing initialized data

EatMsg1

DB

"Eat at Joe's . . . ",'$'

EatMsg2

DB

"...ten million flies can't ALL be wrong!",'$'

CRLF

DB

0DH,0AH,'$'

Calling and Returning

EAT2.ASM does about the same thing as EAT.ASM. It prints a second line as part of the advertising slogan, and that's all in the line of functional innovation. The way the two lines of the slogan are displayed, however, bears examination:

mov DX,EatMsg1

;

Load

offset of Eat1 string into DX

call Writeln

;

and

display it

Here's a new machine instruction: CALL. The label Writeln refers to a procedure. As you might have gathered (especially if you've programmed in an older language such as Basic or FORTRAN), CALL Writeln simply tells the CPU to go off and execute a procedure named Writeln.

The means by which CALL operates may sound familiar: CALL first pushes the address of the next instruction after itself onto the stack. Then CALL transfers execution to the address represented by the name of the procedure. The instructions contained in the procedure execute. Finally, the procedure is terminated by CALL's alter ego: RET (for RETurn). The RET instruction pops the address off the top of the stack and transfers execution to that address. Since the address pushed was the address of the first instruction after the CALL instruction, execution continues as though CALL had not changed the flow of instruction execution at all. See Figure 9.1.

Figure 9.1: Calling a procedure and returning.

This should remind you strongly of how software interrupts work. The main difference is that the caller does know the exact address of the routine it wishes to call. Apart from that, it's very close to being the same process. (Also note that RET and IRET are not interchangeable. CALL works with RET just as INT works with IRET. Don't get those return instructions confused!)

The structure of a procedure is simple and easy to understand. Look at the Write procedure from EAT2.ASM:

Write:

;

Select

DOS service 9: Print String

mov AH,09H

int 21H

;

Call DOS

ret

;

Return

to the caller

The important points are these: A procedure must begin with a label, which is (as you should recall) an identifier followed by a colon. Also, somewhere within the procedure, and certainly as the last instruction in the procedure, there must be at least one RET instruction. There may be more than one RET instruction. Execution has to come back from a procedure by way of a RET instruction, but there can be more than one exit door from a procedure. Using more than one RET instruction requires the use of condition jump instructions, which I won't take up until the next chapter.

Calls within Calls

Within a procedure you can do anything that you can do within the main program. This includes calling other procedures from within a procedure. Even something as simple as EAT2.ASM does that. Look at the

Writeln procedure:

Writeln:

call Write

; Display the string proper through Write

mov DX,CRLF

;

Load offset

of newline string to DX

call Write

;

Display the

newline string through Write

ret

; Return to the caller

The Writeln procedure displays a string to your screen, and then returns the cursor to the left margin of the following screen line. This action is actually two distinct activities, and Writeln very economically uses a mechanism that already exists: the Write procedure. The first thing that Writeln does is call Write to display the string itself to the screen. Remember that the caller loaded the address of the string to be displayed into DX before calling Writeln. Nothing has disturbed DX, so Writeln can immediately call

Write, which will fetch the address from DX and display the string to the screen.

Returning the cursor is done by displaying the newline sequence, which is stored in a string named CRLF. (If you recall, the carriage return and line feed character pair was built right into our message string in the EAT.ASM program that we dissected in Chapter 8.) Writeln again uses Write to display CRLF. Once that is done, the work is finished, and Writeln executes a RET instruction to return execution to the caller.

Calling procedures from within procedures requires you to pay attention to one thing: stack space. Remember that each procedure call pushes a return address onto the stack. This return address is not removed from the stack until the RET instruction for that procedure executes. If you execute another

CALL instruction before returning from a procedure, the second CALL instruction pushes another return address onto the stack. If you keep calling procedures from within procedures, one return address will pile up on the stack for each CALL until you start returning from all those nested procedures.

If you run out of stack space, your program will crash and return to DOS, possibly taking DOS with it. This is why you should take care not to use more stack space than you have. Ironically, in small programs written in real mode flat model, this usually isn't a problem. Stack space isn't allocated in real mode flat model; instead the stack pointer points to the high end of the program's single segment, and the stack uses as much of the segment as it needs. For small programs with only a little data (such as the toy programs we're building and dissecting in this book), 95 percent of the space in the segment has nothing much to do and can be used by the stack if the stack needs it. (Which it doesn't—not in this kind of programming!)

Things are different when you move to real mode segmented model. In that model, you have to explicitly allocate a stack segment of some specific size, and that is all the space that the stack has to work with. So, ironically, in a program that can potentially make use of the full megabyte of real mode memory, it's much easier to foment a stack crash in segmented model than flat model. So, when you allocate space for the stack in real mode segmented model, it makes abundant sense to allocate considerably more stack space than you think you might ever conceivably need. EAT2.ASM at most uses 4 bytes of stack space, because it nests procedure calls two deep. (Writeln within itself calls Write.) In a program like this, stack allocation isn't an issue, even if you migrated it to the segmented model.

Nonetheless, I recommend allocating 512 bytes of stack to get you in the habit of not being stingy with stack space. Obviously, you won't always be able to keep a 128-to-1 ratio of need-to-have, but consider 512 bytes a minimum for stack space allocation in any reasonable program that uses the stack at all. (We allocated only 64 bytes of stack in EATSEG.ASM simply to show you what stack allocation was. The program does not, in fact, make any use of the stack at all.) If you need more, allocate it. Don't forget that there is only one stack in the system, and while your program is running, DOS and the BIOS and any active memory resident programs may well be using the same stack. If they fill it, you'll go down with the system—so leave room!

When to Make Something a Procedure

The single most important purpose of procedures is to manage complexity in your programs by replacing a sequence of machine instructions with a descriptive name. This might hardly seem to the point in the case of the Write procedure, which contains only two instructions apart from the structurally necessary RET instruction.

True. But—the Writeln procedure hides two separate calls to Write behind itself: one to display the string, and another to return the cursor to the left margin of the next line. The name Writeln is more readable and descriptive of what the underlying sequence of instructions does than the sequence of instructions itself.

Extremely simple procedures such as Write don't themselves hide a great deal of complexity. They do give certain actions descriptive names, which is valuable in itself. They also provide basic building blocks for the creation of larger and more powerful procedures, as we'll see later on. And those larger procedures will hide considerable complexity, as you'll soon see.

In general, when looking for some action to turn into a procedure, see what actions tend to happen a lot in a program. Most programs spend a lot of time displaying things to the screen. Such procedures as Write and Writeln become general-purpose tools that may be used all over your programs. Furthermore, once you've written and tested them, they may be reused in future programs as well without adding to the burden of code that you must test for bugs.

Try to look ahead to your future programming tasks and create procedures of general usefulness. I show you more of those by way of examples as we continue, and tool building is a very good way to hone your assembly language skills.

On the other hand, a short sequence (5 to 10 instructions) that is only called once or perhaps twice within a middling program (that is, over hundreds of machine instructions) is a poor candidate for a procedure.

You may find it useful to define large procedures that are called only once when your program becomes big enough to require breaking it down into functional chunks. A thousand-line assembly language program might split well into a sequence of 9 or 10 largish procedures. Each is only called once from the main program, but this allows your main program to be very indicative of what the program is doing:

Start: call Initialize call OpenFile

Input: call GetRec call VerifyRec call WriteRec loop Input call CloseFile call CleanUp

call ReturnToDOS

This is clean and readable and provides a necessary view from a height when you begin to approach a thousand-line assembly language program. Remember that the Martians are always hiding somewhere close by, anxious to turn your program into unreadable hieroglyphics.

There's no weapon against them with half the power of procedures.