Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Assembly Language Step by Step 1992

.pdf
Скачиваний:
145
Добавлен:
17.08.2013
Размер:
7.98 Mб
Скачать

the Carry flag with a branching instruction, as I'll explain in Section 9.3.

Keep in mind when using shift instructions, however, that, in addition to the Shift instructions, a lot of different instructions, including the bitwise logical instructions and the arithmetic instructions, use the Carry flag. If you bump a bit into the Carry flag with the intent of testing that bit to see what it is, test it before you execute another instruction that affects the Carry flag.

If you shift a bit into the Carry flag and then immediately execute another shift instruction, the first bit will be bumped off the end of the world and into nothingness.

The Byte2Str Procedure: Converting Numbers to Displayable Strings

As we've seen, DOS has a fairly convenient method for displaying text on your screen. The problem is that it only displays text—if you want to display a numeric value from a register as a pair of digits, DOS won't help. You first have to convert the numeric value into its string representation, and then display the string representation through DOS.

Converting hexadecimal numbers to hexadecimal digits isn't difficult, and the routine to do the job demonstrates several of the new concepts we're exploring in this chapter. Read the Byte2Str procedure carefully:

To call Byte2Str you must pass the value to be converted to a string in AL, and the address of the string into which the string representation is to be stored as DS:SI. Typically, DS will already contain the segment address of your data segment, so you most likely will only need to pass the offset of the start of the string in SI.

In addition to the code shown here, Byte2Str requires the presence of a second string in the data segment. This string, whose name must be Digits, contains all 16 of the digits used to express hexadecimal numbers. The definition of Digits looks like this:

Digits DB '0123456789ABCDEF'

The important thing to note about Digits is that each digit occupies a position in the string whose offset from the start of the string is the value it represents. In other words, '0' is at the start of the string, zero bytes offset from the string's start. The character "7" lies seven bytes from the start of the string, and so on. Digits is what we call a look up table and it represents (as I'll explain below) an extremely useful mechanism in assembly language.

Splitting a Byte into Two Nybbles

Displaying the value stored in a byte requires two hexadecimal digits. The bottom four bits in a byte are represented by one digit (the least significant, or rightmost digit) and the top four bits in the byte are represented by another digit (the most significant, or leftmost digit.) Converting the two digits must be done one at a time, which means that we have to separate the single byte into two four-bit quantities, which are often called nybbles.

To split a byte in two, we need to mask out the unwanted half. This is done with an AND instruction. Note in the Byte2Str procedure that the first instruction, MOV DI,AX, copies the value to be converted (which is in AL) into DI. You don't need to move AH into DI here, but there is no instruction to move an 8-bit register-half like AL into a 16bit register like DI. AH comes along for the ride, but we really don't need it. The second instruction masks out the high twelve bits of DI using AND. This eliminates what had earlier been in free-rider AH, as well as the high four bits of AL. What's left in DI is all we want: the lower four bits of what was originally passed to the routine in AL.

Using a Lookup Table

The low nybble of the value to be converted is now in DI. The address of Digits is loaded into BX. Then the appropriate digit character is copied from Digits into AH. The whole trick of using a lookup table lies in the way the character in the table is addressed:

mov AH,BYTE PTR [BX+DI]

DS:BX points to the start of Digits, so [BX] would address the first character in digits. To get at the desired digit, we must index into the lookup table by adding the offset into the table to BX. There is an 8086/8088 addressing mode intended precisely for use with lookup tables, called base indexed addressing. That sounds more arcane than it is; what it means is that instead of specifying a memory location at [BX], we add an index to BX, and address a memory location at [BX+DI].

If you recall, we masked out all of DI except the four lowest bits of the byte we are converting. These bits will contain some value from 0 through OFH. Digits contains the hexadecimal digit characters from 0 through F. By using DI as the index, the value in DI will select its corresponding digit character in Digits. We are using the value in DI to look up its equivalent hexadecimal digit character in the lookup table (Digits). See Figure 9.4.

So far, we've read a character from the lookup table into AH. Now, we use yet another addressing mode to move the character from AX back into the second character of the destination string, whose address was passed to Byte2Str in DS:SI. This addressing mode is called indirect addressing, though I question the wisdom of memorizing that term. The mode is nothing more than indirect addressing (addressing the contents of memory at [SI]) with the addition of a literal displacement:

mov [SI+1],AH

This looks a lot like base indexed addressing (which is why the jargon may not be all that useful) with the sole exception that what is added to SI is not a register but a literal constant.

Once this move is done, the first of the two nybbles passed to Byte2Str in AL has been converted to its character equivalent and stored in the destination string variable at

DS:SI.

Now we have to do it again, this time for the high nybble.

Shifting the High Nybble into the Low Nybble

The high nybble of the value to be converted has been waiting patiently all this time in AL. We didn't mask out the high nybble until we moved AX into DI, and did our masking on DI instead of AX. So AL is still just as it was when Byte2Str began.

The first thing to do is clear AH to 0. Byte2Str uses the XOR AH,AH trick I described in the last section. Then we move AX into DI.

All that remains to be done is to somehow move the high nybble of the low byte of DI into the position occupied by the low nybble. The fastest way to do this is simply to shift DI to the right—four times in a row. This is what the four SHR instructions in Byte2Str do. The low nybble is simply shifted off the edge of DI, into the Carry flag, and then out into nothingness. After the four shifts, what was the high nybble is now the low nybble, and once again, DI can be used as an index into the Digits lookup table to move the appropriate digit into AH.

Finally, there is the matter of storing the digit into the target string at DS:SI. Notice that this time, there is no +1 in the MOV instruction:

mov [SI],AH

Why not? The high nybble is the digit on the left, so it must be moved into the first byte in the target string. Earlier, we moved the low nybble into the byte on the right. String indexing begins at the left and works toward the right, so if the left digit is at index 0 of the string, the right digit must be at index 0+1.

Byte2Str does a fair amount of data fiddling in only a few lines. Read it over a few times while following the above discussion through its course until the whole thing makes sense to you.

FIGURE 9.4

Converting Words to Their String Form

Having converted a byte-sized value to a string, it's a snap to convert 16-bit words to their string forms. In fact, it's not much more difficult than calling Byte2Str twice:

The logic here is fairly simple—if you understand how Byte2Str works. Moving AX into CX simply saves an unmodified copy of the word to be converted in CX. Something to watch out for here: if Byte2Str were to use CX for something, this saved copy would be mangled, and you might be caught wondering why things weren't working correctly. This is a common enough bug for the following reason: you create Byte2Str, and then create Word2Str to call Byte2Str. The first version of Byte2Str does not make use of CX, so it's safe to use CX as a storage bucket.

However—later on you beef up Byte2Str somehow, and in the process add some instructions that use CX. You plumb fergot that Word2Str stored a value in CX whileWord2Str was calling

Byte2Str

. It's pointless arguing whether the bug is that Byte2Str uses CX, or that Word2Str assumes that no one else is using CX. To make things work again, you would have to stash the value somewhere other than in CX. Pushing it onto the stack is your best bet if you run out of registers. (You might hit on the idea of stashing it in an unused segment register like ES—but I warn against it! Later on, if you try to use these utility routines in a program that makes use of ES, you'll be in a position to mess over your memory addressing royally. Let segment registers hold segments. Use the stack instead.) Virtually everything that Word2Str does involves getting the converted digits into the proper positions in the target string. A word requires four hexadecimal digits altogether. In a string representation, the high byte occupies the left two digits, and the low byte occupies the right two digits. Since strings are indexed from the left to the right, it makes a certain sense to convert the left end of the string first.

This is the reason for the XCHG instruction. It swaps the high and low bytes of AX, so that the first time Byte2Str is called, the high byte is actually in AL instead of AH.

(Remember that Byte2Str converts the value passed in AL.) Byte2Str does the conversion and stores the two converted digits in the first two bytes of the string at

DS:SI.

For the second call to Byte2Str, AH and AL are not exchanged. Therefore the low byte will be the one converted. Notice the following instruction:

add SI,2

This is not heavy-duty math, but it's a good example of how to add a literal constant to a register in assembly language. The idea is to pass the address of the second two bytes of the string to Byte2Str as though they were actually the start of the string. This means that when Byte2Str converts the low byte of AX, it stores the two equivalent digits into the second two bytes of the string.

For example, if the high byte was 0C7H, the digits C and 7 would be stored in the first two bytes of the string, counting from the left. Then, if the low byte were 042H, the digits 4 and 2 would be stored at the third and fourth bytes of the string, respectively. The whole string would read C742 when the conversion was complete.

As I've said numerous times before: understand memory addressing and you've got the greater part of assembly language in your hip pocket. Most of the trick of Byte2Str and Word2Str lies in the different ways they address memory. As you study them, focus on the machinery behind the lookup table and target string addressing. The logic and shift instructions are pretty obvious and easy to figure out by comparison.

9.3 Flags, Tests, and Branches

Those assembler-knowledgeable folk who have stuck with me this long may be wondering why I haven't covered conditional jumps until this late in the book. I mean, we've explained procedures already, and haven't even gotten to jumps yet.

Indeed. That's the whole point. I explained procedures before jumps because when people learn those two concepts the other way around, they have a tendency to use jumps for everything, even when procedures are called for. Unlike some high-level languages like Pascal and Modula-2, there is no way around jumps—(what they so derisively call "GOTOs")—in assembly language. Sadly, some people then assume that jumps are "it," and don't bother imposing any structure at all on their assembly-language programs. By teaching procedures first, I feel that I've at least made possible a more balanced approach on the part of the learner.

Besides, I felt it wise to teach how to manage complexity before teaching the number one means of creating complexity.

Unconditional Jumps

A jump is just that: an abrupt change in the flow of instruction execution. Ordinarily, instructions are executed one after the other, in order, moving from low memory toward high memory. Jump instructions alter the address of the next instruction to be executed. Execute a jump instruction, and zap!-—all of a sudden you're somewhere else in the code segment. A jump instruction can move execution forward in memory, or backward. It can bend execution back into a loop. (And it can tie your program logic in knots ....) There are two kinds of jumps: conditional and unconditional. An unconditional jump is a jump that always happens. It takes this form:

jmp <label>

When this instruction executes, the sequence of execution moves to the instruc-tion located at the label specified by the <label> operand. It's just that simple. The unconditional JMP instruction is of limited use by itself. It almost always works in conjunction with the conditional jump instructions that test the state of the various 8086/8088 flags. You'll see how this works in just a little while, once we've gone through conditional jumps too.

Conditional Jumps

A conditional JMP instruction is one of those fabled tests I introduced in Chapter 0. When executed, a conditional jump tests something, usually one of the flags in the Flags register. If the flag being tested happens to be in a particular state, execution may jump to a label somewhere else in the code segment, or it may simply "fall through" to the next instruction in sequence.

This either/or nature is important. A conditional jump instruction either jumps, or it falls through. Jump, or no jump. It can't jump to one of two places, or three. Whether it jumps or not depends on the current value of one single bit within the CPU.

For example, the Zero flag (ZF) is set to 1 by certain instructions when the result of that instruction is 0. The decrement (DEC) instruction is one of these instructions.

DEC subtracts 1 from its operand. If by that subtraction the operand becomes 0, ZF is

set to 1. One of the conditional jump instructions, Jump if Zero (JZ) tests ZF. If ZF is found set to 1, a jump occurs, and execution transfers to a label. If ZF is found to be 0, execution falls through to the next instruction in line.

Here's a simple (and non optimal) example, using instructions you should already understand:

mov

Counter,17

;

We're going to do this 17 times

WorkLoop:

call DoWork

; Process the data

dec

Counter

;

Subtract 1 from the counter

jz

AllDone

;

If the Counter is 0, we're done!

jmp

WorkLoop

;

Otherwise, go back and execute the loop again

The label AllDone isn't shown in the example because it's somewhere else in the program, maybe a long way off. The important thing is that the JZ instruction is a twoway switch. If ZF is equal to 1, execution moves to the location marked by the label AllDone. If ZF is equal to 0, execution falls through to the next instruction in sequence. Here, that would be the unconditional jump instruction JMP WorkLoop.

This simple loop is one way to perform a call to a procedure some set number of times. A count value is stored in a variable named Counter. The procedure is called. After control returns from the procedure, Counter is decremented by one. If that drops the counter to 0, the procedure has been called the full number of times, and the loop sends execution elsewhere. If the counter still has some count in it, execution loops back to the procedure call and begins the loop again.

Note the use of an unconditional jump instruction to "close the loop."

Beware Endless Loops!

This is a good place to warn you of a common sort of bug that produces the dreaded endless loop, which locks up your machine and forces you to reboot to get out. Suppose the code snippet shown above were instead done the following way:

WorkLoop: mov Counter,17

;

We're going to do this 17 times

call

DoWork

:

Process the data

dec

Counter

:

Subtract 1

from the counter

jz

AllDone

;

If the counter is 0, we're done!

jmp

WorkLoop

;

Otherwise,

go back and execute the loop again

This becomes a pretty obvious endless loop. (However, you'll be appalled at how

often such an obvious bug will dance in your face for hours without being recognized as such ....) The key point is that the instruction that loads the initial value to the counter is inside the loop! Every time the loop happens, the counter is decremented by one ... and then immediately reloaded with the original count value. The count value thus never gets smaller than the original value minus 1 and the loop (which is waiting for the counter to become 0) never ends.

You're unlikely to do something like this deliberately, of course. But it's very easy to type a label at the wrong place, or (easier still) to type the name of the wrong label, a label that might be at or before the point where a counter is loaded with its initial value. Assembly-language programming requires concentration and endless attention to detail. If you pay attention to what you're doing, you'll make fewer "stupid" errors like the one above.

But I can promise you that you'll still make a few.

Jumping on the Absence of a Condition

There are a fair number of conditional jump instructions, of which I'll discuss only the most common in this book. Their number is increased by the fact that every conditional jump instruction has an alter ego: a jump when the specified condition is not set to 1.

The JZ instruction provides a good example. JZ jumps to a new location in the code segment if ZF is set to 1. JZ's alter ego is the Jump if Not Zero (JNZ). JNZ jumps to a label if ZF is 0, and falls through if ZF is 1.

This may be confusing at first, because JNZ jumps when ZF is equal to 0. Keep in mind that the name of the instruction applies to the condition being tested, and not necessarily the binary bit value of the flag. In the previous code example, JZ jumped when the DEC instruction decremented the Counter to 0. The condition being tested is something connected with an earlier instruction, not simply the state of ZF.

Think of it this way: a condition raises a flag. "Raising a flag" means setting the flag to 1. When one of numerous instructions forces an operand to a value of 0, (which is the condition) the Zero flag is raised. The logic of the instruction refers to the condition, not to the flag.

As an example, let's improve our little loop. I should caution you that its first implementation, while correct and workable in the strictest sense, is awk-ward and not the best way to code that kind of thing. It can be improved in several ways. Here's one:

mov Counter,17 : We're going to do this 17 times