Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Assembly Language Step by Step 1992

.pdf
Скачиваний:
143
Добавлен:
17.08.2013
Размер:
7.98 Mб
Скачать

CS:IP. If you enter the G command and press Enter, the CPU will jump to the address built into the JMP instruction and begin executing machine instructions. What happens then?

Your machine will go into a cold boot, just as it would if you powered down and powered up again. (So make sure you're ready for a reboot before you try it!)

This may seem odd. But consider this: the CPU chip has to begin execution somewhere. When the CPU wakes up after being off all night with the power removed, it must get a first machine instruction from somewhere and start executing. Built into the silicon of the 8086/8088 CPU chips is the assumption that a legal machine instruction will exist at address 0FFFF:0. When power is applied to the CPU chip, the first thing it does is place 0FFFH in CS, and 0 in IP. Then it starts fetching instructions from the address in CS:IP and executing them, one at a time, in the manner that CPUs must.

This is why all PC's have a JMP instruction at 0FFFF:0, and why this JMP instruction always jumps to the routines that bring the PC up from stone cold dead to fully operational.

So go ahead: load 0FFFFH into CS and 0 into IP, and press G. Feel good? It's what we call the feeling of power.

Following Your

Instructions

Meeting Machine Instructions Up Close and

Personal

6.1Assembling and Executing Machine Instructions with DEBUG >• 154

6.2Machine Instructions and Their Operands >• 157

6.3Assembly-Language References >• 167

6.4An Assembly-Language Reference for Beginners >• 168

6.5Rally'Round the Flags, Boys! >• 173

6.6Using Type Overrides >• 178

Machine instructions, those atoms of action that are the steps a program rnust take to get its work done, are the most visible part of any assembly-language program. The collection of instructions supported by a given CPU is that CPU's instruction set. The 8086 and 8088 CPUs share the same instruction set, which is why most people consider them the same CPU.

This cannot be said for the 80286 and 80386, both of which offer additional instructions not found in the 8086/8088. By and large, I'll only be introducing instructions in this book that the 8086/8088 understand. (I'll show you a few more from the more advanced

CPUs in Chapter 11, but there are fewer truly useful new instructions than you might have hoped for.) Furthermore, I can't cover all machine instructions in this book, even limiting myself to the 8086/ 8088. Those that I will describe are the most common and most useful.

Nor will I abandon my discussion of memory addressing begun in Chapter 5. As I've said before, understanding how the CPU and its instructions address memory is more difficult but probably more important than understanding the instructions themselves. In and around the descriptions of the machine instructions I'll present from this point on there will be discussions and elaborations on memory addressing. Pay attention! If you don't learn the concepts of memory addressing, memorizing the entire instruction set will do you no good at all.

6.1 Assembling and Executing Machine Instructions with DEBUG

The most obvious way to experiment with machine instructions is to build a short program out of them and watch it go. This can easily be done (and we'll be doing it a lot in later chapters) but it's far from the fastest way to do things. Editing, assembling, and linking all take time, and when you only want to look at one machine instruction in action (rather than a crew of them working together) the full development cycle is overkill.

Once more, we turn to DEBUG.

At the close of the last chapter we got a taste of a DEBUG feature called unassembly, which is a peculiar way of saying what most of us call disassembly. This is the reverse of the assembly process we looked at in detail in Chapter 3-Disassembly is the process of taking a binary machine instruction like 42H and converting it into its more readable assembly-language equivalent,

INC DX.

In addition to all its other tools, DEBUG also contains a simple assembler, suitable for taking assembly-language mnemonics like INC DX and converting them to their binary machine code form. Later on we'll use a standalone assembler like TASM or MASM to assemble complete assembly-language programs. For the time being, we can use DEBUG to do things one or two instructions at a time.

Assembling a MOV Instruction

The single most common activity in assembly-language work is getting data from here to there. There are several specialized ways to do this, but only one truly general way: the MOV instruction. MOV can move a byte or word of data from one register to another, from a register into memory, or from memory into a register. What MOV cannot do is move data directly from one address in memory to a different address in memory.

The name MOV is a bit of a misnomer, since what is actually happening is that data is copied from a source to a destination. Once copied to the destination, however, the data does not vanish from the source, but continues to exist in both places. This process conflicts a little with our intuitive notion of moving, which usually means that something disappears from a source and reappears at a destination.

Because MOV is so general and obvious in its action, it's a good place to start in working with DEBUG's assembler.

Invoke DEBUG and use the R command to display the current state of the registers. You should see something like this:

-r

 

 

 

AX-0000

BX=0000

CX=0000 DX=0000 SP=FFEE BP=0000 SI-0000 DI-0000

DS=1980

ES=1980

SS=1980

CS=1980 IP=0100 NV UP El PL NZ NA PO NC

1980:0100

701D

JO

011F

We ignored the third line of the register display before. Now let's think a little bit more about what it means.

When DEBUG is loaded without a specific file to debug, it simply takes the empty region of memory where a file would have been loaded (had a file been loaded when DEBUG was invoked) and treats it as though a program file were really there. The registers all get default values, most of which are zero. IP, however, starts out with a value of 0100H, and the code segment register CS gets the segment address of DEBUG's workspace, which is theoretically empty.

Memory is never really "empty." A byte of memory always contains some value, whether true garbage that happened to reside in memory at power-up time, or a leftover value remaining from the last time that byte of memory was used. In the above register dump, memory at CS:IP contains a JO (jump on overflow) instruction. This rather obscure instruction was not placed there

deliberately, but is simply DEBUG's interpretation of the two bytes 701DH that happen to reside at CS:IP. Most likely, the 701D value was part of some data belonging to the last program to use that area of memory. It could have been part of a word-processor file, a spreadsheet, or anything else. Just don't that some program necessarily put a JO

instruction in memory. Machine

instructions are just numbers, after all, and what numbers do in memory depends completely on how you interpret them—and what utility program you feed them to. DEBUG's internal assembler assembles directly into memory, and places instructions one at a time—as you enter them at the keyboard—into memory CS:IP. Each time you enter an instruction, IP is incremented to the next free location in memory. So by continuing to enter instructions, you can actually type an assembly-language program directly into memory.

Try it. Type the A (assemble) command and press Enter. DEBUG responds by displaying the current value of CS:IP, and then waits for you to enter an assemblylanguage instruction. Type MOV AX,1 and press Enter. DEBUG again displays CS:IP and waits for a second instruction. It will continue waiting for instructions until you press Enter without typing anything. Then you'll see DEBUG's dash prompt again.

Now, use the R command again to display the registers. You should see something like this:

-r

 

SI=0000 DI=0000

AX=0000 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000

DS=1980 ES=1980

SS=1980 CS=1980 IP=0100 NV UP

EI PL NZ NA PO NC

1980:0100 B80100

MOV AX,0001

 

The registers haven't changed—but now the third line shows that the JO instruction is gone, and that the MOV instruction you entered has taken its place. Notice once again that CS contains 1980H, and IP contains 0100H. The address of the MOV instruction is shown as 1980:0100; in other words, at CS:IP.

Executing a MOV Instruction with the Trace Command

Note that you haven't executed anything. You've simply used DEBUG's com-mand to write a machine instruction into a location in memory.

There are two ways to execute machine instructions from within DEBUG. One way is to execute a program in memory, starting at CS:IP. This means that DEBUG will simply start the CPU executing whatever sequence of instructions begins at CS:IP. We looked at the G command very briefly at the end of the last chapter, when we found the JMP instruction that reboots your PC on power up, and used G to execute that instruction. The command is quite evocative: Go. But don't type G just yet....

You haven't entered a program. You've entered one instruction, and one instruction does not a program make. The instruction after your MOV instruction could be anything at

all, recalling that DEBUG is simply interpreting garbage values in memory as random machine instructions. A series of random machine instructions could easily go berserk, locking your system into an endless loop or writing zeroes over an entire segment of memory that r contain part of DOS or of DEBUG itself. We'll use DEBUG's G command a little later, once we've constructed a complete program in memory.

For now, consider the mechanism DEBUG has for executing one machine instruction at a time. It's called Trace, and you invoke it by typing T. The T command will execute the machine instruction at CS:IP, then give control of the machine back to DEBUG. Trace is generally used to "single-step" a machine-code program one instruction at a time, in order to watch what it's up to every step of the way. For now, it's a fine way to execute a single instruction and examine that instruction's effects.

DEBUG's G command executes programs in memory starting at CS:IP; DEBUG's T command executes the single instruction at CS:IP.

So type T. DEBUG will execute the MOV instruction you entered at CS:IP, and then immediately display the registers before returning to the dash prompt. You'll see this:

-r

CX-0000

DX=0000

SP-FFEE BP=0000 SI-0000

DI-0000

AX-0001 BX=0000

DS=1980 ES=1980

SS=1980 CS-1980

IP=0103 NV UP EI PL NZ

NA PO NC

1980:0103 6E

DB 6E

 

 

 

Look at the first line. DEBUG says AX is now equal to 0001. It held the default value 0000 before; obviously, your MOV instruction worked.

And there's something else to look at here: the third line shows an instruction called DB at CS:IP. Not quite true—DB is not a machine instruction, but an assembly-language directive that means define byte. (We'll return to DB later on, in Chapter 7.) It's DEBUG's way of saying that the number 6EH does not correspond to any machine instruction. It is truly a garbage byte sitting in memory, doing nothing. Executing a 6EH byte as though it were an instruction, however, could cause your machine to do unpredictably peculiar things, up to and including locking up hard.

6.2 Machine Instructions and Their Operands

As we said earlier, MOV copies data from a source to a destination. MOV is an extremely versatile instruction, and understanding its versatility demands a little study of

this notion of source and a destination.

Source and Destination Operands

Many machine instructions, MOV included, have one or more operands. In the machine instruction MOV AX,1 there are two operands. The first is AX, and the second is "1." By convention in assembly language, the first operand belonging to a machine instruction is the destination operand. The second operand is the source operand.

With the MOV instruction the sense of the two operands is pretty literal: The source operand is copied to the destination operand. In MOV AX,1, the source operand 1 is copied into the destination operand AX. The sense of source and destination is not nearly so literal in other instructions, but a rule of thumb is this: whenever a machine instruction causes a new value to be generated, that new value is placed in the destination operand. There are three different flavors of data that may be used as operands: memory data, register data, and immediate data. I've blown some example MOV instructions up to larger-than-life size in Figure 6.1, to give you a flavor for how the different types of data are specified as operands to the MOV instruction.

Immediate data is the easiest to understand. We'll look at it first.

Immediate Data

The MOV AX,1 machine instruction that you entered into DEBUG was a good example of what we call immediate data which is accessed through an addressing mode called immediate addressing. Immediate addressing gets its name from the fact that the item being addressed is immediate data built right into the machine instruction. The CPU does not have to go anywhere to find immediate data. It's not in a register, or stored in a data segment somewhere out in memory.Immediate data is always right inside the instruction being fetched and executed—in this case, the source operand, 1.

Immediate data must be of an appropriate size for the operand. In other words, you can't move a 16-bit immediate value into an 8-bit register half like AH or DL. Neither DEBUG nor the standalone assemblers will allow you to assemble an instruction like this:

MOV CL,67EFH

Because it's built right into a machine instruction, you might think immediate data would be quick to access. This is true only to a point: fetching anything from memory takes more time than fetching anything from a register, and instructions are, after all, stored in memory.

So, while addressing immediate data is somewhat quicker than addressing ordinary data stored in memory, neither is anywhere near as quick as simply pulling a value from a CPU register.

Also keep in mind that only the source operand may be immediate data. The destination operand is the place where data goes, not where it comes from. Since immediate data consists of literal constants (numbers like 1, 0, or 7F2BH) trying to copy something into immediate data rather than from immediate data simply has no meaning.

Register Data

Data stored inside a CPU register is known as register data, and is accessed directly through an addressing mode called register addressing. Register addressing is done by

simply naming the register you want to work with. Here are some examples of register data and register addressing:

MOV AX,BX

MOV BP,SP

MOV BL.CH

MOV ES.DX

ADD DI.AX

AND DX.SI

The last two examples point up the fact that we're not speaking only of the MOV instruction here. Register addressing happens any time data in a register is acted on directly.

The assembler keeps track of certain things that don't make sense, and one such situation is having a 16-bit register and an 8-bit register half within the same instruction. Such operations are not legal—after all, what would it mean to move a two-byte source into a one-byte destination? And while moving a one-byte source into a two-byte destination might seem more reasonable, the CPU does not support it and it cannot be done.

Playing with register addressing is easy using DEBUG. Bring up debug and assemble the following series of instructions:

.

MOV AX,67FE MOV BX,AX MOV CL,BH MOV CH,BL

Now, reset the value of IP to 0100 using the R command. Then execute each of the machine instructions, one by one, using the T command. The session under DEBUG should look like this:

-A

333F:0100 MOV AX.67FE

333F:0103 MOV BX,AX

333F:0105 MOV CL,BH 333F:0107 MOV CH,BL

333F:0109

-R IP

IP 0100 :0100 -R

AX=0000 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000

DS=333F ES=333F SS=333F CS=333F IP=0100 NV UP EI PL NZ NA PO NC

333F:0100 B8FE67

MOV AX,67FE

 

 

-T

 

 

 

 

AX=67FE BX=0000 CX=0000 DX=0000 SP-FFEE BP=0000 SI=0000 DI=0000

DS-333F ES-333F

SS-333F CS=333F IP=0103

NV UP EI PL NZ NA PO NC

333F:0103 89C3

MOV BX,AX

 

 

-T

 

 

 

 

AX=67FE BX=67FE

CX=0000 OX=0000 SP=FFEE BP=0000 SI=0000 01=0000

DS=333F ES=333F

SS=333F CS=333F

IP=0105

NV UP EI PL NZ NA PO NC

333F:0105 88F9

MOV CL,BH

 

 

-T

 

 

 

 

AX=67FE BX=67FE

CX=0067

DX=0000

SP=FFEE BP=0000 SI=0000 DI=0000

DS=333F ES=333F

SS=333F CS=333F

IP=0107

NV UP EI PL NZ NA PO NC

333F:0107 88DD

MOV CH,BL

 

 

-T

 

 

 

 

AX=67FE BX=67FE CX=FE67

DX=0000 SP=FFEE BP=0000 SI=0000 01=0000

DS=333F ES=333F SS=333F CS=333F IP=0109 NV UP EI PL NZ NA PO NC

333F:0109 1401

ADCAL,01

 

 

Keep in mind that the T command executes the instruction displayed in the third line of the most recent R command display. The ADC instruction in the last register display is yet another garbage instruction, and although executing it would not cause any harm, I recommend against executing random instructions just to see what happens. Executing certain jump or interrupt instructions could wipe out sectors on your hard disk or, worse, cause internal damage to DOS that would not show up until later on.

Let's recap what these four instructions accomplished. The first instruction is an example of immediate addressing—the hexadecimal value 067FEH was moved into the AX register. The second instruction used register addressing to move register data from AX into BX. (Keep in mind that the way the operands are arranged is slightly contrary to the common-sense view of things. The destination operand comes first. Moving something from AX to BX is done by executing MOV BX,AX. Assembly language is just like that sometimes.)

The third instruction and fourth instruction both move data between register halves rather than full, 16-bit registers. These two instructions accomplish something