Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Assembly Language Step by Step 1992

.pdf
Скачиваний:
143
Добавлен:
17.08.2013
Размер:
7.98 Mб
Скачать

interesting. Look at the last register display, and compare the value of BX and CX. By moving the value from BX into CX a byte at a time, it was possible to reverse the order of the two bytes making up BX. The high half of BX (what we sometimes call the most significant byte, or MSB, of BX) was moved into the low half of CX. Then the low half of BX (what we sometimes call the least significant byte, or LSB, of BX) was moved into the high half of CX. This is just a sample of the sorts of tricks you can play with the general-purpose registers.

Just to disabuse you of the notion that the MOV instruction should be used to exchange the two halves of a 16-bit register, let me suggest that you do the following: before you exit DEBUG from your previous session, assemble this instruction and execute it using the T command:

XCHG CL,CH

The XCHG instruction exchanges the values contained in its two operands. What was interchanged before is interchanged again, and the value in CX will match the values already in AX and BX. A good idea while writing your first assembly-language programs is to double check the instruction set periodically to see that what you have cobbled together with four or five instructions is not possible using a single instruction. The 8086/8088 instruction set is very good at fooling you in that regard!

Memory Data

Immediate data is built right into its own machine instruction, and register data is stored in one of the CPU's limited collection of internal registers. In contrast, memory data is

stored somewhere in the megabyte vastness of external memory. Specifying that address is much more complicated than simply reaching into a machine instruction or naming a register.

You should recall that a memory location must be specified in two parts- a segment address, which is one of 65,536 locations spaced every 16 bytes in memory; and an offset address, which is the number of bytes by which the specified byte is offset from the start of the segment. Within the CPU, the segment address is kept in one of the four segment registers, while the offset address (generally just called the offset) may be in one of a select group of general-purpose registers. To pin down a single byte within the 8086/8088's megabyte of memory, you need both the segment and offset components. We generally write them together, specified with a colon to separate them, as either literal constants or register names: OBOO:O167, DS:SI or CS:IP.

BX's Hidden Agenda

One of the easiest mistakes to make early on is to assume that you can use any of the general-purpose registers to specify an offset for memory data. Not so! If you try to specify an offset in AX,CX, or DX, the assembler will flag an error. Register SP is a special case, and addresses data located on the stack as I'll explain in Chapter 7.)

Only BP, BX, SI, and DI may hold an offset for memory data.

So, in fact, general-purpose registers AX, CX, and DX aren't quite so general after all. Why was general-purpose register BX singled out for special treatment? Think of it as the difference between dreams and reality for Intel. In the best of all worlds, every register could be used for all purposes. Unfortunately, when CPU designers get together and argue about what their nascent CPU is supposed to do, they are forced to face the fact that there are only so many transistors on the chip to do the job.

Each chip function is given a "budget" of transistors (sometimes numbering in the tens or even hundreds of thousands), and if the desired logic cannot be implemented using that number of transistors, the expectations of the designers have to be brought down a notch, and some CPU features shaved from the specification.

The 8086 and 8088 are full of such compromises. There were not enough transistors available at design time to allow all general-purpose registers to do everything, so in addition to the truly general-purpose ability to hold data, each 8086/8088 register has what I call a "hidden agenda." Each register has some ability that none of the others share. I'll describe each register's hidden agenda at some appropriate time in this book, and I'll call it out as such.

Register BX is the X register chosen to address memory data. None of the other X registers can be used in this fashion. By convention, and because there simply isn't enough horsepower in the CPU to allow all registers to do it, addressing memory data is one element of BX's hidden agenda.

Using Memory Data

With one or two important exceptions (the string instructions, which I cover to an degree—but not exhaustively—in Chapter 10), only one of an instruction's two operands may specify a memory location. In other words, you can move an immediate value to memory, or a memory value to a register, or some other similar combination, but you can't move a memory value directly to another memory value. This is just an inherent

limitation of the CPU, and we have to live with it, inconvenient as it gets at times. Specifying a memory address as one of an instruction's operands is a little complicated. The offset address must be resident in one of the general-purpose registers. To specify that we want the data at the memory location contained in the register rather than the data in the register itself, we use square brackets around the name of the register. In other words, to move the word at address DS:BX into register AX, we would use the following instruction:

MOV AX,[BX]

Similarly, to move a value residing in register DX into the word at address DS:DI, you would use

this instruction:

MOV [DI],DX

Segment Register Assumptions

The only problem with these examples is: where does it say to use DS as the segment register?

It doesn't. To keep addressing notation simple, the 8086/8088 makes certain assumptions about certain instructions in combinations with certain registers. There is no particular system to these assumptions, and like dates in history or Spanish irregular verbs, you'll just have to memorize them, or at least know where to look them up. (The where is in Appendix C in this book.)

One of these assumptions is that the MOV instruction uses the segment address stored in segment register DS unless you explicitly tell it otherwise. In this case above, we did not tell the MOV instruction to use some segment register other than DS, so it fell back on its assumptions and used DS. However, had you specified the offset as residing in register SP, the MOV instruction would have assumed the use of segment register SS instead. This assumption involves a memory mechanism known as the stack, which we won't really address until the next chapter.

Overriding Segment Assumptions for Memory Data

But what if you want to use CS as a segment register with the MOV instruction? It's not difficult. The instruction set includes what are called segment override prefixes. These

are not precisely instructions, but are more like the filters that may be snapped in front of a camera lens—the filter is not itself a lens, but it alters the way the lens operates.

There is one segment override prefix for each of the four segment registers: (CS, DS, SS, and ES). In assembly language these prefixes are written as the name of the segment register followed by a colon:

Override Prefix

Usage

CS:

Forces usage of code segment register CS

DS:

Forces usage of the data segment register DS

SS:

Forces usage of the stack segment register SS

ES:

Forces usage of the extra segment register ES

In use, the segment override prefix is placed immediate in front of the memory data reference whose segment register assumption is to be overridden. For example, to force a MOV instruction to copy a value from the AX register into a location at an offset

(contained in SI) into the CS register, you would use this instruction:

MOV CS:[SI],AX

Without the "CS:", this instruction would move the value of AX into the DS register, at an address specified as DS:SI.

Prefixes in use are very reminiscent of how an address is written; in fact, understanding how prefixes work will help you keep in mind that in every reference to memory data within an instruction, there is a ghostly segment register assumption floating in the air. You may not see the ghostly "DS:" assumption in your MOV instruction, but if you forget that it is there the whole concept of memory data will begin to seem arbitrary and magical.

Every reference to memory data includes either an assumed segment register or a segment override prefix to specify a segment register other than the assumed segment register.

At the machine-code level, a segment override prefix is a single binary byte. The prefix byte is placed in front of rather than within a machine instruction. In other words, if the binary bytes comprising a MOV AX,[BX] instruction (which we call that instruction's opcode) are 8BH 07H, adding the ES segment override prefix to the instruction (MOV AX,ES:[BX]) places a single 26H in front of the opcode bytes, giving us 26H 8BH 07H as the full binary equivalent.

Memory Data Summary

Memory data consists of a single byte or word in memory, addressed by way of a segment value and an offset value. The register containing the offset address is enclosed in square brackets to indicate that the contents of memory, rather than the contents of the register, are being addressed. The segment register used to address memory data is usually assumed according to a complex set of rules. Optionally, a segment override prefix may be placed in the instruction to specify some segment register other than the default segment register.

Figure 6.2 shows what happens during a MOV AX,ES:[BX] instruction. The segment address component of the full 20-bit memory address is contained inside the CPU in segment register ES. Ordinarily, the segment address would be in register DS, but the MOV instruction contains the ES: segment override prefix. The offset address component is specified to reside in the BX register.

The CPU sends out the values in ES and BX to the memory system side by side. Together, the two values pin down one memory location where MyWord begins. MyWord is actually two bytes, but that's fine—the 8086 CPU can bring both bytes into the CPU at once, while the 8088 brings both bytes in separately, one after the other. The CPU handles details like that and you needn't worry about it. Because AX is a 16-bit register, two 8-bit bytes can fit into it quite nicely.

The segment address may reside in any of the four segment registers: CS, DS, SS, or ES. However, the offset address may reside only in registers BX, BP, SP, SI, or DI.

AX, CX, and DX may not be used to contain an offset address during memory addressing.

Limitations of the MOV Instruction

The MOV instruction can move nearly any register to any other register. For reasons probably having to do with the limited budget of transistors on the 8086 and 8088 chips, MOV can't quite do any move you can think of—here is a list of MOV's limitations:

MOV cannot move memory data to memory data. In other words, an instruction like

MOV [SI],[BX] is illegal. Either of MOV's two operands may be memory data, but both cannot be at once.

MOV cannot move one segment register into another. Instructions like MOV CS,SS are illegal. This usage might have come in handy, but it simply can't be done.

MOV cannot move immediate data into a segment register. You can't write

MOV CS,OB800H. Again, it would be handy but you just can't do it.

• MOV cannot move one of the 8-bit register halves into a 16-bit register, nor vise versa. There are easy ways around any possible difficulties here, and preventing moves between operands of different sizes can keep you out of numerous kinds of trouble.

These limitations are, of course, over and above those situations that simply don't make sense: moving a register or memory into immediate data, moving immediate data into immediate data, specifying a general-purpose register as a segment register to contain a segment, or specifying a segment register to contain an offset address. Figure 6.3 shows numerous illegal MOV instructions that illustrates these various limitations and nonsense situations.

6.3 Assembly-Language References

MOV is a good start. Like a medium-sized screwdriver, you'll end up using it for normal tasks and maybe some abnormal ones, just as I use screwdrivers to pry nails out of boards, club Black Widow spiders in the garage bathroom, discharge large electrolytic

capacitors, and other intriguing things over and above workaday screw-turning. The 8086/8088 instruction set contains dozens of instructions, however, and over the course of the rest of this book I'll be mixing

in descriptions of various other instructions with further discussions of memory addressing and program logic and design.

Remembering a host of tiny, tangled details involving dozens of different instructions is brutal and unnecessary. Even the "Big Guys" don't try to keep it all between their ears at all times. Most keep a blue card or some other sort of reference document handy to jog their memories about machine instruction details.

Blue Cards

A blue card is a reference summary printed on a piece of colored card stock. It folds up like a road map and fits in your pocket. The original blue card may actually have been blue, but knowing the perversity of programmers in general, it was probably bright orange. Most assemblers come with a blue card. Guard it with your life.

Blue cards aren't always cards anymore. One of the best is a full sheet of very stiff shiny plastic, sold by Micro Logic Corp. of Hackensack, NJ*. The blue card sold with Microsoft's MASM is actually published by Intel, and has grown to a pocket-sized booklet stapled on the spine.

Blue cards contain very terse summaries of what an instruction does, what operands are legal, what flags it affects, and how many machine cycles it takes to execute. This information, while helpful in the extreme, is often so brief that newcomers might not quite fathom which edge of the card is up.

6.4 An Assembly-Language Reference for Beginners

In deference to people just starting out in assembly language, I have put together a beginner's reference to the most common 8086/8088 instructions and called it Appendix A. It contains at least a page on every instruction I'll be covering in this book, plus a few additional instructions that everyone ought to know. It does not include descriptions on every instruction, but only the most common and most useful. Once you've gotten skillful enough to use the more arcane instructions, you should be able to pick up the blue card provided with your assembler and run with it.

On the next page is a sample entry from Appendix A. Refer to it during the following discussion

The instruction's mnemonic is at the top of the page, highlighted in a box to make it easy to spot while flipping quickly through the appendix. To the mnemonic's right is the name of the instruction, which is a little more descrip-tive than the naked mnemonic.

*Micro Chart, Micro Logic Corp. P.O. Box 174, Hackensack, NJ 07602

Neg Negate (two's complement; multiply by -1)

Flags affected:

OF: Overflow flag

 

 

 

O D

I

T S Z A P C

TF; Trap flag

AF;

Aux carry

F

F

F

F F F F F F

DF: Direction flag

SF: Sign flag

PF:

Parity flag

*

 

 

* * * * *

IF: Interrupt flag

ZF: Zero flag

CF:

Carry flag

Legal forms:

NEG r8

NEG m8

NEG r16

NEG m16

Examples:

NEG

AL

 

NEG

CX

 

NEG

BYTE PTR [BX]

; Negates byte quantity at DS:BX

NEG

WORD PTR [DI]

; Negates word quantity at DS:BX

Notes:

This is the assembly-language equivalent of multiplying a value by -1. Keep in mind that negation is not the same as simply inverting each bit in the operand. (Another instruction, NOT, does that.) The process is also known as generating the two's complement of a value. The two's complement of a value added to that value yields zero. -1 = $FF; -2 = $FE; -3 = $FD; etc.

If the operand is 0, CF is cleared and ZF is set; otherwise CF is set and ZF is cleared. If the operand contains the maximum negative value (-128 for 8-bit or -32768 for 16-bit) the operand does not change, but OF and CF are set. SF is set if the result is negative, or cleared if not. PF is set if the low-order 8 bits of the result contain an even number of set

(1) bits; otherwise PF is cleared.

NOTE: You must use a type override specifier (BYTE PTR or WORD PTR) with memory data!

r8 = AL AH BL BH CL CH DL DH r!6 = AX BX CX DX BP SP SI DI sr - CS DS SS ES

m8 = 8-bit memory data m16 - 16-bit memory data

i8 - 8-bit immediate data i!6 = 16-bit immediate data

d8 = 8 bit signed displacement d16 = 16-bit signed displacement

Flags

Immediately beneath the mnemonic is a minichart of machine flags in the Flags register. I haven't spoken in detail of flags yet, but the Flags register is a collection of one-bit values that retain certain essential information about the state of the machine for short periods of time. Many (but by no means all) 8086/ 8088 instructions change the values of one or more flags. The flags may then be individually tested by one of the JMP instructions, which then change the course of the program depending on the state of the flags.

We'll get into this business of tests and jumps in Chapter 9. For now, simply understand that each of the flags has a name, and that for each flag is a symbol in the flags minichart. You'll come to know the flags by their 2-character symbols in time, but until then the full names of the flags are shown to the right of the minichart. Most of the flags are not used frequently in beginning assembly-language work. Most of what you'll be paying attention to, flags-wise, is the Carry flag (CF). It's used, as you might imagine, for keeping track of binary arithmetic when an arithmetic operation carries out of a single byte or word. There will be an asterisk (*) beneath the symbol of any flag affected by the instruction. How the flag is affected depends on what the instruction does— you'll have to divine that from the Notes section of the reference sheet. When an instruction affects no flags at all, the word <none> will appear in the minichart.

In the example page, the minichart indicates that the NEG instruction affects the Overflow flag, the Sign flag, the Zero flag, the Auxiliary carry flag, the Parity flag, and the Carry flag. The ways that the flags are affected depend on the results of the negation operation on the operand specified. These ways are summarized in the second paragraph of the Notes section.

Legal Forms