Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Assembly Language Step by Step Programming with DOS and Linux 2nd Ed 2000.pdf
Скачиваний:
156
Добавлен:
17.08.2013
Размер:
4.44 Mб
Скачать

Using DOS Services through INT

I think of EAT.ASM as something of a Tom Sawyer program. It doesn't do much, and it does what it does in time-honored Tom Sawyer fashion—by getting somebody else to do all the work. All that EAT does is display a character string on your screen. The visible part of that string is the advertising slogan itself: Eat at Joe's! The other part is the pair of invisible characters we call newline or EOL: carriage return (0DH) followed by line feed (0AH). (For more on EOL markers and how they interact with text, see Chapter 4.) The EOL marker does nothing more than return the display cursor to the left margin of the next screen line, so that any subsequent text displayed will begin at the left margin and not nipping at the heels of Eat at Joe's!

Both parts of our advertising slogan are sent to the display at once, and via the same mechanism: through a DOS service.

As I explain in Chapter 4, DOS is both a god and a troll. It controls all the most important elements of the machine in godlike fashion: the disk drives, the printer, and (to some extent) the display. At the same time, DOS is like a troll living under a bridge to all those parts of your machine: You tell the troll what you want done, and the troll will go out and do it for you.

There is another troll guarding the bridges to other components of your machine called the BIOS, to which we'll return in a little while. DOS and BIOS both offer services, which are simple tasks that your programs would have to do themselves if the services were not provided. Quite apart from saving you the programmer a lot of work, having DOS and BIOS services helps guarantee that certain things will be done in identical fashion on all machines, which (especially in terms of disk storage) is a major reason software written for DOS runs on so many different machines: All the machine-dependent stuff is done the same way.

One of the services DOS provides is simple (far too simple, actually) access to your machine's display. For the purposes of EAT.ASM (which is just a lesson in getting your first assembly language program written and operating), simple services are enough.

So—how do we use DOS and BIOS services? The way is as easy to use as it is tricky to understand: through software interrupts.

An Interrupt That Doesn't Interrupt Anything

As one new to the x86 family of processors back in 1981, the notion of a software interrupt drove me nuts. I kept looking and looking for the interrupter and interruptee. Nothing was getting interrupted.

The name is unfortunate, even though I admit that there was some reason for calling software interrupts what they are. They are, in fact, courteous interrupts—if you can still call an interrupt an interrupt when it is so courteous that it does no interrupting at all.

The nature of software interrupts and DOS services is best explained by a real example illustrated twice in EAT.ASM. As I hinted previously, DOS keeps little sequences of machine instructions tucked away within itself. Each sequence does something useful—read something from a disk file, display something to the screen, send something to the printer. DOS uses them to do its own work, and it also makes them available (with its troll hat on) to you the programmer to access from your programs.

Well, there is the critical question: How do you find something tucked away inside of DOS? All code sequences, of course, have addresses, and Microsoft or IBM could publish a booklet of addresses indicating where all the code is hidden. There are numerous good reasons not to pass out the addresses of the code itself, however. DOS is evolving and (we should hope) being repaired on an ongoing basis. Repairing and improving code involves adding, changing, and removing machine instructions, which changes the size of those hidden code sequences—and also, in consequence, changes their location. Add a dozen instructions to one sequence, and all the other sequences upmemory from that one sequence will have to shove over, to make room. Once they shove over, they'll be at different addresses, so instantly the booklets are obsolete. Even one byte added to or removed from a code sequence in DOS could change everything. What if the first code sequence has a bug that must be fixed?

The solution is ingenious. At the very start of real mode memory, down at segment 0, offset 0, is a

special table with 256 entries. Each entry is a complete address including segment and offset portions, for a total of 4 bytes per entry. The first 1,024 bytes of memory in any x86 machine are reserved for this table, and no code or data may be placed there.

Each of the addresses in the table is called an interrupt vector. The table as a whole is called the interrupt vector table. Each vector has a number, from 0 to 255. The vector occupying bytes 0 through 3 in the table is vector 0. The vector occupying bytes 4 through 7 is vector 1, and so on, as shown in Figure 8.3.

Figure 8.3: The interrupt vector table.

None of the addresses is burned into permanent memory the way BIOS routines are. When your machine starts up, DOS and BIOS fill many of the slots in the interrupt vector table with addresses of certain service routines within themselves. Each version of DOS knows the location of its innermost parts, and when you upgrade to a new version of DOS, that new version will fill the appropriate slots in the vector table with upgraded and accurate addresses.

What doesn't change from DOS version to DOS version is the number of the interrupt that holds a particular address. In other words, since the PC first began, interrupt 21H has pointed the way into darkest DOS to DOS's services dispatcher, a sort of multiple-railway switch with spurs heading out to the many (over 50) individual DOS service routines. The address of the dispatcher has changed with every DOS version, but regardless of version, programs can find the address of the dispatcher in slot 21H of the interrupt vector table.

Furthermore, programs don't have to go snooping the table for the address themselves. The x86 CPUs include a machine instruction that makes use of the interrupt vector table. The INT (INTerrupt) instruction is used by EAT.ASM to request the services of DOS in displaying two strings on the screen. At two places, EAT.ASM has an INT 21H instruction. When an INT 21H instruction is executed, the CPU goes down to the interrupt vector table, fetches the address from slot 21H, and then jumps

execution to the address stored in slot 21H. Since the DOS services dispatcher lies at the address in slot 21H, the dispatcher gets control of the machine and does the work that it knows how to do.

The process is shown in Figure 8.4. When DOS loads itself at boot time, one of the many things it does to prepare the machine for use is put correct addresses in several of the vectors in the interrupt vector table. One of these addresses is the address of the dispatcher, which goes into slot 21H.

Figure 8.4: Riding the interrupt vector into DOS.

Later on, when you type the name of your program MYPROG on the DOS command line, DOS loads MYPROG.EXE into memory and gives it control of the machine. MYPROG.EXE does not know the address of the DOS dispatcher. MYPROG does know that the dispatcher's address will always be in slot 21H of the vector table, so it executes an INT 21 instruction. The correct address lies in vector 21H, and MYPROG is content to remain ignorant and simply let the INT 21 instruction and vector 21H take it where it needs to go.

Back on the Northwest Side of Chicago, where I grew up, there was a bus that ran along Milwaukee Avenue. All Chicago bus routes had numbers, and the Milwaukee Avenue route was number 56. It started somewhere in the tangled streets just north of Downtown, and ended up in a forest preserve just inside the city limits. The Forest Preserve District ran a swimming pool called Whelan Pool in that forest preserve. Kids all along Milwaukee Avenue could not necessarily have told you the address of Whelan Pool. But come summer, they'd tell you in a second how to get there: Just hop on bus number 56 and take it to the end of the line. It's like that with software interrupts. Find the number of the vector that reliably points to your destination, and ride that vector to the end of the line, without worrying about the winding route or the address of your destination.

Note that the INT 21 instruction does something else: It pushes the address of the next instruction (that is, the instruction immediately following the INT 21 instruction) on the stack before it follows vector 21H into the depths of DOS. Like Hansel and Gretel, the INT 21 was pushing some breadcrumbs to the stack as a way of helping execution find its way back to MYPROG.EXE after the excursion down into DOS—but more on that later.

Now, the DOS dispatcher controls access to dozens of individual service routines. How does it know which one to execute? You have to tell the dispatcher which service you need, and you do so by placing the service's number in 8-bit register AH. The dispatcher may require other information as well and will expect you to provide that information in the correct place before executing INT 21.

Look at the following three lines of code from EAT.ASM:

mov

dx,eatmsg

;

Mem data ref without [] loads the ADDRESS!

mov

ah,09H

;

Function 9 displays text to

standard output.

int

21H

;

INT 21H makes the call into

DOS.

This sequence of instructions requests that DOS display a string on the screen. The first line sets up a vital piece of information: the offset address of the string to be displayed on the screen. Without that, DOS will not have any way to know what it is that we want to display. The dispatcher expects the offset address to be in DX and assumes that the segment address will be in DS.

In flat model, DS is initialized by DOS at execution time. In segmented model, the address of the data segment was loaded into DS earlier in the program by these two instructions:

mov

ax,data

;

Move

segment

address

of data segment into AX

mov

ds,ax

;

Copy

address

from AX

into DS

Once loaded, DS is not disturbed during the full run of the program, so the DOS dispatcher's assumption is valid even though DS is loaded at the start of program execution and not each time we want to display a string.

In moving 09H into register AH, we tell the dispatcher which service we want performed. Service 09H is DOS's Print String service. This is not the fastest nor in other ways the best way to display a string on the PC's screen, but it is most certainly the easiest.

DOS service 09H has a slightly odd requirement: That the end of the string be marked with a dollar sign ($). This is the reason for the dollar sign hung incongruously on the end of EAT.ASM's advertising slogan string. Given that DOS does not ask us to pass it a value indicating how long the string is, the end of the string has to be marked somehow, and the dollar sign is DOS's chosen way. It's a lousy way, unfortunately, because with the dollar sign acting as a marker, there is no way to display a dollar sign. If you intend to talk about money on the PC's screen, don't use DOS service 9! As I said, this is the easiest, but certainly not the best way to display text on the screen.

With the address of the string in DS:DX and service number 09H in AH, we take a trip to the dispatcher by executing INT 21H. The INT instruction is all it takes—boom!, and DOS has control, reading the string at DS:DX and sending it to the screen through mechanisms it keeps more or less to itself.

Getting Home Again

So much for getting into DOS. How do we get home again? The address in vector 21H took control into DOS, but how does DOS know where to go to pass execution back into EAT.EXE? Half of the cleverness of software interrupts is knowing how to get there, and the other half—just as clever—is knowing how to get back.

To get into DOS, a program looks in a completely reliable place for the address of where it wants to go: the address stored in vector 21H. This address takes execution deep into DOS, leaving the program sitting above DOS. To continue execution where it left off prior to the INT 21 instruction, DOS has to look in a completely reliable place for the return address, and that completely reliable place is none other than the top of the stack.

I mentioned earlier (without much emphasis) that the INT 21 instruction pushes an address to the top of the stack before it launches off into the unknown. This address is the address of the next instruction in line for execution: the instruction immediately following the INT 21 instruction. This location is completely reliable because, just as there is only one interrupt vector table in the machine, there is only one stack in operation at any one time. This means that there is only one top of the stack—that is, SS:SP—and DOS can always send execution back to the program that called it by popping the address

off the top of the stack and jumping to that address.

The process is shown in Figure 8.5, which is the continuation of Figure 8.4. Just as the INT instruction pushes a return address onto the stack and then jumps to the address stored in a particular vector, there is a

Figure 8.5: Returning home from an interrupt.

"combination" instruction that pops the return address off the stack and then jumps to the address. The instruction is IRET (for Interrupt RETurn), and it completes this complex but reliable system of jumping toan address when you really don't know the address. The trick, once again, is knowing where the address can reliably be found. (There's actually a little more to what the software interrupt mechanism pushes onto and pops from the stack, but it happens transparently enough that I don't want to complicate the explanation at this point—and you're unlikely to be writing your own software interrupt routines for a while.)

This should make it clear by now what happens when you execute an INT 21 instruction. EAT.ASM uses DOS services to save it the trouble of writing its string data to the screen a byte at a time. The address into DOS is at a known location in the interrupt vector table, and the return address is at a known location on the stack. Whereas I've described the software interrupt system in terms of the DOS service dispatcher interrupt 21H, the system is precisely the same for all other software interrupts—and there are many. In the next chapter we use a few more and explore some of the many services available through the BIOS interrupts that control your video display and printer.

Software Interrupts versus Hardware Interrupts

Software interrupts evolved from an older mechanism that did involve some genuine interrupting: hardware interrupts. A hardware interrupt is your CPU's mechanism for paying attention to the world outside itself.

There is a fairly complex electrical system built into your PC that allows circuit boards to send signals to the CPU. An actual metal pin on the CPU chip is moved from one voltage level to another by a circuit board device like a disk drive controller or a serial port board. Through this pin, the CPU is tapped on the shoulder by the external device. The CPU recognizes this tap as a hardware interrupt. Like software interrupts, hardware interrupts are numbered, and for each interrupt number there is a slot reserved in the interrupt vector table. In this slot is the address of an interrupt service routine (ISR) that performs some action relevant to the device that tapped the CPU on the shoulder. For example, if the interrupt signal came from a serial port board, the CPU would then allow the serial port board to transfer a character byte from itself into the CPU.

Most properly, any routine that lies at the end of a vector address in the interrupt vector table is an ISR, but the term is usually reserved for hardware interrupt service routines.

The only difference between hardware and software interrupts is in the event that triggers the trip through the interrupt vector table. With a software interrupt the triggering event is part of the software; that is, an INT instruction. With a hardware interrupt, the triggering event is an electrical signal applied to the CPU chip itself without any INT instruction taking a hand in the process. The CPU itself pushes the return address on the stack when it recognizes the electrical pulse that triggers the interrupt; however, when the ISR is done, a RET instruction sends execution home, just as it does for a software interrupt.

Hardware ISRs can be (and usually are) written in assembly language. It's a difficult business, because the negotiations between the hardware and software must be done just so, or the machine may lock up or go berserk. This is no place for beginners, and I would advise you to develop some skill and obtain some considerable knowledge of your hardware setup before attempting to write a hardware ISR.

Chapter 9: Dividing and Conquering Using

Procedures and Macros to Battle Complexity

Programming in Martian

There is a computer language called APL (an acronym for A Programming Language, how clever) that has more than a little Martian in it. APL was the first computer language I learned (on a major IBM mainframe), and when I learned it, I learned a little more than just APL.

APL uses a very compact notation, with dozens of odd little symbols, each of which is capable of some astonishing power such as matrix inversion. You can do more in one line of APL than you can in one line of anything else I have learned since. The combination of the strange symbol set and the compact notation makes it very hard to read and remember what a line of code in APL actually does.

So it was in 1977. Having mastered (or so I thought) the whole library of symbols, I set out to write a text formatter program. The program would justify right and left, center headers, and do a few other things of a sort that we take for granted today but which were very exotic in the seventies.

The program grew over a period of a week to about 600 lines of squirmy little APL symbols. I got it to work, and it worked fine-as long as I didn't try to format a column that was more than 64 characters wide. Then everything came out scrambled.

Whoops. I printed the whole thing out and sat down to do some serious debugging. Then I realized with a feeling of sinking horror that, having finished the last part of the program, I had no idea how the first part worked.

The APL symbol set was only part of the problem. I soon came to realize that the most important mistake I had made was writing the whole thing as one 600-line monolithic block of code lines. There were no functional divisions, nothing to indicate what any 10-line portion of the code was trying to accomplish.

The Martians had won. I did the only thing possible: I scrapped it. And I settled for ragged margins in my text.