Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Assembly Language Step by Step Programming with DOS and Linux 2nd Ed 2000.pdf
Скачиваний:
156
Добавлен:
17.08.2013
Размер:
4.44 Mб
Скачать

Chapter 12: The Programmer's View of Linux Tools and Skills to Help You Write Assembly Code under a True 32-Bit OS

Where to Now?

Where indeed? If you've followed me this far, you've been exposed to nearly every concept commonly used in assembly language work. As a working environment we've been using MS-DOS, which made a lot of things easier—made most of it possible, in fact. DOS is simple, forgiving, and present in nearly all Windows machines either as a lurker-beneath-the-windows (for Windows 9x) or a very high quality emulation (Windows NT). Either way, it was likely that you had access to DOS if you had a PC anywhere in your life.

The trouble is, DOS is the past. At best, it's a training ground for understanding the environments where all the real action is now taking place. And that's basically one of two places these days: Windows and Unix. Most other environments have withered severely and exist primarily as "legacy support"—that is, for people who can't afford the money or effort required to move from where they are to Windows or Unix.

On the x86 family of processors (which is what we've been discussing), the undisputed king of Unix implementations is Linux. And where we're going is Linux. It's a true 32-bit protected mode operating system, and it offers the chance to create real 32-bit flat model programs in assembly without a prohibitive amount of head banging. So, what remains of this book will serve to get you started on learning assembly coding for Linux.

Why Not Windows?

The first edition of this book was published in 1992. In the last few years, I've received many letters from readers of the first edition, requesting a second edition that explained how to write Microsoft Windows programs in assembly code. I looked into it. I paled. And I shook my head. Don't go there—you may never come back.

The problem is this: A Windows application isn't so much a stand-alone program as a custom-built extension of Windows itself. A DOS assembly program begins at the top, runs down from there, may do some looping back, but eventually it ends. It may touch the operating system from time to time by making system calls, but the nature of those calls is simple: You set up some parameters in registers or on the stack, and you make an INT 21H call into DOS. When DOS does what it must, it returns control to your program. That's about all there is to it.

The relationship between Windows and its applications is much closer and far more complex. When a Windows program is running and the user presses a mouse button, Windows intercepts the mouse signal and (in effect) taps your program on the shoulder and whispers: "The user just clicked the right mouse button. What are you going to do about it?" A tremendously complex system of events and responses, of messages passed and messages intercepted, runs through Windows and all of its applications like the threads of water flowing over a rocky streambed. From a distance, it's gorgeous. Up close, it borders on chaotic. And in assembly language, you're up as close as it gets.

Just understanding how Windows and Windows applications work at the assembly level could take you months of study. Coding a sizeable app could take a year. Balance against this the fact that a lot of the work in dealing with Windows is always done in precisely the same ways, and you have a tailor-made excuse for drop-in software components and boilerplate code. This is what you get with programming environments like Visual C++, Visual Basic, and Delphi, which basically hand you a generic Windows program with all the infrastructure in place—windows, scroll bars, mouse support, the works—but nothing in the line of specifics. Nonetheless, getting that massive a head start pretty much eliminates any advantage you might have in working in assembly.

But what about speed and size? Nothing beats assembly at speed and size, right? Well, nothing beats good assembly at the speed and size game. However . . . you need to keep in mind that when a

Windows application is running, much or even most of the time code execution is actually somewhere down in Windows, executing DLLs or other Windows machinery that you have no control over. The parts that you actually write will not likely be what dominate the user's perception of the application's speed.

Besides, today's C and Pascal compilers have gotten mighty damned good at generating near-optimal machine code for a specified sequence of high-level language statements. Ace assembly hacks can do better, but it's a little discouraging to ponder just how close to your heels the wolves are snapping.

In truth, coding in assembly for Windows is good for one thing and one thing only: to gain a bit-level, way-down-deep under-the-skin understanding of how Windows works. This can be a very good and valuable thing, and if you want to pursue it, I salute you. I also suspect that once you gain that hardwon understanding of Windows internals, you'll run screaming to the most efficient Windows RAD (Rapid Application Development) environment you can find. (For me, that was Delphi.)

Only one book to my knowledge has ever been written about coding in assembly for Windows:

Windows Assembly Language and Systems Programming, by Barry Kauler (R & D Books, 1997). And for all that it's 400 pages long, it's only a start. Most of what you need to know will have to be found elsewhere, in Microsoft's massive technical documentation.

Good luck. Heh-heh. You'll need it.

And Why Linux?

The decision to cover Linux was not automatic. There were actually two other contenders—or maybe a contender and a half. The half-of-a-contender was DOS protected mode, using a 32-bit DOS extender and the DOS Protected Mode Interface, or DPMI. This would have been reasonably simple, and I almost went that way. I turned back because DOS and DPMI just aren't used anymore by anything that isn't legacy. Why make brand-new antiques? No, strike that—the metaphor is inapt; antiques are by definition valuable. Why make brand-new kitsch?

Besides, DPMI, for all that it works, is really a crutch under a small and very unpowerful OS. For all the effort you will eventually put into learning assembly technology, you deserve to work with more horsepower than that.

The true alternate contender was something called a Windows console application. These are special programs written to be run under Windows NT, in a console—basically, a true 32-bit text-mode window rather than a 16-bit text-mode DOS emulation window. NT console applications are genuine 32-bit programs and are relatively simple to write. They can even do cool Windows-ish things such as display graphical message boxes without a prohibitive amount of fuss. One problem: You must run them under Windows NT, which isn't cheap and currently isn't all that common. On DOS and Windows 9x systems, Windows console applications won't run at all.

Ultimately, I chose Linux because it was every bit as powerful as Windows NT (especially in the realm we're discussing in this book) as well as free. Furthermore, there is an immense amount of free code out there on the Internet written for use with Linux. You can install a Linux partition on the same hard disk as a Windows partition, so you don't have to give up your "real work" in Windows to play around with Linux coding.

Finally, Linux (as the reigning x86 king of the Unix world) is one of the last places where x86 text-mode programming is still done in a big way. Windows console applications are little-used exceptions to the GUI rule in the Microsoft world. In Linux, text mode is still mainstream.

That's where we're going. Let's see what it'll take to get there.

Prerequisites-Yukkh!

Yes, I know, patience isn't one of your virtues. It's not one of mine either. But before you write your first line of assembly code under Linux, there are a number of things that you had better do, or you'll end of up thrashing a lot and wasting a lot of time. That's the only way some people learn, but it's hard on the hair and sucks up valuable hours out of your life that you will never have again. (This seems not to matter much when you're 18-but when you're 47, as I am at this writing, it matters a lot.) The list is short, but plan to spend some time on it:

1.Learn Linux.

2.Learn EMACS.

3.Learn C programming.

These three things-surprise!-are way too much for me to attempt to explain in this book. I recommend you buy or borrow a full book (or more) on each of them, work through tutorials, and do your best to become a journeyman practitioner in all three areas. Allow me to explain why.

Linux Is Not DOS!

The single most important thing to remember if you're coming to Linux for the first time is that although Linux bears some functional resemblance to a grown-up DOS, it's radically different in a great many ways. Some of these ways are so fundamental that people who use Linux (and other versions of Unix) on a total lifestyle basis no longer think of them as notable-and, thus, even beginner books will not fully prepare you for the sense of alienness that you'll encounter in your first few days in front of the beast.

The best example I can give you is this: In the first few days that I began working with Linux, I wrote a short C program that generated a date display. The program was trivial, and it compiled without difficulty. But when I named the compiled binary program in order to run it, bash (a user shell and roughly equivalent to DOS's COMMAND.COM) told me the file wasn't there! This drove me nuts for some time. The executable file I had generated was right there in the current directory, as I could verify with the ls command. However, when I typed the name of the file followed by Enter, bash pleaded ignorance of its existence. What I hadn't learned yet is that to run a Unix (and hence a Linux) executable, you have to enter the full path name, put the directory in which the executable file exists on the path, or prepend the explicit current directory specifier "./". Absent one of those location specifiers, bash doesn't search the current directory for a named executable file!

Yes, to me this is stupid-but I came up through DOS. People who started out with Linux or some other flavor of Unix don't think of this as remarkable at all, and there are some technical reasons why it may be better to do things this way. But the lesson here is that you need to be very attentive as you learn Linux, and try very hard not to make assumptions based on your DOS or Windows experience.

If you've never touched a Unix system before, trust me, it's a lot to swallow in a hurry. See if there's a local community college course you can take on it, or corral a couple of your Unix friends, buy them beer and pizza, and encourage them to talk while you take furious notes. At minimum, buy several books on Linux and read them through, following along at your keyboard and typing the commands as they're presented. At the simple user level, Linux is Unix, so any good beginner book on Unix will be useful, and there are currently a multitude of new Linux-specific beginner books on the stands. (Books that are specific to a particular distribution of Linux-Red Hat, Debian, or Caldera, for exampleare now beginning to appear and these may be even more helpful. Haunt the local Borders regularly and keep your eyes open. If you install Red Hat Linux, I recommend Learning Red Hat Linux by Bill McCarty, from O'Reilly.)

In going forward, I am going to assume that you know how to log in and out, navigate around within Unix directories, and all that elementary user-level stuff. If I use a term or cite a Unix command that you're not familiar with, look it up in one of those other books that you ought to have close at hand.

The distribution I used in preparing this book in the late summer and fall of 1999 was Red Hat 6. It's by far the Linux distribution in widest use, and if you adopt it, you will have plenty of company, which in the computer business is always a plus.

EMACS: More than an Editor

I didn't bother looking for a Linux programming editor/environment to put on this book's CD-ROM, because if you have Linux you've already got one-or several. In fact, if you've been using Linux as a programmer for more than half an hour, you've probably already glommed onto an editor and would be unwilling to switch to anything I would likely be able to hand you. Although there are dozens or (perhaps) hundreds of text editors available for Unix, most Unix people use one of either vi or EMACS. And in the Linux world, as best I can tell, EMACS is the editor of choice.

EMACS is way more than just an editor. It's much closer to the integrated text-mode environments used in the last days of DOS for such products as Borland C++ and Borland Pascal. It understands C syntax, C++ syntax, and assembly syntax-though, alas, not the assembly syntax we'll be using. (More on this sad little disconnect later.) EMACS can build an executable from inside the editor and do an awful lot of other things I've never had occasion to fool with. Whole books have been written on EMACS (O'Reilly has one) and it would be worthwhile to grab such a book and digest it. If you intend to stick with Linux and do any significant programming for it, EMACS is indispensable. Learn as much of it as you can.

It's a C World

I'm a notorious Pascal bigot, and it pains me to say this, but Linux (as a genuine implementation of Unix) is inescapably a C world. Most of Linux is written in C, and what little isn't in C is in assembly. Virtually all the programming examples you'll see for Linux that don't involve interpreted languages such as Perl or Tcl will be in C. Most significantly (as I explain in greater detail later), the runtime library your assembly programs will use to communicate with the operating system is written in C and requires that you use the C protocols for function calling, rather than the more sensible Pascal ones.

So, before you attempt your first assembly program, buy a book and get down and hack some C. You don't need to do a lot of it, but make sure you understand all the basic C concepts, especially as they apply to function calls. I'll try to fill in the lower-level gaps in this book, but I can't teach the language itself nor all the baggage that comes with it. You may find it distasteful (as I did and do) or you may love it, but what you must understand is that you can't escape it, even if your main interest in Linux is assembly language.

There are some excellent Pascal implementations for Linux, most of them free, so if you don't stick with assembly you have some alternatives to C. My choice is FreePascal 32. Go to the following Web site for more details and for the software itself: http://gd.tuwien.ac.at/languages/pascal/fpc/www/.