Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Perl Web Development - From Novice To Professional (2006)

.pdf
Скачиваний:
56
Добавлен:
17.08.2013
Размер:
2.9 Mб
Скачать

80 C H A P T E R 4 S YS T E M I N T E R A C T I O N

Accessing Uploading File Header Information

The CGI module also includes a function called uploadInfo(), which gives you access to header information which may (or may not) be sent from the web browser along with the uploaded file. The headers sent by the browser are actually sent as a reference to a hash or associative array. Using the uploadInfo() function along with a header like Content-Type, it’s possible to determine the type of document being uploaded in order to allow only certain types to be uploaded. Be forewarned though, browsers can lie. Don’t ever rely on user input or on any data coming from a user’s browser. As I’ve emphasized in previous chapters, no input should be used within your program until it has been validated.

For example, it’s possible to incorporate a CGI program into the form shown in Figure 4-1 in order to print the Content-Type of the file being uploaded. Listing 4-1 contains a basic CGI script for accomplishing this task.

Listing 4-1. Printing the Content-Type of an Uploaded File

#!/usr/bin/perl

use strict;

use CGI qw/:standard/;

my $q = new CGI;

my $filename = $q->param('uploaded_file');

my $contenttype = $q->uploadInfo($filename)->{'Content-Type'};

print header; print start_html;

print "Type is $contenttype<P>"; print end_html;

The Content-Type is placed into the variable $contenttype, and then printed to the output stream of an HTML page as an example. Figure 4-2 shows an example of choosing to upload an HTML file, and Figure 4-3 shows the output produced from Listing 4-1.

In practice, you would likely check the content type in order to make sure that it’s one of the acceptable types of files that your program expects as input. Consider the example in Listing 4-2.

C H A P T E R 4 S YS T E M I N T E R A C T I O N

81

Figure 4-2. Uploading an HTML file

Figure 4-3. Output showing the Content-Type

Listing 4-2. Checking for Acceptable File Types

#!/usr/bin/perl

use strict;

use CGI qw/:standard/;

82C H A P T E R 4 S YS T E M I N T E R A C T I O N

my $q = new CGI;

my $filename = $q->param('uploaded_file');

my $contenttype = $q->uploadInfo($filename)->{'Content-Type'};

print header; print start_html;

if ($contenttype !~ /^text\/html$/) { print "Only HTML is allowed<P>"; print end_html;

exit; } else {

print "Type is $contenttype<P>";

}

print end_html;

When a file with a Content-Type that isn’t text/html is uploaded to this CGI script, its output indicates that only HTML types are allowed, as shown in Figures 4-4 and 4-5, where an executable file is uploaded. Figure 4-4 shows an example of choosing to upload an executable file, and Figure 4-5 shows the output produced from Listing 4-2.

Figure 4-4. Uploading an executable file

C H A P T E R 4 S YS T E M I N T E R A C T I O N

83

Figure 4-5. Output produced when a file with a Content-Type other than text/html is uploaded

Protecting Temporary Files

Whenever a file is being uploaded from the web client to the server, the content of that file is being stored in a temporary location on the server. This creates a number of problems, not the least of which is that the file, while being stored in that temporary location, might be exposed to other users local to the server itself. To lessen (but not completely remove) this risk, you can use the -private_tempfiles pragma when you invoke the CGI namespace—in other words, with the use CGI lines that usually appear at the beginning of the program. Where before you might have something like this:

use CGI qw/:standard/;

you now have this:

use CGI qw/:standard -private_tempfiles/;

Working with System Processes

By the phrase “working with system processes,” I’m specifically referring to spawning programs external to your Perl program—such as cat, ls, and others—in order to obtain information and interact with the operating system itself.

When working with processes, or really anything outside your Perl program, there is an inherent danger of introducing unknown and possibly unsafe elements into the program. Refer to the “Security Considerations with System Interaction” section later in this chapter for more information about this aspect of programming with Perl.

Since this book does assume at least some familiarity with Perl, it’s logical to assume that you have some experience with spawning external processes from a Perl program. This

84 C H A P T E R 4 S YS T E M I N T E R A C T I O N

section won’t be a rehash of every bit of information about such an undertaking. Rather, this section will provide some refresher material to ensure we’re all talking the same language.

Executing System Processes from a Perl Program

When Perl runs a system process, it inherits the traits of its parent. Recall from the earlier section on filehandles that a Perl program inherits the three standard filehandles: STDIN, STDOUT, and STDERR. The Perl program also inherits other things, like the uid of the parent, the umask, the current directory, and other such environmental variables. Perl provides the %ENV hash as means to access and change the environment variables that are inherited by your Perl program. You can iterate through this hash in the same way that you would any other hash to see the environment variables.

foreach $key (keys %ENV) {

print "Environment key $key is $ENV{$key}\n";

}

The fork() and exec() methods of firing a system command are the most flexible method of working with system commands available in Perl. Unfortunately, they’re also the most complex and arguably the least used, especially when it comes to CGI programming. I’m not going to clutter these pages with a discussion of fork() and exec(), but rather refer you to the perlfunc document pages for more information about fork(), exec(), kill(), wait(), and waitpid().

Here, we’ll look at using the system() function, run quotes, and system processes as filehandles.

The system Function

The system() function is a common way to fire off a new process from a Perl program. When you use the system() function, a new process is created or handed off to /bin/sh, and the Perl program waits until the process is finished running.

The exit status from the system() function is passed back to the Perl program. It’s important to note that this exit status is not the exit status of the command that the system() function actually runs, but the exit status of the shell in which the command is run. This is an important distinction because it means that you can’t rely on the exit status of the system() function as an indication of whether or not the actual command run by the function completed successfully.

Here’s an example of the system() function in action:

system("uptime");

Run Quotes

The next method to execute a system process from within a Perl program goes by a few names—backquotes, backticks, or run quotes. I’ll be using the name run quotes for no reason other than that’s the name that I’ve heard used the most often.

Run quotes are different from the system() function in that they return the output of the command. If you want to capture the output of the command, using run quotes provides an easy way to accomplish the task, as this example shows:

$uptime = `uptime`;

C H A P T E R 4 S YS T E M I N T E R A C T I O N

85

When executed, the variable $uptime would contain the output from the uptime command:

21:20:07 up 202 days, 5:52, 3 users, load average: 0.01, 0.02, 0.00

Obviously, the output from your uptime command probably will be different.

System Processes As Filehandles

You learned about filehandles earlier in this chapter. You can also use system processes as filehandles by using the same open() function as you use to create a filehandle. Like file-flavored filehandles, you can open process filehandles for reading and writing.

The syntax of the open() function is largely the same for processes as it is for files, except that the pipe character (|) is used instead of the greater-than/less-than characters. The location of the pipe character determines whether the process handle is opened for reading or writing.

Creating a process handle for reading looks like this:

open(UPTIME, "uptime|");

Creating a handle for writing looks like this:

open(PRINTER, "|lpr ");

Then you could print to that handle, similar to how you print to a filehandle:

print PRINTER "Printing something from this handle\n";

Using System Processes Within a CGI Program

Using a system call from within a Perl-based CGI script is no different from using the same system call from within another Perl program. The one exception is that you must pay particular attention to the environment within which the script will be run.

Many times, the environment will be that of the web server user. This means that the Perl program might not have access to the same commands, files, and other environmental bits as your user on the same system. A common symptom is that the script will appear to run fine when you execute it from your shell, but when you attempt to access the script through the web browser, it won’t work, possibly giving the infamous Internal Server Error message that Perl programmers everywhere have come to love.

Some web servers use Apache’s suEXEC feature for the execution of CGI scripts. With suEXEC, CGI scripts are executed as the user who owns the program, which is frequently the same as your userid for many installations.

Security Considerations with System Interaction

When working with the system—files or processes—from within a Perl program, you must pay additional attention to your surroundings. It’s quite possible and unfortunately easy to overwrite existing and essential files by opening a file for writing instead of appending.

Whenever you work with anything outside your Perl program, there is always a risk of introducing unknown data into your Perl program or doing something unintentional to the system itself. The former is an important concern for developers; the latter is primarily a concern for the system administrator. For those of us who frequently wear both hats, it’s important to perform

86C H A P T E R 4 S YS T E M I N T E R A C T I O N

rigorous sanity checks against data and against all external system interactions. This means, among other things, liberal use of the print() function to ensure that the script is doing what you think it should be doing, and use of the die() function or, at least, the warn() function to report on unexpected conditions. Otherwise, the program may continue as normal, even though the file-open operation failed. The die() function is helpful to prevent the script from getting itself into an unknown state. Also, you’ll want to enable taint mode and strict checking, as explained in Chapter 1.

Summary

This chapter covered system interactions that involve files and system processes. You work with files from a Perl script through filehandles. You can open files for reading, writing, or appending. When creating a filehandle, the file is opened in read mode by default, except when using the three-argument version of the open() function, and unless you use a single or double greaterthan sign, to indicate either writing or appending, respectively. Whenever you’re working with filehandles, it’s always a good idea to use the die() function to ensure that the filehandle was created successfully.

System processes can also be created or spawned from within a Perl program. There are four methods for working with external system processes from within a Perl program: the system() function, run quotes (or backquotes), the fork() and exec() functions, and the open() function. With the system() function or the run quotes method, the Perl program waits until the process is finished running. The system() function returns the status from the shell to the Perl program. Run quotes return the resulting value from any command(s) run outside the Perl program. You can use the open() function to spawn external processes in much the same way as you create a filehandle, except that you use the pipe character (|) to denote that you want to run a process, rather than a filehandle. Putting the pipe character after the command indicates the process handle is for reading; putting the pipe character before the command indicates that the process handle is for writing.

When working with system processes, you must pay attention to the environment variables that the program will inherit when it runs. Since many CGI scripts are executed as the web server user, the script might not run the same when it is executed by the web server.

When your CGI program is interacting with the system, you need to take great care to ensure that the program is written with security in mind. This is especially the case for programs that will be accessed through the Web. The use of taint and strict modes is essential, along with use of the die() function.

This chapter marks the end of the first part of the book. The next part of the book will look at some additional interaction that Perl programs can take with the Internet. So far, you’ve seen how to create web pages with Perl. The next chapter describes how to consume and work with web pages from within Perl.

P A R T 2

■ ■ ■

Internet Interaction with LWP and Net:: Tools

C H A P T E R 5

■ ■ ■

LWP Modules

LWP is an abbreviation for library of WWW modules in Perl. LWP modules enable you to incorporate common web tasks into your Perl program through a set of functions that can be imported into your namespace.

Using the LWP modules (the LWP, for short), you can perform tasks such as retrieving web pages, submitting web forms, and mirroring a web site. As with other tasks in Perl, you could accomplish these same things without the help of the LWP. However, this chapter will concentrate on using the LWP modules (thus the title). The LWP contains a number of protocol methods, including ones to work with HTTP, HTTPS, FTP, NNTP, and others. This chapter looks at how to use the LWP with HTTP and HTTPS.

Getting Started with the LWP

To use the LWP modules, you need to first obtain and install them. Distributions such as Debian have the LWP prepackaged, which makes installation rather trivial (apt-get install libwww-perl). If your distribution doesn’t contain a prepackaged version of the LWP, you can download and install the software manually. The LWP modules are available from your favorite CPAN mirror.

It’s always a good idea to check whether the modules are already installed prior to going through the job of installing them. An easy method for testing this is with the following command, executed from the shell:

perl -MLWP -e 'print "$LWP::VERSION\n"'

You have the LWP installed if you see a version number such as this output:

5.803

If you don’t have the LWP installed, you’ll need to perform the installation in order to accomplish most of the tasks in this chapter.

The LWP primarily works with the HTTP request and response model. This means that the module is an excellent choice for retrieving and parsing web pages on the Internet. Here’s a quick example to get your feet wet. The code (Getua.pl) retrieves a web page and prints it all to STDOUT.

89