Visual CSharp .NET Programming (2002) [eng]
.pdfTo match a URL, use the following regular expression:
\w+:\/\/[\w.]+\/?\S*
This will match any standard URL that includes a protocol, e.g., http:// or ftp://. Path information following the domain is allowed, but optional. So both http://www.bearhome.com/cub/ and http://www.sybex.com match this pattern.
Matching Substrings in an Expression
The preceding pattern that matches a URL is all very well and good, but what if you need only part of the URL? For instance, you might need to know the protocol used, or the domain. In this example, I'll show you how to use regular expressions to 'decompose' a URL into its constituent parts. Obviously, you could also do this with string methods-there's no real need for regular expressions-but following this example should help you to understand how the
.NET Framework regular expression classes are related.
For a user interface, I've used a TextBox for entering the URL (which is to, once again, include protocol followed by ://), a ListBox to display the URL's decomposed parts, and a Button whose click event will process the decomposition (Figure 9.14).
Figure 9.14: Regular expressions can be used to decompose a URL into constituent parts.
The first step in the click event is to rewrite the regular expression pattern a bit. The original URL-matching pattern that I showed you is
\w+:\/\/[\w.]+\/?\S*
To do substring matches on this, we need to break it up into groups using parentheses:
(\w+):\/\/([\w.]+)(\/?\S*)
In order to assign the pattern to a string variable, we need to use the verbatim symbol (otherwise the compiler thinks the string includes an invalid escape character):
string pattern = @"(\w+):\/\/([\w.]+)(\/?\S*)";
Now that we have our regular expression, we have to use it to retrieve the substring values. First, we start a counter and a string array to hold the results:
int k = 0;
string [] results = new string[4];
Next, create a Regex instance using the "ignore case" option:
Regex regex = new Regex (pattern, RegexOptions.IgnoreCase);
Use the instance's Match method to assign the match with the URL entered by the user to a Match object:
Match match = regex.Match(txtURL.Text);
Use the Match instance's Groups property to populate a GroupCollection object:
GroupCollection gc = match.Groups;
Cycle through the GroupCollection instance. For each item in it (each Group), use the Group instance's Captures property to populate a CaptureCollection instance. Cycle through each CaptureCollection instance, and use each item's Value property to add a substring match to the results array and increase the array counter:
for (int i = 0; i < gc.Count; i++) { CaptureCollection cc = gc [i].Captures; for (int j = 0; j < cc.Count ; j++) {
results[k] = cc[j].Value; k++;
}
}
Finally, add the text in the results array to the ListBox Items collection for display. The complete code for the URL decomposition is shown in Listing 9.14.
Listing 9.14: Decomposing a URL into Groups
using System.Text.RegularExpressions;
...
private void btnDecomp_Click_1(object sender, System.EventArgs e) { lstURL.Items.Clear();
string pattern = @"(\w+):\/\/([\w.]+)(\/?\S*)"; int k = 0;
string [] results = new string[4];
Regex regex = new Regex (pattern, RegexOptions.IgnoreCase); Match match = regex.Match(txtURL.Text);
GroupCollection gc = match.Groups; for (int i = 0; i < gc.Count; i++) {
CaptureCollection cc = gc [i].Captures; for (int j = 0; j < cc.Count ; j++) {
results[k] = cc[j].Value; k++;
}
}
lstURL.Items.Add ("Full URL: " + results[0]); lstURL.Items.Add ("Protocol: " + results[1]); lstURL.Items.Add ("Domain: " + results[2]); lstURL.Items.Add ("Path: " + results[3]);
}
Conclusion
This chapter has covered a lot of ground. If string manipulation is, indeed, 'everything,' it is fair to say that you've been-at least-introduced to everything you need to know to work with strings.
We started with the various string-related types: string, char, and the StringBuilder class. I explained how to create instances of the types, and how to use the instance and static methods related to these classes. I also showed you how to implement ToString methods in your own classes.
Moving on, we touched on the fact that the string type implements IComparable. I showed you how to implement IComparable in your own classes, and how to create a custom interface for use with Dinosaur and its derived classes, dubbed ICarnivore.
The final part of this chapter explained working with regular expressions. While regular expressions are not unique to C# or .NET, successfully using them can add a great deal of ease and power to your programs. You do need to understand how the .NET regular expression classes and collections interact-I concluded the chapter with an example showing this.
It's time to move on. Chapter 10 will show you how to work with files, input, and output.
Part IV: Gigue: Leaping to Success
Chapter 10: Working with Streams and Files
Chapter 11: Messaging
Chapter 12: Working with XML and ADO.NET
Chapter 13: Web Services as Architecture
Chapter 10: Working with Streams and Files
Overview
•Files and directories
•Using the Environment and Path classes
•Finding files
•Working with the system Registry
•Using isolated storage
•Understanding streams
•Reading and writing files
•Web streams
•Using a FileStream asynchronously
It's an unusual program of any size that doesn't save and retrieve values for initialization purposes. Most programs also need to work with files. In other words, to accomplish many tasks, along the way you'll need to obtain information about the file system a program is operating in, read (and write) configuration data, and read from and write to files.
This chapter explains how to work with files and directories. Next, I'll show you how to
save initialization information-using both the system Registry and isolated storage. I'll explain streams and how the stream classes interrelate, and show you how to read and write both text and binary files. We'll also take a look at web streams and at invoking FileStreams asynchronously.
Files and Directories
You'll find the Environment, Path, Directory, DirectoryInfo, File, and FileInfo classes essential for obtaining information about local systems, paths, directories, and files-and for manipulating them.
Environment Class
The Environment class, part of the System namespace, contains properties and methods that let you get information about the system on which a program is running, the current user logged on to a system, and the environment strings-or, variables that are used to maintain information about the system environment. Using the Environment class, besides information about environment variables, you can retrieve command-line arguments, exit codes, the contents of the call stack, the time since the last system boot, the version of the CLR that is running, and more. For a full list of Environment members, look up "Environment Members" in online help.
Table 10.1 shows some of the static Environment methods and properties that are related to the file system.
Table 10.1: Environment Class Members Related to Files and Directories
Member |
|
What It Does |
|
|
|
CurrentDirectory |
|
Property gets (or sets) the fully qualified path for the current directory. |
|
|
|
GetFolderPath |
|
Method gets the fully qualified path to the special folder identified in the |
|
|
Environment.SpecialFolder enumeration (see Table 10.2). |
GetLogicalDrives Method returns an array of strings containing the names of the logical drives on the system.
SystemDirectory Property gets the fully qualified path of the system directory.
Table 10.2 shows the possible values of the Environment.SpecialFolder enumeration. It is unusual that this enumeration is defined within the Environment class, as you can see in the Object Browser in Figure 10.1 (it is more common to define the enumeration directly within the namespace, e.g., System).
Figure 10.1: The SpecialFolder enumeration is defined within the Environment class.
Table 10.2: Values of the Environment.SpecialFolder Enumeration
Constant |
|
Special Folder Description |
|
|
|
ApplicationData |
|
Directory that serves as a common repository for application- |
|
|
specific data for the current roaming user |
CommonApplicationData Directory that serves as a common repository for applicationspecific data that is used by all users
CommonProgramFiles |
|
Directory for components that are shared across applications |
|
|
|
Cookies |
|
Directory that serves as a common repository for Internet cookies |
|
|
|
DesktopDirectory |
|
Directory used to physically store file objects shown on the |
|
|
Desktop |
|
|
|
Favorites |
|
Directory that serves as a common repository for the user's favorite |
|
|
items |
|
|
|
History |
|
Directory that serves as a common repository for Internet history |
|
|
items |
|
|
|
InternetCache |
|
Directory that serves as a common repository for temporary |
|
|
Internet files |
|
|
|
LocalApplicationData |
|
Directory that serves as a common repository for application- |
|
|
specific data that is used by the current, non-roaming user |
|
|
|
Personal |
|
Directory that serves as a common repository for documents (My |
|
|
Documents) |
|
|
|
ProgramFiles |
|
Program files directory |
|
|
|
Programs |
|
Directory that contains the user's program groups |
|
|
|
Recent |
|
Directory that contains the user's most recently used documents |
|
|
|
SendTo |
|
Directory that contains Send To menu items |
|
|
|
StartMenu |
|
Directory that contains Start menu items |
|
|
|
Startup |
|
Directory that corresponds to the user's Startup program group |
|
|
|
System |
|
System directory |
|
|
|
Templates |
|
Directory that serves as a common repository for document |
|
|
templates |
Being defined within the Environment class implies that the enumeration must be referenced with the class as a qualifier. Hence
String str = Environment.GetFolderPath(SpecialFolder.ProgramFiles);
produces a compile-time syntax error. The correct formulation is
String str = Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles);
The fully qualified path that these constants refer to depends, of course, on the operating system and configuration. Let's look at a brief example, shown in Listing 10.1, namely getting the current Personal directory-which on my Windows XP system happens to be C:\Documents and Settings\harold\My Documents.
Listing 10.1: Getting the Personal Directory
private void btnMyDocs_Click(object sender, System.EventArgs e) { try {
txtStartDir.Text = Environment.GetFolderPath (Environment.SpecialFolder.Personal);
}
catch (Exception excep) { MessageBox.Show (excep.Message);
}
}
If you run this code, the current Personal directory will be displayed in a TextBox. (I'll be using the contents of this TextBox as the starting place for a recursive scan of a computer's directories-shown in an example a little later in this chapter.)
Note In the examples in this chapter, I have been rigorous about always using some variant of the try...catch, try...catch...finally, or try...finally exception handling syntax explained in Chapter 6, "Zen and Now: The C# Language." The reason is that when it comes to files and file systems, one just never knows. Files can be moved or deleted. Drives can be unavailable. Permission levels may be required to access certain information, and so on. This is an arena in which, realistically, you can never assert perfect control-and should therefore embed program statements within exception handling blocks.
Path Class
The Path class, defined in the System.IO namespace, provides static methods that perform string manipulations on file and path information stored in string instances. Note that these methods do not interact with the file system and do not verify the existence of specified files or paths-with the interesting implication that a path string does not have to represent an existing path in order to be used with the members of this class. So you could use Path class methods to construct a string representing a path and file, and then check to see whether the
file actually exists before using the string. As another example, the Path.GetExtension method returns a file extension. So the following code stores the value "sybex" in the variable extension:
string fileName = @"C:\theDir\myfile.sybex" string extension;
extension = Path.GetExtension(fileName);
But nothing about this code snippet guarantees that 'sybex' is a valid file extension-or, for that matter, that the specified file name and path in the fileName variable represents an existing file.
The methods of the Path class, all of which are static, are shown in Table 10.3.
|
Table 10.3: Path Class Methods |
||
Method |
|
|
What It Does |
|
|
|
|
ChangeExtension |
|
|
Changes the extension of a path string. |
Combine
Joins a path name (on the left) with a path and/or file name (on the right). This is like string concatenation, except you do not have to worry about whether a backslash is the end of the left part or the beginning of the right part.
GetDirectoryName |
|
Returns the directory information for the specified path |
|
|
string. |
|
|
|
GetExtension |
|
Returns the extension of the specified path string. |
|
|
|
GetFileName |
|
Returns the file name and extension of the specified path |
|
|
string. |
|
|
|
GetFileNameWithoutExtension |
|
Returns the file name of the specified path string without the |
|
|
extension. |
|
|
|
GetFullPath |
|
Returns the absolute path for the specified path string. |
|
|
|
GetPathRoot |
|
Gets the root directory information of the specified path. |
|
|
|
GetTempFileName |
|
Returns a unique temporary file name and creates an empty |
|
|
file by that name on disk. |
|
|
|
GetTempPath |
|
Returns the path of the current system's temporary folder. |
|
|
|
HasExtension |
|
Determines whether a path includes a file name extension. |
|
|
|
IsPathRooted |
|
Gets a value indicating whether the specified path string |
|
|
contains absolute or relative path information. |
Directory and File Classes
Four parallel classes are designed to let you work with (and perform discovery on) directories and files. The classes-all members of the System.IO namespace and all sealed so they can't be inherited-are Directory, DirectoryInfo, File, and FileInfo. Here's some more information about these classes:
•Directory contains only static members for manipulating directories, which require an argument that is a directory, such as a Path string (see Table 10.4 for class methods).
•DirectoryInfo contains no static members, so you must instantiate a DirectoryInfo object to use it. DirectoryInfo inherits from the abstract class FileSystemInfo (as does FileInfo). The GetFiles method of DirectoryInfo returns an array of FileInfo objects that are the files in a given directory. Selected members of the DirectoryInfo class are shown in Table 10.5.
•File, like Directory, provides static methods used for manipulating files, which require a string argument representing a file name. Selected File members are shown in Table 10.6.
•FileInfo contains no static members, so you must obtain an instance to use it. The Directory method returns a DirectoryInfo object that is an instance of the parent directory of the FileInfo object. Selected members of FileInfo are shown in Table 10.7.
It is more or less the case that one can achieve comparable functionality using either parallel pair-Directory and File, or DirectoryInfo and FileInfo. If you need only a few items of information or to perform only a few operations, it's probably easiest to use the static classes, Directory and File. However, if you expect to use members multiple times, it probably makes more sense to instantiate DirectoryInfo and FileInfo objects-which is my stylistic preference in any case.
|
|
Table 10.4: Directory Class Static Methods |
Method |
|
What It Does |
|
|
|
CreateDirectory |
|
Creates a directory or subdirectory |
|
|
|
Delete |
|
Deletes a directory and its contents |
|
|
|
Exists |
|
Returns a Boolean value indicating whether the specified path |
|
|
corresponds to an actual existing directory on disk |
|
|
|
GetCreationTime |
|
Gets the creation date and time of a directory |
|
|
|
GetCurrentDirectory |
|
Gets the current working directory of the application |
|
|
|
GetDirectories |
|
Gets the names of subdirectories in the specified directory |
|
|
|
GetDirectoryRoot |
|
Returns the volume information, root information, or both for the |
|
|
specified path |
|
|
|
GetFiles |
|
Returns the names of files in the specified directory |
GetFileSystemEntries |
|
Returns the names of all files and subdirectories in the specified |
|
|
directory |
|
|
|
GetLastAccessTime |
|
Returns the date and time that the specified file or directory was |
|
|
lastaccessed |
|
|
|
GetLastWriteTime |
|
Returns the date and time that the specified file or directory was |
|
|
lastwritten to |
|
|
|
GetLogicalDrives |
|
Retrieves the names of the logical drives on the current computer |
|
|
(inthe form "<drive letter>:\") |
|
|
|
GetParent |
|
Retrieves the parent directory of the specified path, including both |
|
|
absolute and relative paths |
|
|
|
Move |
|
Moves a file or a directory and its contents to a specified location |
|
|
|
SetCreationTime |
|
Sets the creation date and time for the specified file or directory |
|
|
|
|
Table 10.4: Directory Class Static Methods |
||
|
|
|
|
|
|
|
Method |
|
|
|
What It Does |
||
|
|
|
|
|
|
|
SetCurrentDirectory |
|
|
|
Sets the application's current working directory to the specified |
||
|
|
|
|
directory |
||
|
|
|
|
|
|
|
SetLastAccessTime |
|
|
|
Sets the date and time that the specified file or directory was last |
||
|
|
|
|
accessed |
||
|
|
|
|
|
|
|
SetLastWriteTime |
|
|
|
Sets the date and time a directory was last written to |
||
|
|
|
|
|
||
Table 10.5: Selected DirectoryInfo Class Instance Members |
||||||
|
|
|
|
|
|
|
Member |
|
|
|
|
|
What It Does |
|
|
|
|
|
|
|
Attributes |
|
|
|
|
|
Property gets or sets the FileAttributes of the current |
|
|
|
|
|
|
FileSystemInfo |
|
|
|
|
|
|
|
CreationTime |
|
|
|
|
|
Property gets or sets the creation time of the current |
|
|
|
|
|
|
FileSystemInfo object |
|
|
|
|
|
|
|
Create |
|
|
|
|
|
Method creates a directory |
|
|
|
|
|
|
|
CreateSubdirectory |
|
|
|
|
|
Method creates a subdirectory or subdirectories |
|
|
|
|
|
|
|
Delete |
|
|
|
|
|
Method deletes a DirectoryInfo and its contents from a path |
|
|
|
|
|
|
|
Exists |
|
|
|
|
|
Property returns a Boolean value indicating whether a |
|
|
|
|
|
|
DirectoryInfo instance corresponds to an actual existing |
|
|
|
|
|
|
directory on disk |
|
|
|
|
|
|
|
Extension |
|
|
|
|
|
Property gets the string representing the extension part of the |
|
|
|
|
|
|
file |
|
|
|
|
|
|
|
FullName |
|
|
|
|
|
Property gets the full path of the directory or file |
|
|
|
|
|
|
|
GetDirectories |
|
|
|
|
|
Method returns the subdirectories of the current directory |
|
|
|
|
|
|
|
GetFiles |
|
|
|
|
|
Method returns an array of FileInfo objects representing the |
|
|
|
|
|
|
files in the current directory |
|
|
|
|
|
|
|
GetFileSystemInfos |
|
|
|
|
|
Method retrieves an array of FileSystemInfo objects |
|
|
|
|
|
|
|
LastAccessTime |
|
|
|
|
|
Property gets or sets the time that the current file or directory |
|
|
|
|
|
|
was last accessed |
|
|
|
|
|
|
|
LastWriteTime |
|
|
|
|
|
Property gets or sets the time when the current file or |
|
|
|
|
|
|
directory was last written to |
|
|
|
|
|
|
|
MoveTo |
|
|
|
|
|
Method moves a DirectoryInfo instance and its contents to a |
|
|
|
|
|
|
newpath |
|
|
|
|
|
|
|
Parent |
|
|
|
|
|
Property gets the parent directory of a specified subdirectory |
|
|
|
|
|
|
|
Root |
|
|
|
|
|
Property gets the root portion of a path |
|
|
|
|
|
||
|
|
|
|
Table 10.6: File Class Static Methods |
||
|
|
|
||||
Method |
|
What It Does |
||||
|
|
|
||||
AppendText |
|
Creates a StreamWriter that appends text to an existing file |
||||
|
|
(see"Streams" later in this chapter for more information about |
||||
|
|
StreamWriters) |
||||
|
|
|
||||
Copy |
|
Copies a file |
||||
|
|
|
||||
Create |
|
Creates a file in the specified fully qualified path |
||||
|
|
|
|
|
|
|
|
|
Table 10.6: File Class Static Methods |
Method |
|
What It Does |
|
|
|
CreateText |
|
Creates or opens a new file for writing text |
|
|
|
Delete |
|
Deletes the file specified by the fully qualified path (an exception is |
|
|
notthrown if the specified file does not exist) |
|
|
|
Exists |
|
Determines whether the specified file exists |
|
|
|
GetAttributes |
|
Gets the FileAttributes of the file on the fully qualified path |
|
|
|
GetCreationTime |
|
Returns the creation date and time of the specified file or directory |
GetLastAccessTime Returns the date and time that the specified file or directory was lastaccessed
|
GetLastWriteTime |
|
|
Returns the date and time that the specified file or directory was |
|
||
|
|
|
|
|
|
lastwritten to |
|
|
|
|
|
|
|
|
|
|
Move |
|
|
|
Moves a specified file to a new location, providing the option to |
|
|
|
|
|
|
|
|
specifya new file name |
|
|
|
|
|
|
|
|
|
|
Open |
|
|
|
Opens a FileStream on the specified path (see "Streams" later in |
|
|
|
|
|
|
|
|
thischapter for more information about FileStreams) |
|
|
|
|
|
|
|
|
|
|
OpenRead |
|
|
|
Opens an existing file for reading |
|
|
|
|
|
|
|
|
|
|
|
OpenText |
|
|
|
Opens an existing text file for reading |
|
|
|
|
|
|
|
|
|
|
|
OpenWrite |
|
|
|
Opens an existing file for writing |
|
|
|
|
|
|
|
|
|
|
|
SetAttributes |
|
|
|
Sets the specified FileAttributes of the file on the specified path |
|
|
|
|
|
|
|
|
|
|
|
SetCreationTime |
|
|
|
Sets the date and time that the file was created |
|
|
|
|
|
|
|
|
|
|
|
SetLastAccessTime |
|
|
Sets the date and time that the specified file was last accessed |
|
||
|
|
|
|
|
|
|
|
|
SetLastWriteTime |
|
|
Sets the date and time that the specified file was last written to |
|
||
|
|
|
|
|
|
||
|
|
|
|
Table 10.7: Selected FileInfo Class Instance Members |
|
||
|
|
|
|
||||
|
Member |
|
What It Does |
|
|||
|
|
|
|
||||
|
AppendText |
|
Method creates a StreamWriter that appends text to the file |
|
|||
|
|
|
|
||||
|
Attributes |
|
Property gets or sets a FileAttributes object that represents the file's |
|
|||
|
|
|
attributes |
|
|||
|
|
|
|
||||
|
CopyTo |
|
Method copies an existing file to a new file |
|
|||
|
|
|
|
||||
|
Create |
|
Method creates a file |
|
|||
|
|
|
|
||||
|
CreateText |
|
Creates a StreamWriter that writes a new text file |
|
|||
|
|
|
|
||||
|
CreationTime |
|
Property gets or sets the creation time of the current object |
|
|||
|
|
|
|
||||
|
Delete |
|
Method permanently deletes a file |
|
|||
|
|
|
|
||||
|
Directory |
|
Property gets an instance of the parent directory |
|
|||
|
|
|
|
||||
|
DirectoryName |
|
Property gets a string representing the directory's full path |
|
|||
|
|
|
|
||||
|
Exists |
|
Property gets a Boolean value indicating whether a file exists |
|
|||
|
|
|
|
||||
|
Extension |
|
Property gets the string representing the extension part of the file |
|
|||
|
|
|
|
||||
|
FullName |
|
Property gets the full path of the file |
|
|||
|
|
|
|
||||
|
LastAccessTime |
|
Property gets or sets the time that the file was last accessed |
|