- •Using Your Sybex Electronic Book
- •Acknowledgments
- •Contents at a Glance
- •Introduction
- •Who Should Read This Book?
- •How About the Advanced Topics?
- •The Structure of the Book
- •How to Reach the Author
- •The Integrated Development Environment
- •The Start Page
- •Project Types
- •Your First VB Application
- •Making the Application More Robust
- •Making the Application More User-Friendly
- •The IDE Components
- •The IDE Menu
- •The Toolbox Window
- •The Solution Explorer
- •The Properties Window
- •The Output Window
- •The Command Window
- •The Task List Window
- •Environment Options
- •A Few Common Properties
- •A Few Common Events
- •A Few Common Methods
- •Building a Console Application
- •Summary
- •Building a Loan Calculator
- •How the Loan Application Works
- •Designing the User Interface
- •Programming the Loan Application
- •Validating the Data
- •Building a Math Calculator
- •Designing the User Interface
- •Programming the MathCalculator App
- •Adding More Features
- •Exception Handling
- •Taking the LoanCalculator to the Web
- •Working with Multiple Forms
- •Working with Multiple Projects
- •Executable Files
- •Distributing an Application
- •VB.NET at Work: Creating a Windows Installer
- •Finishing the Windows Installer
- •Running the Windows Installer
- •Verifying the Installation
- •Summary
- •Variables
- •Declaring Variables
- •Types of Variables
- •Converting Variable Types
- •User-Defined Data Types
- •Examining Variable Types
- •Why Declare Variables?
- •A Variable’s Scope
- •The Lifetime of a Variable
- •Constants
- •Arrays
- •Declaring Arrays
- •Initializing Arrays
- •Array Limits
- •Multidimensional Arrays
- •Dynamic Arrays
- •Arrays of Arrays
- •Variables as Objects
- •So, What’s an Object?
- •Formatting Numbers
- •Formatting Dates
- •Flow-Control Statements
- •Test Structures
- •Loop Structures
- •Nested Control Structures
- •The Exit Statement
- •Summary
- •Modular Coding
- •Subroutines
- •Functions
- •Arguments
- •Argument-Passing Mechanisms
- •Event-Handler Arguments
- •Passing an Unknown Number of Arguments
- •Named Arguments
- •More Types of Function Return Values
- •Overloading Functions
- •Summary
- •The Appearance of Forms
- •Properties of the Form Control
- •Placing Controls on Forms
- •Setting the TabOrder
- •VB.NET at Work: The Contacts Project
- •Anchoring and Docking
- •Loading and Showing Forms
- •The Startup Form
- •Controlling One Form from within Another
- •Forms vs. Dialog Boxes
- •VB.NET at Work: The MultipleForms Project
- •Designing Menus
- •The Menu Editor
- •Manipulating Menus at Runtime
- •Building Dynamic Forms at Runtime
- •The Form.Controls Collection
- •VB.NET at Work: The DynamicForm Project
- •Creating Event Handlers at Runtime
- •Summary
- •The TextBox Control
- •Basic Properties
- •Text-Manipulation Properties
- •Text-Selection Properties
- •Text-Selection Methods
- •Undoing Edits
- •VB.NET at Work: The TextPad Project
- •Capturing Keystrokes
- •The ListBox, CheckedListBox, and ComboBox Controls
- •Basic Properties
- •The Items Collection
- •VB.NET at Work: The ListDemo Project
- •Searching
- •The ComboBox Control
- •The ScrollBar and TrackBar Controls
- •The ScrollBar Control
- •The TrackBar Control
- •Summary
- •The Common Dialog Controls
- •Using the Common Dialog Controls
- •The Color Dialog Box
- •The Font Dialog Box
- •The Open and Save As Dialog Boxes
- •The Print Dialog Box
- •The RichTextBox Control
- •The RTF Language
- •Methods
- •Advanced Editing Features
- •Cutting and Pasting
- •Searching in a RichTextBox Control
- •Formatting URLs
- •VB.NET at Work: The RTFPad Project
- •Summary
- •What Is a Class?
- •Building the Minimal Class
- •Adding Code to the Minimal Class
- •Property Procedures
- •Customizing Default Members
- •Custom Enumerations
- •Using the SimpleClass in Other Projects
- •Firing Events
- •Shared Properties
- •Parsing a Filename String
- •Reusing the StringTools Class
- •Encapsulation and Abstraction
- •Inheritance
- •Inheriting Existing Classes
- •Polymorphism
- •The Shape Class
- •Object Constructors and Destructors
- •Instance and Shared Methods
- •Who Can Inherit What?
- •Parent Class Keywords
- •Derived Class Keyword
- •Parent Class Member Keywords
- •Derived Class Member Keyword
- •MyBase and MyClass
- •Summary
- •On Designing Windows Controls
- •Enhancing Existing Controls
- •Building the FocusedTextBox Control
- •Building Compound Controls
- •VB.NET at Work: The ColorEdit Control
- •VB.NET at Work: The Label3D Control
- •Raising Events
- •Using the Custom Control in Other Projects
- •VB.NET at Work: The Alarm Control
- •Designing Irregularly Shaped Controls
- •Designing Owner-Drawn Menus
- •Designing Owner-Drawn ListBox Controls
- •Using ActiveX Controls
- •Summary
- •Programming Word
- •Objects That Represent Text
- •The Documents Collection and the Document Object
- •Spell-Checking Documents
- •Programming Excel
- •The Worksheets Collection and the Worksheet Object
- •The Range Object
- •Using Excel as a Math Parser
- •Programming Outlook
- •Retrieving Information
- •Recursive Scanning of the Contacts Folder
- •Summary
- •Advanced Array Topics
- •Sorting Arrays
- •Searching Arrays
- •Other Array Operations
- •Array Limitations
- •The ArrayList Collection
- •Creating an ArrayList
- •Adding and Removing Items
- •The HashTable Collection
- •VB.NET at Work: The WordFrequencies Project
- •The SortedList Class
- •The IEnumerator and IComparer Interfaces
- •Enumerating Collections
- •Custom Sorting
- •Custom Sorting of a SortedList
- •The Serialization Class
- •Serializing Individual Objects
- •Serializing a Collection
- •Deserializing Objects
- •Summary
- •Handling Strings and Characters
- •The Char Class
- •The String Class
- •The StringBuilder Class
- •VB.NET at Work: The StringReversal Project
- •VB.NET at Work: The CountWords Project
- •Handling Dates
- •The DateTime Class
- •The TimeSpan Class
- •VB.NET at Work: Timing Operations
- •Summary
- •Accessing Folders and Files
- •The Directory Class
- •The File Class
- •The DirectoryInfo Class
- •The FileInfo Class
- •The Path Class
- •VB.NET at Work: The CustomExplorer Project
- •Accessing Files
- •The FileStream Object
- •The StreamWriter Object
- •The StreamReader Object
- •Sending Data to a File
- •The BinaryWriter Object
- •The BinaryReader Object
- •VB.NET at Work: The RecordSave Project
- •The FileSystemWatcher Component
- •Properties
- •Events
- •VB.NET at Work: The FileSystemWatcher Project
- •Summary
- •Displaying Images
- •The Image Object
- •Exchanging Images through the Clipboard
- •Drawing with GDI+
- •The Basic Drawing Objects
- •Drawing Shapes
- •Drawing Methods
- •Gradients
- •Coordinate Transformations
- •Specifying Transformations
- •VB.NET at Work: Plotting Functions
- •Bitmaps
- •Specifying Colors
- •Defining Colors
- •Processing Bitmaps
- •Summary
- •The Printing Objects
- •PrintDocument
- •PrintDialog
- •PageSetupDialog
- •PrintPreviewDialog
- •PrintPreviewControl
- •Printer and Page Properties
- •Page Geometry
- •Printing Examples
- •Printing Tabular Data
- •Printing Plain Text
- •Printing Bitmaps
- •Using the PrintPreviewControl
- •Summary
- •Examining the Advanced Controls
- •How Tree Structures Work
- •The ImageList Control
- •The TreeView Control
- •Adding New Items at Design Time
- •Adding New Items at Runtime
- •Assigning Images to Nodes
- •Scanning the TreeView Control
- •The ListView Control
- •The Columns Collection
- •The ListItem Object
- •The Items Collection
- •The SubItems Collection
- •Summary
- •Types of Errors
- •Design-Time Errors
- •Runtime Errors
- •Logic Errors
- •Exceptions and Structured Exception Handling
- •Studying an Exception
- •Getting a Handle on this Exception
- •Finally (!)
- •Customizing Exception Handling
- •Throwing Your Own Exceptions
- •Debugging
- •Breakpoints
- •Stepping Through
- •The Local and Watch Windows
- •Summary
- •Basic Concepts
- •Recursion in Real Life
- •A Simple Example
- •Recursion by Mistake
- •Scanning Folders Recursively
- •Describing a Recursive Procedure
- •Translating the Description to Code
- •The Stack Mechanism
- •Stack Defined
- •Recursive Programming and the Stack
- •Passing Arguments through the Stack
- •Special Issues in Recursive Programming
- •Knowing When to Use Recursive Programming
- •Summary
- •MDI Applications: The Basics
- •Building an MDI Application
- •Built-In Capabilities of MDI Applications
- •Accessing Child Forms
- •Ending an MDI Application
- •A Scrollable PictureBox
- •Summary
- •What Is a Database?
- •Relational Databases
- •Exploring the Northwind Database
- •Exploring the Pubs Database
- •Understanding Relations
- •The Server Explorer
- •Working with Tables
- •Relationships, Indices, and Constraints
- •Structured Query Language
- •Executing SQL Statements
- •Selection Queries
- •Calculated Fields
- •SQL Joins
- •Action Queries
- •The Query Builder
- •The Query Builder Interface
- •SQL at Work: Calculating Sums
- •SQL at Work: Counting Rows
- •Limiting the Selection
- •Parameterized Queries
- •Calculated Columns
- •Specifying Left, Right, and Inner Joins
- •Stored Procedures
- •Summary
- •How About XML?
- •Creating a DataSet
- •The DataGrid Control
- •Data Binding
- •VB.NET at Work: The ViewEditCustomers Project
- •Binding Complex Controls
- •Programming the DataAdapter Object
- •The Command Objects
- •The Command and DataReader Objects
- •VB.NET at Work: The DataReader Project
- •VB.NET at Work: The StoredProcedure Project
- •Summary
- •The Structure of a DataSet
- •Navigating the Tables of a DataSet
- •Updating DataSets
- •The DataForm Wizard
- •Handling Identity Fields
- •Transactions
- •Performing Update Operations
- •Updating Tables Manually
- •Building and Using Custom DataSets
- •Summary
- •An HTML Primer
- •HTML Code Elements
- •Server-Client Interaction
- •The Structure of HTML Documents
- •URLs and Hyperlinks
- •The Basic HTML Tags
- •Inserting Graphics
- •Tables
- •Forms and Controls
- •Processing Requests on the Server
- •Building a Web Application
- •Interacting with a Web Application
- •Maintaining State
- •The Web Controls
- •The ASP.NET Objects
- •The Page Object
- •The Response Object
- •The Request Object
- •The Server Object
- •Using Cookies
- •Handling Multiple Forms in Web Applications
- •Summary
- •The Data-Bound Web Controls
- •Simple Data Binding
- •Binding to DataSets
- •Is It a Grid, or a Table?
- •Getting Orders on the Web
- •The Forms of the ProductSearch Application
- •Paging Large DataSets
- •Customizing the Appearance of the DataGrid Control
- •Programming the Select Button
- •Summary
- •How to Serve the Web
- •Building a Web Service
- •Consuming the Web Service
- •Maintaining State in Web Services
- •A Data-Driven Web Service
- •Consuming the Products Web Service in VB
- •Summary
THE HASHTABLE COLLECTION 497
Iterating an ArrayList
To iterate through the elements of an ArrayList collection, you can set up a For…Next loop like the following one:
For i = 0 To ArrayList.Count – 1
{ process item ArrayList(i) }
Next
This is a trivial operation, but the processing itself can get as complicated as the type of objects stored in the collection requires. The current item at each iteration is the ArrayList(i). If you don’t know its exact type, assign it to an Object variable and then process it.
You could also use the For Each…Next loop with an Object variable, as shown next:
Dim itm As Object
For Each itm In ArrayList { process item itm }
Next
If all the items in the ArrayList are of the same type, you can use a variable of the same type to iterate through the collection, instead of a generic Object variable. If all the elements were Decimals, for example, you can declare the itm variable as Decimal.
An even better method is to create an enumerator for the collection and use it to iterate through its items. This technique applies to all collections and is discussed in the section “Enumerating Collections,” later in this chapter.
The ArrayList class addresses most of the problems associated with the Array class, but one last problem remains—that of accessing the items in the collection through a meaningful key. This is the problem addressed by the HashTable collection.
The HashTable Collection
The ArrayList is a more convenient form of an array. It’s dynamic, it allows you to insert items anywhere and remove items from the collection with a single method call, and it supports all the convenient features of an array, like sorting and searching.
Yet, both collections have a drawback: namely, you must access their elements by an index. Another collection, the HashTable collection, is similar to the ArrayList, but it allows you to access the items by a key. Each item has a value and a key. The value is the same value you store in an array, but the key is a meaningful entity for accessing the items in the collection.
The HashTable exposes most of the properties and methods of the ArrayList, with a few notable exceptions. The Count property returns the number of items in the collection as usual, but the HashTable collection doesn’t expose a Capacity property. The HashTable collection uses fairly complicated logic to maintain the list of items, and it adjusts its capacity automatically. Fortunately, you need not know how the items are stored in the collection. In short, it creates automatically a unique key for each item. This key is derived from the item being added, and it’s possible that two items will produce the same key—not very likely, but the possibility is not zero. The HashTable class uses a complicated algorithm to handle all possible cases, but you need not be concerned with these details. The Framework provides all these classes so that you won’t have to write low-level code.
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |
498 Chapter 11 STORING DATA IN COLLECTIONS
To use a HashTable in your code, you need not import any class. Just declare a HashTable variable with the following statement:
Dim hTable As New HashTable
To add an item to the HashTable, use the Add method, whose syntax is
hTable.Add(key, value)
value is the item you want to add (it can be any object), and key is a value you supply, which represents the item. This is the value you’ll use later to retrieve the item. If you’re setting up a structure for storing temperatures in various cities, use the city names as keys:
Dim Temperatures As New HashTable
Temperatures.Add(“Houston”, 81)
Temperatures.Add(“Los Angeles”, 78)
Notice that you can have duplicate values, but the keys must be unique. If you attempt to use an existing key, an argument exception will be raised. To find out whether a specific value or key is already in the collection, use the ContainsKey and ContainsValue methods. The syntax of the two methods is quite similar:
hTable.ContainsKey(object)
hTable.ContainsValue(object)
The HashTable collection exposes the Contains method too, which is identical to the ContainsKey method.
To find out whether a specific key is in use already, use the ContainsKey method, as shown in the following statements, which add a new item to the HashTable only if it’s key doesn’t exist already:
Dim value As New Rectangle(100, 100, 50, 50) Dim key As String = “object1”
If Not hTable.ContainsKey(key) Then hTable.Add(key, value)
End If
The Values and Keys properties allow you to retrieve all the values and the keys in the HashTable. Both properties are collections and expose the usual members of a collection. To iterate through the values stored in the HashTable hTable, use the following loop:
Dim itm As Object
For Each itm In hTable.Values
Console.WriteLine(itm)
Next
There is only one method to remove items from an ArrayList: the Remove method, which accepts as argument the key of the item to be removed:
hTable.Remove(key)
To extract items from a HashTable, use the CopyTo method. This method copies the items to a one-dimensional array, and its syntax is
newArray = HTable.CopyTo(arrayName)
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |
THE HASHTABLE COLLECTION 499
You must set up the array that will accept the items beforehand, because this method can throw several different exceptions for various error conditions. The array that accepts the values must be one-dimensional, and there should be enough space in the array for the HashTable’s values. Moreover, the array’s type must be Object, because this is the type of the items you can store in a HashTable.
Listing 11.7 demonstrates how to scan the keys of a HashTable through the Keys property and then use these keys to access the items through the Item property (and passing the key as argument).
Listing 11.7: Iterating a HashTable
Private Function ShowHashTableContents(ByVal table As Hashtable) As String Dim msg As String
Dim element, key As Object
msg = “The HashTable contains “ & table.Count.tostring & “ elements:” & vbCrLf For Each key In table.keys
element = table.Item(key) msg = msg & vbCrLf
msg = msg & “ Element Type = “ & element.GetType.ToString & vbCrLf msg = msg & “ Element Key= “ & Key.ToString
msg = msg & “ Element Value= “ & element.ToString & vbCrLf Next
Return(msg) End Sub
To print the contents of a HashTable variable on the Output window, call the ShowHashTableContents() function, passing the name of the HashTable as argument, and then print the string returned by the function:
Dim HT As New HashTable
{ statements to populate HashTable } Console.WriteLine(ShowHashTableContents(HT))
VB.NET at Work: The WordFrequencies Project
In this section, you’ll develop an application that counts word frequencies in a text. The WordFrequencies application scans text files and counts the occurrences of each word in the text. As you will see, the HashTable is the natural choice for storing this information, because you want to access a word’s frequency by the word. To retrieve (or update) the frequency of the word elaborate, for example, you will use the expression:
Words(“ELABORATE”).Value
Arrays and ArrayLists are out of the question, because they can’t be accessed by a key. You could also use the SortedList collection, which is described later in this chapter, but this collection maintains its items sorted at all times. If you need this functionality as well, you can modify the application accordingly. The items in a SortedList are also accessed by keys, so you won’t have to introduce substantial changes in the code.
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |
500 Chapter 11 STORING DATA IN COLLECTIONS
Let me start with a few remarks. First, all words we locate in the various text files will be converted to uppercase. Because the keys of the HashTable are case-sensitive, converting them to uppercase makes them unique. This way, we don’t risk counting the same word in different cases as two or more different words.
The frequencies of the words can’t be calculated instantly, because we need to know the total number of words in the text. Instead, each value in the HashTable is the number of occurrences of a specific word. To calculate the actual frequency of the same word, you must divide this value by the number of occurrences of all words, but this can happen only after we have scanned the entire text file and counted the occurrences of each word. Since this operation will introduce delays in the application, I’ve decided to keep track of number of occurrences only and calculate the word frequencies when requested.
When the code runs into another instance of the word elaborate, it simply increases the matching item of the HashTable by one:
Words(“ELABORATE”).Value = Words(“ELABORATE”).Value + 1
The application’s interface is shown in Figure 11.3. To scan another text file and process its words, click the Read Text File button. You’ll be prompted to select the name of the file to be processed with an Open dialog box. Then, you can click the Show Word Count button to count the number of occurrences of each word in the text. The last button on the form sorts the words according to their count.
Figure 11.3
The WordFrequencies project demons trates how to use the HashTable collection.
The application maintains a single HashTable collection, the Words collection, and it updates this collection rather than counting word occurrences from scratch. The Frequency Table menu contains the commands to save the collection’s items to a disk file and read the same data from the file. Use one of the Save commands to save the HashTable to a disk file, and use the equivalent Load command to read the data from the disk file into the HashTable. The commands in this menu can store the data either to a text file (Save SOAP/Load SOAP commands) or to a binary file (Save Binary/Load Binary). Use
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |
THE HASHTABLE COLLECTION 501
these commands to store the data generated in a single session, load the data in a later session, and process more files. These commands will be discussed in detail at the end of the chapter, where we’ll explore the Serialization class. For now, you can use the commands to continue processing text files in multiple sessions.
The WordFrequencies application uses techniques and classes we haven’t discussed yet. The topic of reading from (or writing to) files is discussed in the following chapter. You don’t really have to understand the code that opens a text file and reads its lines; just focus on the segments that manipulate the text file. To test the project, I used some very large files I downloaded from the Project Gutenberg Web site (http://promo.net/pg/). This site contains entire books in electronic format (plain text files), and you can borrow some files to test any program that manipulates text (in addition to reading them, of course).
The code reads the text into a string variable, the str variable. Then, it calls the Split method of the String class to split the text into individual words. The Split method uses the space, comma, period, quote, exclamation mark, colon, semicolon, and newline characters as delimiters. The individual words are stored in the Words array. The program goes through each word in the array and determines whether it’s a valid word by calling the IsValidWord() function. This function returns False if one of the characters in the word is not a letter; strings like “B2B” or “U2” are not considered proper words. IsValidWord() is a custom function, and you can edit it as you wish.
Any valid word becomes a key to the WordFrequencies HashTable. The corresponding value is the number of occurrences of the specific word in the HashTable. If a key (a new word) is added to the table, its value is set to 1. If the key exists already, then its value is increased by 1, with the following If statement:
If Not WordFrequencies.ContainsKey(word) Then
WordFrequencies.Add(word, 1)
Else
WordFrequencies(word) = CType(WordFrequencies(word), Integer) + 1
End If
The code that reads the text file and splits it into individual words is shown in Listing 11.8. The code prompts the user to select a text file with the Open dialog box and then reads the entire text into a string variable, the txtLine variable, and the individual words are isolated with the Split method of the String class.
Listing 11.8: Splitting a Text File into Words
Private Sub Button1_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles Button1.Click
OpenFileDialog1.DefaultExt = “TXT”
OpenFileDialog1.Filter = “Text|*.TXT|All Files|*.*”
OpenFileDialog1.ShowDialog()
If OpenFileDialog1.FileName = “” Then Exit Sub
Dim str As StreamReader
Dim txtFile As File
Dim txtLine As String
Dim Words() As String
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |
502 Chapter 11 STORING DATA IN COLLECTIONS
Dim Delimiters() As Char = {CType(“ “, Char), CType(“.”, Char), _ CType(“,”, Char), CType(“‘“, Char), _ Ctype(“!”, Char), Ctype(“;”, Char), _ Ctype(“:”, Char), Chr(10), Chr(13)}
str = File.OpenText(OpenFileDialog1.FileName) txtLine = str.ReadLine()
txtLine = str.ReadToEnd
Words = txtLine.Split(Delimiters) Dim iword As Integer, word As String
For iword = 0 To Words.GetUpperBound(0) word = Words(iword).ToUpper
If IsValidWord(word) Then
If Not WordFrequencies.ContainsKey(word) Then WordFrequencies.Add(word, 1)
Else
WordFrequencies(word) = CType(WordFrequencies(word), Integer) + 1 End If
End If Next
End Sub
This event handler calculates the count of the unique words and displays them on a TextBox control. In a document with 130,000 words, it didn’t take more than a couple of seconds to perform all the calculations. The process of displaying the list of unique words on a TextBox control was very fast too, thanks to the StringBuilder class. The code behind the Show Word Count button (Listing 11.9) displays the list of words along with the number of occurrences of each word in the text.
Listing 11.9: Displaying the Count of Each Word in the Text
Private Sub Button2_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles Button2.Click Dim wEnum As IDictionaryEnumerator
Dim occurrences As Integer
Dim allWords As New System.Text.StringBuilder() wEnum = WordFrequencies.GetEnumerator
While wEnum.MoveNext
allWords.Append(wEnum.Key.ToString & vbTab & “—>” & vbTab & _ wEnum.Value.ToString & vbCrLf)
End While
TextBox1.Text = allWords.ToString End Sub
The last button on the form calculates the frequency of each word in the HashTable, sorts them according to their frequencies, and displays the list; its code is detailed in Listing 11.10.
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |
THE HASHTABLE COLLECTION 503
Listing 11.10: Sorting the Words According to Frequency
Private Sub Button3_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles Button3.Click Dim wEnum As IDictionaryEnumerator
Dim Words(WordFrequencies.Count) As String
Dim Frequencies(WordFrequencies.Count) As Double Dim allWords As New System.Text.StringBuilder() Dim i, totCount As Integer
wEnum = WordFrequencies.GetEnumerator While wEnum.MoveNext
Words(i) = CType(wEnum.Key, String) Frequencies(i) = CType(wEnum.Value, Integer) totCount = totCount + Frequencies(i)
i = i + 1 End While
For i = 0 To Words.GetUpperBound(0) Frequencies(i) = Frequencies(i) / totCount
Next
Words.Sort(Frequencies, Words) TextBox1.Clear()
For i = Words.GetUpperBound(0) To 0 Step -1 allWords.Append(Words(i) & vbTab & “—>” & vbTab & _
Format(100 * Frequencies(i), “#.000”) & vbCrLf)
Next
TextBox1.Text = allWords.ToString End Sub
Handling Large Sets of Data
Incidentally, my first attempt was to display the list of unique words on a ListBox control. The process was incredibly slow. The first 10,000 words were added in a few seconds, but as the number of items increased, the time it took to add them to the control increased exponentially (or so it seemed).
Adding thousands of items to a ListBox control is a very slow process. It’s likely that you will run into situations where a seemingly simple task will turn out to be detrimental to your application’s performance. You should try different approaches, but also consider a total overhaul of your user interface. Ask yourself, who needs to see a list with 10,000 words? You can use the application to do the calculations and then retrieve the count of selected words, or display the 100 most common ones, or even display 100 words at a time. I’m displaying the list of words because this is a demonstration, but a real application shouldn’t display such a long list. The core of the application counts unique words in a text file, and it does it very efficiently.
Appending each word to a TextBox control was slow too, so I’ve used a string variable to store the text, then assign it to the control. This variable is the allWords variable, which was declared with the StringBuilder type. As you will learn in the following chapter, the StringBuider class manipulates strings like the String class, but it’s very fast.
Copyright ©2002 SYBEX, Inc., Alameda, CA |
www.sybex.com |