Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
(ebook) Visual Studio .NET Mastering Visual Basic.pdf
Скачиваний:
120
Добавлен:
17.08.2013
Размер:
15.38 Mб
Скачать

870 Chapter 20 DATABASES: ARCHITECTURE AND BASIC CONCEPTS

this chapter isn’t just of theoretical interest. Everything you will learn in this chapter is as practical as it gets.

Another important motivation for this chapter is the fact that databases, the most important aspect of computer science today, are among the most complicated objects in programming, yet they’re based on common-sense principles. Once you’ve understood these principles, you’ll find that database programming isn’t as complicated as you may have thought.

What Is a Database?

A database is an object for storing complex, structured information. The same is true for a file, or even for the file system on your hard disk. What makes a database unique is the fact that databases are designed to make data easily retrievable. The purpose of a database is not so much the storage of information as its quick retrieval. In other words, you must structure your database so that it can be queried quickly and efficiently.

Databases are maintained by special programs, such as Access and SQL Server. These programs are called database management systems (DBMS), and they’re among the most complicated applications. A fundamental characteristic of a DBMS is that it isolates much of the complexity of the database from the developer. Regardless of how each DBMS stores data on disk, you see your data organized in tables with relationships between tables. To access the data stored in the database and to update the database, you use a special language, Structured Query Language (SQL). Unlike other areas of programming, SQL is a truly universal language and all major DBMSs support this language.

Note The recommended DBMS for Visual Studio .NET is SQL Server 2000. You can use Access, or even nonMicrosoft databases like Oracle. Although this chapter was written with SQL Server 2000, most of the examples will work with Access 2000 as well.

Data are stored in tables, and each table contains entities of the same type. In a database that stores information about books, there will be a table with titles, another table with authors, and a table with publishers. The table with the titles contains information like the title of the book, number of pages, and the book’s description. Author names are stored in a different table, because each author may appear in multiple titles. If author information were stored along with each title, we’d be repeating author names. This means that every time we wanted to change an author’s name, we’d have to modify multiple entries in the titles table. Even retrieving a list of unique author names would be a challenge, because you’d have to scan the entire database, retrieve all the authors, and then get rid of the duplicates entries. The same is true for publishers. Publishers are stored in a separate table, and each title contains a pointer to the appropriate row in the publishers table. If publisher information was stored along with each title, then deleting all the books of a specific publisher would also remove the information about the specific publisher from the database.

The reason for breaking the information we want to store in a database into separate tables is to avoid duplication of information. This is a key point in database design. Duplication of information will sooner or later lead to inconsistencies in the database. The process of breaking the data into related tables that eliminate all possible forms of information duplication is called normalization, and there rules for normalizing databases. The topic of database normalization is not discussed in this book. However, all it really takes to design a functional database is common sense. Once you learn

Copyright ©2002 SYBEX, Inc., Alameda, CA

www.sybex.com

WHAT IS A DATABASE? 871

how to extract data from your database’s tables with SQL statements, you’ll develop a better understanding of how databases should be structured.

Breaking the information into separate tables is a very convenient approach, but we must figure out a way to reconstruct the information. For each title, we must retrieve the title’s author(s) and publisher and put them together to display all the information about the book. To be able to reconstruct the original information, we establish links between the various tables. These links are called relationships, and they’re at the heart of a modern DBMS.

Relational Databases

The databases we’re interested in are relational, because they are based on relationships among the data they contain. The data is stored in tables, and tables contain related data, or entities, such as persons, products, orders, and so on. The idea is to keep the tables small and manageable; thus, separate entities are kept in their own tables. If you start mixing customers and invoices, products and their suppliers, or books, publishers, and authors in the same table, you’ll end up repeating information—a highly undesirable situation. If there’s one rule to live by as a database designer and programmer, this is it: Do not duplicate information.

Of course, entities are not independent of each other. For example, orders are placed by specific customers, so the rows of the Customers table must be linked to the rows of the Orders table that store the orders of the customers. Figure 20.1 shows a segment of a table with customers (top left) and the rows of a table with orders that correspond to one of the customers (bottom right). The lines that connect the rows of the two tables represent relationships.

Figure 20.1

Linking customers and orders with relationships

Copyright ©2002 SYBEX, Inc., Alameda, CA

www.sybex.com

872 Chapter 20 DATABASES: ARCHITECTURE AND BASIC CONCEPTS

As you can see in Figure 20.1, relationships are implemented by inserting columns with matching values in the two related tables; the CustomerID column is repeated in both tables. The rows with a common value in their CustomerID field are related. In other words, the lines that connect the two tables simply indicate that there are two fields, one on each side of the relationship, with a common value. The customer with the ID value BERGS has placed the orders 10278 and 10280. The customer BSBEV has placed the order 10289. To find all the orders placed by a customer, we can scan the Orders table and retrieve the rows in which the CustomerID field has the same value as the ID of the specific customer in the Customers table. Likewise, you can locate customer information for each order by looking up the row of the Customers table that has the same ID as the one in the CustomerID field of the Orders table.

These two fields used in a relationship are called key fields. The CustomerID field of the Customers table is the primary key, because it identifies a single customer. The CustomerID field of the Orders table is the foreign key of the relationship. A CustomerID value appears in a single row of the Customers table; it’s the table’s primary key. However, it may appear in multiple rows of the Orders table, because in this table the CustomerID field is the foreign key. In fact, it will appear in as many rows of the Orders table as there are orders for the specific customer.

The operation of matching rows in two tables based on their primary and foreign keys is called a join. Joins are very basic operations in manipulating tables, and they are discussed in detail in the section “Structured Query Language,” later in this chapter.

Exploring the Northwind Database

In this section, we’ll explore the structure of a sample database that comes with both SQL Server 2000 and Access 2000. The Northwind database stores products, customers, and sales data, and many of you are already familiar with the structure of the database. We’ll discuss the basic objects that make up a database shortly, but it’s easier to explain these objects through examples. Besides, you need a good understanding of the structure of this database, so that you can follow the examples of the following sections and chapters. Unless you understand how data is stored in the tables of the database and how the tables relate to one another, you won’t be able to retrieve information from the database or insert new data into it.

The Northwind database is made up of tables, each storing a collection of unique entities (customers, products, and so on). A table that stores products has a column for the product’s name, another column for the product’s price, and so on. Each product is stored in a different row. As products are added or removed from the table, the number of rows changes, but the number of columns remains the same; they determine the information we store about each product.

Products Table

The Products table stores information about the products sold by the Northwind Corporation. This information includes the product’s name, packaging information, price, and other relevant fields. Each product (or row) in the table is identified by a unique numeric ID. Since the rows of the Products table are referenced by invoices (the Order Details table, which is discussed later), the product IDs appear in the Order Details table as well.

Copyright ©2002 SYBEX, Inc., Alameda, CA

www.sybex.com

WHAT IS A DATABASE? 873

Suppliers Table

Each product has a supplier too. Because the same supplier may offer more than one product, the supplier information is stored in a different table, and a common field, the SupplierID field, is used to link each product to its supplier as shown in Figure 20.2. For example, the products Chai, Chang, and Aniseed Syrup are purchased from the same supplier, Exotic Liquids. Their SupplierID fields all point to the same row in the Suppliers table.

Figure 20.2

Linking products to their suppliers and their categories

Categories Table

In addition to having a supplier, each product belongs to a category. Categories are not stored along with product names, but in a separate table, the Categories table. Again, each category is identified by a numeric value (field CategoryID) and has a name (field CategoryName). In addition, the Categories table has two more columns: Description, which contains text, and Picture, which stores a bitmap. The CategoryID field in the Categories table is the primary key, and the field by the same name in the Products table is the corresponding foreign key.

Customers Table

The Customers table stores information about the company’s customers. Each customer is stored in a separate row of this table, and customers are referenced by the Orders table. Unlike the product IDs, the customer IDs are five-character strings.

Orders Table

The Orders table stores information about the orders placed by Northwind’s customers. The OrderID field, which is an integer value, identifies each order. Orders are numbered sequentially, so this field is also the order’s number. Each time you append a new row to the Orders table, the value of the new OrderID field is generated automatically by the database.

Copyright ©2002 SYBEX, Inc., Alameda, CA

www.sybex.com

874 Chapter 20 DATABASES: ARCHITECTURE AND BASIC CONCEPTS

The Orders table is linked to the Customers table through the CustomerID field. By matching rows with identical values in their CustomerID fields in the two tables, we can recombine customers with their orders. Figure 20.1 shows how customers are linked to their orders.

Order Details Table

You probably have noticed that the Northwind database’s Orders table doesn’t store any details about the items ordered. This information is stored in the Order Details table (see Figure 20.3). Each order is made up of one or more items, and each item has a price, a quantity, and a discount. In addition to these fields, the Order Details table contains an OrderID column, which holds the ID of the order to which the detail line belongs.

Figure 20.3

The Customers, Orders, and Order Details tables, and their relations

The reason details aren’t stored along with the order’s header is that the Orders and Order Details tables store different entities. The order’s header, which contains information about the customer who placed the order, the date of the order, and so on, is quite different from the information you must store for each item ordered.

Employees Table

This table holds employee information. Each employee is identified by a numeric ID, which appears in the each order. When a sale is made, the ID of the employee who made the sale is recorded in the Orders table.

Shippers Table

Each order is shipped with one of the three shippers stored in the Shippers table. The appropriate shipper’s ID is stored in the Orders table.

Copyright ©2002 SYBEX, Inc., Alameda, CA

www.sybex.com