Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Practical Database Programming With Java

.pdf
Скачиваний:
778
Добавлен:
10.06.2015
Размер:
31.58 Mб
Скачать

2.4 Identifying Keys 17

2.4 IDENTIFYING KEYS

2.4.1 Primary Key and Entity Integrity

An attribute that uniquely identifies one and only one instance of an entity is called a primary key. Sometimes, a primary key consists of a combination of attributes. It is referred to as a composite key. Entity integrity rule states that no attribute that is a member of the primary (composite) key may accept a null value.

A FacultyID may serve as a primary key for the Faculty entity, assuming that all faculty members have been assigned a unique FaultyID. However, caution must be exercised when picking an attribute as a primary key. Last Name may not make a good primary key because a department is likely to have more than one person with the same last name. Primary keys for the CSE_DEPT database are shown in Table 2.6.

Primary keys provide a tuple-level addressing mechanism in the relational databases. Once you define an attribute as a primary key for an entity, the DBMS will enforce the uniqueness of the primary key. Inserting a duplicate value for primary key field will fail.

2.4.2 Candidate Key

There can be more than one attribute which uniquely identifies an instance of an entity. These are referred to as candidate keys. Any one of them can serve as a primary key. For example, ID Number as well as Social Security Number may make a suitable primary key. Candidate keys that are not used as primary key are called alternate keys.

2.4.3 Foreign Keys and Referential Integrity

Foreign keys are used to create relationships between tables. It is an attribute in one table whose values are required to match those of primary key in another table. Foreign keys are created to enforce referential integrity, which states that you may not add a record to a table containing a foreign key unless there is a corresponding record in the related table to which it is logically linked. Furthermore, the referential integrity rule also implies that every value of a foreign key in a table must match the primary key of a related table or

Table 2.6.

Faculty table

 

 

 

 

 

 

 

 

 

 

 

 

faculty_id

 

faculty_name

office

phone

college

title

email

A52990

 

Black Anderson

MTC-218

750-378-9987

Virginia Tech

Professor

banderson@college.edu

A77587

 

Debby Angles

MTC-320

750-330-2276

University of Chicago

Associate Professor

dangles@college.edu

B66750

 

Alice Brown

MTC-257

750-330-6650

University of Florida

Assistant Professor

abrown@college.edu

B78880

 

Ying Bai

MTC-211

750-378-1148

Florida Atlantic University

Associate Professor

ybai@college.edu

B86590

 

Satish Bhalla

MTC-214

750-378-1061

University of Notre Dame

Associate Professor

sbhalla@college.edu

H99118

 

Jeff Henry

MTC-336

750-330-8650

Ohio State University

Associate Professor

jhenry@college.edu

J33486

 

Steve Johnson

MTC-118

750-330-1116

Harvard University

Distinguished Professor

sjohnson@college.edu

K69880

 

Jenney King

MTC-324

750-378-1230

East Florida University

Professor

jking@college.edu

18 Chapter 2 Introduction to Databases

be null. MS Access also makes provision for cascade update and cascade delete, which imply that changes made in one of the related tables will be reflected in the other of the two related tables.

Consider two tables Course and Faculty in the sample database, CSE_DEPT. The Course table has a foreign key entitled faculty_id, which is a primary key in the Faculty table. The two tables are logically related through the faculty_id link. Referential integrity rules imply that we may not add a record to the Course table with a faculty_id, which is not listed in the Faculty table. In other words, there must be a logical link between the two related tables. Second, if we change or delete a faculty_id in the Faculty table, it must reflect in the Course table, meaning that all records in the Course table must be modified using a cascade update or cascade delete (Table 2.7).

2.5 DEFINE RELATIONSHIPS

2.5.1 Connectivity

Connectivity refers to the types of relationships that entities can have. Basically it can be one-to-one, one-to-many, and many-to-many. In ER diagrams, these are indicated by placing 1, M, or N at one of the two ends of the relationship diagram. Figure illustrates the use of this notation.

A one-to-one (1 : 1) relationship occurs when one instance of entity A is related to only one instance of entity B. For example, user_name in the LogIn table and user_name in the Student table (Fig. 2.2).

Table 2.7. The Faculty and the Course Partial Data

course_id

course

faculty_id

CSC-132A

Introduction to Programming

J33486

CSC-132B

Introduction to Programming

B78880

CSC-230

Algorithms & Structures

A77587

CSC-232A

Programming I

B66750

CSC-232B

Programming I

A77587

CSC-233A

Introduction to Algorithms

H99118

CSC-233B

Introduction to Algorithms

K69880

CSC-234A

Data Structure & Algorithms

B78880

 

 

 

faculty_id

faculty_name

office

A52990

Black Anderson

MTC-218

A77587

Debby Angles

MTC-320

B66750

Alice Brown

MTC-257

B78880

Ying Bai

MTC-211

B86590

Satish Bhalla

MTC-214

H99118

Jeff Henry

MTC-336

J33486

Steve Johnson

MTC-118

K69880

Jenney King

MTC-324

 

 

 

LogIn

 

 

Student

 

 

 

user_name

pass_word

 

user_name

gpa

credits

student_id

ajade

tryagain

 

ajade

3.26

108

A97850

awoods

smart

 

awoods

3.57

116

A78835

bvalley

see

 

bvalley

3.52

102

B92996

hsmith

try

 

hsmith

3.87

78

H10210

jerica

excellent

 

jerica

3.95

127

J77896

 

 

 

 

 

 

 

Figure 2.2. One to one relationship in the LogIn and the Student tables.

Faculty

faculty_id faculty_name office

A52990

Black Anderson

MTC-218

A77587

Debby Angles

MTC-320

B66750

Alice Brown

MTC-257

B78880

Ying Bai

MTC-211

B86590

Satish Bhalla

MTC-214

H99118

Jeff Henry

MTC-336

J33486

Steve Johnson

MTC-118

K69880

Jenney King

MTC-324

2.5 Define Relationships 19

Course

 

 

 

course_id

course

faculty_id

CSC-132A

Introduction to Programming

J33486

CSC-132B

Introduction to Programming

B78880

CSC-230

Algorithms & Structures

A77587

CSC-232A

Programming I

B66750

CSC-232B

Programming I

A77587

CSC-233A

Introduction to Algorithms

H99118

CSC-233B

Introduction to Algorithms

K69880

CSC-234A

Data Structure & Algorithms

B78880

 

 

 

Figure 2.3. One-to-many relationship between Faculty and Course tables.

A one-to-many (1 : M) relationship occurs when one instance of entity A is associated with zero, one, or many instances of entity B. However, entity B is associated with only one instance of entity A. For example, one department can have many faculty members; each faculty member is assigned to only one department. In CSE_DEPT database, One-to-many relationship is represented by faculty_id in the Faculty table and faculty_id in the Course table, student_id in the Student table and student_id in the StudentCourse table, course_id in the Course table and course_id in the StudentCourse table (Fig. 2.3).

A many-to-many (M : N) relationship occurs when one instance of entity A is associated with zero, one, or many instances of entity B. And one instance of entity B is associated with zero, one, or many instance of entity A. For example, a student may take many courses, and one course may be taken by more than one student.

In CSE_DEPT database, a many-to-many relationship can be realized by using the third table. For example, in this case, the StudentCourse that works as the third table, set a many-to-many relationship between the Student and the Course tables (Fig. 2.4).

This database design assumes that the course table only contains courses taught by all faculty members in this department for one semester. Therefore, each course can only be taught by a unique faculty. If one wants to develop a Course table that contains courses taught by all faculty in more than one semester, the third table, say FacultyCourse table, should be created to set up a many-to-many relationship between the Faculty and the Course table, since one course may be taught by the different faculty for the different semester.

The relationships in CSE_DEPT database are summarized in Figure 2.5. Database name: CSE_DEPT

Five entities are:

LogIn

Faculty

Course

Student

StudentCourse

The relationships between these entities are shown below. P.K. and F.K represent the primary key and the foreign key, respectively.

20 Chapter 2 Introduction to Databases

Course

Student

student_id

student_name

gpa

credits

A78835

Andrew Woods

3.26

108

A97850

Ashly Jade

3.57

116

B92996

Blue Valley

3.52

102

H10210

Holes Smith

3.87

78

J77896

Erica Johnson

3.95

127

 

 

 

 

course_id

course

faculty_id

CSC-132A

Introduction to Programming

J33486

CSC-132B

Introduction to Programming

B78880

CSC-230

Algorithms & Structures

A77587

CSC-232A

Programming I

B66750

CSC-232B

Programming I

A77587

CSC-233A

Introduction to Algorithms

H99118

 

 

 

StudentCourse

s_course_id

student_id

course_id

credit

major

1000

H10210

CSC-131D

3

CE

1001

B92996

CSC-132A

3

CS/IS

1002

J77896

CSC-335

3

CS/IS

1003

A78835

CSC-331

3

CE

1004

H10210

CSC-234B

3

CE

1005

J77896

CSC-234A

3

CS/IS

1006

B92996

CSC-233A

3

CS/IS

Figure 2.4. Many-to-many relationship between Student and Course tables.

 

 

LogIn Table

 

 

one-to-many

 

 

 

 

Faculty Table

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

P.K.

 

 

F.K.

F.K.

 

 

P.K.

 

 

 

 

 

 

 

 

user_name

pass_word

faculty_id

student_id

 

 

 

faculty_id

 

name

office

college

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

one-to-many

 

 

 

 

 

 

 

 

 

 

one-to-many

 

Course Table

 

 

 

 

 

 

Student Table

 

 

 

 

 

 

 

 

 

 

 

 

 

P.K.

 

 

 

 

 

 

P.K.

 

 

 

F.K.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

student_id

name

 

 

major

 

gpa

 

 

 

 

 

 

course_id

 

 

course

 

faculty_id

 

credits

 

 

 

 

 

 

 

 

 

 

 

 

 

many-to-many

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

one-to-many

 

StudentCourse Table

 

one-to-many

 

 

 

 

 

 

 

 

 

 

 

 

 

P.K.

 

 

F.K.

F.K.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

s_course_id

 

student_id

course_id

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 2.5. Relationships in CSE_DEPT database.

Figure 2.6 displays the Microsoft Access relationships diagram among various tables in the CSE_Dept database. One-to-many relationships are indicated by placing 1 at one end of the link and at the other. The many-to-many relationship between the Student and the Course table was broken down to two 1-to-many relationships by creating a new StudentCourse table.

2.7 Data Normalization 21

Figure 2.6. Relationships are illustrated using MS Access in the CSE_DEPT database.

2.6 ER NOTATION

There are a number of ER notations available, including Chen’s, Bachman, Crow’s foot, and a few others. There is no consensus on the symbols and the styles used to draw ERDs. A number of drawing tools are available to draw ERDs. These include ER Assistant, Microsoft Visio, and Smart Draw among others. Commonly used notations are shown in

Figure 2.7.

2.7 DATA NORMALIZATION

After identifying tables, attributes, and relationships, the next logical step in database design is to make sure that the database structure is optimum. Optimum structure is achieved by eliminating redundancies, various inefficiencies, and update and deletion anomalies that usually occur in the unnormalized or partially normalized databases. Data normalization is a progressive process. The steps in the normalization process are called normal forms. Each normal form progressively improves the database and makes it more efficient. In other words, a database that is in second normal form is better than the one in the first normal form, and the one in third normal form is better than the one in second normal form. To be in the third normal form, a database has to be in the first and second normal form. There are fourth and fifth normal forms, but for most practical purposes, a database meeting the criteria of third normal form is considered to be of good design.

22 Chapter 2 Introduction to Databases

Entity is represented by a rectangular box.

A weak entity is represented by a double rectangular box.

An attribute is represented by an oval

Relationship is represented by a diamond with lines connecting the entities involved

Cardinality is indicated by

 

1

M

 

placing 1, N, or M near the

 

 

 

 

entity it is associated with

 

 

 

 

 

 

 

 

A line links entities to attributes and relationships

Figure 2.7. Commonly used symbols for ER notation.

2.7.1 First Normal Form (1NF)

A table is in first normal form if values in each column are atomic, that is, there are no repeating groups of data.

The following Faculty table (Table 2.8) is not normalized. Some faculty members have more than one telephone number listed in the phone column. These are called repeating groups.

In order to convert this table to the First Normal Form (INF), the data must be atomic. In other words, the repeating rows must be broken into two or more atomic rows. Table 2.9 illustrates the Faculty table in 1NF, where repeating groups have been removed. Now, it is in INF.

2.7 Data Normalization 23

Table 2.8. Unnormalized Faculty table with repeating groups

faculty_id

faculty_name

office

phone

 

 

A52990

Black Anderson

MTC-218, SHB-205

750-378-9987, 555-255-8897

 

A77587

Debby Angles

MTC-320

750-330-2276

 

B66750

Alice Brown

MTC-257

750-330-6650

 

B78880

Ying Bai

MTC-211, SHB-105

750-378-1148, 555-246-4582

 

B86590

Satish Bhalla

MTC-214

750-378-1061

 

H99118

Jeff Henry

MTC-336

750-330-8650

 

J33486

Steve Johnson

MTC-118

750-330-1116

 

K69880

Jenney King

MTC-324

750-378-1230

 

 

 

 

 

 

Table 2.9. Normalized Faculty table

faculty_id

faculty_name

office

phone

A52990

Black Anderson

MTC-218

750-378-9987

A52990

Black Anderson

SHB-205

555-255-8897

A77587

Debby Angles

MTC-320

750-330-2276

B66750

Alice Brown

MTC-257

750-330-6650

B78880

Ying Bai

MTC-211

750-378-1148

B78880

Ying Bai

SHB-105

555-246-4582

B86590

Satish Bhalla

MTC-214

750-378-1061

H99118

Jeff Henry

MTC-336

750-330-8650

J33486

Steve Johnson

MTC-118

750-330-1116

K69880

Jenney King

MTC-324

750-378-1230

 

 

 

 

2.7.2 Second Normal Form (2NF)

A table is in second normal form if it is already in 1NF, and every nonkey column is fully dependent upon the primary key.

This implies that if the primary key consists of a single column, then the table in 1NF is automatically in 2NF. The second part of the definition implies that if the key is composite, then none of the nonkey columns will depend upon just one of the columns that participate in the composite key.

The Faculty table in Table 2.9 is in first normal form. However, it has a composite primary key, made up of faculty_id and office. The phone number depends on a part of the primary key, the office, and not on the whole primary key. This can lead to update and deletion anomalies mentioned above.

By splitting the old Faculty table (Fig. 2.8) into two new tables, Faculty and Office, we can remove the dependencies mentioned earlier. Now the faculty table has a primary key, faculty_id, and the Office table has a primary key, office. The nonkey columns in both tables now depend only on the primary keys only.

24 Chapter 2 Introduction to Databases

Old Faculty table in 1NF

faculty_id

faculty_name

office

phone

A52990

Black Anderson

MTC-218

750-378-9987

A52990

Black Anderson

SHB-205

555-255-8897

A77587

Debby Angles

MTC-320

750-330-2276

B66750

Alice Brown

MTC-257

750-330-6650

B78880

Ying Bai

MTC-211

750-378-1148

 

 

 

 

B78880

Ying Bai

SHB-105

555-246-4582

B86590

Satish Bhalla

MTC-214

750-378-1061

H99118

Jeff Henry

MTC-336

750-330-8650

J33486

Steve Johnson

MTC-118

750-330-1116

K69880

Jenney King

MTC-324

750-378-1230

 

 

 

 

New Faculty table New Office table

faculty_id faculty_name

A52990

Black Anderson

A52990

Black Anderson

A77587

Debby Angles

B66750

Alice Brown

B78880

Ying Bai

B78880

Ying Bai

B86590

Satish Bhalla

H99118

Jeff Henry

J33486

Steve Johnson

K69880

Jenney King

 

 

office

 

 

phone

 

faculty_id

 

 

 

 

 

 

 

 

 

 

 

MTC-218 750-378-9987 A52990

SHB-205

555-255-8897

A52990

MTC-320

750-330-2276

A77587

MTC-257

750-330-6650

B66750

MTC-211

750-378-1148

B78880

SHB-105

555-246-4582

B78880

MTC-214

750-378-1061

B86590

MTC-336

750-330-8650

H99118

MTC-118

750-330-1116

J33486

MTC-324

750-378-1230

K69880

Figure 2.8. Converting Faulty table into 2NF by decomposing the old table in two, Faculty and Office.

2.7.3 Third Normal Form (3NF)

A table is in third normal form if it is already in 2NF, and every nonkey column is nontransitively dependent upon the primary key. In other words, all nonkey columns are mutually independent, but at the same time, they are fully dependent upon the primary key only.

Another way of stating this is that in order to achieve 3NF, no column should depend upon any nonkey column. If column B depends on column A, then A is said to functionally determine column B; hence, the term determinant. Another definition of 3NF says that the table should be in 2NF, and only determinants it contains are candidate keys.

For the Course table in Table 2.10, all nonkey columns depend on the primary key— course_id. In addition, name and phone columns also depend on faculty_id. This table is

2.7 Data Normalization 25

Table 2.10. The old Course table

course_id

course

classroom

faculty_id

faculty_name

phone

CSC-131A

Computers in Society

TC-109

A52990

Black Anderson

750-378-9987

CSC-131B

Computers in Society

TC-114

B66750

Alice Brown

750-330-6650

CSC-131C

Computers in Society

TC-109

A52990

Black Anderson

750-378-9987

CSC-131D

Computers in Society

TC-109

B86590

Satish Bhalla

750-378-1061

CSC-131E

Computers in Society

TC-301

B66750

Alice Brown

750-330-6650

CSC-131I

Computers in Society

TC-109

A52990

Black Anderson

750-378-9987

CSC-132A

Introduction to Programming

TC-303

J33486

Steve Johnson

750-330-1116

CSC-132B

Introduction to Programming

TC-302

B78880

Ying Bai

750-378-1148

 

 

 

 

 

 

Table 2.11. The new Course table

course_id

course

classroom

 

CSC-131A

Computers in Society

TC-109

 

CSC-131B

Computers in Society

TC-114

 

CSC-131C

Computers in Society

TC-109

 

 

CSC-131D

Computers in Society

TC-109

 

CSC-131E

Computers in Society

TC-301

 

CSC-131I

Computers in Society

TC-109

 

CSC-132A

Introduction to Programming

TC-303

 

CSC-132B

Introduction to Programming

TC-302

 

Table 2.12. The new instructor table

faculty_id

faculty_name

phone

A52990

Black Anderson

750-378-9987

B66750

Alice Brown

750-330-6650

A52990

Black Anderson

750-378-9987

B86590

Satish Bhalla

750-378-1061

B66750

Alice Brown

750-330-6650

A52990

Black Anderson

750-378-9987

J33486

Steve Johnson

750-330-1116

B78880

Ying Bai

750-378-1148

A77587

Debby Angles

750-330-2276

 

 

 

in second normal form but it suffers from update, addition, and deletion anomalies because of transitive dependencies. In order to conform to third normal form, we can split this table into two tables, Course and Instructor (Tables 2.11 and 2.12). Now we have eliminated the transitive dependencies that are apparent in the Course table in Table 2.10.

26 Chapter 2 Introduction to Databases

2.8 DATABASE COMPONENTS IN SOME POPULAR DATABASES

All databases allow for storage, retrieval, and management of the data. Simple databases provide basic services to accomplish these tasks. Many database providers, like Microsoft SQL Server and Oracle, provide additional services that necessitate storing many components in the database other than data. These components, such as views, stored procedures, and so on, are collectively called database objects. In this section, we will discuss various objects that make up MS Access, SQL Server, and Oracle databases.

There are two major types of databases, File Server and Client Server:

In a File Server database, data is stored in a file, and each user of the database retrieves the data, displays the data, or modifies the data directly from or to the file. In a Client Server database, the data is also stored in a file, however, all these operations are mediated through a master program, called a server. MS Access is a File Server database, whereas Microsoft SQL Server and Oracle are Client Server databases. The Client Server databases have several advantages over the File Server databases.These include, minimizing chances of crashes, provision of features for recovery, enforcement of security, better performance, and more efficient use of the network compared with the file server databases.

2.8.1 Microsoft Access Databases

Microsoft Access Database Engine is a collection of information stored in a systematic way that forms the underlying component of a database. Also called a Jet (Joint Engine Technology), it allows the manipulation of relational database. It offers a single interface that other software may use to access Microsoft databases. The supporting software is developed to provide security, integrity, indexing, record locking, and so on. By executing MS Access program, MSACCESS.EXE, you can see the database engine at work and the user interface it provides. Figure 2.9 shows how a Java application accesses the MS Access database via ACE OLE database provider.

Access

Access

Database

Database

JDBC

JDBC

Driver

Driver

Java

Java

Applications

Applications

Figure 2.9. Microsoft Access database illustration.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]