Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
MathStats&Probability Chapter 1.doc
Скачиваний:
220
Добавлен:
21.02.2016
Размер:
1.24 Mб
Скачать

1.5.4. Interpretation of the population standard deviation

Often in statistical studies we are interested in specifying the percentage of items in a data set that lie within some specified interval when only the mean and standard deviation for the data set are known. Two rules are commonly used for forming such estimates.

The first is true for any data set.

Chebyshev’s theorem:

For any set of data and any, at least of the values in the data set must be within plus or minus standard deviations of the mean.

Remark:

In applying Chebyshev’s theorem we treat every data set as if it were a population, and the formula for a population standard deviation is used.

1.5

2

2.5

55.6%

75%

84%

According to Chebyshev’s rule, at least 55.6% of the population data lie within 1.5 standard deviations around the mean, at least 75% of the population data lie within 2 standard deviations around the mean and so on.

Example:

Let,

If we let from we obtain that .

The theorem states that at least 88.89% of data values will fall within 3 standard deviations of the mean. 88.89% of data falls within or

and

For , at 88.89% of the data values fall between 74.5, 65.5.

Rule of Thumb.

When a distribution is bell-shaped the following statements, which are called Thumb rule, are true:

Approximately 68% of the population members lie within one standard deviation of the mean.

Approximately 95% of the population members lie within two standard deviations of the mean.

Approximately 99.7% of the population members lie within three standard deviations of the mean.

For example, suppose that scores on entrance exam have a mean of 480 and standard deviation of 90. If these scores are normally distributed, then approximately 68% will fall between 390 and 570 ;

and

Approximately 95% of the scores will fall between 300 and 660

and .

Approximately 99.7 % of the scores will fall between 210 and 750

and .

1.5.5. The interquartile range

Quartiles are the summary measures that divide a ranked data set into four equal parts. Three measures will divide any data set into four equal parts. These three measures are the first quartile (denoted by), the second quartile (denoted by), and the third quartile (denoted by). The data should be ranked in increasing order before the quartiles are determined. The quartiles are defined as follows:

- ordered observation

- ordered observation.

The difference between the third and the first quartiles gives the interquartile range. That is

.

Example:

A teacher gives a 20-point test to 10 students. The scores are shown below

18, 15, 12, 6, 8, 2, 3, 5, 20, 10

Find the interquartile range.

Solution:

First, we rank the given data in increasing order:

2, 3, 5, 6, 8, 10, 12, 15, 18, 20

- ordered observation.

.

Hence, the first quartile is three-quarter way from the data (3) to the third (5). Therefore,

First quartile=

Similarly, since

The third quartile is one-quarter of the way from the observation (15) to the observation (18). Thus we have

Third quartile=.

Finally, the interquartile range is the difference between the third and first quartiles:

Interquartile range=

Example:

The following are the ages of nine employees of an insurance company

47, 28, 39, 51, 33, 37, 59, 24, 33

Find the interquartile range.

Solution:

Let us arrange the data in order from smallest to largest

24, 28, 33, 33, 37, 39, 47, 51, 59

The interquartile range is

.

Exercises

1. Fifteen students were selected randomly and asked how many hours each studied for the final exam in statistics. Their answers are recorded here

8, 6, 3, 0, 0, 5, 9, 2, 1, 3, 7, 10, 0, 3, 6

a) Find the range

b) Find the mean absolute deviation

c) Find the sample variance and sample standard deviation

d) Find the interquartile range.

2. The following data give the hourly wage rate of all 12 employees of a small company

21, 22, 27, 36, 22, 29, 22, 23, 22, 28, 36, 33

a) Find the population variance and standard deviation

b) Find the mean absolute deviation

c) Find the range

d) Find the interquartile range.

3. The number of words printed in each of 12 randomly selected storybooks for children is listed below

502, 213, 335, 197, 414, 469, 497, 367, 409, 297, 309, 414

a) Find the sample variance and sample standard deviation

b) Find the range

c) Find the mean absolute deviation

d) Find the interquartile range.

4. The weights of sample of nine football players are recorded as follows:

78, 72, 68, 73, 75, 69, 74, 73, 72

a) Find the range

b) Find the variance

c) Find the standard deviation

5. The following data give the number of cars that stopped at a service station during each of the 10 hours observed

29, 35, 42, 31, 24, 18, 16, 27, 39, 34

Find the range, variance, and standard deviation.

6. The following data give the number of new cars sold at a dealership during a 12-day period

13, 5, 9, 6, 8, 11, 9, 15, 4, 11, 7, 5

Find the range, variance, standard deviation, and interquartile range.

7. Consider the following two data sets:

Data set I : 12, 25, 37, 8, 41

Data set II: 19, 32, 44, 15, 48

Notice that each value of the second data set is obtained by adding 7 to

the corresponding value of the first data set. Calculate the standard deviation for each of these two data sets using the formula for sample data. Comment on the relationship between the standard deviations.

8. Consider the following two data sets:

Data set I : 4, 8, 15, 9, 11

Data set II: 8, 16, 30, 18, 22

Notice that each value of the second data set is obtained by multiplying the corresponding value of the first data set by 2. Calculate the standard deviation for each of these data sets using the formula for the sample data. Comment on the relationship between the standard deviations.

9. The number of patients treated at the hospital per day are shown below. Data are from a random sample of 12 days:

45, 50, 36, 59, 28, 42, 55, 67, 33, 35, 40, 50

Compute the mean, median, mode, range, variance, and standard deviation for these data.

10. Light bulbs manufactured by a well-known electrical equipment firm are known to have a mean life of 800 hours with a standard deviation of 100 hours.

a) Find a range in which it can be guaranteed that 84% of lifetimes of light bulbs lie.

b) Using the rule of thumb, find a range in which it can be estimated that approximately 68% of these light bulbs lie.

11. Tires of a particular brand have lifetimes with mean of 29.000 km and standard deviation of 3.000 km.

a) Find a range in which it can be guaranteed that 75% of the lifetimes of tire of this brand lie.

b) Using the rule of thumb, find a range in which it can be estimated that approximately 95% of the lifetimes of tires of this brand lie.

12. The mean of a distribution is 20 and the standard deviation is 2.

Use Chebyshev’s theorem to answer:

a) At least what percentage of the values will fall between 10 and 30 ?

b) At least what percentage of the values will fall between 12 and 28 ?

13. A sample of hourly wages of employees who work in restaurants in a large city has a mean of $5.02 and a standard deviation of $0.09. Using Chebyshev’s theorem, find the range in which at least 75% of the data will lie.

14. The average score on a special test of knowledge has a mean of 95 and a standard deviation of 2. Using Chebyshev’s theorem, find the range in which at least 88.89 % of the data will fall.

15. During a recent football season, it was reported that the average attendance for games was 45.000. The standard deviation in the attendance figure was. Use Chebyshev’s theorem to answer the following:

a) Develop an interval that contains the attendance figure for at least 75% of the games.

b) The commissioner claims that at least 90% of the games had attendances between 29.000 and 61.000. Is this statement warranted given information we have?

Answers

1. a) 10; b) 2.8; c) 11.3; 3.4; d) 6; 2. a) 29.52; 5.43; b) 4.75; c) 15;

d) 10; 3. a) 10325.9; b) 305; c) 82.25; d) 156.5;

4. a) 10; b) 9; c) 3 ; 5. ; b); c);

6. ;;; 7. for both data sets;

8. and; 9. 45; 43.5; 50;; ; 10. a) 550-1050; b) 700-900; 11. a) 23.000-35.000;

b) 23.000-35.000; 12. a); b);

13. $4.84-$5.20; 14. 89-101; 15. a) 37.000-53.000.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]