Descriptive statistics | Bài giảng số 4 chương 3 học phần Applied statistics | Trường Đại học Quốc tế, Đại học Quốc gia Thành phố Hồ Chí Minh

Ratio scale: a weight of 4 grams is twice as heavy as a weight of 2 grams Interval scale: a temperature of 10 degrees C should not be considered twice as hot as 5 degrees C. If it were, a conflict would be created because 10 degrees C is 50 degrees F and 5 degrees C is 41 degrees F. Clearly, 50 degrees is not twice 41 degrees. a pH of 3 is not twice as acidic as a pH of 6, because pH is not a ratio variable. Tài liệu giúp bạn tham khảo, ôn tập và đạt kết quả cao. Mời bạn đón xem.

8
APPLIED STATISTICS
COURSE CODE: ENEE1006IU
Lecture 4:
Chapter 3: Descripve stascs
(3 credits: 2 is for lecture, 1 is for lab-work)
Instructor: TRAN THANH TU Email:
tu@hcmiu.edu.vn
tu@hcmiu.edu.vn 1
tu@hcmiu.edu.vn 2
REVIEW PREVIOUS LECTURES
tu@hcmiu.edu.vn 3
REVIEW PREVIOUS LECTURES
Examples of interval and rao scales:
Rao scale: a weight of 4 grams is twice as heavy as a weight of 2 grams Interval
scale: a temperature of 10 degrees C should not be considered twice as hot as 5
degrees C. If it were, a conict would be created because 10 degrees C is 50
degrees F and 5 degrees C is 41 degrees F. Clearly, 50 degrees is not twice 41
degrees.
a pH of 3 is not twice as acidic as a pH of 6, because pH is not a rao variable.
Rao scale doesn’t have negave numbers, because of its zero-point feature
Division between two values has meaning
(besides the subtracon like in interval scale)
Allow unit conversion (e.g. kg/gr calories)
tu@hcmiu.edu.vn 4
TODAY’S CONTENT
3.1. Measures of locaon 3.2. Measures of variability
tu@hcmiu.edu.vn 5
3.1. MEASURES OF LOCATION
•Mean •Weighted Mean •Median •Geometric Mean •Mode
•Percenles •Quarles
3.1. MEASURES OF LOCATION
•Mean (average value): provides a measure of central locaon for the data
tu@hcmiu.edu.vn 6
-if the data are for a sample:
the mean is denoted by
( )
n: number of observaon
- if the data are for a populaon: the mean is denoted by the Greek leer µ
N: total observaons in a populaon
3.1. MEASURES OF LOCATION
tu@hcmiu.edu.vn 7
Mean: Example
tu@hcmiu.edu.vn 8
3.1. MEASURES OF LOCATION
•Weighted Mean: in n observaons, each observaon i shares the weight
(Mean: each observaon share the same weight w=1/n)
tu@hcmiu.edu.vn 9
3.1. MEASURES OF LOCATION
•Median: Arrange the data in ascending order (smallest value to largest value):
(a) For an odd number of observaons, the median is the middle value.
tu@hcmiu.edu.vn 10
(b) For an even number of observaons, the median is the average of the two
middle values.
tu@hcmiu.edu.vn 11
3.1. MEASURES OF LOCATION
•Geometric Mean: measure of locaon that is calculated by nding the nth
root of the product of n values
is oen used in analyzing growth rates
tu@hcmiu.edu.vn 12
3.1. MEASURES OF LOCATION
•Geometric Mean:
- Other common applicaons: changes in populaons of species, crop yields, polluon levels, and birth
and death rates, etc.
tu@hcmiu.edu.vn 13
- Also note that the geometric mean can be applied to changes that occur over any number of
successive periods of any length.
- In addion to annual changes,
the geometric mean is oen
applied to nd the mean rate of
change over quarters, months,
weeks, and even days.
3.1. MEASURES OF
LOCATION
tu@hcmiu.edu.vn 14
•Mode: value that occurs with greatest frequency
Mean=?
Median=?
Mode=?
3.1.
MEASURES OF LOCATION
•Percenles: provides
informaon about how the data
are spread over the interval
from the smallest value to the largest value.
tu@hcmiu.edu.vn 15
•Locaon of pth percenle:
Calculate the value of pth percenle based on Lp
50th percenle=???
tu@hcmiu.edu.vn 16
3.1. MEASURES OF LOCATION
•Quarles: it is oen desirable to divide a data set into four parts, with each
part containing approximately one-fourth, or 25%, of the observaons. These
division points are referred to as the quarles and are dened as follows:
Q
1
= rst quarle, or 25th percenle (Locaon: L
25
)
Q
2
= second quarle, or 50th percenle (also the median) (Locaon: L
50
) Q
3
=
third quarle, or 75th percenle (Locaon: L
75
)
tu@hcmiu.edu.vn 17
3.1. MEASURES OF LOCATION
Mean=?
Median=?
Mode=?
25th percenle=?
50th percenle=?
75th percenle=?
lOMoARcPSD|47231818
End of le 1.
Any quesons?
tu@hcmiu.edu.vn
16
3.2. MEASURES OF VARIABILITY
•Range: simplest measure of variability
lOMoARcPSD|47231818
tu@hcmiu.edu.vn 19
•It is seldom used as the only measure. The reason is that the range is based
on only two of the observaons and thus is highly inuenced by extreme
values.
•Interquarle Range: A measure of variability that overcomes the
dependency on extreme values.
dierence between the third quarle, Q
3
, and the rst quarle, Q
1
3.2. MEASURES OF VARIABILITY
•Variance: A measure of variability that ulizes all the data
lOMoARcPSD|47231818
tu@hcmiu.edu.vn 20
based on the dierence between the value of each observaon (x
i
) and the
mean (=deviaon about the mean):
- For a sample, a deviaon about the mean is wrien (
); - For
a populaon, it is wrien (
•Populaon variance:
•Sample variance:
(if the sum of the squared deviaons about the sample mean is divided by n − 1, and
not n, the resulng sample variance provides an unbiased esmate of the populaon
Variance)
| 1/24

Preview text:

8 APPLIED STATISTICS COURSE CODE: ENEE1006IU Lecture 4:
Chapter 3: Descriptive statistics
(3 credits: 2 is for lecture, 1 is for lab-work)
Instructor: TRAN THANH TU Email: tttu@hcmiu.edu.vn tttu@hcmiu.edu.vn 1 REVIEW PREVIOUS LECTURES tttu@hcmiu.edu.vn 2 REVIEW PREVIOUS LECTURES
Examples of interval and ratio scales:
Ratio scale: a weight of 4 grams is twice as heavy as a weight of 2 grams Interval
scale: a temperature of 10 degrees C should not be considered twice as hot as 5
degrees C. If it were, a conflict would be created because 10 degrees C is 50
degrees F and 5 degrees C is 41 degrees F. Clearly, 50 degrees is not twice 41 degrees.
a pH of 3 is not twice as acidic as a pH of 6, because pH is not a ratio variable.
Ratio scale doesn’t have negative numbers, because of its zero-point feature
Division between two values has meaning
(besides the subtraction like in interval scale)
Allow unit conversion (e.g. kg/gr calories) tttu@hcmiu.edu.vn 3 TODAY’S CONTENT
3.1. Measures of location 3.2. Measures of variability tttu@hcmiu.edu.vn 4 3.1. MEASURES OF LOCATION
•Mean •Weighted Mean •Median •Geometric Mean •Mode •Percentiles •Quartiles 3.1. MEASURES OF LOCATION
•Mean (average value): provides a measure of central location for the data tttu@hcmiu.edu.vn 5
-if the data are for a sample: the mean is denoted by ( ) n: number of observation
- if the data are for a population: the mean is denoted by the Greek letter µ
N: total observations in a population 3.1. MEASURES OF LOCATION tttu@hcmiu.edu.vn 6 Mean: Example tttu@hcmiu.edu.vn 7 3.1. MEASURES OF LOCATION
•Weighted Mean: in n observations, each observation i shares the weight
(Mean: each observation share the same weight w=1/n) tttu@hcmiu.edu.vn 8 3.1. MEASURES OF LOCATION
•Median: Arrange the data in ascending order (smallest value to largest value):
(a) For an odd number of observations, the median is the middle value. tttu@hcmiu.edu.vn 9
(b) For an even number of observations, the median is the average of the two middle values. tttu@hcmiu.edu.vn 10 3.1. MEASURES OF LOCATION
•Geometric Mean: measure of location that is calculated by finding the nth
root of the product of n values
is often used in analyzing growth rates tttu@hcmiu.edu.vn 11 3.1. MEASURES OF LOCATION •Geometric Mean:
- Other common applications: changes in populations of species, crop yields, pollution levels, and birth and death rates, etc. tttu@hcmiu.edu.vn 12
- Also note that the geometric mean can be applied to changes that occur over any number of
successive periods of any length.
- In addition to annual changes, the geometric mean is often
applied to find the mean rate of change over quarters, months, weeks, and even days. 3.1. MEASURES OF LOCATION tttu@hcmiu.edu.vn 13
•Mode: value that occurs with greatest frequency Mean=? Median=? Mode=? 3.1. MEASURES OF LOCATION •Percentiles: provides
information about how the data are spread over the interval
from the smallest value to the largest value. tttu@hcmiu.edu.vn 14
•Location of pth percentile:
Calculate the value of pth percentile based on Lp 50th percentile=??? tttu@hcmiu.edu.vn 15 3.1. MEASURES OF LOCATION
•Quartiles: it is often desirable to divide a data set into four parts, with each
part containing approximately one-fourth, or 25%, of the observations. These
division points are referred to as the quartiles and are defined as follows:
Q1 = first quartile, or 25th percentile (Location: L25)
Q2 = second quartile, or 50th percentile (also the median) (Location: L50) Q3 =
third quartile, or 75th percentile (Location: L75) tttu@hcmiu.edu.vn 16 3.1. MEASURES OF LOCATION Mean=? 25th percentile=? Median=? 50th percentile=? Mode=? 75th percentile=? tttu@hcmiu.edu.vn 17 lOMoARcPSD|47231818 End of file 1. Any questions? tttu@hcmiu.edu.vn 16 3.2. MEASURES OF VARIABILITY
•Range: simplest measure of variability lOMoARcPSD|47231818
•It is seldom used as the only measure. The reason is that the range is based
on only two of the observations and thus is highly influenced by extreme values.
•Interquartile Range: A measure of variability that overcomes the dependency on extreme values.
difference between the third quartile, Q3, and the first quartile, Q1 3.2. MEASURES OF VARIABILITY
•Variance: A measure of variability that utilizes all the data tttu@hcmiu.edu.vn 19 lOMoARcPSD|47231818
based on the difference between the value of each observation (xi) and the
mean (=deviation about the mean):
- For a sample, a deviation about the mean is written ( ௜ − ); - For a population, it is written ( •Population variance: •Sample variance:
(if the sum of the squared deviations about the sample mean is divided by n − 1, and
not n, the resulting sample variance provides an unbiased estimate of the population Variance) tttu@hcmiu.edu.vn 20