Biostatictis - Xác suất thống kê | Trường Đại Học Duy Tân

Biostatictis - Xác suất thống kê | Trường Đại Học Duy Tân được sưu tầm và soạn thảo dưới dạng file PDF để gửi tới các bạn sinh viên cùng tham khảo, ôn tập đầy đủ kiến thức, chuẩn bị cho các buổi học thật tốt. Mời bạn đọc đón xem!

Trường:

Đại học Duy Tân 1.8 K tài liệu

Thông tin:
10 trang 7 tháng trước

Bình luận

Vui lòng đăng nhập hoặc đăng ký để gửi bình luận.

Biostatictis - Xác suất thống kê | Trường Đại Học Duy Tân

Biostatictis - Xác suất thống kê | Trường Đại Học Duy Tân được sưu tầm và soạn thảo dưới dạng file PDF để gửi tới các bạn sinh viên cùng tham khảo, ôn tập đầy đủ kiến thức, chuẩn bị cho các buổi học thật tốt. Mời bạn đọc đón xem!

36 18 lượt tải Tải xuống
LearningObjectives
Recognizetheadvantagesand disadvantages of
nonparametricstatistics.
Understandhow runstouse the test to test for
randomness.
Knowwhenandhowtouse the MannWhitne
y
U test
,
theWilcoxonmatchedpairs signedranktest, the
KruskalWallistest,andtheFriedmantest.
Parametricvs.NonparametricStatistics
ParametricStatisticsarestatisticaltechniquesbased
onassumptions populationaboutthe fromwhichthe
sampledataarecollected.
Assumptionthatdatabeinganalyzedarerandomly
selectedfromanormallydistributedpopulation.
Requiresquantitativemeasurementthatyieldinterval
orratioleveldata.
NonparametricStatisticsarebasedonfewer
assumptions populationaboutthe andthe
parameters.
Sometimescalled“distributionfree”statistics.
Avarietyofnonparametricstatisticsareavailablefor
usewithnominalorordinaldata.
AdvantagesofNonparametricTechniques
Sometimesthereisnoparametricalternativetothe
useofnonparametricstatistics.
Certainnonparametrictest tocanbeused analyze
nominaldata.
Certainnonparametrictest tocanbeused analyze
ordinaldata.
Thecomputationsonnonparametricstatisticsare
usuallylesscomplicatedthanthoseforparametric
statistics,particularlyforsmallsamples.
Probabilitystatementsobtainedfrommost
nonparametrictestsareexactprobabilities.
DisadvantagesofNonparametricStatistics
Nonparametrictestscanbewastefulofdataif
parametrictestsareavailableforuse thewith data.
Nonparametrictestsare asusuallynot widely
availableandwellknowasparametrictests.
Forlar
g
esam
p
les
,
the calculations for man
y
nonparametricstatistics tedious.canbe
MannWhitneyU Test
MannWhitneyUtest‐ anonparametriccounterpart
of meansthe ttest toused comparethe oftwo
independentpopulations.
Nonparametriccounterpartofthet testfor
independentsamples
Does not requirenormallydistributedpopulations
Maybeappliedtoordinaldata
Assumptions
IndependentSamples
AtLeastOrdinalData
Sizeofsampleone:n
1
Sizeofsampletwo:n
2
Ifbothn
1
andn
2
are
10,thesmallsampleprocedure
isappropriate.
If either
n
or
n
is greater than 10 the large sample
MannWhitneyU Test:
SampleSizeConsideration
If
either
n
1
or
n
2
is
greater
than
10
,
the
large
sample
procedure appropriate.is
DrugA DrugB
20.10 26.19
19.80 23.88
H
0
:µ1=µ2
H
a
: π
MannWhitneyUTest:SmallSample
ExampleDemonstration
.
.
18.75 21.64
21.90 24.85
22.96 25.30
20.75 24.12
23.45
=.05
Ifthefinalp <value .05,rejectH
0
.
Compensation Rank Group
18.75 1 H
19.80 2 H
20.10 3 H
20.75 4 H
21.64 5 E
21 90
6
H
MannWhitneyUTest:SmallSample
ExampleDemonstration
W
1
= 1 + 2 + 3 + 4 + 6 + 7 + 8 = 31
W
2
= 5 + 9 + + + + + + = 10 11 12 13 14 15 89
21
.
90
6
H
22.36 7 H
22.96 8 H
23.45 9 E
23.88 10 E
24.12 11 E
24.85 12 E
25.30 13 E
25.50 14 E
26.19 15 E
SinceU
2
< =U
1
,U 3.
p =value .0011*2
(foratwotailedtest)=.022
<.05,rejectH
0
.
MannWhitneyU Test:Small
SampleExample
MannWhitneyU Test:
FormulasforLargeSampleCase
PBS NonPBS
24,500 41,000
39,400 32,500
36,800 33,000
44,300 21,000
40 500
π
Example Mann Whitney Ufor large
samples
,
40
,
500
32,000 32,400
61,000 16,000
34,000 21,500
43,500 39,500
55,000 27,600
39,000 43,500
62,500 51,900
61,400 27,800
53,000
n
1
=14
n
2
=13
Data Data
value
Rank Group
value
Rank Group
16,000 1 NonPBS 39,500 15 NonPBS
21,000 2 NonPBS 40,500 16 NonPBS
21,500 3 NonPBS 41,000 17 NonPBS
24,500 4 PBS 43,000 18 PBS
27,600
5
Non
PBS
43,500
19.5
PBS
Ranks of IncomefromCombined
GroupsofPBS PBSandNon Viewers
27,800 6 NonPBS 43,500 19.5 NonPBS
32,000 7 PBS 51,900 21 NonPBS
32,400 8 NonPBS 53,000 22 PBS
32,500 9 NonPBS 55,000 23 PBS
33,000 10 NonPBS 57,960 24 PBS
34,000 11 PBS 61,000 25 PBS
36,800 12 PBS 61,400 26 PBS
39,000 13 PBS 62,500 27 PBS
39,400 14 PBS
PBS PBSandNon :CalculationofU



PBS PBSandNon :Conclusion

WilcoxonMatchedPairsSignedRankTest WilcoxonMatchedPairsSignedRankTest
Differencesof ofthescores thetwomatchedsamples
Differencesareranked,ignoringthesign
Ranksaregiventhesignofthedifference
Positiveranksaresummed
Negative ranks are summed
Negative
ranks
are
summed
T isthe sumsmaller ofranks
n isthenumberofmatchedpairs
Ifn >15,T isapproximatelynormallydistributed,
andaZ testisused.
Ifn
15,aspecial“smallsample”procedureis
followed.
WilcoxonMatchedPairsSignedRankTest:
SampleSizeConsideration
Thepaireddataarerandomlyselected.
Theunderlyingdistributionsaresymmetric.
Considerthesurvey by American Demographicsthat
estimatedtheaverageannualhouseholdspending
onhealthcare.TheU.S.metropolitanaveragewas
$1,800.
Su
pp
osesixfamiliesinPittsbur
g
h
,
Penns
y
lvania
,
are
WilcoxonMatchedPairsSignedRankTest:
SmallSample Example
matcheddemographicallywithsixfamiliesin
Oakland,California, andtheir amountsof household
spendingonhealthcareforlastyearareobtained.
Pair
SampleA
SampleB
1 1,950 1,760
2 1,840 1,870
H
0
:M
d
= 0
H
a
: 0M
d
n= 6
WilcoxonMatchedPairsSignedRankTest:
SmallSampleExample
3 2,015 1,810
4 1,580 1,660
5 1,790 1,340
6 1,925 1,765
=0.05
IfT
observed
1,rejectH
0.
Family
Pair SampleA
SampleB
d Rank
1 1,950 1,760 190
2 1,840 1,870 30
3 2,015 1,810 205
4
1 580
1 660
80
+4
1
+5
2
WilcoxonMatchedPairsSignedRankTest:
SmallSample Example
4
1
,
580
1
,
660
80
5 1,790 1,340 450
6 1,925 1,765 160
2
+6
+3
Forlargesamples,theTstatisticisapproximatelynormally
distributedandazscore can be usedas theteststatistic.
WilcoxonMatchedPairsSignedRankTest:
LargeSampleFormulas
WilcoxonMatchedPairsSignedRankTest:
LargeSampleFormulas
WilcoxonMatchedPairsSignedRankTest:
LargeSampleFormulas
City 1979 2011 d Rank City 1979 2011 d Rank
1
20 3
22 8
2 5
8
10
20 3
20 9
0 6
1
Example
1
20
.
3
22
.
8
2
.
5
8
10
20
.
3
20
.
9
0
.
6
1
2 19.5 12.7 6.8 17 11 19.2 22.6 3.4 11.5
3 18.6 14.1 4.5 13 12 19.5 16.9 2.6 9
4 20.9 16.1 4.8 15 13 18.7 20.6 1.9 6.5
5 19.9 25.2 5.3 16 14 17.7 18.5 0.8 2
6 18.6 20.2 1.6 4 15 21.6 23.4 1.8 5
7 19.6 14.9 4.7 14 16 22.4 21.3 1.1 3
8 23.2 21.3 1.9 6.5 17 20.8 17.4 3.4 11.5
9 21.8 18.7 3.1 10
T Calculation

Conclusion
KruskalWallisTest‐ Anonparametricalternative
toonewayanalysisofvariance
Mayusedtoanalyzeordinaldata
Noassumedpopulationshape
Assumes that the
C
groups are independent
KruskalWallisTest
Assumes
that
the
C
groups
are
independent
Assumesrandomselectionofindividualitems
KruskalWallisK Statistic
NumberofPatientsperDayperPhysician
inThreeOrganizationalCategories
Supposearesearcherwantstodeterminewhetherthe
numberofphysiciansinanofficeproducessignificant
differencesinthenumber ofoffice patients seen by each
physicianperday.
Sh k d l f h i i f i
Sh
e
t
a
k
esaran
d
omsamp
l
e o
f
p
h
ys
i
c
i
ans
f
romprac
ti
ces
inwhich(1)thereareonlytwopartners, (2) thereare
threeormorepartners, (3)or theofficeis a health
maintenanceorganization(HMO).
Two
P
Threeor
More
P
HMO
H
o
:Thethreepopulationsareidentical
H
a
:Atleastone theof threepopulationsisdifferent
NumberofPatientsperDayperPhysician
inThreeOrganizationalCategories
P
artners
P
artners
HMO
13 24 26
15 16 22
20 19 31
18 22 27
23 25 28
14 33
17
Two
Partners
Threeor
More
Partners HMO
Patients Rank Patients Rank Patients Rank
13 1 24 12 26 14
15 3 16 4 22 9.5
20 8 19 7 31 17
18
6
22
9 5
27
15
PatientsperDay Data: KruskalWallis
PreliminaryCalculations
n= + + = + + =n
1
n
2
n
3
5 7 6 18
18
6
22
9
.
5
27
15
23 11 25 13 28 16
14 2 33 18
17 5
T
1
= = =29 T
2
52.5 T
3
89.5
n
1
=5 n
2
=7 n
3
=6
PatientsperDay Data: KruskalWallis
CalculationsandConclusion
FriedmanTest
FriedmanTest‐ Anonparametricalternativetothe
randomizedblockdesign
Assumptions
Theblocksareindependent.
Thereisnointeractionbetweenblocksandtreatments.
Observationswithineachblockcanberanked.
Hypotheses
H
o
:Thetreatmentpopulationsareequal
H
a
:Atleastonetreatmentpopulationyieldslargervalues
than oneatleast othertreatmentpopulation
FriedmanTest
Supplier Supplier1 2 3Supplier Supplier4
H
o
: The supplierpopulationsare equal
H
a
: At least one supplierpopulationyields largervalues
than oneatleast other supplier population
FriedmanTest:
Monday 62 63 57 61
Tuesday 63 61 59 65
Wednesday 61 62 56 63
Thursday 62 60 57 64
Friday 64 63 58 66
FriedmanTest:
Supplier Supplier1 2 Supplier3 Supplier4
Monday 3 4 1 2
Tuesday 3 2 1 4
Wednesday 2 3 1 4
Thursday
3
2 1
4
FriedmanTest:
Friday 3 2 1 4
14 13 5 18
196 169 25 324
j
R
2
j
R
FriedmanTest:
| 1/10

Preview text:

LearningObjectives
Parametricvs.NonparametricStatistics
ParametricStatisticsarestatisticaltechniquesbased
Recognizetheadvantagesanddisadvantagesof
onassumptionsaboutthepopulationfromwhichthe
nonparametricstatistics.
sampledataarecollected.
Understandhowtousetherunstesttotestfor
Assumptionthatdatabeinganalyzedarerandomly randomness.
selectedfromanormallydistributedpopulation.
KnowwhenandhowtousetheMann‐WhitneyU test,
Requiresquantitativemeasurementthatyieldinterval
theWilcoxonmatched‐pairssignedranktest,the
orratioleveldata.
Kruskal‐Wallistest,andtheFriedmantest.
NonparametricStatisticsarebasedonfewer
assumptionsaboutthepopulationandthe parameters.
Sometimescalled“distribution‐free”statistics.
Avarietyofnonparametricstatisticsareavailablefor

usewithnominalorordinaldata.
AdvantagesofNonparametricTechniques
DisadvantagesofNonparametricStatistics
Sometimesthereisnoparametricalternativetothe
Nonparametrictestscanbewastefulofdataif
useofnonparametricstatistics.
parametrictestsareavailableforusewiththedata.
Certainnonparametrictestcanbeusedtoanalyze
Nonparametrictestsareusuallynotaswidely nominaldata.
availableandwellknowasparametrictests.
Certainnonparametrictestcanbeusedtoanalyze
Forlargesamples,thecalculationsfo  r many ordinaldata.
nonparametricstatisticscanbetedious.
Thecomputationsonnonparametricstatisticsare
usuallylesscomplicatedthanthoseforparametric
statistics,particularlyforsmallsamples.
Probabilitystatementsobtainedfrommost

nonparametrictestsareexactprobabilities.
Mann‐WhitneyU Test
Mann‐WhitneyU Test:
SampleSizeConsideration
Mann‐WhitneyUtest‐ anonparametriccounterpart
Sizeofsampleone:n1
ofthe ttestusedtocomparethemeansoftwo
Sizeofsampletwo:n independent 2 populations.
Ifbothn and are 1 n2
10,thesmallsampleprocedure
Nonparametriccounterpartofthet testfor isappropriate. independentsamples If either
eithe n or
n o n is greater than 10 the large sample 1
n2 greaterthan10,thelargesampl 
Doesnotrequirenormallydistributedpopulations
procedureisappropriate.
Maybeappliedtoordinaldata Assumptions IndependentSamples
AtLeastOrdinalData

Mann‐WhitneyUTest:SmallSample
Mann‐WhitneyUTest:SmallSample Example‐Demonstration Example‐Demonstration H := 0 µ1=µ2 .05 Compensation Rank Group H : π Drug 18.75 1 H a A DrugB 19.80 2 H 20.10 26.19
Ifthefinalp‐value<.05,rejectH . 0 20.10 3 H 19.80 23.88 20.75 4 H 21.64 5 E 22.36 25.50 21.90 6 H 18.75 21.64
W =1+2+3+4+6 + 7 + 8 = 31 1       22.36 7 H 22.96 8 H 21.90 24.85 23.45 9 E
W =5+9+10+11+12+13+14+15= 22.96 25.30 2 89 23.88 10 E 24.12 11 E 20.75 24.12 24.85 12 E 23.45 25.30 13 E 25.50 14 E 26.19 15 E
Mann‐WhitneyU Test:Small
Mann‐WhitneyU Test: SampleExample
FormulasforLargeSampleCase
SinceU <U ,U =3.    2 1
p‐value=.0011*2        (for   
atwo‐tailedtest)=.022
<.05,rejectH . 0                      
Example– MannWhitneyUforlarge
RanksofIncomefromCombined samples
GroupsofPBSandNon‐PBSViewers PBS Non‐PBS Datavalue Rank Group Datavalue Rank Group 24,500 41,000 π 16,000 1 Non‐PBS 39,500 15 Non‐PBS 39,400 32,500 21,000 2 Non‐PBS 40,500 16 Non‐PBS 36,800 33,000 21,500 3 Non‐PBS 41,000 17 Non‐PBS 44,300 21,000 24,500 4 PBS 43,000 18 PBS 57 5 9 ,9 6 6 0 0 40 4 , 5 5 0 0 0 0 27,600 5 Non‐PBS 43,500 19.5 PBS n = 1 14 32,000 32,400 27,800 6 Non   ‐PBS 43,500 19.5 Non‐PBS 61,000 16,000 32,000 7 PBS 51,900 21 Non‐PBS    34,000 21,500 32,400 8 Non‐PBS 53,000 22 PBS n = 2 13 43,500 39,500 32,500 9 Non‐PBS 55,000 23 PBS 33,000 10 Non‐PBS 57,960 24 PBS 55,000 27,600 34,000 11 PBS 61,000 25 PBS 39,000 43,500 36,800 12 PBS 61,400 26 PBS 62,500 51,900 39,000 13 PBS 62,500 27 PBS 61,400 27,800 39,400 14 PBS 53,000
PBSandNon‐PBS:CalculationofU
PBSandNon‐PBS:Conclusion                                                                
WilcoxonMatched‐PairsSignedRankTest
WilcoxonMatched‐PairsSignedRankTest
Differencesofthescoresofthetwomatchedsamples
Differencesareranked,ignoringthesign
Ranksaregiventhesignofthedifference
Positiveranksaresummed Negative
v ranks are summed ranksaresumme
T isthesmallersumofranks
WilcoxonMatched‐PairsSignedRankTest:
WilcoxonMatched‐PairsSignedRankTest:
SampleSizeConsideration
SmallSampleExample
n isthenumberofmatchedpairs
Considerthesurveyb 
y AmericanDemographicsthat
Ifn >15,T isapproximatelynormallydistributed,
estimatedtheaverageannualhouseholdspending
andaZ testisused.
onhealthcare.TheU.S.metropolitanaveragewas
Ifn 15,aspecial“smallsample”procedureis $1,800. followed.
SupposesixfamiliesinPittsburgh ,Pennsylvania,are
Thepaireddataarerandomlyselected.
matcheddemographicallywithsixfamiliesin
Theunderlyingdistributionsaresymmetric.
Oakland,California,andtheiramountso fhousehold
spendingonhealthcareforlastyearareobtained.

WilcoxonMatched‐PairsSignedRankTest:
WilcoxonMatched‐PairsSignedRankTest:
SmallSampleExample
SmallSampleExample Family H : =0 0 Md Pair SampleA SampleB d Rank H :M 0 a d Pair SampleA SampleB 1 1,950 1,760 190 +4 1 1,950 1,760 2 1,840 1,870 ‐30 ‐1 n=6 2 1,840 1,870 3 2,015 1,810 205 +5 3 2,015 1,810 4 1 5 , 8 5 0 8 1 6 , 6 6 0 6 8 ‐ 0 2 =0.05 4 1,580 1,660 5 1,790 1,340 450 +6 6 1,925 1,765 160 +3 5 1,790 1,340 IfT1,rejectH 6 1,925 1,765 observed 0.
WilcoxonMatched‐PairsSignedRankTest:
WilcoxonMatched‐PairsSignedRankTest:
LargeSampleFormulas
LargeSampleFormulas
Forlargesamples,theTstatisticisapproximatelynormally
distributedandaz
scoreca  n b 
e usedastheteststatistic.
WilcoxonMatched‐PairsSignedRankTest: Example
LargeSampleFormulas                    City 1979 2011 d Rank City 1979 2011 d Rank 1 20 2 3 . 22 2 .8 8 2 ‐ .5 ‐ 8 10 1 20 2 .3 3 20 2 .9 9 ‐ 0.6 1   2 19.5 12.7 6.8 17 11 19.2 22.6 ‐3.4 ‐11.53 18.6 14.1 4.5 13 12 19.5 16.9 2.6 94 20.9 16.1 4.8 15 13 18.7 20.6 ‐1.9 ‐6.5 5 19.9 25.2 ‐5.3 ‐16 14 17.7 18.5 ‐0.8 ‐2 6 18.6 20.2 ‐1.6 ‐4 15 21.6 23.4 ‐1.8 ‐5 7 19.6 14.9 4.7 14 16 22.4 21.3 1.1 3 8 23.2 21.3 1.9 6.5 17 20.8 17.4 3.4 11.5 9 21.8 18.7 3.1 10
T Calculation Conclusion
                                                             Kruskal‐WallisTest
Kruskal‐WallisK Statistic
Kruskal‐WallisTest‐ Anonparametricalternative
toone‐wayanalysisofvariance          
Mayusedtoanalyzeordinaldata       
Noassumedpopulationshape As A s s u s m u e m s e that the
that th C groups are independent
C groupsareindependen
Assumesrandomselectionofindividualitems  
NumberofPatientsperDayperPhysician
NumberofPatientsperDayperPhysician
inThreeOrganizationalCategories
inThreeOrganizationalCategories
Supposearesearcherwantstodeterminewhetherthe H :
o Thethreepopulationsareidentical
numberofphysiciansinanofficeproducessignificant H :
a Atleastoneofthethreepopulationsisdifferent
differencesinthenumberofoffic  e patientssee  n b  y eac  h  Threeor
physicianperday.   Two More Sh S et k a esara d n omsam l p e f o phy i s i
c ansfrompractices Partners P artners HMO     
inwhich(1)thereareonlytwopartners,(2)thereare 13 24 26three 15 16 22
ormorepartners,or(3)theofficeisahealth20 19 31
maintenanceorganization(HMO).18 22 27 23 25 28 14 33 17
PatientsperDayData:Kruskal‐Wallis
PatientsperDayData:Kruskal‐Wallis
PreliminaryCalculations
CalculationsandConclusion Threeor Two More   Partners Partners HMO       Patients Rank
Patients Rank Patients Rank        13 1 24 12 26 14 15 3 16 4 22 9.5   20 8 19 7 31 17            18 6 22 9 5 . 27 15      23 11 25 13 28 16 14 2 33 18        17 5    T =29 T = = 1 2 52.5 T3 89.5n = = = 1 5 n2 7 n3 6 n=n + + = + + = 1
n2 n3 5 7 6 18     FriedmanTest FriedmanTest
FriedmanTest‐ Anonparametricalternativetothe
randomizedblockdesign       Assumptions
Theblocksareindependent. There
isnointeractionbetweenblocksandtreatments.
Observationswithineachblockcanberanked. Hypotheses H :
o Thetreatmentpopulationsareequal H :
a Atleastonetreatmentpopulationyieldslargervalues   
thanatleastoneothertreatmentpopulation FriedmanTest: FriedmanTest:
H : The supplier equal o populationsare
H : At least one supplier s larger a
populationyield  values
thanatleastoneothersupplierpopulation        Supplier1 Supplier2 Supplier3 Supplier4 Monday 62 63 57 61 Tuesday 63 61 59 65   Wednesday 61 62 56 63 Thursday 62 60 57 64 Friday 64 63 58 66   FriedmanTest: FriedmanTest: Supplier1 Supplier2 Supplier3 Supplier4      Monday 3 4 1 2   Tuesday 3 2 1 4 Wednesday 2 3 1 4    Thursday 3 2 1 4Friday 3 2 1 4 Rj 14 13 5 18 R2 196 169 25 324 j         