Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Statistik Dalam Hidrologi Uji Kecocokan Data Terhadap Distribusi Kemnungkinan (Goodness of Fit of Data To Probability Distibution)

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 63

Statistik Dalam Hidrologi

Uji Kecocokan data terhadap Distribusi Kemnungkinan


( Goodness of Fit of Data to Probability Distibution )
Learning Outcome

• Mahasiswa dapat memahami prinsip ujij


kecocokan data terhadap Distribusi
Kemungkinan
• Mahasiswa dapat menguji kecocokan data
thd Distribusi Kemungkinan secara : grafis
(visual) dan analistis ( Chi square dan
Smirnov Kolmogorov )
PARAMATER STATISTIK
Summary statistics
• Also called descriptive statistics
– If x1, x2, …xn is a sample then

1 n
Mean, X   xi  for continuous data
n i 1

2
1 n
Variance,
2
S   xi  X 
n  1 i 1
 for continuous data

Standard S  S2  for continuous data


deviation,
S
Coeff. of variation, CV 
X

Also included in summary statistics are median, skewness, correlation coefficient,


4
Graphical display
• Time Series plots
• Histograms/Frequency distribution
• Cumulative distribution functions
• Flow duration curve

7
Penentuan Jumlah Kelas

Jumlah kelas (k) = 1 + 3,322 log n


 = 1 + 3,322 log (14)
= 4,807
Ambil 5 kelas
DAFTAR DEBIT MINIMUM SUNGAI KOMERING
TAHUN 2000 - 2010

NO Tahun JUMLAH DEBIT MINIMUM


    KESELURUHAN
    ( m3 / det )

1 2000 17.22
2 2001 22.25
3 2002 50.20
4 2003 58.60
5 2004 33.36
6 2005 33.36
7 2006 45.08
8 2007 43.50
9 2008 38.39
10 2009 8.50
11 2010 21.45
Time Series Plot


Data Grouping
n= 11
nilai max = 58.60
nilai min = 8.50

range r = 50.10

banyak
kelas m =1 + log 3.3 log n
=1 + log 3.3 log 11
=4.4
=4
SIDLACOM
• SURVEY
• INVESTIGATION
• DESIGN
• La = LAND
ACQUISITION
• C0NSTRUCTION
• OPERATION
• MAINTENANCE
Frequency Histogram Calc
Nilai Kelas
Frekuensi Probabilty
Kelas Debit
 (xi) (fi) (p)

1 8.50 < p < 21.03 14.77 2 0.181818

2 21.04 < p < 33.55 27.30 4 0.363636

3 33.56 < p < 46.08 39.82 3 0.272727

4 46.09 < p < 58.60 52.35 2 0.181818

Total 11 1


Probability distributions
• Normal family
– Normal, lognormal, lognormal-III
• Generalized extreme value family
– EV1 (Gumbel), GEV, and EVIII (Weibull)
• Exponential/Pearson type family
– Exponential, Pearson type III, Log-Pearson type
III

16
Normal distribution
• pdf for normal distribution
2
1 x 
1  
2  

f X ( x)  e
 2

 is the mean and  is the standard


deviation

Hydrologic variables such as annual precipitation, annual average streamflow, or


annual average pollutant loadings follow normal distribution
17
Standard Normal distribution
• A standard normal distribution is a normal
distribution with mean () = 0 and standard
deviation () = 1
• Normal distribution is transformed to
standard normal distribution by using the
following formula:
X 
z

z is called the standard normal variable
18
Normal Distribution
2
1  x 
• Normal distribution 1   
2  
f X ( x)  e
 2
xT  x
KT   zT
s

• So the frequency factor for the Normal


Distribution is the standard normal variate

xT  x  KT s  x  zT s

• Example: 50 year return period


1
T  50; p   0.02; K 50  z50  2.054 Look in Table 11.2.1 or use –NORMSINV (.) in
50 EXCEL or see page 390 in the text book
19
Lognormal distribution
• If the pdf of X is skewed, it’s not
normally distributed
• If the pdf of Y = log (X) is
normally distributed, then X is
said to be lognormally
distributed.
1  ( y   y )2 
f ( x)  exp   x  0, and y  log x
x 2  2 y 
2

20
EV type I distribution
( Gumbel)
1  x u  x  u 
f ( x)  exp   exp  
     
6sx
 u  x  0.5772

21
Frequency analysis for extreme events
1  x u  x  u 
f ( x)  exp   exp   EV1 pdf and cdf
     
6 sx
 u  x  0.5772

  x  u 
F ( x)  exp  exp  
    x u
y

F ( x)  exp exp( y )
y   ln lnF ( x)    ln ln(1  p) where p  P(x  xT )
  1 
yT   ln  ln1  
  T 

If you know T, you can find yT, and once yT is know, xT can be computed by

xT  u  yT 22
EV-I (Gumbel) Distribution
  x  u  6s   T 
F ( x)  exp  exp    yT   ln ln 
       T  1 

u  x  0.5772
xT  u  yT
6 6    T  
 x  0.5772 s s  ln ln  
     T  1  
6   T  
x 0.5772  ln ln  s
    T  1  

xT  x  KT s

6   T  
KT   0.5772  ln ln  
    T  1  

23
Contoh
• Given annual maximum rainfall, calculate 5-yr
storm using frequency factor
6   T  
KT   0.5772  ln ln  
    T  1  

6   5  
KT   0 .5772  ln  ln    0.719
    5  1  

xT  x  K T s
 0.649  0.719  0.177
 0.78 in

24
contoh
• Given annual maxima for 10-minute storms
• Find 5- & 50-year return period 10-minute
storms x  0.649 in
s  0.177 in
6s 6 * 0.177 u  x  0.5772  0.649  0.5772 * 0.138  0.569
   0.138
 
  T    5 
y5   ln ln    ln ln   1.5
  T  1    5  1 
x5  u  y5  0.569  0.138 *1.5  0.78 in

x50  1.11 in

25
Log-Pearson Type III
• If log X follows a Person Type III distribution,
then X is said to have a log-Pearson Type III
distribution
 ( y   )  1 e   ( y  )
f ( x)  y  log x  
(  )

26
Probability plots
• Probability plot is a graphical tool to assess
whether or not the data fits a particular
distribution.
• The data are fitted against a theoretical
distribution in such as way that the points should
form approximately a straight line (distribution
function is linearized)
• Departures from a straight line indicate
departure from the theoretical distribution

27
Probaility Plotting
• Plot of magnitude of random X (variable
hidrologi) dan Probability
• Plotting position = menentukan probability
dari suatu titik data .
• Urutan data boleh diurut dari besar ke kecil
atau dari kecil ke besar.
• Bila diurutkan dari kecil ke besar maka
probability ( m/(n+1) untuk p(x)
• Bila diurutkan dari besar ke kecil ( m/(n+1)
hrs dihitung lagi (1-P(x))
PLOTTING POSITION
Contoh Kertas Distribusi
Kemungkinan Log Normal
• CT HAAN , STATITISTIC HYDROLOGY
• VT.CHOW, APPLIED HYDROLOGY
Normal Probability Plotting
Normal probability plot
• Steps
1. Rank the data from largest (m = 1) to smallest (m = n)
2. Assign plotting position to the data
1. Plotting position – an estimate of exccedance probability
2. Use p = m/(n+1)
3. Find the standard normal variable z corresponding to the
plotting position (use -NORMSINV (.) in Excel)
4. Plot the data against z
• If the data falls on a straight line, the data comes from
a normal distributionI

36
        ( empirical) Expected (theory)

Cum
p= z= Prob
NO X (mm) (x-Xrata)^2 m m/(n+1) ln (X) (xi-Xrata)/std Cum Prob (%)
(1) (2) (3) (4) (5) (6) (7) (8) (9)

1 5.9 2607.8044 1 7.69 1.775 -0.971 0.166 16.58

2 9.6 2243.6011 2 15.38 2.262 -0.901 0.184 18.39

3 13.8 1863.3611 3 23.08 2.625 -0.821 0.206 20.59

4 18.3 1495.1111 4 30.77 2.907 -0.735 0.231 23.11

5 24.8 1034.6944 5 38.46 3.211 -0.612 0.270 27.04

6 30.1 721.81778 6 46.15 3.405 -0.511 0.305 30.47

7 39.2 315.65444 7 53.85 3.669 -0.338 0.368 36.77

8 62.7 32.871111 8 61.54 4.138 0.109 0.543 54.34

9 84.7 769.13778 9 69.23 4.439 0.527 0.701 70.10


( dihitung berdasarkan
n= 12 perhitungan parameter statistik )
56.9666 ( dihitung berdasarkan
Xrata = 67 perhitungan parameter statistik )
2765.46 ( dihitung berdasarkan
Var = 24 perhitungan parameter statistik )
52.5876 ( dihitung berdasarkan
std = 64 perhitungan parameter statistik )

CATATAN :

Col (8) : F(x) atau cumulative frequency distribution (cfd) dapat dilihat pada
tabel standar normal distribution yg
fungsi z atau dihitung dg mempergunakan fungsi statistik pada excel
@NORM.DIST ( x,mean,std,cum).
baca perhitungan Cumulative Frequency
Distribution menggunakan fungsi excel.
EV1 (Gumbel) probability plot
• Steps
1. Sort the data from smallest to largest to Assign
plotting position using Weibull formula pi =
m/(n+1)
2. Calculate reduced variate yi = -ln(-ln(1-pi))
3. Plot sorted data against yi
• If the data falls on a straight line, the data
comes from an EV1 distribution

40
Observed
        ( empirical)     Expected (theory)  

p= T= z= (x- p=exp(-exp(- T=1/(1-


NO X (mm) (x-Xrata)^2 m m/(n+1) 1/(1-p) log(T)   mu)/alpa z)) p) Log (T)
(1) (2) (3) (4) (5) (6) (7)   (8) (9)   (10) (11)
1 5.9 2607.804 1 0.08 1.083 0.080  -0.668 0.142  1.17 0.15
2 9.6 2243.601 2 0.15 1.182 0.167  -0.577 0.168  1.20 0.18
3 13.8 1863.361 3 0.23 1.300 0.262  -0.475 0.200  1.25 0.22
4 18.3 1495.111 4 0.31 1.444 0.368  -0.365 0.237  1.31 0.27
5 24.8 1034.694 5 0.38 1.625 0.486  -0.207 0.292  1.41 0.35
6 30.1 721.818 6 0.46 1.857 0.619  -0.078 0.339  1.51 0.41
7 39.2 315.654 7 0.54 2.167 0.773  0.144 0.421  1.73 0.55
8 62.7 32.871 8 0.62 2.600 0.956  0.717 0.614  2.59 0.95
9 84.7 769.138 9 0.69 3.250 1.179  1.253 0.752  4.03 1.39
10 97.3 1626.778 10 0.77 4.333 1.466  1.560 0.811  5.28 1.66
11 127.2 4932.721 11 0.85 6.500 1.872  2.289 0.904  10.38 2.34
12 170 12776.534 12 0.92 13.000 2.565  3.333 0.965  28.51 3.35
Jumlah 683.6 30420.087
Parameter statistik

n= 12
Xrata = 56.96667
Var = 2765.462
std = 52.58766
alpa = 41.02323
mu = 33.28806
UJI KECOCOKAN DG ANALITIS
• 1. CHI SQUARE TEST
• 2.SMIRNOV KOLMOGOROV TEST
CHI SQUARE TEST
• Comparison between actual number of
observation and expected number of
observation ( expected according to
distribution under test ) that fall in the class
interval ( Haan, CT , 1988)
Harga Xc ( X kritis sbg sayarat)

• Ditentukan oleh nilai Dk dan ( level of


significance)
• DK = k - (p + 1)
• DK = Derajat kebebasan ( degree of freedom)
• k = kelas interval
• p = parameter distribusi
Nilai X kriitik untuk Uji Chi Square

Nilai kritis Xc
α
DK
0,950 0,050 0,025 0,010 0,005
1 0,004 3,841 5,024 6,635 7,879
2 0,103 5,991 7,378 9,210 10,597
3 0,352 7,815 9,348 11,345 12,838
4 0,711 9,488 11,143 13,277 14,860
5 1,145 11,070  12,833 15,086 16,750
6 1,635 12,592 14,449 16,812 18,548
7 2,167 14,067 16,013 18,475 20,278
8 2,733 15,507 17,535 20,090 21,955
9 3,325 16,919 19,023 21,666 23,589
10 3,940 18,307 20,483 23,209 25,188
11 4,575 19,675 21,920 24,725 26,757
12 5,226 21,026 23,337 26,217 28,300
13 5,892 22,362 24,736 27,688 29,819
14 6,571 23,685 26,119 29,141 31,319
      Observed ( empirical) Expected (theory)  
p= z= (xi- Probabili

1 5.9
Contoh Perhitungan
X (mm) (x-Xrata)^2
2607.804
m
1
m/(n+1) log (X)
0.08
Xrata)/std Cum Prob ty
1.775 -0.971077 0.166
fre
0.17
2 9.6 2243.601 2 0.15 2.262 -0.900718 0.184 0.02
3 13.8 1863.361 3 0.23 2.625 -0.820852 0.206 0.02
4 18.3 1495.111 4 0.31 2.907 -0.73528 0.231 0.03
5 24.8 1034.694 5 0.38 3.211 -0.611677 0.270 0.04
6 30.1 721.818 6 0.46 3.405 -0.510893 0.305 0.03
7 39.2 315.654 7 0.54 3.669 -0.337849 0.368 0.06

8 62.7 32.871 8 0.62 4.138 0.1090243 0.543 0.18

9 84.7 769.138 9 0.69 4.439 0.5273734 0.701 0.16

10 97.3 1626.778 10 0.77 4.578 0.7669733 0.778 0.08

11 127.2 4932.721 11 0.85 4.846 1.3355477 0.909 0.13

12 170 12776.534 12 0.92 5.136 2.1494268 0.984 0.08


683.6 30420.087          0.98
12
56.967
765.462
Pembagian K = 1 + 3.322
kelas log n

K= Nilai
max
4.5850401 = 170.00

5 kelas min = 5.9


range
= 32.82
35(pembulatan)
Perhitungan Chi Square
OBSERVE
DATA D EXPECTED CHI ^2

Nilai (OF-
NO Interval R (x) OF z P (z) f(x) EF (OF-EF) EF)^2 Xc

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)

1 0 - 35 35 6 (0.42) 0.338 0.338 4.06 1.94 3.7755 0.6293

2 36 - 70 70 2 0.25 0.598 0.260 3.12 -1.12 1.2488 0.6244

3 71- 105 105 2 0.91 0.819 0.222 2.66 -0.66 0.4347 0.2174

4 106 - 1`40 140 1 1.58 0.943 0.123 1.48 -0.48 0.2305 0.2305

5 141 - 175 175 1 2.24 0.988 0.045 0.54 0.46 0.2141 0.2141

  JUMLAH   12         5.9037 1.9157


• DK = k - (p + 1)
• K= 5
• P = 2 ( dsiribusi normal ada 2 parameter : mean dan
standard deviasi)
• DK = 5 –( 2+1)
• DK = 2
• ( level of significance) = 0,05 karena tingkat
kepercayaan yg diharapkan = 0.95
• Dari tabel Xc , Xc = 5,991
• X hitung 1,915 < X c tabel , dapat diterima
Uji Smirnov Kolmogorov
• Membandingkan antara deviasi maksimum
antara probability pengamatan dengan
teoritis.
• Data tidak dibagi dalam kelas.
• Dpt membandingkan dengan gambar
cumulative Frequency distribution (grafis)
• Atau dihitung dengan pakai tabel.
Uji Smirnov Kolmogorov
• Misalkan Sn(X) Cummulative Density Function
berdasarkan pengamatan. Sn(x) = k/n
diamana k = jumlah pengamatan <= x
• P(x) = cumulative distribution function teoritis
• Hitung deviasi maks (D) , dimana D = maks
[P(x) – Sn(x)]
• Bila nilai D >= Nilai kritis pada tabel maka , tdk
dapat diterima.
Nilai ∆ kritik untuk Uji Smirnov Kolmogorov
Uji Smirnov Kolmogorov

α
υ
0,200 0,150 0,100 0,050 0,010
2 0,684 0,726 0,776 0,842 0,929
4 0,494 0,525 0,564 0,642 0,734
6 0,410 0,436 0,470 0,521 0,618
8 0,358 0,381 0,411 0,457 0,543
10 0,322 0,342 0,368 0,409 0,486
12 0,295 0,313 0,338 0,375 0,450
14 0,274 0,292 0,314 0,349 0,418
16 0,258 0,274 0,295 0,328 0,391
18 0,244 0,259 0,278 0,309 0,370
20 0,231 0,246 0,264 0,294 0,352
30 0,190 0,200 0,220 0,242 0,290
n > 50 1,07/n 1,14/n 1,22/n 1,36/n 1,63/n
Observed
      ( empirical) Expected (theory)   Smirno
z= (xi- Cum Probabil
X (mm) (x-Xrata)^2 m p= m/(n+1) log (X) Xrata)/std Prob ity frekuensi D=P(ob

1 5.9 2607.804 1 0.08 0.771 -0.971077 0.166 0.17 1.99 (0.0

2 9.6 2243.601 2 0.15 0.982 -0.900718 0.184 0.02 0.22 (0.0

3 13.8 1863.361 3 0.23 1.140 -0.820852 0.206 0.02 0.26 0.0

4 18.3 1495.111 4 0.31 1.262 -0.73528 0.231 0.03 0.30 0.0

5 24.8 1034.694 5 0.38 1.394 -0.611677 0.270 0.04 0.47 0.1

6 30.1 721.818 6 0.46 1.479 -0.510893 0.305 0.03 0.41 0.1

7 39.2 315.654 7 0.54 1.593 -0.337849 0.368 0.06 0.76 0.1

8 62.7 32.871 8 0.62 1.797 0.1090243 0.543 0.18 2.11 0.0

9 84.7 769.138 9 0.69 1.928 0.5273734 0.701 0.16 1.89 (0.0

10 97.3 1626.778 10 0.77 1.988 0.7669733 0.778 0.08 0.93 (0.0

11 127.2 4932.721 11 0.85 2.104 1.3355477 0.909 0.13 1.57 (0.0

12 170 12776.534 12 0.92 2.230 2.1494268 0.984 0.08 0.90 (0.0


lah 683.6 30420.087          Dmax = 0.
• Dari tabel perhitungan uji kecocokan distribusi
normal , deviasi maksimum (Dmaks) = 0.17.
• Dari tabel Smirnov Kolmogorov untuk  = 0.05
dan jumlah data = 66, maka Dkritis = 0,375
• D < D kritis, maka asumsi disrubusi yg ditest
dapat diterima.
Contoh Uji Kecocokan Distribusi
Gumbel
    Observed ( empirical)   Expected (theory)   Smirnov  
p=
m/(n+1 Probabilit
mm) (x-Xrata)^2 m ) T= 1/(1-p) log(T) z= (x-mu)/alpa p=exp(-exp(-z)) y D=P(obs)-P(teo

5.9 2607.80444 1 0.08 1.083 0.080 -0.66762314 0.142  0.142  (0.07)  

9.6 2243.60111 2 0.15 1.182 0.167 -0.57743034 0.168  0.026  (0.01)  

13.8 1863.36111 3 0.23 1.300 0.262 -0.47504932 0.200  0.032  0.03  

18.3 1495.11111 4 0.31 1.444 0.368 -0.36535538 0.237  0.036  0.07  

24.8 1034.69444 5 0.38 1.625 0.486 -0.20690857 0.292  0.056  0.09  

30.1 721.817778 6 0.46 1.857 0.619 -0.07771348 0.339  0.047  0.12  

39.2 315.654444 7 0.54 2.167 0.773 0.144112054 0.421  0.081  0.12  

62.7 32.8711111 8 0.62 2.600 0.956 0.716958211 0.614  0.193  0.00  


84.7 769.137778 9 0.69 3.250 1.179 1.25323972 0.752  0.138  (0.06)  

97.3 1626.77778 10 0.77 4.333 1.466 1.560382767 0.811  0.059  (0.04)  


127.2 4932.72111 11 0.85 6.500 1.872 2.28923809 0.904  0.093  (0.06)  
170 12776.5344 12 0.92 13.000 2.565 3.33254939 0.965  0.061  (0.04)  
683.6 30420.0867              0.965  0.12  
• Dari tabel perhitungan uji kecocokan distribusi
normal , deviasi maksimum (Dmaks) = 0.12.
• Dari tabel Smirnov Kolmogorov untuk  = 0.05
dan jumlah data = 66, maka Dkritis = 0,375
• D < D kritis, maka asumsi disrubusi yg ditest
dapat diterima.

You might also like