Thaophamm (1)

1.
Data
Data given:
X1 X2 X3 X4 X5 X6 X7
478 184 40 74 11 31 20
494 213 32 72 11 43 18
643 347 57 70 18 16 16
341 565 31 71 11 25 19
773 327 67 72 9 29 24
603 260 25 68 8 32 15
484 325 34 68 12 24 14
546 102 33 62 13 28 11
424 38 36 69 7 25 12
548 226 31 66 9 58 15
506 137 35 60 13 21 9
819 369 30 81 4 77 36
541 109 44 66 9 37 12
491 809 32 67 11 37 16
514 29 30 65 12 35 11
371 245 16 64 10 42 14
457 118 29 64 12 21 10
437 148 36 62 7 81 27
570 387 30 59 15 31 16
432 98 23 56 15 50 15
619 608 33 46 22 24 8
357 218 35 54 14 27 13
623 254 38 54 20 22 11
547 697 44 45 26 18 8
792 827 28 57 12 23 11
799 693 35 57 9 60 18
439 448 31 61 19 14 12
867 942 39 52 17 31 10
912 1017 27 44 21 24 9
462 216 36 43 18 23 8
859 673 38 48 19 22 10
805 989 46 57 14 25 12
652 630 29 47 19 25 9
776 404 32 50 19 21 9
919 692 39 48 16 32 11
732 1517 44 49 13 31 14
657 879 33 72 13 13 22
1419 631 43 59 14 21 13
989 1375 22 49 9 46 13
821 1139 30 54 13 27 12
1740 3545 86 62 22 18 15
815 706 30 47 17 39 11
760 451 32 45 34 15 10
936 433 43 48 26 23 12
863 601 20 69 23 7 12
783 1024 55 42 23 23 11
715 457 44 49 18 30 12
1504 1441 37 57 15 35 13
1324 1022 82 72 22 15 16
940 1244 66 67 26 18 16
Overview variables:
Variable Obs Mean Std. Dev. Min Max
x1 50 717.96 293.9388 341 1740

x2 50 616.18 573.7392 29 3545
x3 50 37.76 13.82036 16 86
x4 50 58.8 9.965246 42 81
x5 50 15.4 6.023762 4 34
x6 50 29.9 14.80106 7 81
x7 50 13.82 5.157479 8 36
- The value of X1 will range from 341 to 1740

- And 29 to 3545 for X2
- X3 we have the value from 16 to 86
- X4 range from 42 to 81
Suppose we have a regression model:
X1 = β1 + β2X2 + β3X3 + β4X4 + β5X5 + β6X6 + β7X7
To reduce the impact of large data points on the model, in this study we
will transform the Natural Logarithm
 LnX1 = β1 + β2ln(X2) + β3ln(X3) + β4ln(X4) + β5ln(X5) + β6ln(X6)
+ β7ln(X7)
Regression model (1) using Stata, we obtain the following rerult:
Table 2: regression model Ln(X1) = β1 + β2ln(X2) + β3ln(X3) + β4ln(X4)
+ β5ln(X5) + β6ln(X6) + β7ln(X7)
Dependent variable: Ln(X1)
Method: Least Squares
Date:19/11/2024
Time:11:54 AM
Included observation:50
Source SS df MS Number of obs = 50
F( 6, 43) = 8.84
Model 3.69591933 6 .615986555 Prob > F = 0.0000
Residual 2.99679948 43 .069693011 R-squared = 0.5522
Adj R-squared = 0.4898
Total 6.69271881 49 .136586098 Root MSE = .26399
ln_x1 Coef. Std. Err. t P>|t| [95% Conf. Interval]
ln_x2 .2603337 .0510629 5.10 0.000 .1573555 .3633118

ln_x3 .3287415 .1357469 2.42 0.020 .0549817 .6025012
ln_x4 .3108223 .4737788 0.66 0.515 -.6446437 1.266288
ln_x5 .0152426 .1959793 0.08 0.938 -.3799872 .4104725
ln_x6 .0955764 .1566919 0.61 0.545 -.2204229 .4115758
ln_x7 -.2479215 .2394942 -1.04 0.306 -.7309076 .2350645
_cons 2.776674 2.370395 1.17 0.248 -2.003684 7.557032
From the above estimation results we obtain:

(PRF): E(X1/X2,X3,X4,X5,X6,X7)= β1 + β2ln(X2) + β3ln(X3) + β4ln(X4) +
β5ln(X5) + β6ln(X6) + β7ln(X7)
(SRF): Ln(X5) = 2.77 +0.26ln(X2) +0.32ln(X3) + 0.31ln(X4)+0.01ln(X5)+
0.09ln(X6) – 0.24ln(X7)
2. Analyze regression results

2.1. Economic significance of regression coefficients
- B1 = 2.77 > 0
Shows that has a constant amount of 2.77
- B2 = 0.26 > 0
Inverse ratio shows that every 1 unit of ln(X2) increases will increase 0.26
unit ln(X1)
- B3 = 0.32 > 0
unit ln(X1)
- B4 = 0.31 > 0
unit of ln(X1)
- B5 = 0.01 >0
unit ln(X1)
- B6 = 0.09 >0
unit ln(X1)
- B7 = -0.24 < 0
Inverse ratio shows that every 1 unit of ln(X7) increases will decrease 0.24
unit ln(X1)
2.2. Statistical significance of regression coefficients

Test pair of hypotheses:
{ H 0: β J =0
H 1 : β J ≠0
(j=2)
^β −β
J j
Inspection standards: T= T ( n−2 )
Sⅇ ( β^ )
w α =(T :|T |>t 0.025=1.96)

48
Tqs2 = 5.1 > t 51 0.025 =1.96→ Reject H0, accept H1 → B2 is statistically

significant
Tqs3 = 2.24 → Reject H0, accept H1 → B3 is statistically significant
Tqs4 = 0.66 → Reject H0, accept H1→ B4 is statistically significant
Tqs5 = 0.08 → Reject H0, accept H1→ B4 is not statistically significant
Tqs6 = 0.61 → Reject H0, accept H1→ B4 is not statistically significant
Tqs7 = 1.04→ Reject H0, accept H1→ B4 is not statistically significant
Fix:
Eliminate x7,x6, x5, x4, so now the regression equation will be
reconsidered on the 3 variables x1, x3 and x2
Ln(X1) = β1 + β2ln(X2) + β3ln(X3)
Table 3: regression model Ln(X1) = β1 + β2ln(X2) + β3ln(X3)
Date:19/11/2024
Time:1:00 PM
F( 2, 47) = 27.61
Model 3.61561629 2 1.80780814 Prob > F = 0.0000
Total 6.69271881 49 .136586098 Root MSE = .25587
ln_x2 .2383047 .0400974 5.94 0.000 .1576392 .3189702

ln_x3 .2994823 .1194239 2.51 0.016 .0592324 .5397322
_cons 3.99405 .4259659 9.38 0.000 3.137117 4.850983
(PRF): E(X1/X3,X2)= β1 + β3ln(X3) + β2ln(X2)

(SRF): Ln(X1) = 3.99 + 0.299ln(X3) -0.238ln(X3)
{ H 0: β J =0
H 1 : β J ≠0
(j=2)
^β −β
J j
Inspection standards: T= T ( n−2 )
Sⅇ ( β^ )
w α =(T :|T |>t 0.025=1.96)

52
Tqs2 = > t 52
0.025 =1.96→ Reject H1, accept H0
→ B2 is statistically significant
Tqs3 = 3.33 > t 51
0.025 =1.96→ Reject H1, accept H0
→ B3 is statistically significant
3. Confidence interval for regression coefficients

The confidence interval for the regression coefficients is given by the
following formula:
 ^β i – t a/ 2 Se( ^β i) < βi < ^β i + t a/ 2 Se( ^β i)

n−k n−k
 The confidence interval for the intercept is calculated as:
^β 1 - t (n−2)
a/ 2
Se( ^β 1 ) < β1 < ^β 1 + t (n−2)
a/ 2
Se( ^β 1 )
 3.99-1.96*0.42< β1 < 3.99+1.96*0.42
 3.16< β1 <4.81
Holding reported violence and police funding constant, the crime rate is in
rage (3.16;4.81) unit
Similarly we have:
^β 2 - t (n−2) ^ ^ (n−2) ^
a/ 2 Se( β 2 ) < β2 < β 2 + t a/ 2 Se( β 2 )
0.15< β2 < 0.3

Under conditions where the reported violent crime rate remains constant,
the crime rate lies within the rage (0.15;0.3) unit
0.08<Β3 <0.51
Under the condition that annual police funding remains unchanged, the
crime rate is in the range (0.08;0.51) unit
4. Test the appropriateness of the model.
H0: R2 = 0
H1: R2 ≠ 0
2
R /(2)
F¿ F (2 ; 47 )
(1−R 2)/(51)
Domain rejected Wa = (F: F > F0.05(2;47) = 3.15)
We have: F = 27.6 ∈ Wa
 Reject H0, accept H1

 Suitable model
R2 = 0.5402 shows that the independent variable explains 54.02% of the

variation in the dependent variable
5. Check the model for defects

5.1 Multicollinearity
5.1.1 Testing multicollinearity phenomenon
Assuming that X2 and X3 are linear to each other, according to the
regression:
Ln(X2)= β1 + β2ln(X3)

Date:19/11/2024
Time:1:20 PM

F( 1, 48) = 4.58
Model 3.88906193 1 3.88906193 Prob > F = 0.0374
Total 44.6094639 49 .910397223 Root MSE = .92105
ln_x3 .8793954 .4107213 2.14 0.037 .053585 1.705206

_cons 2.899484 1.475121 1.97 0.055 -.0664458 5.865414
(SRF): Ln(X2) = 2.89 + 0.88ln(X3)

H0: R2 = 0
H1: R2 ≠ 0
2
R /(2)
F¿ F ( 1 ; 48 )
(1−R 2)/(48)
We have: F = 18.21 ∈ Wa
Multicollinearity occurs
Fix
Remove X3: regression model Ln(X1) = β1 + β2Ln(X2)

Date:19/11/2024
Time:1:45 PM

F( 1, 48) = 44.08
Model 3.20389451 1 3.20389451 Prob > F = 0.0000
Total 6.69271881 49 .136586098 Root MSE = .2696
ln_x2 .2679943 .040365 6.64 0.000 .186835 .3491537

_cons 4.885961 .2469886 19.78 0.000 4.389357 5.382565
(SRF): Ln(X1) = 4.88+ 0.26ln(X2)

H0: R2 = 0
H1: R2 ≠ 0
2
R /(2)
F¿ F ( 1 ; 48 )
(1−R 2)/(48)
We have: F = 44.08 ∈ Wa
R2 = 0.4787 shows that the independent variable explains 47.87% of the

variation in the dependent variable

Thaophamm (1)

Uploaded by

Copyright:

Available Formats

Thaophamm (1)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thaophamm (1)

Uploaded by

Copyright:

Available Formats

1.

Variable Obs Mean Std. Dev. Min Max

x1 50 717.96 293.9388 341 1740

- The value of X1 will range from 341 to 1740

Dependent variable: Ln(X1)

Method: Least Squares

ln_x1 Coef. Std. Err. t P>|t| [95% Conf. Interval]

ln_x2 .2603337 .0510629 5.10 0.000 .1573555 .3633118

From the above estimation results we obtain:

2. Analyze regression results

2.2. Statistical significance of regression coefficients

w α =(T :|T |>t 0.025=1.96)

Tqs2 = 5.1 > t 51 0.025 =1.96→ Reject H0, accept H1 → B2 is statistically

Dependent variable: Ln(X1)

Method: Least Squares

ln_x1 Coef. Std. Err. t P>|t| [95% Conf. Interval]

ln_x2 .2383047 .0400974 5.94 0.000 .1576392 .3189702

(PRF): E(X1/X3,X2)= β1 + β3ln(X3) + β2ln(X2)

w α =(T :|T |>t 0.025=1.96)

3. Confidence interval for regression coefficients

 ^β i – t a/ 2 Se( ^β i) < βi < ^β i + t a/ 2 Se( ^β i)

 The confidence interval for the intercept is calculated as:

 3.99-1.96*0.42< β1 < 3.99+1.96*0.42

0.15< β2 < 0.3

Domain rejected Wa = (F: F > F0.05(2;47) = 3.15)

 Reject H0, accept H1

R2 = 0.5402 shows that the independent variable explains 54.02% of the

5. Check the model for defects

Dependent variable: Ln(X2)

Method: Least Squares

Source SS df MS Number of obs = 50

ln_x2 Coef. Std. Err. t P>|t| [95% Conf. Interval]

ln_x3 .8793954 .4107213 2.14 0.037 .053585 1.705206

(SRF): Ln(X2) = 2.89 + 0.88ln(X3)

Domain rejected Wa = (F: F > F0.05(1;48) = 4.08)

Remove X3: regression model Ln(X1) = β1 + β2Ln(X2)

Method: Least Squares

Source SS df MS Number of obs = 50

ln_x1 Coef. Std. Err. t P>|t| [95% Conf. Interval]

ln_x2 .2679943 .040365 6.64 0.000 .186835 .3491537

(SRF): Ln(X1) = 4.88+ 0.26ln(X2)

Domain rejected Wa = (F: F > F0.05(1;48) = 4.08)

R2 = 0.4787 shows that the independent variable explains 47.87% of the

You might also like

 3.99-1.960.42< β1 < 3.99+1.960.42