RESEARCH METHODS AND DATA ANALYSIS

ASSIGNMENT FOR RESEARCH METHODS AND DATA ANALYSIS

Exercise 1a

Case Processing Summary
Cases
ValidMissingTotal
NPercentNPercentNPercent
cholesterol values in mmol/L44100.0%00.0%44100.0%

As all the variables have same value which indicates the null missing value in the statistics in the field of normality of the following distribution of cholesterol values.

Descriptives
StatisticStd. Error
cholesterol values in mmol/LMean5.9932.23706
95% Confidence Interval for MeanLower Bound5.5151
Upper Bound6.4713
5% Trimmed Mean6.0162
Median5.7500
Variance2.473
Std. Deviation1.57250
Minimum3.00
Maximum8.50
Range5.50
Interquartile Range2.50
Skewness-.184.357
Kurtosis-.980.702

 

From the descriptive values, it has been analysed that the mean of the cholesterol value is 5.99 with standard error of 0.24 and 5% trimmed mean is reflected as 6.01. The median, variance and standard deviation for cholesterol values are 5.75, 2.473 and 1.572 respectively.

On the other hand, the values of skewness and kurtosis is need to be approximately zero for constructing the table using normal distribution. The kurtosis is -0.980 with standard error of 0.702 which reflects the pattern of normal distribution.

Tests of Normality
Kolmogorov-SmirnovaShapiro-Wilk
StatisticdfSig.StatisticdfSig.
cholesterol values in mmol/L.10544.200*.95944.121
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

 

 

Shapiro-Wilk test highlights if the random sample is obtained from normal distribution and it is a test of normality (Hanusz, et al., 2016). Kolmogorov–Smirnov test (KS test) helps to determine if the datasets vary significantly or not (Hassani & Silva, 2015). The normal distribution test or test of normality has been conducted through following Shapiro-Wilk test and KS test.

The value of statistical significant differs with that of normal distribution. The null-hypothesis of the normal distribution is rejected if the value is less than 0.05 which has also reflected the non-normal distribution. The significant value of cholesterol is 0.20 which indicates that it is distributed normally with having null-hypothesis.

From the Shapiro-Wilk test, it has been found that the significant value for cholesterol is 0.121 which is greater than 0.05. So, the cholesterol values are stated to be normally distributed. Shapiro-Wilk test is also considered as very useful in the cases where the number of observations is very much less in relation with the same value.

 

RESEARCH METHODS AND DATA ANALYSIS

 

Shapiro-Wilk test is also considered as very useful in the cases where the number of observations is very much less in relation with the same value.

From the histogram, it has been stated that the range of the value lies in between 3 to 9 which reflects the continuous measurement of variables. The results have shown that the standard deviation is 1.57 for the sample size 44 and the histogram also reflects the standard median of 5.75.

RESEARCH METHODS AND DATA ANALYSIS

The normal Q-Q plot of cholesterol values reflects that the observed values of the limit will be hugging with the line which indicates the normal distribution of the variables. By checking each section of the variable category, it has been stated that the cholesterol values are approximated normally distributed.

The difference in outliers is demonstrated through the detrended normal Q-Q plot which reflects that the observed values have differed with that of the standard one. The pattern of Q-Q curve helps to showcase the approximated normal distribution of identified cholesterol values.

From the figure of Boxplot, it has been identified that the line across the summary plot indicates its median of 5.75 and outliers belong to the range of 3 to 8.5.

Exercise 1b

Statistics
cholesterol values in mmol/L
NValid44
Missing0
Mean5.9932
Median5.7500
Mode5.50
Std. Deviation1.57250
Variance2.473
Range5.50
Minimum3.00
Maximum8.50

 

From the statistics, it has been analysed that the null hypothesis is not included in the cholesterol values and the mean, mode and median of the dataset are identified as 5.99, 5.50 and 5.75 respectively.

Having the variance of the sample as 2.47, the maximum and minimum sample is also identified as 8.5 and 3 respectively. Moreover, 95% of the distributed sample of cholesterol are included under confidence interval which is also reflected from the below figure.

From the histogram, it has been stated that the range of the value lies in between 3 to 9 which reflects the continuous measurement of variables. The results have shown that the standard deviation is 1.57 for the sample size 44 and the histogram also reflects the standard median of 5.75.

The original or true value must lie in between the upper and lower limit of 95% confidence interval which reflects the normal distribution.

Exercise 2a

Paired Samples Statistics
MeanNStd. DeviationStd. Error Mean
Pair 1aerobic capacity for group 146.3462138.137942.25706
aerobic capacity for group 234.3692136.865901.90426
Pair 2systolic blood pressure for group 1122.7692133.722761.03251
systolic blood pressure for group 2136.5385135.796771.60774
Pair 3body fat for group 121.5000135.135661.42438
body fat for group 232.1154133.680771.02086

 

For developing the hypothesis, the group of participants is distributed on the basis of three pairs which are aerobic capacity, systolic blood pressure and body fat. For the constant sample size of 13, the mean and standard deviation for thehealthy weight group is 46.34 and 8.13 respectively. The mean and standard deviation forthe overweight group for SBP are 136.53 and 5.79 respectively.

Paired Samples Correlations
NCorrelationSig.
Pair 1aerobic capacity for group 1 & aerobic capacity for group 213.087.778
Pair 2systolic blood pressure for group 1  & systolic blood pressure for group 213.134.663
Pair 3body fat for group 1 & body fat for group 213-.179.559

 

From the paired samples correlations, it has been identified that the significant values for aerobic capacity for group 1 & aerobic capacity for group 2 along with systolic blood pressure for group 1 & systolic blood pressure for group 2 are 0.778 and 0.663 respectively. On the other hand, the correlation and significant value of body fat for group 1 & body fat for group 2 are -0.179 and 0.559 respectively.

Paired Samples Test
Paired DifferencestdfSig. (2-tailed)
MeanStd. DeviationStd. Error Mean95% Confidence Interval of the Difference
LowerUpper
Pair 1aerobic capacity for group 1 – aerobic capacity for group 211.9769210.181942.823965.8240418.129814.24112.001
Pair 2systolic blood pressure for group 1  – systolic blood pressure for group 2-13.769236.456961.79084-17.67113-9.86733-7.68912.000
Pair 3body fat for group 1 – body fat for group 2-10.615386.831771.89479-14.74378-6.48699-5.60212.000

 

Paired sample test are normally carried out to evaluate if the mean difference between two data sets is zero or not (Skyttberg, et al., 2018).From the paired sample test, it has been identified that the difference between two datasets have been equivalent from the value of t in T-testing.

The standard error mean of three dataset is identified as 2.82, 1.79 and 1.894 respectively. The original or true value must lie in between the upper and lower limit of 95% confidence interval which reflects the normal distribution.

Exercise 2b

Paired Samples Statistics
MeanNStd. DeviationStd. Error Mean
Pair 1aerobic capacity for group 234.3692136.865901.90426
group 2 retest for AC36.4077135.145461.42709

 

For the group of overweight, it has been identified that the standard deviation has been changed from 6.86 to 5.14 which demonstrates the slight change after training.

Paired Samples Correlations
NCorrelationSig.
Pair 1aerobic capacity for group 2 & group 2 retest for AC13.912.000

 

For the sample size of 13, the correlation of the aerobic capacity for group 2 is demonstrated as 0.912.

Paired Samples Test
Paired DifferencestdfSig. (2-tailed)
MeanStd. DeviationStd. Error Mean95% Confidence Interval of the Difference
LowerUpper
Pair 1aerobic capacity for group 2 – group 2 retest for AC-2.038463.02587.83923-3.86698-.20995-2.42912.032

 

The original or true value must lie in between – 3.86 to -0.209 of 95% confidence interval which reflects the normal distribution.  The mean of the group is also identified as 2.03 with standard deviation of 3.025.

Exercise 3a

 

Correlations
systolic bpdiastolic bpage
systolic bpPearson Correlation1.725**.223
Sig. (2-tailed).000.124
N494949
diastolic bpPearson Correlation.725**1.365**
Sig. (2-tailed).000.010
N494949
agePearson Correlation.223.365**1
Sig. (2-tailed).124.010
N494949
**. Correlation is significant at the 0.01 level (2-tailed).

Exercise 3b

 

Based on the Bivariate relationship table above, it is seen that diastolic blood pressure and systolic blood pressure has the highest correlation with 0.725, followed by the diastolic blood pressure and age with 0.365 and systolic blood pressure and age with 0.223.

Exercise 4a

 

Descriptives
NMeanStd. DeviationStd. Error95% Confidence Interval for MeanMinimumMaximum
Lower BoundUpper Bound
Atkins Diet25254.5036.06225.500-269.51378.512980
32137.00....3737
36140.00....4040
39148.00....4848
40165.00....6565
41166.00....6666
49257.004.2433.00018.8895.125460
50158.00....5858
51163.00....6363
52157.00....5757
58255.007.0715.000-8.53118.535060
60170.00....7070
61170.00....7070
72173.00....7373
Total1757.6513.5273.28150.6964.602980
5:2 Diet25239.0029.69821.000-227.83305.831860
32125.00....2525
36128.00....2828
39144.00....4444
40145.00....4545
41159.00....5959
49246.005.6574.000-4.8296.824250
50154.00....5454
51168.00....6868
52147.00....4747
58255.50.707.50049.1561.855556
60163.00....6363
61155.00....5555
72175.00....7575
Total1749.6515.1373.67141.8657.431875

 

ANOVA
Sum of SquaresdfMean SquareFSig.
Atkins DietBetween Groups1559.38213119.952.263.963
Within Groups1368.5003456.167
Total2927.88216
5:2 DietBetween Groups2751.38213211.645.694.724
Within Groups914.5003304.833
Total3665.88216

Exercise 4b

Null hypothesis: There is no difference in the preference of a particular diet.

Alternate hypothesis: There is a significant difference in the preference of a particular diet.

One–way Anova test helps to find out if there is a variance between the means of independent variables (Kim, 2017). Based on one–way Anova test result, the significance value for both Atkins diet and 5:2 diet with relation to low-calorie diet is higher than the significance value of 0.05.

Therefore, null hypothesis is accepted. Since the significance value as suggested by Anova test is above 0.05 for both the variables, therefore there is no need for post hoc tests.

Exercise 5a

Descriptives
systolic BP
NMeanStd. DeviationStd. Error95% Confidence Interval for MeanMinimumMaximum
Lower BoundUpper Bound
6 weeks10187.400012.946044.09390178.1390196.6610170.00210.00
12 weeks7179.000014.236105.38074165.8338192.1662159.00200.00
24 weeks6163.50004.969912.02896158.2844168.7156155.00169.00
Total23178.608715.062713.14079172.0951185.1223155.00210.00

 

ANOVA
systolic BP
Sum of SquaresdfMean SquareFSig.
Between Groups2143.57821071.7897.527.004
Within Groups2847.90020142.395
Total4991.47822

 

Post Hoc Tests

Multiple Comparisons
Dependent Variable:   systolic BP
Tukey HSD
(I) Group(J) GroupMean Difference (I-J)Std. ErrorSig.95% Confidence Interval
Lower BoundUpper Bound
6 weeks12 weeks8.400005.88062.346-6.477923.2779
24 weeks23.90000*6.16214.0038.309939.4901
12 weeks6 weeks-8.400005.88062.346-23.27796.4779
24 weeks15.500006.63887.074-1.296232.2962
24 weeks6 weeks-23.90000*6.16214.003-39.4901-8.3099
12 weeks-15.500006.63887.074-32.29621.2962
*. The mean difference is significant at the 0.05 level.

 

Exercise 5b

From the results of the Anova test, it can be said that there is a statistical significant difference between the levels of the independent variables i.e. difference is observed in systolic blood pressure due to the exercise lasting for 24 weeks, 12 weeks and 6 weeks.

The significant difference in the Anova test can be found out through Post hoc where except the significance value of 24 weeks and 6 weeks, all other values have more than 0.05 significance. Therefore all the values except the significance value of 24 weeks and 6 weeks have a non-statistically significant difference.

Exercise 6a

Between-Subjects Factors
Value LabelN
Main Sport participated in1Runner8
2Rower8

 

Multivariate Testsa
EffectValueFHypothesis dfError dfSig.
heartratePillai’s Trace.97137.635b7.0008.000.000
Wilks’ Lambda.02937.635b7.0008.000.000
Hotelling’s Trace32.93137.635b7.0008.000.000
Roy’s Largest Root32.93137.635b7.0008.000.000
heartrate * sportPillai’s Trace.6412.038b7.0008.000.170
Wilks’ Lambda.3592.038b7.0008.000.170
Hotelling’s Trace1.7832.038b7.0008.000.170
Roy’s Largest Root1.7832.038b7.0008.000.170
a. Design: Intercept + sport

Within Subjects Design: heartrate

b. Exact statistic

 

Mauchly’s Test of Sphericitya
Measure:   MEASURE_1
Within Subjects EffectMauchly’s WApprox. Chi-SquaredfSig.Epsilonb
Greenhouse-GeisserHuynh-FeldtLower-bound
heartrate.00181.31127.000.411.565.143
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix.
a. Design: Intercept + sport

Within Subjects Design: heartrate

b. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table.

 

Tests of Within-Subjects Effects
Measure:   MEASURE_1
SourceType III Sum of SquaresdfMean SquareFSig.
heartrateSphericity Assumed15918.09472274.01387.152.000
Greenhouse-Geisser15918.0942.8765535.54487.152.000
Huynh-Feldt15918.0943.9564023.62387.152.000
Lower-bound15918.0941.00015918.09487.152.000
heartrate * sportSphericity Assumed190.844727.2631.045.405
Greenhouse-Geisser190.8442.87666.3661.045.381
Huynh-Feldt190.8443.95648.2401.045.392
Lower-bound190.8441.000190.8441.045.324
Error(heartrate)Sphericity Assumed2557.0639826.092
Greenhouse-Geisser2557.06340.25963.516
Huynh-Feldt2557.06355.38646.168
Lower-bound2557.06314.000182.647

 

Tests of Within-Subjects Contrasts
Measure:   MEASURE_1
SourceheartrateType III Sum of SquaresdfMean SquareFSig.
heartrateLinear3043.00613043.00642.968.000
Quadratic30.006130.0062.137.166
Cubic6436.03116436.031148.967.000
Order 4.2741.274.017.899
Order 52849.14412849.144133.314.000
Order 616.751116.7514.524.052
Order 73542.88113542.881271.684.000
heartrate * sportLinear17.357117.357.245.628
Quadratic44.024144.0243.135.098
Cubic62.546162.5461.448.249
Order 424.961124.9611.516.238
Order 56.32216.322.296.595
Order 65.04615.0461.363.263
Order 730.587130.5872.346.148
Error(heartrate)Linear991.4941470.821
Quadratic196.6131414.044
Cubic604.8621443.204
Order 4230.4851416.463
Order 5299.2031421.372
Order 651.839143.703
Order 7182.5661413.040

 

Tests of Between-Subjects Effects
Measure:   MEASURE_1
Transformed Variable:   Average
SourceType III Sum of SquaresdfMean SquareFSig.
Intercept3292819.53113292819.5315790.345.000
sport3549.03113549.0316.241.026
Error7961.43714568.674

 

Exercise 6b

Based on the significant value provided in Tests of Within-Subjects Contrasts table, it is seen that for eight separate heart rate variables, the significance value is less than 0.05. But heartrate * sport source leads to the significance values of higher than 0.05. Test of between-subjects effects also showed that the significance value of sport is less than 0.05. From the above graph between heart rate and estimated marginal means, it is observed that the rowers showed higher heart rates in comparison to runners.

References

Hanusz, Z., Tarasinska, J. & Zielinski, W., 2016. Shapiro-Wilk test with known mean. REVSTAT-Statistical Journal, 14(1), pp. 89-100.

Hassani, H. & Silva, E., 2015. A Kolmogorov-Smirnov based test for comparing the predictive accuracy of two sets of forecasts. Econometrics, 3(3), pp. 590-609.

Kim, T., 2017. Understanding one-way ANOVA using conceptual figures. Korean journal of anesthesiology, 70(1), p. 22.

Skyttberg, N., Chen, R. & Koch, S., 2018. Man vs machine in emergency medicine–a study on the effects of manual and automatic vital sign documentation on data quality and perceived workload, using observational paired sample data and questionnaires. BMC, 18(1), p. 54.

 

 

 

Leave a Comment