II. Data Sets
a) Student’s year and the amount of Coffee they take
b) Socializing time and the amount of Coffee they take
c) Gender and the amount of Coffee they take
III. Conclusion
For testing this guess, we do t-hypothesis test with two different independent samples.
Null hypothesis is there is no difference between lower class and upper class. Alternative hypothesis is there is significant difference between lower class and upper class.
H0 : (the mean of the amount of coffee for lower class students) – (the mean of the amount of coffee upper class students) = 0
Ha : (the mean of the amount of coffee for lower class students) – (the mean of the amount of coffee upper class students) ≠ 0
Following is the result of t-test from SPSS. You can find the statistic from these outputs.
From the data descriptive output in the previous page, we can calculate the t-statistic.
t = ((.79-1.23))/√(2.14⁄61+1.8⁄22) = -1.294 d.f = 61+22-2=81
T-statistic from calculating and the SPSS output (-1.294, -1.270) is almost same. (Slight differences between the values we computed and the values on the table might have been caused from setting different decimal points.)
The degree of freedom is big enough to follow normal distribution, so we can use z-distribution instead of t-distribution.
If α=0.05, the null hypothesis cannot be rejected because the p-value, 0.211, is bigger than α, which means there is no difference of coffee amount between upper and lower classes.
Besides, because the difference between the number of upper class(22) and the lower class(61) is quiet big, the data can be biased. If you want to test those data clearly, we suggest select more samples on upper class and eliminate critical outliers which can affect its mean and variance.
Socializing time and the amount of Coffee they take
As you might see in this scatter plot, it was hard to identify any relation between the socializing time and the amount of coffee taken from the raw data. We cannot find linear regression relationship in this scatter plot. There are several outliers and most data are distributed from 0 to 20 socializing times.
However, even though it does not seem that there is a relationship between the

분야