###QUESTION 05

###Topic: Sheet 9 Exercise 1
###Questions:
##1) Why do you use "var.equal = FALSE" additionally to the command?
#     In my solution I had not used this command and got exactly the same solution
##2) I also wanted to know if the command "paired = FALSE" is necessary in this case
#     or can be skipped because R assumes that an unpaired t-test is calculated?
##3) Finally, I wanted to ask why a comma should be added to the square bracket
#     "data.ccrt[data.ccrt$population == "BKK",]" after "BKK" before closing it? ###Answer: ##1)You should not get the exact same solution, look at the degrees of freedom (df) and the p-value data.ccrt <- read.table("C:/Users/Ingo/Documents/A-Uni/EES/3rd_semester/Rcourse2020_Tutoring/data/ccrt.txt", header = TRUE) var_noneq <- t.test(data.ccrt[data.ccrt$population == "KATH",]$ccrt, data.ccrt[data.ccrt$population == "BKK",]$ccrt, paired = FALSE, var.equal = FALSE) var_eq <- t.test(data.ccrt[data.ccrt$population == "KATH",]$ccrt, data.ccrt[data.ccrt$population == "BKK",]$ccrt, paired = FALSE, var.equal = TRUE) var_noneq$parameter
##       df
## 175.9615
var_eq$parameter ## df ## 223 var_noneq$p.value
##  3.686528e-10
var_eq$p.value ##  7.150701e-11 #The solution says that we don't know the variance, but actually we do, you may remember var = sd^2 tapply(data.ccrt$ccrt, data.ccrt$population, var) ## BKK KATH ## 141.57834 66.85144 #Still, even if we didn't know the variances we should assume they are not equal, because Bangkok and Kathmandu flies are different populations #Individuals from different populations should have different variances. #Depending on if the variances are the same or not you will either perform a Two-sample t-test (equal var) or a Welch Two-sample t-test (different var) #You can also see which t-test has been run in the output var_noneq$method
##  "Welch Two Sample t-test"
var_eq$method ##  " Two Sample t-test" #In both tests, the t-statistic and df are being calculated differently (more on that in the stats lecture next semester) ##2)You are right, if you check the help file you can see that paired = FALSE is set as default, if you know that this is the case you can of course leave it out #If you are just starting to use a command, however, I would recommend still writing this down to help remember what the default setting is #As you can see you could also leave out var.equal = FALSE, as this is also the default t.test(data.ccrt[data.ccrt$population == "KATH",]$ccrt, data.ccrt[data.ccrt$population == "BKK",]$ccrt) ## ## Welch Two Sample t-test ## ## data: data.ccrt[data.ccrt$population == "KATH", ]$ccrt and data.ccrt[data.ccrt$population == "BKK", ]$ccrt ## t = -6.6436, df = 175.96, p-value = 3.687e-10 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -11.948003 -6.475203 ## sample estimates: ## mean of x mean of y ## 36.77869 45.99029 ##3)The comma before the end of the bracket in data.ccrt[data.ccrt$population == "BKK",] signifies that you want to take all columns from this
#leaving the comma out would result in an "undefined columns selected" error
#alternatively you can also specify that you only want a single or not all of the columns, by for example specifying
data.ccrt[data.ccrt\$population == "BKK",2]
##    24 25 26 27 27 27 27 27 27 28 29 30 30 31 31 32 32 33 34 34 35 35 36 36 37
##   37 37 37 38 38 38 39 39 39 40 41 42 42 43 43 43 43 43 44 44 45 45 45 46 46
##   46 46 46 47 47 47 47 48 48 48 48 49 49 50 51 51 51 51 51 52 53 53 53 54 54
##   54 54 54 55 55 55 56 58 58 58 58 59 59 59 60 61 61 61 62 62 63 65 68 68 69
##  69 69 70
#which only gives you the ccrt values without the population name
#when leaving the space after the comma empty you are saying that you want to take all of the columns