### QUESTION 2
#
## Topic: Sheet 8 Exercise 1
## Question:
# 1. Why is there a 2 in the square brackets: for (i in 1:dim(data)[2])
# 2. Why is there: y<-c(y, names(data)[i]) no comma in front of the i as in the line above to output the name of the column?
# Solution
myfact <- function(data) {
y <-c()
for (i in 1:dim(data)[2]){
if (is.factor(data[,i]) == TRUE){
y<-c(y, names(data)[i])
}
}
return(y)
}
heartbeats <- read.table("http://evol.bio.lmu.de/_statgen/Rcourse/ws1920/data/heartbeats.txt", header = TRUE)
## Answer:
# 1. Load the heartbeats data and try compiling this to see what happens:
dim(heartbeats)
## [1] 210 3
# Explaination: The output of the function dim is a vector with 2 values:
# the first value represents the number of rows.
# the second value represents the number of columns.
# If you would like to extract just the number of rows from the vector
# you would use dim(heartbeats)[1], analogously if you need just the number
# of columns you would use dim(heartbeats)[2], try it:
dim(heartbeats)[1]
## [1] 210
dim(heartbeats)[2]
## [1] 3
# since you want to check each column in the function, you should use dim(data)[2] in the for loop.
# 2. Try compiling this to see what happens:
names(heartbeats)
## [1] "wghtcls" "treatment" "wghtincr"
# Explaination: the output of the names function is a vector that contains the names of
# all columns. Vectors are ONE dimensional so, if you want to get a specific value from
# the vector you just need to put one index, e.g. try:
names(heartbeats)[1]
## [1] "wghtcls"
# this will output the value at the position 1, and thus, you would get the name of the
# first column.
# by using this names(data)[i] you extract the name of the i-th column for which is.factor(data[,i]) == TRUE
# notice: data[,i] --> here you use comma because data is a TWO dimensional data frame
# and by saying data[,i], you select ALL ROWS but only column i.