### QUESTION 2
#
## Topic: Sheet 8 Exercise 1

## Question: 
# 1. Why is there a 2 in the square brackets: for (i in 1:dim(data)[2])
# 2. Why is there: y<-c(y, names(data)[i]) no comma in front of the i as in the line above to output the name of the column?

# Solution 
myfact <- function(data) {
  y <-c()
  for (i in 1:dim(data)[2]){
    if (is.factor(data[,i]) == TRUE){
      y<-c(y, names(data)[i])
    }
  }
  return(y)
}

heartbeats <- read.table("http://evol.bio.lmu.de/_statgen/Rcourse/ws1920/data/heartbeats.txt", header = TRUE)

## Answer: 

# 1. Load the heartbeats data and try compiling this to see what happens:
dim(heartbeats)
## [1] 210   3
# Explaination: The output of the function dim is a vector with 2 values: 
# the first value represents the number of rows. 
# the second value represents the number of columns. 
# If you would like to extract just the number of rows from the vector 
# you would use dim(heartbeats)[1], analogously if you need just the number 
# of columns you would use dim(heartbeats)[2], try it: 
dim(heartbeats)[1]
## [1] 210
dim(heartbeats)[2]
## [1] 3
# since you want to check each column in the function, you should use dim(data)[2] in the for loop. 

# 2. Try compiling this to see what happens:
names(heartbeats)
## [1] "wghtcls"   "treatment" "wghtincr"
# Explaination: the output of the names function is a vector that contains the names of 
# all columns. Vectors are ONE dimensional so, if you want to get a specific value from 
# the vector you just need to put one index, e.g. try:
names(heartbeats)[1]
## [1] "wghtcls"
# this will output the value at the position 1, and thus, you would get the name of the 
# first column. 
# by using this names(data)[i] you extract the name of the i-th column for which is.factor(data[,i]) == TRUE
# notice: data[,i] --> here you use comma because data is a TWO dimensional data frame
# and by saying data[,i], you select ALL ROWS but only column i.