In this lab we will use PCA to analize football players characteristics of the French premier league (according to the video game Fifa). But before doing so, we will try to reproduce the results we get during the lectures on our socio-economic dataset.
To retrive the data you can go here.
install.packages(c("FactoMineR", "factoextra)")
you are now ready to load them (to be done each time you restart R)
library(FactoMineR)
library(factoextra)
mypca <- function(data){
n.obs <- nrow(data)
n.var <- ncol(data)
data <- scale(data)## centering and scaling the data
decomp <- svd(data)
U <- decomp$u
V <- decomp$v
D <- diag(decomp$d)
## Compute the percentage of explained variance
explained.variance.prop <- 1## Fill in
## Coordinates of individuals and variables onto factorial axis
ind.coord <- 1## Fill in
var.coord <- 1## Fill in
## Some graphics
par(mfrow = c(1, 3))
## Evolution of the explained variance
barplot(100 * explained.variance.prop)
## Plot individuals onto the 1st factorial plane
xlab <- paste("1st axis (", 100 * round(explained.variance.prop[1], 3), "%)", sep = "")
ylab <- paste("2nd axis (", 100 * round(explained.variance.prop[2], 3), "%)", sep = "")
plot(ind.coord[,1:2], xlab = xlab, ylab = ylab, main = "Individuals")
abline(h = 0, lty = 2, col = "grey")
abline(v = 0, lty = 2, col = "grey")
## Plot the variable onto the 1st factorial plane
plot(0, xlim = c(-1, 1), ylim = c(-1, 1), xlab = xlab, ylab = ylab, main = "Variables",
type = "n")
abline(h = 0, lty = 2, col = "grey")
abline(v = 0, lty = 2, col = "grey")
## on trace le cercle unité
angles <- seq(0, 2 * pi, length = 500)
lines(cos(angles), sin(angles))
arrows(rep(0, ncol(data)), rep(0, ncol(data)), var.coord[,1], var.coord[,2])
text(var.coord[,1], var.coord[,2], colnames(data))
return(list(ind.coord = ind.coord, var.coord = var.coord, explained.variance.prop = explained.variance.prop))
}
Data can be retrieved from here. Perform a complete statistical analysis.
Good luck!