Blog

Q1: In assessing the predictive power of categorical predictors of a binary outcome,

should logistic regression be used?

Q2: Objective: Using Logistic Regression to handle a binary outcome.

Given the prostate cancer dataset, in which biopsy results are given for 97 men:

• You are to predict tumor spread in this dataset of 97 men who had undergone a biopsy.

• The measures to be used for prediction are: age, lbph, lcp, gleason, and lpsa. This implies that binary dependent variable of lcavol will be the outcome variable.

We start by loading the appropriate libraries in R: ROCR, ggplot2, and aod packages as follows:

> install.packages(“ROCR”)

> install.packages(“ggplot2”)

> install.packages(“aod”)

> library(ROCR)

> library(ggplot2)

> library(aod)

Next, we load the csv file and check the statistical properties of the csv File as follow:

> setwd(“C:/RData”) # your working directory

> tumor <- read.csv(“prostate.csv”) # loading the file

> str(tumor) # check the properties of the file

 

. . . continue from here!

 

Reference

R Documentation (2016). Prostate cancer data. Retrieved from

http://rafalab.github.io/pages/649/prostate.html