Assignment: Handle Binary Outcome
Assignment: Handle Binary Outcome
Objective: Using Logistic Regression to handle a binary outcome. Given the prostate cancer dataset, in which biopsy results are given for 97 men: • You are to predict tumor spread in this dataset of 97 men who had undergone a biopsy. • The measures to be used for prediction are: age, lbph, lcp, gleason, and lpsa. This implies that binary dependent variable of lcavol will be the outcome variable. We start by loading the appropriate libraries in R: ROCR, ggplot2, and aod packages as follows: > install.packages(“ROCR”) > install.packages(“ggplot2”) > install.packages(“aod”) > library(ROCR) > library(ggplot2) > library(aod) Next, we load the csv file and check the statistical properties of the csv File as follow: > setwd(“C:/RData”) # your working directory > tumor <- read.csv(“prostate.csv”) # loading the file > str(tumor) # check the properties of the file . . . continue from here! Reference R Documentation (2016). Prostate cancer data. Retrieved from http://rafalab.github.io/pages/649/prostate.html
Aims The analysis of randomized controlled trials with incomplete binary outcome data is challenging. We develop
a general method for exploring the impact of missing data in such trials, with a focus on abstinence outcomes.
Design We propose a sensitivity analysis where standard analyses, which could include ‘missing = smoking’ and ‘last
observation carried forward’, are embedded in a wider class of models. Setting We apply our general method to data
from two smoking cessation trials. Participants A total of 489 and 1758 participants from two smoking cessation
trials. Measurements The abstinence outcomes were obtained using telephone interviews. Findings The estimated
intervention effects from both trials depend on the sensitivity parameters used. The findings differ considerably in
magnitude and statistical significance under quite extreme assumptions about the missing data, but are reasonably
consistent under more moderate assumptions. Conclusions A new method for undertaking sensitivity analyses when
handling missing data in trials with binary outcomes allows a wide range of assumptions about the missing data to be
assessed. In two smoking cessation trials the results were insensitive to all but extreme assumptions.
Keywords Last observation carried forward, missing data, missing not at random, Russell Standard, sensitivity
analysis, smoking cessation trials.
Correspondence to: Dan Jackson, MRC Biostatistics Unit, Institute of Public Health, University Forvie Site, Robinson Way, Cambridge CB2 0SR, UK.
Submitted 29 April 2013; initial review completed 11 November 2013; final version accepted 14 August 2014
Missing outcome data are a common problem in
randomized controlled trials. In this paper we focus on
trials where the end-point of interest is a single binary
outcome. Binary outcome measures are widely used in
trials for smoking, alcohol and drug misuse where the
treatment goal is abstinence [1–4].
In smoking cessation trials, participants who do not
report their smoking status at follow-up are often
assumed to be smoking [5–8], and the Russell Standard
[9,10] requires this. Because smoking cessation trials
have this standard approach for handling missing
outcome data, we use smoking as our example and incorporate the Russell Standard into our methods. However,
our method is applicable to all trial areas where binary
outcome data are collected; for example, Maisel et al. 
found that most studies in their meta-analysis of treatments for alcohol-use disorders considered dropouts to
have relapsed. Based on an informal review, w