Homework - ggplot2

Published

October 18, 2022

Homework Submission Information

Deadline: November 3rd, 2022 (Tuesday)

Format: HTML

Submit to:

Instructions

There will be three parts to this homework. Part 1 of the homework involves constructing four different plots, then merging the 4 plots into a single figure like how it was shown in class. This is something that will be commonly performed during your academic career in preparation of manuscripts for publications. Part 2 of the homework involves comparing the distribution of diastolic blood pressure and systolic blood pressure by infection status. Part 3 involves constructing a ‘profile plot’, where we are observing the trends in mean tumor volume across time.

As an additional tidbit, if you want to modify the dimension of your figure within the output of R markdown - please include fig.width, and fig.height within the header of the chunk. For example, {r, fig.width = 6, fig.height = 4} will output the figure within the chunk as 4 inches (width) by 6 inches (height)

Important

Part 1 and 2 will involve using the emergency dataset. and Part 3 will involve using the vaccine dataset

Part 1

Part 1 involves constructing the four primary plots that we went over in class. The classic dot plot, box plot, bar plot, and histogram. Afterward please merge them together to form a single figure.

Dotplot

For the classic dot plot, please construct a vertical dot plot showing the distribution of heart rate between gender (Female vs Male). Please also overlay a line showing the mean and median heart rate for male and females.

Boxplot

For the classic box plot, please construct a box plot comparing the age of Female with and without infection and Male with and without infection. Please also overlay jittered points in addition to the box plot. Please make this plot without the use of a facet or patchwork etc, it should be on a single panel. e.g. 2 boxplots side by side for Female (Infection Yes vs Infection No) and 2 boxplots side by side for Male (Infection Yes vs Infection No) within a single panel.

Barplot

For the classic bar plot, please compare the percent of each CHILD score between gender. There should be separate bars for each CHILD score (A, B, and C) for Female and Male. The height of the bar should depict the percent for each gender. In addition to the bar plot, please also overlay the actual count and proportion on each bar as shown in the notes and in class. Please construct this plot on a single panel without the use of facets. e.g. 3 bars for Female (CHILD A, B, and C) and 3 bars for Male (CHILD A, B, and C).

Histogram

For the classic histogram, please construct a histogram of length of stay for Female and Male in separate panels.

Merged Plot

Please merge the 4 plots that we constructed into a single labeled plot as was shown in class and in the notes.

Part 2

Part 2A

Part 2A involves constructing a box plot comparing the diastolic blood pressure and systolic blood pressure between infection status. You should construct this plot in a single panel without the use of facets. There should be 2 box plots (Infection Yes vs Infection No) for diastolic blood pressure and 2 box plots (Infection Yes vs Infection No) for systolic blood pressure. Please also overlay jittered points on top of the boxplots.

Part 2B

Part 2B involves constructing a violin plot comparing the diastolic blood pressure and systolic blood pressure between infection status. You should construct this plot in a single panel without the use of facets. There should be 2 violin plots (Infection Yes vs Infection No) for diastolic blood pressure and 2 violin plots (Infection Yes vs Infection No) for systolic blood pressure.

Please see below for an example of the classical violin plot. The violin plot is composed of a rotated and centered density plot as well as a classical box plot embedded within the region.

Part 2C

Similar to Part 2A and 2B, Part 2C involves constructing a bar plot comparing diastolic blood pressure and systolic blood pressure between infection status. The height of the barplots should depict the mean and please overlay errorbars for the standard error. I have provided a function below to calculate the standard error. Please copy this function into your RMD, run it, and then you will have access to this function se(). You should construct this plot in a single panel without the use of facets. There should be 2 bar plots (Infection Yes vs Infection No) for diastolic blood pressure and 2 bar plots (Infection Yes vs Infection No) for systolic blood pressure.

se <- function(x, na.rm = T) sqrt(var(x, na.rm = na.rm)/length(x))

Part 2D

Now that you have both 2A, 2B, and 2C completed, it’s time to submit your figure to the journal for publication. Please write a small statement on which figure you will choose to submit. Please compare and contrasts the two figures, and provide evidence to support your decision.

Part 3

Part 3 of the assignment involves constructing a ‘profile plot’. This will be using a different data set than the emergency data set that can be found in the box folder. You will be constructing this plot to observe the change in mean tumor volume across time. In constructing your plot, there should be a different panel for Ad-GFP and Ad-HER3. Within each of the panel, you will be plotting the mean tumor volume among IgG, anti-PD1, and anti-PD-L1 at each time point. In addition to plotting the mean tumor volume, you will also plot the standard errors at each time point. Since R does not include a built in function to calculate the standard error, I have provided the function below that you can use.

Important
  • Please note this is not ‘exactly’ how it should look like, but this can be used as a reference.

Tip

Copy this line of code into your R environment, run it, and you should be able to calculate the standard error using se()

se <- function(x, na.rm = T) sqrt(var(x, na.rm = T)/length(x))