Homework - Datasets

Author

Marcio A Diniz

Published

September 1, 2022

Homework 01

Deadline: Septembet 27th

Format: HTML + Qmd + Excel Spreadsheets

Submit to: michael.luu@cshs.org

Using the files example 01.xlsx, example 02.xlsx and example 03.xlsx:

Task 1

Organize the datasets following Basic rules to organize a dataset document.

Task 2

In a Quarto report, read all three datasets in R; (a) Check whether the format (DBL, FCT) of all variables are adequate based on the dictionary of variables, which is presented together with the dataset. If they are not correct, change them to an adequate format. (b) Show the summary of the formatted datasets where is possible to check the first rows of all variables, the number of columns (variables), rows (observations) and format of each column.

Task 3

For example 01 dataset, create a new dataset with only pediatric patients.

Task 4

For example 02 dataset, create a new dataset containing only the variables Patient ID, Sex, Treatment, and Clinical Outcome.

Task 5

For example 03 dataset: If you organized the dataset in the long format, then transform it into the wide format, and transform the wide dataset back to the long format; If you organized the dataset in the wide format, then transform it into the long format, and transform the long dataset back to the wide format. The long and wide dataset should contain the variables DCX, p21 and DCX_p21.