Homework - Datasets
Homework 01
Deadline: Septembet 27th
Format: HTML + Qmd + Excel Spreadsheets
Submit to: michael.luu@cshs.org
Using the files example 01.xlsx, example 02.xlsx and example 03.xlsx:
Task 1
Organize the datasets following Basic rules to organize a dataset document.
Task 2
In a Quarto report, read all three datasets in R; (a) Check whether the format (DBL, FCT) of all variables are adequate based on the dictionary of variables, which is presented together with the dataset. If they are not correct, change them to an adequate format. (b) Show the summary of the formatted datasets where is possible to check the first rows of all variables, the number of columns (variables), rows (observations) and format of each column.
Task 3
For example 01 dataset, create a new dataset with only pediatric patients.
Task 4
For example 02 dataset, create a new dataset containing only the variables Patient ID, Sex, Treatment, and Clinical Outcome.
Task 5
For example 03 dataset: If you organized the dataset in the long format, then transform it into the wide format, and transform the wide dataset back to the long format; If you organized the dataset in the wide format, then transform it into the long format, and transform the long dataset back to the wide format. The long and wide dataset should contain the variables DCX, p21 and DCX_p21.