The code was prepared by Tessa Lloyd using R. It is software in support of reproducibility of the following research outputs from the manuscript "Multidimensional analysis of immune response identified biomarkers of recent Mycobacterium tuberculosis infection".
It consists of four R scripts that made up the key analysis steps in the manuscript:
1. Decision Trees.R - this provides the code used to build the simple classification tree, tune the parameters for the random forest model via cross validation and build the final random forest model using the tuned hyperparameters.
2. Internal Validation.R - this provides the code that was used to validate the final logistic regression model via an internal validation procedure using cross validation.
3. MTP-EN.R - this provides the code that was used to build a multiple tuning parameter elastic net model. The function returns the average area under the receiver operating curve value for each weight parameter.
4. Standard EN.R - this provides the code that was used to tune a standard elastic net model via cross validation and build the final model to the data, identifying the non-zero coefficients from the model.
All libraries that were used in each script are included and are readily available on CRAN.
The goal of the manuscript was to identify a biomarker of recent Mycobacterium tuberculosis infection from a high-dimensional, data integrated dataset.
See full manuscript for more details.
Funding
Immunological significance of QuantiFERON TB Gold In-tube reversions in settings with high burden of tuberculosis
National Institute of Allergy and Infectious Diseases