Machine learning (ML) already accelerates discoveries in many scientific fields and is the driver behind several new products. Recently, growing sample sizes enabled the use of ML approaches in larger… Click to show full abstract
Machine learning (ML) already accelerates discoveries in many scientific fields and is the driver behind several new products. Recently, growing sample sizes enabled the use of ML approaches in larger omics studies. This work provides a guide through a typical analysis of an omics dataset using ML. As an example, this chapter demonstrates how to build a model predicting Drug-Induced Liver Injury based on transcriptomics data contained in the LINCS L1000 dataset. Each section covers best practices and pitfalls starting from data exploration and model training including hyperparameter search to validation and analysis of the final model. The code to reproduce the results is available at https://github.com/Evotec-Bioinformatics/ml-from-omics .
               
Click one of the above tabs to view related content.