Cheminformatics

Internal training, University of Medicine and Pharmacy at Ho Chi Minh city, Department of Organic Chemistry, 2022

This is a set of eight (8) tutorials on basic information of cheminformatics using the Google Colab free cloud-computing environment in Summer 2022.

Introduction

These tutorials were created between Jan-June 2022 as part of the MedAI Training Session for full execution over Google Colab and remote accesibility via web browsers.

Each tutorial includes a brief introduction of the activities to be performed, installation instructions of the open-source software to be used in each session and several programming, visualization and data analysis activities to be achieved during the tutorial.

After this training session, you will know:

  • Package using in cheminformatics: RDKit
  • SMILES and SMART structure
  • Visualize molecules and database
  • ADMET and virtual screening filtrations
  • Molecular descriptors and fingerprints
  • Molecular Similarity
  • Clustering techniques

Description of the Tutorials

The following is a brief description of each tutorial, along with the open-source software used for each task:

TutorialDescriptionSoftware 
Lab.01 Open In ColabWarm-up on Colab and Brief Review of RDKit functionsRDKit, mols2grid, tqdm, mols2grid, sklearn, pingouin 
Lab.02 Open In ColabUnderstanding SMILES structure and different types of data filesRDKit, mols2grid, tqdm 
Lab.03 Open In ColabUnderstanding SMART structureRDKit, mols2grid, tqdm 
Lab.04 Open In ColabVisualization techniques for molecules and databaseRDKit, mols2grid, tqdm, sklearn, umap-learn 
Lab.05 Open In Colab, Homework Open In ColabDrug-likeliness filtration (ADMET, PAINS,…)RDKit, mols2grid, tqdm, seaborn, sklearn, pingouin, PyTDC 
Lab.06 Open In ColabDefinition of molecular descriptors and fingerprintsRDKit, mols2grid, tqdm, mordred, padelpy, CATS2D, map4, tmap, mhfp, mol2vec 
Lab.07 Open In ColabMolecular Similarity with RDKitRDKit, mols2grid, tqdm, 
Lab.08 Open In ColabMolecular Clustering with Butina algorithmRDKit, mols2grid, tqdm