Crimson Publishers Publish With Us Reprints e-Books Video articles


Annals of Chemical Science Research

Fine-Tuning ChemBERTa-2 for Aqueous Solubility Prediction

Submission: May 08, 2023;Published: May 19, 2023

ISSN : 2688-8394
Volume4 Issue1

DOI: 10.31031/ACSR.2023.04.000578


Traditional machine-learning techniques for predicting physical-chemical properties often require the calculation and selection of molecular descriptors. Calculating descriptors can be time-consuming and computationally expensive, and there is no guarantee that all relevant and significant features will be captured, especially when trying to predict novel endpoints. In this study, we demonstrate the effectiveness of transformer models in predicting physical-chemical endpoints by fine-tuning the open ChemBERTa-2 model to predict aqueous solubility directly from structure with comparable accuracy to traditional machine-learning techniques, without the need for descriptor calculation and selection. Our findings suggest that transformer models have the potential to provide an efficient and streamlined method for predicting physical-chemical properties directly from molecular structure.

Keywords:Transformer models; ChemBERTa-2; SMILES; Cheminformatics; Physical-chemical property prediction

Get access to the full text of this article