person profile

Dietrich Klakow

Dietrich Klakow — researcher or builder tracked in the Angestrom contributor network.

3Connections

1Papers

0Models

0Repos

0News

Papers · 1

Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

Chemical language models (CLMs) are trained with linearized representations such as SMILES, yet it remains unclear which chemically meaningful substructures they encode. To foster a better understanding of CLMs, we conduct a systematic study and probe for 78 molecular substructures across eight pre-trained and six randomly initialized models. We furthermore study how fine-tuning on chemical downstream tasks affects the learned representations of molecular substructures. Our results show that pre-training generally improves molecular structure awareness of CLMs, particularly in the upper layers