Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 19h ago

Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

Chemical language models (CLMs) are trained with linearized representations such as SMILES, yet it remains unclear which chemically meaningful substructures they encode. To foster a better understanding of CLMs, we conduct a systematic study and probe for 78 molecular substructures across eight pre-trained and six randomly initialized models. We furthermore study how fine-tuning on chemical downstream tasks affects the learned representations of molecular substructures. Our results show that pre-training generally improves molecular structure awareness of CLMs, particularly in the upper layers

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorAnna Karnysheva

    Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

  • Linked via arxiv authorDietrich Klakow

    Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

  • Linked via arxiv authorJi-Ung Lee

    Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

Covers

Implements

authored (incoming)

Covers (incoming)

Related across the graph

Topics