paperarXivTrust 82 · PrimaryPublished yesterdayLive · 19h ago

Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

Chemical language models (CLMs) are trained with linearized representations such as SMILES, yet it remains unclear which chemically meaningful substructures they encode. To foster a better understanding of CLMs, we conduct a systematic study and probe for 78 molecular substructures across eight pre-trained and six randomly initialized models. We furthermore study how fine-tuning on chemical downstream tasks affects the learned representations of molecular substructures. Our results show that pre-training generally improves molecular structure awareness of CLMs, particularly in the upper layers

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Linked via arxiv authorAnna Karnysheva →
Probing Chemical Language Models: Effects of Pre-training and Fine-tuning
Linked via arxiv authorDietrich Klakow →
Probing Chemical Language Models: Effects of Pre-training and Fine-tuning
Linked via arxiv authorJi-Ung Lee →
Probing Chemical Language Models: Effects of Pre-training and Fine-tuning

Covers

newsBridging three-dimensional molecular structures and artificial intelligence with a conformation description language newsReshaping biomolecular structure prediction through strategic conformational exploration with HelixFold-S1

Implements

repochrisliu298/awesome-llm-unlearning

authored (incoming)

personAnna Karnysheva personDietrich Klakow personJi-Ung Lee

Covers (incoming)

newsEfficient and valid large molecule generation via self-supervised generative models - Nature

Related across the graph

repochrisliu298/awesome-llm-unlearning newsEfficient and valid large molecule generation via self-supervised generative models - Nature newsReshaping biomolecular structure prediction through strategic conformational exploration with HelixFold-S1 personAnna Karnysheva newsBridging three-dimensional molecular structures and artificial intelligence with a conformation description language personJi-Ung Lee personDietrich Klakow

Topics

cs.LG