FANTASIA is an advanced pipeline for the automatic functional annotation of protein sequences using state-of-the-art protein language models. It integrates deep learning embeddings and in-memory ...
We benchmark on the community-standard Dalke NN dataset (1,000 high-similarity ChEMBL pairs) — the same dataset widely used by RDKit, CDK, and the academic MCS literature. Identical SMILES input, same ...