Deep contrastive learning enables genome-wide virtual screening.
Jia, Y., Gao, B., Tan, J., Zheng, J., Hong, X., Zhu, W., Tan, H., Xiao, Y., Tan, L., Cai, H., Huang, Y., Deng, Z., Wu, X., Jin, Y., Yuan, Y., Tian, J., He, W., Ma, W., Zhang, Y., Liu, L., Yan, C., Zhang, W., Lan, Y.(2026) Science 391: eads9530-eads9530
- PubMed: 41505557 
- DOI: https://doi.org/10.1126/science.ads9530
- Primary Citation of Related Structures:  
9JEL, 9JF3 - PubMed Abstract: 
Recent breakthroughs in protein structure prediction have opened new avenues for genome-wide drug discovery, yet existing virtual screening methods remain computationally prohibitive. We present DrugCLIP, a contrastive learning framework that achieves ultrafast and accurate virtual screening, up to 10 million times faster than docking, while consistently outperforming various baselines on in silico benchmarks. In wet-lab validations, DrugCLIP achieved a 15% hit rate for norepinephrine transporter, and structures of two identified inhibitors were determined in complex with the target protein. For thyroid hormone receptor interactor 12, a target that lacks holo structures and small-molecule binders, DrugCLIP achieved a 17.5% hit rate using only AlphaFold2-predicted structures. Finally, we released GenomeScreenDB, an open-access database providing precomputed results for ~10,000 human proteins screened against 500 million compounds, pioneering a drug discovery paradigm in the post-AlphaFold era.
- Institute for AI Industry Research (AIR), Tsinghua University, Beijing, China.
Organizational Affiliation: 
















