AutoScreen

Automating Knowledge-based Target Selection for Functional Screening

Authors

Yuanhao Qu^1,2,3, Steven Hui⁴, Oswaldo Martinez², Ming Yin⁵, Di Yin^1,2,3, Xiaotong Wang^1,2,3, Kexin Huang⁶, Hanchen Wang^6,7, Mengdi Wang^5,*, Le Cong^1,2,3,*

¹Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA

²Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA

³Cancer Biology Program, Stanford University School of Medicine, Stanford, CA 94305, USA

⁴Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA

⁵Center for Statistics and Machine Learning, Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ 08544, USA

⁶Department of Computer Science, Stanford University, Stanford, CA 94305, USA

⁷Genentech, South San Francisco, CA 94080, USA

^*Corresponding authors: mengdiw@princeton.edu (M.W.), congle@stanford.edu (L.C.)

Abstract

Functional screening is a cornerstone of modern biology and drug discovery, enabling systematic interrogation of gene function and cellular processes. However, the growing complexity of library design and data interpretation poses significant challenges, especially as screening applications expand beyond essentiality in standard cell line models. Here we introduce AutoScreen, a large language model (LLM)-driven multi-agent system for automating candidate selection and analysis in functional screens. AutoScreen integrates diverse biological knowledge—from scientific literature and KEGG pathways to ClinVar variants, gene/protein expression, and interaction networks—to generate contextually relevant screening candidates. It features LLM-Select, a module that prioritizes information sources based on experimental context, and employs reinforcement fine-tuning to train a task-specialized model optimized for biological reasoning. Unlike black-box systems, AutoScreen supports transparent, interactive decision-making, offering users full visibility into the rationale behind each prediction. We evaluated AutoScreen on a benchmark of 400 genome-scale CRISPRko, CRISPRi, and CRISPRa screens, and found it significantly outperforms general-purpose LLMs in both accuracy and biological relevance under zero-shot conditions. In experimental validation, AutoScreen identified novel regulators of melanoma resistance to natural killer (NK) cells, several of which were confirmed through wet-lab assays. Our results demonstrate that AutoScreen is a powerful, interpretable AI agent system capable of accelerating functional screening and biological discovery.

How AutoScreen Works

AutoScreen is a multi-agent system that integrates diverse biological knowledge sources to automate target selection for functional screening

Literature Review

Comprehensive knowledge extraction

AutoScreen analyzes scientific literature to extract relevant information about genes, proteins, and pathways related to the experimental context.

Knowledge Integration

Multi-source data synthesis

The system integrates data from multiple sources including KEGG pathways, ClinVar variants, gene expression databases, and protein interaction networks.

Contextual Prioritization

Experiment-specific ranking

LLM-Select prioritizes information sources based on experimental context, ensuring that the most relevant data is used for target selection.

Transparent Decision-Making

Explainable target selection

Unlike black-box systems, AutoScreen provides full visibility into the rationale behind each prediction, supporting interactive decision-making.

Key Features

AutoScreen combines advanced AI with comprehensive biological knowledge to revolutionize functional screening

Multi-Agent Architecture

Specialized agents for comprehensive analysis

AutoScreen employs a team of specialized agents that work together to analyze different aspects of biological data, from literature review to pathway analysis.

Reinforcement Fine-Tuning

Optimized biological reasoning

The system employs reinforcement learning to fine-tune its models specifically for biological reasoning tasks, improving accuracy and relevance.

Experimental Validation

Proven in real-world applications

AutoScreen has been validated in real laboratory settings, successfully identifying novel regulators of melanoma resistance to natural killer cells.

Case Study: Melanoma Resistance

How AutoScreen identified novel regulators of melanoma resistance to natural killer (NK) cells

Challenge

Researchers needed to identify genes that regulate melanoma resistance to natural killer (NK) cells, a complex immunological process with multiple potential pathways and mechanisms.

Traditional approaches would require extensive literature review, pathway analysis, and expert knowledge to design an effective screening library.

Solution

AutoScreen analyzed scientific literature, pathway databases, and gene expression data to identify potential regulators of NK cell resistance in melanoma.

The system prioritized candidates based on their relevance to the experimental context and provided detailed explanations for each selection.

Results

AutoScreen successfully identified several novel regulators of melanoma resistance to NK cells that were not previously associated with this process.

These candidates were validated through wet-lab assays, confirming AutoScreen's ability to accelerate functional screening and biological discovery.