AutoScreen

Automating Knowledge-based Target Selection for Functional Screening

Stanford University
UC San Diego
Princeton University
Genentech

Authors

Yuanhao Qu1,2,3, Steven Hui4, Oswaldo Martinez2, Ming Yin5, Di Yin1,2,3, Xiaotong Wang1,2,3, Kexin Huang6, Hanchen Wang6,7, Mengdi Wang5,*, Le Cong1,2,3,*

1Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA

2Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA

3Cancer Biology Program, Stanford University School of Medicine, Stanford, CA 94305, USA

4Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA

5Center for Statistics and Machine Learning, Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ 08544, USA

6Department of Computer Science, Stanford University, Stanford, CA 94305, USA

7Genentech, South San Francisco, CA 94080, USA

*Corresponding authors: mengdiw@princeton.edu (M.W.), congle@stanford.edu (L.C.)

Abstract

Functional screening is a cornerstone of modern biology and drug discovery, enabling systematic interrogation of gene function and cellular processes. However, the growing complexity of library design and data interpretation poses significant challenges, especially as screening applications expand beyond essentiality in standard cell line models. Here we introduce AutoScreen, a large language model (LLM)-driven multi-agent system for automating candidate selection and analysis in functional screens. AutoScreen integrates diverse biological knowledge—from scientific literature and KEGG pathways to ClinVar variants, gene/protein expression, and interaction networks—to generate contextually relevant screening candidates. It features LLM-Select, a module that prioritizes information sources based on experimental context, and employs reinforcement fine-tuning to train a task-specialized model optimized for biological reasoning. Unlike black-box systems, AutoScreen supports transparent, interactive decision-making, offering users full visibility into the rationale behind each prediction. We evaluated AutoScreen on a benchmark of 400 genome-scale CRISPRko, CRISPRi, and CRISPRa screens, and found it significantly outperforms general-purpose LLMs in both accuracy and biological relevance under zero-shot conditions. In experimental validation, AutoScreen identified novel regulators of melanoma resistance to natural killer (NK) cells, several of which were confirmed through wet-lab assays. Our results demonstrate that AutoScreen is a powerful, interpretable AI agent system capable of accelerating functional screening and biological discovery.

How AutoScreen Works

AutoScreen is a multi-agent system that integrates diverse biological knowledge sources to automate target selection for functional screening

AutoScreen Framework
Literature Review
Comprehensive knowledge extraction

AutoScreen analyzes scientific literature to extract relevant information about genes, proteins, and pathways related to the experimental context.

Knowledge Integration
Multi-source data synthesis

The system integrates data from multiple sources including KEGG pathways, ClinVar variants, gene expression databases, and protein interaction networks.

Contextual Prioritization
Experiment-specific ranking

LLM-Select prioritizes information sources based on experimental context, ensuring that the most relevant data is used for target selection.

Transparent Decision-Making
Explainable target selection

Unlike black-box systems, AutoScreen provides full visibility into the rationale behind each prediction, supporting interactive decision-making.

Key Features

AutoScreen combines advanced AI with comprehensive biological knowledge to revolutionize functional screening

Multi-Agent Architecture
Specialized agents for comprehensive analysis

AutoScreen employs a team of specialized agents that work together to analyze different aspects of biological data, from literature review to pathway analysis.

Reinforcement Fine-Tuning
Optimized biological reasoning

The system employs reinforcement learning to fine-tune its models specifically for biological reasoning tasks, improving accuracy and relevance.

Experimental Validation
Proven in real-world applications

AutoScreen has been validated in real laboratory settings, successfully identifying novel regulators of melanoma resistance to natural killer cells.

Case Study: Melanoma Resistance

How AutoScreen identified novel regulators of melanoma resistance to natural killer (NK) cells

Challenge

Researchers needed to identify genes that regulate melanoma resistance to natural killer (NK) cells, a complex immunological process with multiple potential pathways and mechanisms.

Traditional approaches would require extensive literature review, pathway analysis, and expert knowledge to design an effective screening library.

Solution

AutoScreen analyzed scientific literature, pathway databases, and gene expression data to identify potential regulators of NK cell resistance in melanoma.

The system prioritized candidates based on their relevance to the experimental context and provided detailed explanations for each selection.

Results

AutoScreen successfully identified several novel regulators of melanoma resistance to NK cells that were not previously associated with this process.

These candidates were validated through wet-lab assays, confirming AutoScreen's ability to accelerate functional screening and biological discovery.

The system's transparent decision-making process allowed researchers to understand the rationale behind each prediction and refine their experimental approach.

Ready to Accelerate Your Functional Screening?

Contact us to learn how AutoScreen can help you identify optimal targets for your functional screening experiments.