Open source // MIT License

Research a disease
like a scientist would.

Biotech Research Studio puts the entire published research landscape for a disease at your fingertips. Papers, clinical trials, drug targets, protein structures. AI reads it all. You ask the questions. Configure it for any disease.

The research exists. Finding it is the hard part.

Thousands of papers are published every year on every disease. Clinical trials open and close. New drug targets are identified. But this information is scattered across databases, written in specialist language, and hard to navigate without years of training.

Too much to read

PubMed has millions of papers. For any given disease, hundreds are relevant. Nobody has time to read them all, let alone connect the dots across them.

Hard to understand

Research papers are written for specialists. The terminology, methodology, and implications are opaque to anyone outside the field.

No single place

Papers are on PubMed. Trials are on ClinicalTrials.gov. Protein structures are on AlphaFold. Drug data is elsewhere. Nothing is connected.

What this tool does

Biotech Research Studio connects the dots for you. It pulls papers, reads them with AI, extracts the key findings, and puts everything in one place. You get the knowledge without the years of training.

75
papers

Full text ingested and analyzed. Not abstracts. See results →

42
drug candidates

Extracted by Gemma with target, mechanism, and evidence level. See candidates →

5
protein targets

Mapped to AlphaFold 3D structures you can inspect. View targets →

JSON
output

Structured data you can query, filter, and build on. View raw data →

The Research Pipeline

01
Discover

Papers from PubMed. Trials from ClinicalTrials.gov. Daily.

02
Read

Gemma 4 reads full papers. Methods, results, discussion.

03
Extract

Drug candidates, targets, mechanisms. Structured JSON.

04
Map

AlphaFold 3D protein structures for each target.

05
Compound

Knowledge base grows overnight. Wiki compiles itself.

06
Chat

Ask questions, get cited answers. 22 languages.

Gemma at Work

Real output from Experiment #4. Gemma 4 26B read 63,979 characters of a paper and returned this in under 5 minutes.

PROMPT SENT TO GEMMA 4
22 sections, 63,979 chars
system:
"You are a senior HD researcher and medicinal chemist. Focus on actionable drug repurposing opportunities."
user:
"You are screening this HD research paper for drug candidates that target somatic CAG repeat expansion.
The 5 validated targets: MSH3, FAN1, PMS1, MLH1, LIG1
FULL PAPER TEXT: [63,979 characters]
Analyze this paper. Return JSON."
GEMMA 4 RESPONSE
confidence: high
{
"main_finding": "Reducing MSH3 expression in a dose-dependent manner effectively prevents somatic CAG repeat expansions in the CNS of HD mouse models",
"relevance_score": 10,
"drug_candidates": [{
"name": "MSH3 di-siRNA",
"target": "MSH3",
"key_result": "10 nmol dose reduced striatal SEI by 78.1%",
"evidence_level": "animal_model"
}]
}

Gemma 4 did this 75 times across 75 papers. 4,976,515 characters total. Found 42 drug candidates across 5 validated protein targets. On a Mac M2. Cost: $0.

See full results arrow_forward
CROSS-PAPER SYNTHESIS // 75 papers aggregated
gemma4:26b via ollama
Target Confidence Druggability Top drug candidate Evidence
MSH3
90
HIGH di-siRNA (78% expansion reduction) animal model
PMS1
85
HIGH branaplam (splice modulator) cell model
FAN1
80
MEDIUM miR-124-3p antagomir cell model
MLH1
70
MEDIUM cyclic peptide inhibitors preclinical
LIG1
60
EMERGING ligase fidelity enhancers concept
AI-generated hypotheses. Not validated findings. Not medical advice. Full experiment report → Source code →
Run on Kaggle

Try it yourself. Free GPU.

The full pipeline runs on Kaggle's free T4 GPU. Pick a disease, point it at PubMed, watch Gemma 4 extract drug candidates from real papers.

Open Kaggle Notebook open_in_new
speed Inference Stats
Model
gemma4:26b
Context window
65,536 tokens
Runtime
Ollama (local)
Hardware
Mac M2, 24GB
Per-paper time
~4 min
Total cost
$0.00

Declarative Biology

Define your disease, targets, and search queries in one file. The studio builds everything else: pipelines, agents, website, chatbot.

settings layers hub
1
disease:
2
name: "Parkinson's Disease"
3
short_name: "PD"
4
targets:
5
- symbol: "LRRK2"
6
name: "Leucine Rich Repeat Kinase 2"
7
- symbol: "GBA"
8
- symbol: "SNCA"
9
branding:
10
app_name: "PD Research Studio"
Active Node

HD Research

Huntington's Disease Studio

Enter Workspace arrow_forward
Starter config

Parkinson's

LRRK2, GBA, SNCA

Fork to deploy
Starter config

ALS

SOD1, TDP-43, FUS

Fork to deploy
add

Your disease

Compute Agnostic Deployment
laptop_mac Local Laptop Ollama, Mac M2, RTX 5090
terminal Kaggle Node Free GPU notebooks
cloud Cloud Native OpenAI, Anthropic, NIM
smartphone On-device Edge Gemma 2B, offline

Pick a disease.
Configure the workspace.
Start contributing.

terminal

Engineers

You know Python and ML. This gives you the biomedical data layer. Fork, configure, run experiments.

diversity_3

Researchers

Automated literature review. The platform reads papers overnight. You focus on the science.

science

Patient communities

Research in plain language. 22 languages. A chatbot that cites its sources, not hallucinate.

volunteer_activism

Foundations

Deploy a research workspace for your disease community. Open source. No vendor dependency.