Open source // MIT License

Research a disease
like a scientist would.

Biotech Research Studio puts the entire published research landscape for a disease at your fingertips. Papers, clinical trials, drug targets, protein structures. AI reads it all. You ask the questions. Configure it for any disease.

See a live workspace View Source

The research exists. Finding it is the hard part.

Thousands of papers are published every year on every disease. Clinical trials open and close. New drug targets are identified. But this information is scattered across databases, written in specialist language, and hard to navigate without years of training.

Too much to read

PubMed has millions of papers. For any given disease, hundreds are relevant. Nobody has time to read them all, let alone connect the dots across them.

Hard to understand

Research papers are written for specialists. The terminology, methodology, and implications are opaque to anyone outside the field.

No single place

Papers are on PubMed. Trials are on ClinicalTrials.gov. Protein structures are on AlphaFold. Drug data is elsewhere. Nothing is connected.

What this tool does

Biotech Research Studio connects the dots for you. It pulls papers, reads them with AI, extracts the key findings, and puts everything in one place. You get the knowledge without the years of training.

75

papers

Full text ingested and analyzed. Not abstracts. See results →

42

drug candidates

Extracted by Gemma with target, mechanism, and evidence level. See candidates →

5

protein targets

Mapped to AlphaFold 3D structures you can inspect. View targets →

JSON

output

Structured data you can query, filter, and build on. View raw data →

The Research Pipeline

Runs overnight, every night

01

Discover

Papers from PubMed. Trials from ClinicalTrials.gov. Daily.

02

Read

Gemma 4 reads full papers. Methods, results, discussion.

03

Extract

Drug candidates, targets, mechanisms. Structured JSON.

04

Map

AlphaFold 3D protein structures for each target.

05

Compound

Knowledge base grows overnight. Wiki compiles itself.

06

Chat

Ask questions, get cited answers. 22 languages.

Gemma at Work

Real output from Experiment #4. Gemma 4 26B read 63,979 characters of a paper and returned this in under 5 minutes.

Actual model output

PROMPT SENT TO GEMMA 4

22 sections, 63,979 chars

system:

"You are a senior HD researcher and medicinal chemist. Focus on actionable drug repurposing opportunities."

user:

"You are screening this HD research paper for drug candidates that target somatic CAG repeat expansion.

The 5 validated targets: MSH3, FAN1, PMS1, MLH1, LIG1

FULL PAPER TEXT: [63,979 characters]

Analyze this paper. Return JSON."

GEMMA 4 RESPONSE

confidence: high

{

"main_finding": "Reducing MSH3 expression in a dose-dependent manner effectively prevents somatic CAG repeat expansions in the CNS of HD mouse models",

"relevance_score": 10,

"drug_candidates": [{

"name": "MSH3 di-siRNA",

"target": "MSH3",

"key_result": "10 nmol dose reduced striatal SEI by 78.1%",

"evidence_level": "animal_model"

}]

}

Gemma 4 did this 75 times across 75 papers. 4,976,515 characters total. Found 42 drug candidates across 5 validated protein targets. On a Mac M2. Cost: $0.

See full results arrow_forward

CROSS-PAPER SYNTHESIS // 75 papers aggregated

gemma4:26b via ollama

Target	Confidence	Druggability	Top drug candidate	Evidence
MSH3	90	HIGH	di-siRNA (78% expansion reduction)	animal model
PMS1	85	HIGH	branaplam (splice modulator)	cell model
FAN1	80	MEDIUM	miR-124-3p antagomir	cell model
MLH1	70	MEDIUM	cyclic peptide inhibitors	preclinical
LIG1	60	EMERGING	ligase fidelity enhancers	concept

AI-generated hypotheses. Not validated findings. Not medical advice. Full experiment report → Source code →

Run on Kaggle

Try it yourself. Free GPU.

The full pipeline runs on Kaggle's free T4 GPU. Pick a disease, point it at PubMed, watch Gemma 4 extract drug candidates from real papers.

Open Kaggle Notebook open_in_new

speed Inference Stats

Model

gemma4:26b

Context window

65,536 tokens

Runtime

Ollama (local)

Hardware

Mac M2, 24GB

Per-paper time

~4 min

Total cost

$0.00

Declarative Biology

Define your disease, targets, and search queries in one file. The studio builds everything else: pipelines, agents, website, chatbot.

settings layers hub

1

disease:

2

name: "Parkinson's Disease"

3

short_name: "PD"

4

targets:

5

- symbol: "LRRK2"

6

name: "Leucine Rich Repeat Kinase 2"

7

- symbol: "GBA"

8

- symbol: "SNCA"

9

branding:

10

app_name: "PD Research Studio"

Active Node

HD Research

Huntington's Disease Studio

Enter Workspace arrow_forward

Starter config

Parkinson's

LRRK2, GBA, SNCA

Fork to deploy

Starter config

ALS

SOD1, TDP-43, FUS

Fork to deploy

add

Your disease

Compute Agnostic Deployment

laptop_mac Local Laptop Ollama, Mac M2, RTX 5090

terminal Kaggle Node Free GPU notebooks

cloud Cloud Native OpenAI, Anthropic, NIM

smartphone On-device Edge Gemma 2B, offline

Pick a disease.
Configure the workspace.
Start contributing.

terminal

Engineers

You know Python and ML. This gives you the biomedical data layer. Fork, configure, run experiments.

diversity_3

Researchers

Automated literature review. The platform reads papers overnight. You focus on the science.

science

Patient communities

Research in plain language. 22 languages. A chatbot that cites its sources, not hallucinate.

volunteer_activism

Foundations

Deploy a research workspace for your disease community. Open source. No vendor dependency.

Explore HD Studio Fork on GitHub

Research a diseaselike a scientist would.