UKB Research Skill

ukb-research is a standalone AI agent skill for early-stage UK Biobank topic development. It is designed to help users move from a broad research idea to a concise, literature-informed, and feasible study plan.

What It Does

The skill can:

  • search and summarize related PubMed-indexed studies;
  • describe existing work and its limitations;
  • identify actionable research gaps;
  • propose a UK Biobank-compatible study design;
  • map the workflow to UKBAnalytica functions, R packages, Python libraries, and external tools;
  • return a structured Markdown research scoping report.

It is especially useful when the user has a broad idea such as an exposure-outcome pair, a disease area, an omics profile, or a prediction-model concept, but has not yet decided whether the project is novel and feasible.

Typical Use

Use $ukb-research to assess whether air pollution and atrial fibrillation is a feasible UK Biobank topic, summarize existing studies and their limitations, identify the research gap, and propose an analysis plan.

The output report usually includes:

  • research question and assumptions;
  • literature background and evidence table;
  • existing-work and gap analysis;
  • feasibility assessment in UK Biobank;
  • recommended study population, exposure, outcome, covariates, and models;
  • expected results and figures;
  • software implementation plan;
  • reference verification summary.

Reference Checking

The skill includes a lightweight citation-audit script:

python3 inst/skills/ukb-research/scripts/verify_references.py report.md \
  --output reference_audit.json \
  --bibtex references.bib \
  --ris references.ris

This utility extracts PMID, DOI, and URL entries from the Markdown report, checks PMID/DOI records when network access is available, and can export BibTeX or RIS files for reference managers.

Privacy Boundary

ukb-research is for literature scoping, study planning, script drafting, and interpretation of aggregate results. It must not be used to inspect or process real UK Biobank participant-level data, row-level RAP tables, exact dates, eid values, or per-row model outputs. Any executable analysis scripts generated by the skill should be run inside the official UK Biobank RAP environment.