Conventional image QA
- Pre-selected diagnostically relevant 2D views.
- Black-box final answers with limited process visibility.
- Little pressure to preserve viewer state, coordinates, or evidence provenance.
Every viewer action, evidence artifact, and final answer can be replayed and audited across complete radiology studies and whole-slide pathology images.
Many benchmarks start from pre-selected slices, crops, or patches. Real imaging workflows require searching full studies, navigating software state, comparing views or timepoints, and documenting evidence that can be audited.
MedOpenClaw sits between a backbone VLM and standard medical imaging viewers. Agents use predefined viewer, evidence, and analysis actions, while each step is recorded as a replayable trace.
Drive 3D Slicer and QuPath through bounded actions: series selection, scrolling, windowing, panning, zooming, fusion, and viewport capture.
Export key slices, RAS coordinates, whole-slide coordinates, masks, measurements, bookmarks, and state snapshots for deterministic scoring.
Expose vetted segmentation, registration, resampling, quantitative analysis, and MONAI/VISTA3D workflows as auditable software operations.
A MedOpenClaw episode records what the agent saw, which operations it called, which artifacts were produced, and which evidence was available when the final answer was submitted.
Load complete radiology volume or whole-slide image.
Navigate slices, magnifications, fusion state, or analysis tools.
Record slices, coordinates, masks, ROIs, and viewer snapshots.
Check answers and evidence against hidden references.
MedCopilot is an example application built on top of MedOpenClaw, where generated actions, viewer states, and evidence artifacts remain visible for clinician inspection. We present it as a demonstration of traceable interaction, not as evidence of clinical deployment or workflow-efficiency gains.
The website treats the paper as technical backing for the platform. The landing experience foregrounds the runtime, benchmark, demos, and leaderboard.
@misc{shen2026medopenclaw,
title={MedOpenClaw and MedFlow-Bench: Auditing Medical Agents in Full-Study Workflows},
author={Shen, Weixiang and collaborators},
year={2026},
url={https://jakobshen.github.io/MedOpenClaw/}
}