---
name: biosimulant-scientific-publishing
title: Biosimulant Scientific Publishing
description: Apply the publish-worthy quality bar for scientific Biosimulant labs: source-faithful, runnable, traceable, documented, visual, and conservative about claims.
version: 2026.05.23
tags: [biosimulant, publishing, scientific-accuracy, provenance, hub]
audience: Claude Code, Cursor, Continue, Aider
recommended: true
---

# Biosimulant Scientific Publishing

Use this skill when a lab must be ready for public Biosimulant Hub publication.

## Biosimulant Publishing Context

Biosimulant publishes runnable scientific labs, not just model files. A publish-ready lab needs source artifacts,
traceable provenance, a `BioModule` runtime wrapper, user-facing public ports, real runtime output, visualisations,
README screenshots, and Hub run evidence. Treat source fidelity and scientific caveats as part of the package contract.

## Definition Of Scientific Correctness

For Biosimulant publishing, scientific correctness means:

- faithful execution of the bundled source model or source-derived artifact
- traceable upstream provenance
- conservative user-facing labels
- verified Biosimulant wiring
- finite, non-empty runtime outputs
- useful visualisations based on actual model outputs

Do not claim full reproduction of source papers, figures, clinical performance, or biological truth unless reference
trajectory or source-sample parity evidence proves it.

## Source-Faithful Rule

Preserve the source scientific artifact:

- SBML XML remains the source of truth for SBML labs. Use open-source `TelluriumSBMLBioModule` / Tellurium-backed
  execution instead of hand-coded equations.
- CellML files and imports remain the source of truth for CellML labs. Use open-source `LibCellMLBioModule`, libCellML,
  NumPy, and SciPy integration instead of hand-coded equations.
- ONNX files remain executable source-derived artifacts for ONNX labs. Use ONNX Runtime to validate graph load and
  inference contracts.
- Custom code must document equations, assumptions, parameters, and references.

Do not manually rewrite equations, parameters, units, or initial values unless there is a clearly documented source repair.

## Planning Table

Before bulk edits, create:

```text
<TARGET_MODEL_REPO>/tmp/publish_cleanup_plan.csv
```

Include every lab with:

```text
current_lab_folder
proposed_lab_slug
source_type
upstream_source
scientific_context
keep_fix_orphan
reason
source_artifact
runtime_wrapper
proposed_public_inputs
proposed_public_outputs
visualisation_scope
scientific_question
answer_style
caveats
validation_required
```

Every public input must state the exact source symbol, runtime parameter, tensor, initial condition, boundary condition,
stimulus, protocol control, or preprocessing contract it maps to. If none is safe, write
`no_public_input_deliberate` and explain why.

## Public Naming

Make names user-friendly without weakening traceability:

- Put exact raw symbols in `maps_to`, descriptions, `variable_labels`, `species_labels`, and README mappings.
- Use clear names when source labels, components, units, or model context support them.
- Use conservative names for ambiguous symbols: `model_state_x`, `model_parameter_k`, `abstract_signal_level`.
- Never invent labels such as `infected_cells`, `tumor_burden`, `disease_score`, or `immune_response` without source
  evidence.
- Avoid raw-ish public names such as `y_y`, `v_u`, `source_v`, `A A`, or `dimensionless_state_v`.

## Visualisation Quality

Every kept lab should have meaningful, non-empty visuals:

- Q/A table with `Scientific question`, `Observed answer`, `Evidence`, `Dominant module`, and `Caveat`
- timeseries when dynamics exist
- bar/ranking summaries when scalar output comparisons exist
- table evidence for steady-state or flat outputs
- provenance/runtime evidence table

Do not show only file structure, graph metadata, or long numeric tensors when a better visual summary is possible.

Examples:

- physiology CellML: selected state trajectories, largest-change bar, variable/source mapping table, Q/A caveat
- BioModels SBML: species trajectories, final species ranking, parameter/species evidence table
- ONNX segmentation: input/output graph schema, channel statistics, mask/logit summary, synthetic-input caveat
- custom Python: scenario response timeseries, sensitivity bars, equation/assumption evidence table

## README Quality

Every kept lab README should explain:

- upstream source and bundled artifact
- what public inputs map to
- what public outputs map to
- runtime scenario/defaults
- what each screenshot shows
- scientific caveats and validation scope

Screenshots must be actual captured runtime visuals, not placeholders.

## Orphan Policy

Move a lab out of the publish set when it is broken, scientifically misleading, out of scope, or not fixable without
changing the source science.

Before moving, add or update a README or issue note with:

- why it was moved
- issue type: scientific, runtime, source-data, dependency, or scope
- whether it is fixable
- evidence for the decision
- recommended next action

## Validation

Run:

- manifest validation
- entrypoint validation
- Python `compileall`
- full pytest when reasonable
- runtime smoke for every kept core model
- visual smoke for every kept visualisation model
- public port audit
- README asset audit
- path audit showing zero `labs/*/model` directories in curated repositories

## Hub Publish Gate

Before public publication, run the lab locally. After publishing, verify the Hub detail endpoint:

```bash
biosimulant labs run /absolute/path/to/lab --json --no-open
biosimulant labs publish /absolute/path/to/lab --visibility public --json --no-open

hub_id=$(jq -r '.hub_id' /absolute/path/to/lab/.biosimulant-project.json)
biosimulant hub labs get "$hub_id" --json \
  | jq '.data | {is_public, completed_runs, run_count, package_name}'
```

Acceptance condition:

```text
is_public == true
completed_runs >= 1
```

`hub labs list --json` is not enough for run verification because it may return `completed_runs: null`.
