Interpretation and reporting workflow
Songqi Duan
duan@songqi.org Source:vignettes/interpretation-workflows.Rmd
interpretation-workflows.RmdThis article shows how the new interpretation layer fits on top of the current Shennong analysis workflow. The key idea is that analysis stays deterministic, while interpretation reuses stored DE and enrichment results to build evidence bundles, prompts, and optional LLM-style summaries.
The workflow below demonstrates:
- marker discovery with
sn_find_de() - enrichment storage with
sn_enrich()/sn_store_enrichment() - result discovery with
sn_list_results() - direct retrieval with
sn_get_de_result()andsn_get_enrichment_result() - evidence preparation for annotation, DE, and writing tasks
- prompt generation with
sn_build_prompt() - provider-based interpretation with
sn_interpret_annotation()
Marker discovery remains the analysis layer
Interpretation starts from stored DE results rather than recomputing
statistics. Here, sn_find_de() has already saved cluster
markers into the Seurat object, and we retrieve them through
sn_get_de_result().
| cluster | gene | avg_log2FC | p_val_adj |
|---|---|---|---|
| 0 | BPI | 5.812083 | 0 |
| 0 | ENSG00000289381 | 5.667320 | 0 |
| 0 | MTARC1 | 5.624992 | 0 |
Store enrichment for later interpretation
The interpretation layer expects enrichment results to be stored in
the Seurat object. sn_enrich() can now store them directly
when you pass a Seurat object through x or
object =, and its gene_clusters formula now
covers both grouped ORA inputs such as gene ~ cluster and
ranked GSEA inputs such as gene ~ avg_log2FC.
knitr::kable(top_pathways)| ID | Description | NES | p.adjust |
|---|---|---|---|
| GO:0042776 | proton motive force-driven mitochondrial ATP synthesis | -2.515028 | 0.0002216 |
| GO:0006120 | mitochondrial electron transport, NADH to ubiquinone | -2.099828 | 0.0427655 |
| GO:0032543 | mitochondrial translation | -1.960416 | 0.0651602 |
| GO:0098581 | detection of external biotic stimulus | 1.921567 | 0.0015675 |
| GO:0009595 | detection of biotic stimulus | 1.912715 | 0.0031257 |
Discover what is currently stored on the object
Use sn_list_results() to see which DE, enrichment, and
interpretation artifacts are available for reuse.
knitr::kable(result_index)| collection | type | name | analysis | method | created_at | n_rows | source |
|---|---|---|---|---|---|---|---|
| de_results | de | cluster_markers | markers | wilcox | 2026-03-26 19:09:31 UTC | 17203 | NA |
| enrichment_results | enrichment | cluster0_gsea | gsea | NA | 2026-03-26 19:10:06 UTC | 2646 | cluster_markers |
| interpretation_results | interpretation | cluster_annotation_note | NA | NA | 2026-03-26 19:10:06 UTC | 0 | NA |
Prepare structured evidence
sn_prepare_*_evidence() turns stored outputs into
compact, reusable evidence bundles. These are the objects that can later
be turned into prompts or passed into writing helpers.
knitr::kable(annotation_evidence$cluster_summary)| cluster | n_cells | fraction | top_markers |
|---|---|---|---|
| 0 | 229 | 0.1884774 | BPI, ENSG00000289381, MTARC1, MCEMP1, VNN3P |
| 1 | 172 | 0.1415638 | IATPR, TTC39C-AS1, PI16, TNFRSF4, CRIP2 |
| 2 | 149 | 0.1226337 | EDAR, ANKRD55, ADTRP, TSHZ2, ENSG00000271774 |
| 3 | 144 | 0.1185185 | TCL1A, ENSG00000257275, SCN3A, ENSG00000224610, KCNG1 |
| 4 | 133 | 0.1094650 | LINC02446, CD8B, ENSG00000310107, CRTAM, GZMH |
| 5 | 125 | 0.1028807 | CLEC10A, FCER1A, CYP2S1, DTNA, ZBTB46 |
| 6 | 75 | 0.0617284 | SLC4A10, ENSG00000228033, IL23R, ADAM12, LINC01644 |
| 7 | 61 | 0.0502058 | IGHA1, IGHG2, IGHG3, IGHG1, SSPN |
| 8 | 54 | 0.0444444 | ENSG00000288970, ENSG00000291157, MYOM2, ENSG00000288782, ENSG00000294329 |
| 9 | 32 | 0.0263374 | MT-CYB, MT-CO3, MT-ATP6, MT-CO2, MT-ND3 |
| 10 | 26 | 0.0213992 | LYPD2, ENSG00000301038, UICLM, PPP1R17, LINC02345 |
| 11 | 15 | 0.0123457 | CLDN5, PF4V1, CMTM5, PDZK1IP1, PDGFA-DT |
knitr::kable(enrichment_evidence$top_terms)| ID | Description | NES | p.adjust |
|---|---|---|---|
| GO:0042776 | proton motive force-driven mitochondrial ATP synthesis | -2.515028 | 0.0002216 |
| GO:0006120 | mitochondrial electron transport, NADH to ubiquinone | -2.099828 | 0.0427655 |
| GO:0032543 | mitochondrial translation | -1.960416 | 0.0651602 |
| GO:0098581 | detection of external biotic stimulus | 1.921567 | 0.0015675 |
| GO:0009595 | detection of biotic stimulus | 1.912715 | 0.0031257 |
Build prompts without binding to a specific provider
The first-class package object for the LLM layer is a prompt bundle, not a network call. This keeps analysis and interpretation cleanly separated.
cat(substr(annotation_prompt$user, 1, 900))
#> Task: annotation.
#>
#> Audience: scientist.
#>
#> Language: en.
#>
#> Target style: concise annotation note.
#>
#> Task instructions: Interpret cluster-level marker evidence to support cell type annotation. For each cluster, identify the most plausible cell type or state. Explicitly cite the top markers that support the label and mention conflicting markers when relevant. Flag ambiguous clusters and suggest additional markers or orthogonal checks that would improve confidence. Return one concise cluster-by-cluster annotation table followed by a short narrative summary.
#>
#>
#>
#>
#>
#> Evidence:
#>
#> task:
#> annotation
#>
#> cluster_col:
#> seurat_clusters
#>
#> source_de_name:
#> cluster_markers
#>
#> analysis_method:
#> wilcox
#>
#> species:
#> human
#>
#> cluster_summary:
#> # A tibble: 8 × 4
#> cluster n_cells fraction top_markers
#> <fct> <int> Generate manuscript-style prompts
High-level helpers such as sn_write_results() can return
prompts directly.
Initialize a Codex-ready analysis project
For longer-running projects, it helps to initialize a governed
analysis repository rather than improvising the folder structure. The
packaged project template used by
sn_initialize_codex_project() creates
AGENTS.md, memory/,
docs/standards/, skills/,
config/, data/, scripts/,
notebooks/, runs/, and
results/.
codex_project <- file.path(tempdir(), "shennong-codex-project")
created <- sn_initialize_codex_project(
path = codex_project,
project_name = "PBMC interpretation pilot",
objective = "Build a reproducible PBMC interpretation workflow with Shennong.",
overwrite = TRUE
)
knitr::kable(
tibble::tibble(
file = names(created)[-1],
path = unname(unlist(created[-1]))
)
)| file | path |
|---|---|
| readme | /tmp/RtmpD1xvn2/shennong-codex-project/README.md |
| config | /tmp/RtmpD1xvn2/shennong-codex-project/config |
| config_default | /tmp/RtmpD1xvn2/shennong-codex-project/config/default.yaml |
| data | /tmp/RtmpD1xvn2/shennong-codex-project/data |
| data_raw | /tmp/RtmpD1xvn2/shennong-codex-project/data/raw |
| data_processed | /tmp/RtmpD1xvn2/shennong-codex-project/data/processed |
| data_metadata | /tmp/RtmpD1xvn2/shennong-codex-project/data/metadata |
| scripts | /tmp/RtmpD1xvn2/shennong-codex-project/scripts |
| notebooks | /tmp/RtmpD1xvn2/shennong-codex-project/notebooks |
| runs | /tmp/RtmpD1xvn2/shennong-codex-project/runs |
| results | /tmp/RtmpD1xvn2/shennong-codex-project/results |
| results_figures | /tmp/RtmpD1xvn2/shennong-codex-project/results/figures |
| results_tables | /tmp/RtmpD1xvn2/shennong-codex-project/results/tables |
| results_reports | /tmp/RtmpD1xvn2/shennong-codex-project/results/reports |
| agents_md | /tmp/RtmpD1xvn2/shennong-codex-project/AGENTS.md |
| agents | /tmp/RtmpD1xvn2/shennong-codex-project/AGENTS.md |
| memory | /tmp/RtmpD1xvn2/shennong-codex-project/memory |
| memory_decisions | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Decisions.md |
| decisions | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Decisions.md |
| memory_plan | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Plan.md |
| plan | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Plan.md |
| memory_prompt | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Prompt.md |
| prompt | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Prompt.md |
| memory_status | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Status.md |
| status | /tmp/RtmpD1xvn2/shennong-codex-project/memory/Status.md |
| standards | /tmp/RtmpD1xvn2/shennong-codex-project/docs/standards |
| conventions | /tmp/RtmpD1xvn2/shennong-codex-project/docs/standards/BioinformaticsAnalysisConventions.md |
| skills | /tmp/RtmpD1xvn2/shennong-codex-project/skills |
The initialized project keeps durable operating rules in
AGENTS.md, project state in memory/,
enforceable directory and naming rules in
docs/standards/BioinformaticsAnalysisConventions.md, and
reusable governance procedures in skills/.
cat(substr(results_prompt$user, 1, 900))
#> Task: results.
#>
#> Audience: scientist.
#>
#> Language: en.
#>
#> Target style: manuscript-style Results section.
#>
#> Task instructions: Write a manuscript-style Results subsection using only the supplied evidence. Keep the tone formal, precise, and evidence-based. Integrate cluster, DE, and enrichment findings into a coherent paragraph sequence instead of bullet fragments. Do not claim validation beyond the provided evidence.
#>
#>
#>
#>
#>
#> Evidence:
#>
#> task:
#> results
#>
#> dataset:
#> n_cells:
#> 1215
#>
#> n_features:
#> 54872
#>
#> cluster_col:
#> seurat_clusters
#>
#> clusters:
#> 12
#>
#> cluster_summary:
#> # A tibble: 8 × 3
#> cluster n_cells fraction
#> <fct> <int> <dbl>
#> 1 0 229 0.188
#> 2 1 172 0.142
#> 3 2 149 0.123
#> 4 3 144 0.119
#> 5Plug in a provider when needed
If you provide a function that accepts messages and
returns text, the same high-level helpers can store the generated
interpretation back into the Seurat object.
annotation_response
#> [1] "Mock interpretation generated from 2 messages using demo-model"Interpretation results are stored alongside analysis results
names(pbmc_interpreted@misc$interpretation_results)
#> [1] "cluster_annotation_note"