Skip to contents

This article shows how the new interpretation layer fits on top of the current Shennong analysis workflow. The key idea is that analysis stays deterministic, while interpretation reuses stored DE and enrichment results to build evidence bundles, prompts, and optional LLM-style summaries.

The workflow below demonstrates:

Marker discovery remains the analysis layer

Interpretation starts from stored DE results rather than recomputing statistics. Here, sn_find_de() has already saved cluster markers into the Seurat object, and we retrieve them through sn_get_de_result().

knitr::kable(top_cluster_markers |> dplyr::select(cluster, gene, avg_log2FC, p_val_adj))
cluster gene avg_log2FC p_val_adj
0 BPI 5.812083 0
0 ENSG00000289381 5.667320 0
0 MTARC1 5.624992 0

Store enrichment for later interpretation

The interpretation layer expects enrichment results to be stored in the Seurat object. sn_enrich() can now store them directly when you pass a Seurat object through x or object =, and its gene_clusters formula now covers both grouped ORA inputs such as gene ~ cluster and ranked GSEA inputs such as gene ~ avg_log2FC.

knitr::kable(top_pathways)
ID Description NES p.adjust
GO:0042776 proton motive force-driven mitochondrial ATP synthesis -2.515028 0.0002216
GO:0006120 mitochondrial electron transport, NADH to ubiquinone -2.099828 0.0427655
GO:0032543 mitochondrial translation -1.960416 0.0651602
GO:0098581 detection of external biotic stimulus 1.921567 0.0015675
GO:0009595 detection of biotic stimulus 1.912715 0.0031257

Discover what is currently stored on the object

Use sn_list_results() to see which DE, enrichment, and interpretation artifacts are available for reuse.

knitr::kable(result_index)
collection type name analysis method created_at n_rows source
de_results de cluster_markers markers wilcox 2026-03-26 19:09:31 UTC 17203 NA
enrichment_results enrichment cluster0_gsea gsea NA 2026-03-26 19:10:06 UTC 2646 cluster_markers
interpretation_results interpretation cluster_annotation_note NA NA 2026-03-26 19:10:06 UTC 0 NA

Prepare structured evidence

sn_prepare_*_evidence() turns stored outputs into compact, reusable evidence bundles. These are the objects that can later be turned into prompts or passed into writing helpers.

knitr::kable(annotation_evidence$cluster_summary)
cluster n_cells fraction top_markers
0 229 0.1884774 BPI, ENSG00000289381, MTARC1, MCEMP1, VNN3P
1 172 0.1415638 IATPR, TTC39C-AS1, PI16, TNFRSF4, CRIP2
2 149 0.1226337 EDAR, ANKRD55, ADTRP, TSHZ2, ENSG00000271774
3 144 0.1185185 TCL1A, ENSG00000257275, SCN3A, ENSG00000224610, KCNG1
4 133 0.1094650 LINC02446, CD8B, ENSG00000310107, CRTAM, GZMH
5 125 0.1028807 CLEC10A, FCER1A, CYP2S1, DTNA, ZBTB46
6 75 0.0617284 SLC4A10, ENSG00000228033, IL23R, ADAM12, LINC01644
7 61 0.0502058 IGHA1, IGHG2, IGHG3, IGHG1, SSPN
8 54 0.0444444 ENSG00000288970, ENSG00000291157, MYOM2, ENSG00000288782, ENSG00000294329
9 32 0.0263374 MT-CYB, MT-CO3, MT-ATP6, MT-CO2, MT-ND3
10 26 0.0213992 LYPD2, ENSG00000301038, UICLM, PPP1R17, LINC02345
11 15 0.0123457 CLDN5, PF4V1, CMTM5, PDZK1IP1, PDGFA-DT
knitr::kable(enrichment_evidence$top_terms)
ID Description NES p.adjust
GO:0042776 proton motive force-driven mitochondrial ATP synthesis -2.515028 0.0002216
GO:0006120 mitochondrial electron transport, NADH to ubiquinone -2.099828 0.0427655
GO:0032543 mitochondrial translation -1.960416 0.0651602
GO:0098581 detection of external biotic stimulus 1.921567 0.0015675
GO:0009595 detection of biotic stimulus 1.912715 0.0031257

Build prompts without binding to a specific provider

The first-class package object for the LLM layer is a prompt bundle, not a network call. This keeps analysis and interpretation cleanly separated.

cat(substr(annotation_prompt$user, 1, 900))
#> Task: annotation.
#> 
#> Audience: scientist.
#> 
#> Language: en.
#> 
#> Target style: concise annotation note.
#> 
#> Task instructions: Interpret cluster-level marker evidence to support cell type annotation. For each cluster, identify the most plausible cell type or state. Explicitly cite the top markers that support the label and mention conflicting markers when relevant. Flag ambiguous clusters and suggest additional markers or orthogonal checks that would improve confidence. Return one concise cluster-by-cluster annotation table followed by a short narrative summary.
#> 
#> 
#> 
#> 
#> 
#> Evidence:
#> 
#> task:
#> annotation
#> 
#> cluster_col:
#> seurat_clusters
#> 
#> source_de_name:
#> cluster_markers
#> 
#> analysis_method:
#> wilcox
#> 
#> species:
#> human
#> 
#> cluster_summary:
#> # A tibble: 8 × 4
#>   cluster n_cells fraction top_markers                                          
#>   <fct>     <int>    

Generate manuscript-style prompts

High-level helpers such as sn_write_results() can return prompts directly.

Initialize a Codex-ready analysis project

For longer-running projects, it helps to initialize a governed analysis repository rather than improvising the folder structure. The packaged project template used by sn_initialize_codex_project() creates AGENTS.md, memory/, docs/standards/, skills/, config/, data/, scripts/, notebooks/, runs/, and results/.

codex_project <- file.path(tempdir(), "shennong-codex-project")

created <- sn_initialize_codex_project(
  path = codex_project,
  project_name = "PBMC interpretation pilot",
  objective = "Build a reproducible PBMC interpretation workflow with Shennong.",
  overwrite = TRUE
)

knitr::kable(
  tibble::tibble(
    file = names(created)[-1],
    path = unname(unlist(created[-1]))
  )
)
file path
readme /tmp/RtmpD1xvn2/shennong-codex-project/README.md
config /tmp/RtmpD1xvn2/shennong-codex-project/config
config_default /tmp/RtmpD1xvn2/shennong-codex-project/config/default.yaml
data /tmp/RtmpD1xvn2/shennong-codex-project/data
data_raw /tmp/RtmpD1xvn2/shennong-codex-project/data/raw
data_processed /tmp/RtmpD1xvn2/shennong-codex-project/data/processed
data_metadata /tmp/RtmpD1xvn2/shennong-codex-project/data/metadata
scripts /tmp/RtmpD1xvn2/shennong-codex-project/scripts
notebooks /tmp/RtmpD1xvn2/shennong-codex-project/notebooks
runs /tmp/RtmpD1xvn2/shennong-codex-project/runs
results /tmp/RtmpD1xvn2/shennong-codex-project/results
results_figures /tmp/RtmpD1xvn2/shennong-codex-project/results/figures
results_tables /tmp/RtmpD1xvn2/shennong-codex-project/results/tables
results_reports /tmp/RtmpD1xvn2/shennong-codex-project/results/reports
agents_md /tmp/RtmpD1xvn2/shennong-codex-project/AGENTS.md
agents /tmp/RtmpD1xvn2/shennong-codex-project/AGENTS.md
memory /tmp/RtmpD1xvn2/shennong-codex-project/memory
memory_decisions /tmp/RtmpD1xvn2/shennong-codex-project/memory/Decisions.md
decisions /tmp/RtmpD1xvn2/shennong-codex-project/memory/Decisions.md
memory_plan /tmp/RtmpD1xvn2/shennong-codex-project/memory/Plan.md
plan /tmp/RtmpD1xvn2/shennong-codex-project/memory/Plan.md
memory_prompt /tmp/RtmpD1xvn2/shennong-codex-project/memory/Prompt.md
prompt /tmp/RtmpD1xvn2/shennong-codex-project/memory/Prompt.md
memory_status /tmp/RtmpD1xvn2/shennong-codex-project/memory/Status.md
status /tmp/RtmpD1xvn2/shennong-codex-project/memory/Status.md
standards /tmp/RtmpD1xvn2/shennong-codex-project/docs/standards
conventions /tmp/RtmpD1xvn2/shennong-codex-project/docs/standards/BioinformaticsAnalysisConventions.md
skills /tmp/RtmpD1xvn2/shennong-codex-project/skills

The initialized project keeps durable operating rules in AGENTS.md, project state in memory/, enforceable directory and naming rules in docs/standards/BioinformaticsAnalysisConventions.md, and reusable governance procedures in skills/.

cat(substr(results_prompt$user, 1, 900))
#> Task: results.
#> 
#> Audience: scientist.
#> 
#> Language: en.
#> 
#> Target style: manuscript-style Results section.
#> 
#> Task instructions: Write a manuscript-style Results subsection using only the supplied evidence. Keep the tone formal, precise, and evidence-based. Integrate cluster, DE, and enrichment findings into a coherent paragraph sequence instead of bullet fragments. Do not claim validation beyond the provided evidence.
#> 
#> 
#> 
#> 
#> 
#> Evidence:
#> 
#> task:
#> results
#> 
#> dataset:
#> n_cells:
#> 1215
#> 
#> n_features:
#> 54872
#> 
#> cluster_col:
#> seurat_clusters
#> 
#> clusters:
#> 12
#> 
#> cluster_summary:
#> # A tibble: 8 × 3
#>   cluster n_cells fraction
#>   <fct>     <int>    <dbl>
#> 1 0           229   0.188 
#> 2 1           172   0.142 
#> 3 2           149   0.123 
#> 4 3           144   0.119 
#> 5

Plug in a provider when needed

If you provide a function that accepts messages and returns text, the same high-level helpers can store the generated interpretation back into the Seurat object.

annotation_response
#> [1] "Mock interpretation generated from 2 messages using demo-model"

Interpretation results are stored alongside analysis results

names(pbmc_interpreted@misc$interpretation_results)
#> [1] "cluster_annotation_note"