Chapter 9 AI Module
TCMDATA includes an AI module built on top of aisdk, providing three layers of AI capability: interpretation (one-shot analysis of R objects), agent (multi-turn interactive analysis with tool use), and skill (domain-specific workflow knowledge). This chapter demonstrates the key features of each layer.
9.1 Prerequisites
The AI module requires both TCMDATA and aisdk:
Model configuration is handled through tcm_setup() and only needs to be called once per session. User must set their own API key and provider to use AI module in this package.
9.2 Interpretation layer
The interpretation layer provides one-shot AI analysis of R objects and free-text queries, with no tool use or multi-turn conversation. The primary interface is tcm_interpret().
9.2.2 Interpreting enrichment objects
tcm_interpret() can directly accept a clusterProfiler enrichment object. The package automatically compresses the relevant terms and genes into a compact representation before sending to the model:
library(clusterProfiler)
library(org.Hs.eg.db)
lz_targets <- search_herb("lingzhi", type = "Herb_pinyin_name")$target
lz_targets <- sample(unique(na.omit(lz_targets)), 100)
bp <- enrichGO(
gene = lz_targets,
ont = "BP",
OrgDb = org.Hs.eg.db,
keyType = "SYMBOL")
tcm_interpret(
bp,
prompt = "Summarise the main biological implications of this Lingzhi GO enrichment.",
language = "en")=== TCM AI Analysis: enrichment ===
Model: openai:gpt-4o | Language: en | Audience: researcher
-- Summary --
The enrichment profile indicates that Lingzhi's putative targets cluster in innate
immune and inflammatory signaling, particularly TLR/IL-1/TNF-NF-κB pathways,
leukocyte adhesion, and matrix remodeling...
-- Key Findings --
* Strong enrichment for cellular response to stress and external stimuli
* Activation of innate immune and inflammatory pathways (IRAK1, NFKBIA, TNF, ADAM17)
* Leukocyte adhesion and endothelial activation (VCAM1, MMP9)
...
9.2.3 Interpreting PPI objects
PPI network objects with topological metrics computed by compute_nodeinfo() can also be directly interpreted:
9.2.4 Drafting result paragraphs
draft_result_paragraph() transforms an interpretation object into a publication-ready paragraph:
9.2.5 Custom structured output
tcm_interpret_schema() allows user-defined output schemas for integration into downstream pipelines:
my_schema <- tcm_schema(
summary = tcm_field_string("A concise 2-3 sentence summary"),
mechanism = tcm_field_string("Core mechanistic interpretation"),
key_targets = tcm_field_array("Most important targets"),
confidence = tcm_field_enum(c("high", "medium", "low"),
"Confidence level")
)
custom_res <- tcm_interpret_schema(
bp,
schema = my_schema,
type = "enrichment",
prompt = "Focus on inflammation-related processes.")
print(custom_res)Available field types: tcm_field_string(), tcm_field_number(), tcm_field_boolean(), tcm_field_array(), tcm_field_enum().
9.3 Agent layer
The agent layer adds tool use and multi-turn conversation on top of the interpretation engine. TCMDATA provides 30+ analysis tools (target search, enrichment, PPI, ML screening, compound lookup, visualization, etc.) that the agent can call autonomously.
9.3.1 One-shot task: tcm_agent()
For a single analysis request, tcm_agent() automatically routes the query to the appropriate tools and returns the result:
# Simple herb target lookup — agent selects the right tool automatically
result <- tcm_agent("Search the targets of Huangqi (Astragalus)")
cat(result$text)
# The result also contains any generated artifacts
result$artifactsThe built-in router matches user queries to relevant tools using keyword patterns. For example, “enrichment” routes to GO/KEGG tools, “ppi” routes to network tools, and “machine learning” routes to ML screening tools. When multiple patterns match, tools are merged.
9.3.2 Interactive session: tcm_chat()
tcm_chat() opens an interactive REPL for multi-turn exploratory analysis:
╔══════════════════════════════════════════════════╗
║ TCM-Pharmacist · Interactive Session ║
║ Type /help for commands · /quit to exit ║
╚══════════════════════════════════════════════════╝
[1] You > Search targets of Huangqi and sepsis, then compute intersection
✓ Route: target_lookup + disease_lookup (high)
✓ Tools called: search_herb_records → search_disease_targets → compute_target_intersection
✓ New artifacts: intersect_001
Agent > Found 121 intersection targets between Huangqi and sepsis...
[2] You > Run GO and KEGG enrichment on the intersection
✓ Route: enrichment (high)
✓ Tools called: run_go_enrichment → run_kegg_enrichment
✓ New artifacts: enrich_001, enrich_002
Agent > GO enrichment identified 245 significant BP terms...
[3] You > /artifacts
intersect_001 gene_list character[121] 2026-04-10 14:32:01
enrich_001 enrichment enrichResult 2026-04-10 14:32:15
enrich_002 enrichment enrichResult 2026-04-10 14:32:16
[4] You > /save 10x8
✓ Exported 3 artifacts to tcm_output/
[5] You > /quit
Key commands: /help, /artifacts, /save [WxH], /history, /stats, /quit.
The session returns a list with history and artifacts for programmatic access:
9.3.3 Programmatic agent: create_tcm_task_agent()
For scripted workflows, create a reusable agent and execute tasks programmatically:
agent <- create_tcm_task_agent()
r1 <- run_tcm_task(agent, "Search Huangqi targets and sepsis targets, compute intersection")
r2 <- run_tcm_task(agent, "Run GO enrichment on the intersection genes")
r3 <- run_tcm_task(agent, "Build PPI network and rank hub genes")
# Each result contains: $text, $artifacts, $tool_calls
r3$text9.4 Skill layer
Skills are domain-knowledge packages that guide the agent through complex multi-step workflows. Unlike tools (which perform specific operations), skills provide strategic context — what to do, in what order, and what to watch out for.
9.4.1 Built-in skills
TCMDATA ships with two package skills, and also loads aisdk’s skill-creator by default:
| Skill | Purpose |
|---|---|
tcm-network-pharmacology |
Guides the full network pharmacology workflow (target retrieval → intersection → PPI → enrichment → validation → report). Only activates when user explicitly requests systematic analysis |
analysis-preferences |
Background constraint layer. Sets default parameters (e.g., p-value cutoffs, PPI score thresholds, visualization defaults) and quality standards. Active on every turn |
skill-creator |
Meta-skill from aisdk for creating new custom skills |
9.4.2 How skills work
When the agent detects a request that matches a skill’s trigger condition, the skill’s instructions are injected into the conversation context. For example, asking “Help me do a network pharmacology analysis of Huangqi treating sepsis” activates the tcm-network-pharmacology skill, which guides the agent through:
- Phase 1: Target collection (herb targets + disease targets → intersection)
- Phase 2: Network & enrichment (PPI construction, GO/KEGG analysis, hub gene ranking)
- Phase 3: Expression validation (WGCNA, ML screening, DEG integration — if data available)
- Phase 4: Single-cell validation (if scRNA-seq data available)
- Phase 5: Literature validation, cross-database verification, and report generation
Importantly, the skill is scope-aware: asking “Just search the targets of Huangqi” will only run the relevant step, not the full pipeline.
9.4.3 Creating custom skills
Use the skill-creator skill to define new skills. A minimal skill only needs a SKILL.md file with YAML frontmatter:
# Example: ask the agent to create a new skill
tcm_chat()
# > Help me create a skill for molecular docking analysisA skill directory can optionally contain references/ (detailed documentation), scripts/ (executable scripts), and assets/ (templates).
9.4.4 Managing skills
By default, the agent uses the TCMDATA skill directory plus aisdk’s skill-creator. To customize preferences or add new skills, initialize a local skills directory:
This creates a local tcm_skills/ directory containing the TCMDATA package skills. You can then:
# Edit analysis preferences (e.g., change default p-value cutoff)
file.edit("tcm_skills/analysis-preferences/SKILL.md")
# Add a new skill created by skill-creator
# Just place the skill directory under tcm_skills/:
# tcm_skills/my-new-skill/SKILL.md
# Check which skills directory is currently active
tcm_skill_dir()
# Switch to a different skills directory
tcm_use_skills("path/to/other/skills")
# Reset to package defaults
tcm_reset_skills()Once a local skills directory is active, all agent functions (tcm_agent(), tcm_chat(), create_tcm_task_agent()) will automatically use it.
9.5 Session information
sessionInfo()
#> R version 4.5.3 (2026-03-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] caret_7.0-1 lattice_0.22-9 org.Hs.eg.db_3.22.0
#> [4] AnnotationDbi_1.72.0 IRanges_2.44.0 S4Vectors_0.48.1
#> [7] Biobase_2.70.0 BiocGenerics_0.56.0 generics_0.1.4
#> [10] clusterProfiler_4.18.4 aplot_0.2.9 ggrepel_0.9.8
#> [13] ggtangle_0.1.1 igraph_2.2.3 ggplot2_4.0.2
#> [16] dplyr_1.2.1 ivolcano_0.0.5 enrichplot_1.30.5
#> [19] TCMDATA_0.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] splines_4.5.3 ggplotify_0.1.3 tibble_3.3.1
#> [4] R.oo_1.27.1 polyclip_1.10-7 hardhat_1.4.3
#> [7] pROC_1.19.0.1 rpart_4.1.24 lifecycle_1.0.5
#> [10] doParallel_1.0.17 globals_0.19.1 MASS_7.3-65
#> [13] magrittr_2.0.5 sass_0.4.10 rmarkdown_2.31
#> [16] jquerylib_0.1.4 yaml_2.3.12 ggvenn_0.1.19
#> [19] cowplot_1.2.0 DBI_1.3.0 RColorBrewer_1.1-3
#> [22] lubridate_1.9.5 purrr_1.2.2 R.utils_2.13.0
#> [25] yulab.utils_0.2.4 nnet_7.3-20 tweenr_2.0.3
#> [28] rappdirs_0.3.4 ipred_0.9-15 gdtools_0.5.0
#> [31] circlize_0.4.18 lava_1.9.0 listenv_0.10.1
#> [34] tidytree_0.4.7 parallelly_1.46.1 codetools_0.2-20
#> [37] DOSE_4.4.0 ggforce_0.5.0 tidyselect_1.2.1
#> [40] shape_1.4.6.1 farver_2.1.2 matrixStats_1.5.0
#> [43] Seqinfo_1.0.0 jsonlite_2.0.0 GetoptLong_1.1.1
#> [46] e1071_1.7-17 ggridges_0.5.7 ggalluvial_0.12.6
#> [49] survival_3.8-6 iterators_1.0.14 systemfonts_1.3.2
#> [52] foreach_1.5.2 tools_4.5.3 ggnewscale_0.5.2
#> [55] treeio_1.34.0 Rcpp_1.1.1 glue_1.8.0
#> [58] prodlim_2026.03.11 gridExtra_2.3 xfun_0.57
#> [61] ranger_0.18.0 qvalue_2.42.0 withr_3.0.2
#> [64] fastmap_1.2.0 digest_0.6.39 timechange_0.4.0
#> [67] R6_2.6.1 gridGraphics_0.5-1 colorspace_2.1-2
#> [70] GO.db_3.22.0 RSQLite_2.4.6 R.methodsS3_1.8.2
#> [73] tidyr_1.3.2 fontLiberation_0.1.0 data.table_1.18.2.1
#> [76] recipes_1.3.2 class_7.3-23 httr_1.4.8
#> [79] htmlwidgets_1.6.4 scatterpie_0.2.6 ModelMetrics_1.2.2.2
#> [82] pkgconfig_2.0.3 gtable_0.3.6 timeDate_4052.112
#> [85] blob_1.3.0 ComplexHeatmap_2.26.1 S7_0.2.1
#> [88] XVector_0.50.0 htmltools_0.5.9 fontBitstreamVera_0.1.1
#> [91] bookdown_0.46 fgsea_1.36.2 clue_0.3-68
#> [94] scales_1.4.0 png_0.1-9 gower_1.0.2
#> [97] Boruta_9.0.0 ggfun_0.2.0 knitr_1.51
#> [100] rstudioapi_0.18.0 reshape2_1.4.5 rjson_0.2.23
#> [103] nlme_3.1-168 proxy_0.4-29 cachem_1.1.0
#> [106] GlobalOptions_0.1.4 stringr_1.6.0 parallel_4.5.3
#> [109] pillar_1.11.1 grid_4.5.3 vctrs_0.7.3
#> [112] randomForest_4.7-1.2 tidydr_0.0.6 cluster_2.1.8.2
#> [115] evaluate_1.0.5 cli_3.6.6 compiler_4.5.3
#> [118] rlang_1.2.0 crayon_1.5.3 future.apply_1.20.2
#> [121] labeling_0.4.3 plyr_1.8.9 fs_2.0.1
#> [124] ggiraph_0.9.6 stringi_1.8.7 viridisLite_0.4.3
#> [127] BiocParallel_1.44.0 Biostrings_2.78.0 lazyeval_0.2.3
#> [130] glmnet_4.1-10 GOSemSim_2.36.0 fontquiver_0.2.1
#> [133] Matrix_1.7-5 patchwork_1.3.2 bit64_4.6.0-1
#> [136] future_1.70.0 KEGGREST_1.50.0 kernlab_0.9-33
#> [139] memoise_2.0.1 bslib_0.10.0 ggtree_4.0.5
#> [142] fastmatch_1.1-8 bit_4.6.0 xgboost_3.2.1.1
#> [145] ape_5.8-1 gson_0.1.0