Chapter 8 Other resources
In this chapter, we provide a curated list of additional resources that complement the TCMDATA package for TCM network pharmacology research. These include databases, tools, and R packages that can be integrated into your analysis workflow for enhanced data retrieval, processing, and visualization.
8.1 Additional databases
8.1.1 TF–gene interactions (DoRothEA)
Transcription factors (TFs) are master regulators that control gene expression by binding to specific DNA sequences. In network pharmacology, understanding TF–target relationships is crucial for identifying upstream regulators of disease-associated genes and constructing comprehensive regulatory networks.
TCMDATA provides a curated dataset of high-confidence TF–target interactions derived from the DoRothEA database[1]. The dataset includes interactions at confidence levels A, B, and C for both Human and Mouse:
| Confidence | Evidence Source | Description |
|---|---|---|
| A | Literature-curated | High confidence; manually curated from publications |
| B | ChIP-seq evidence | Moderate confidence; supported by chromatin immunoprecipitation data |
| C | TFBS + ChIP-seq | Medium confidence; predicted binding sites validated by ChIP-seq |
User can access this data via data("tf_targets"):
#> # A tibble: 10 × 5
#> Species TF Target Confidence Mode_of_Regulation
#> <chr> <chr> <chr> <chr> <dbl>
#> 1 Human AHR CYP1A1 C 1
#> 2 Human AHR CYP1A2 C 1
#> 3 Human AHR CYP1B1 C 1
#> 4 Human AHR FOS C 1
#> 5 Human AHR MYC C 1
#> 6 Human AHR UGT1A6 C 1
#> 7 Human AHR ASAP1 C 1
#> 8 Human AHR ERG C 1
#> 9 Human AHR VGLL4 C 1
#> 10 Human AHR ARHGAP15 C 1
The Mode_of_Regulation column indicates whether the TF activates (1) or inhibits (-1) the target gene, enabling construction of signed regulatory networks.
8.1.1.1 Application example
This dataset is particularly useful for:
- TF enrichment analysis — Identify which TFs are significantly associated with your disease targets.
- Regulatory network construction — Build TF → target → disease networks to reveal upstream regulatory mechanisms.
- Integration with TCM networks — Link TCM targets to their upstream TFs, extending the herb → molecule → target → TF cascade.
8.1.2 Gut microbiota–host interactions (GutMGene)
The gut microbiome plays a critical role in human health by producing metabolites that modulate host physiology. These microbial metabolites, which are similar to TCM active compounds, are small molecules that interact with host target proteins. This “Gut Microbiota → Metabolite → Host Target” axis provides a natural extension to the classical TCM network pharmacology paradigm, enabling researchers to explore how gut microbiota may mediate or modulate the therapeutic effects of herbal medicines.
TCMDATA integrates comprehensive gut microbiota–metabolite–target interaction data from the GutMGene v2.0 database[2]. The dataset covers both Human and Mouse and includes:
- Bacteria–Metabolite relationships: Which gut bacteria produce specific metabolites
- Metabolite–Target relationships: Which host genes are regulated by microbial metabolites
- Interaction type: Whether the metabolite activates or inhibits the target
#> Bacteria Bacteria_ID Metabolite Metabolite_ID Target Target_ID
#> 1 Christensenella minuta 626937 Acetate 175 FFAR3 2865
#> 2 Christensenella minuta 626937 Acetate 175 FFAR2 2867
#> 3 Christensenella minuta 626937 Acetate 175 CCL2 6347
#> 4 Christensenella minuta 626937 Acetate 175 LGR5 8549
#> 5 Christensenella minuta 626937 Acetate 175 ALPI 248
#> 6 Christensenella minuta 626937 Acetate 175 MUC2 4583
#> 7 Christensenella minuta 626937 Acetate 175 DCLK1 9201
#> 8 Christensenella minuta 626937 Acetate 175 OCLN 100506658
#> 9 Christensenella minuta 626937 Acetate 175 CLDN3 1365
#> 10 Christensenella minuta 626937 Acetate 175 TJP1 7082
#> Interaction PMID_Bac_Met PMID_Met_Target
#> 1 activation 21357455 28322790
#> 2 activation 21357455 28322790
#> 3 inhibition 21357455 28322790
#> 4 activation 21357455 32240190
#> 5 activation 21357455 32240190
#> 6 activation 21357455 32240190
#> 7 activation 21357455 32240190
#> 8 activation 21357455 32240190
#> 9 activation 21357455 32240190
#> 10 activation 21357455 32240190
data("gutMGene") loads a list with two data frames: gut_axis_human and gut_axis_mouse.
8.1.2.1 Application example: Gut–TCM integration
Since microbial metabolites and TCM active compounds are both small molecules targeting host proteins, you can integrate GutMGene with TCM network analysis to explore:
- Shared targets — Do TCM compounds and gut metabolites regulate the same host genes?
- Synergistic effects — Can TCM herbs modulate gut microbiota composition, thereby influencing metabolite production?
- Multi-layer networks — Construct Bacteria → Metabolite → Target → Pathway networks analogous to Herb → Molecule → Target → Pathway networks.
8.1.2.2 Network visualization with ggtangle
The tripartite structure of gut microbiota data (Bacteria → Metabolite → Target) mirrors the TCM network paradigm (Herb → Molecule → Target). Below we demonstrate how to visualize this relationship using ggtangle:
library(TCMDATA)
library(dplyr)
library(igraph)
library(ggtangle)
library(ggplot2)
library(ggrepel)
data("gutMGene")
# Select one bacterium with multiple metabolites and targets
bac_name <- "Bifidobacterium longum"
gut_sub <- gutMGene$gut_axis_human |>
filter(Bacteria == bac_name, !is.na(Metabolite), !is.na(Target))
# Sample targets to keep the network readable
set.seed(2026)
gut_sub <- gut_sub |>
group_by(Metabolite) |>
slice_sample(n = 8) |>
ungroup()
# Build edges: Bacteria → Metabolite → Target
edges <- bind_rows(
gut_sub |> distinct(from = Bacteria, to = Metabolite),
gut_sub |> distinct(from = Metabolite, to = Target)
)
# Node attributes
nodes <- tibble(name = unique(c(edges$from, edges$to))) |>
mutate(type = case_when(
name == bac_name ~ "Bacteria",
name %in% gut_sub$Metabolite ~ "Metabolite",
TRUE ~ "Target"
))
# Build and plot network
g <- graph_from_data_frame(edges, directed = FALSE, vertices = nodes)
set.seed(2026)
ggplot(g, layout = "fr") +
geom_edge(alpha = 0.15, color = "grey55") +
geom_point(aes(color = type, shape = type, size = type), alpha = 0.9) +
geom_text_repel(
aes(label = name),
size = 3, max.overlaps = 30, segment.alpha = 0.3
) +
scale_color_manual(
values = c("Bacteria" = "#E64B35", "Metabolite" = "#4DBBD5", "Target" = "#00A087"),
name = "Node Type"
) +
scale_shape_manual(
values = c("Bacteria" = 18, "Metabolite" = 15, "Target" = 16),
name = "Node Type"
) +
scale_size_manual(
values = c("Bacteria" = 6, "Metabolite" = 5, "Target" = 3.5),
name = "Node Type"
) +
theme_void() +
theme(
legend.position = "right",
legend.title = element_text(face = "bold", size = 11),
legend.text = element_text(size = 10)
)
The network reveals the multi-target nature of gut microbial metabolites—analogous to TCM active compounds—where a single metabolite (e.g., Acetate, Butyrate) can modulate multiple host genes through distinct signaling pathways.
8.2 Additional visualization
8.2.1 Heatmap for molecular docking results
Molecular docking is a key computational method in network pharmacology for validating interactions between TCM active compounds and target proteins. After performing docking simulations (e.g., with AutoDock Vina), users typically obtain a matrix of binding affinities (kcal/mol) where more negative values indicate stronger binding.
TCMDATA provides ggdock() for visualizing docking results as heatmaps. The function supports both dot plots and tile plots, with flexible color palettes and optional affinity labels.
library(TCMDATA)
# Generate demo docking data
set.seed(2026)
molecules <- c("Quercetin", "Kaempferol", "Luteolin", "Apigenin", "Baicalein")
targets <- c("AKT1", "EGFR", "TNF", "IL6", "VEGFA", "CASP3")
dock_matrix <- matrix(
runif(length(molecules) * length(targets), min = -9.5, max = -4.5),
nrow = length(targets),
dimnames = list(targets, molecules)
)
# Dot plot
ggdock(dock_matrix, order = "median", type = "dot", point_size = 10)
Also, you can try tile plot with affinity labels:

8.3 References
Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Research (2019), 29(8), 1363–1375. doi: 10.1101/gr.240663.118.
Qi C, He G, Qian K, et al. GutMGene v2.0: an updated comprehensive database for target genes of gut microbes and microbial metabolites. Nucleic Acids Research (2025), 53(D1), D783–D788. doi: 10.1093/nar/gkae1002.
8.4 Session information
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] org.Hs.eg.db_3.22.0 AnnotationDbi_1.72.0 IRanges_2.44.0
#> [4] S4Vectors_0.48.0 Biobase_2.70.0 BiocGenerics_0.56.0
#> [7] generics_0.1.4 clusterProfiler_4.18.4 aplot_0.2.9
#> [10] ggrepel_0.9.7 ggtangle_0.1.1 igraph_2.2.2
#> [13] ggplot2_4.0.2 dplyr_1.2.0 aplotExtra_0.0.4
#> [16] ivolcano_0.0.5 enrichplot_1.30.5 TCMDATA_0.0.0.9000
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 shape_1.4.6.1 rstudioapi_0.18.0
#> [4] jsonlite_2.0.0 tidydr_0.0.6 magrittr_2.0.4
#> [7] farver_2.1.2 rmarkdown_2.30 GlobalOptions_0.1.3
#> [10] fs_1.6.7 vctrs_0.7.1 memoise_2.0.1
#> [13] paletteer_1.7.0 ggtree_4.0.4 htmltools_0.5.9
#> [16] forcats_1.0.1 gridGraphics_0.5-1 sass_0.4.10
#> [19] bslib_0.10.0 htmlwidgets_1.6.4 plyr_1.8.9
#> [22] cachem_1.1.0 iterators_1.0.14 lifecycle_1.0.5
#> [25] pkgconfig_2.0.3 Matrix_1.7-4 R6_2.6.1
#> [28] fastmap_1.2.0 gson_0.1.0 clue_0.3-67
#> [31] digest_0.6.39 colorspace_2.1-2 ggnewscale_0.5.2
#> [34] rematch2_2.1.2 patchwork_1.3.2 maftools_2.26.0
#> [37] prismatic_1.1.2 RSQLite_2.4.6 labeling_0.4.3
#> [40] httr_1.4.8 polyclip_1.10-7 compiler_4.5.2
#> [43] bit64_4.6.0-1 fontquiver_0.2.1 withr_3.0.2
#> [46] doParallel_1.0.17 S7_0.2.1 BiocParallel_1.44.0
#> [49] DBI_1.3.0 ggforce_0.5.0 R.utils_2.13.0
#> [52] MASS_7.3-65 rappdirs_0.3.4 rjson_0.2.23
#> [55] DNAcopy_1.84.0 tools_4.5.2 ape_5.8-1
#> [58] scatterpie_0.2.6 R.oo_1.27.1 glue_1.8.0
#> [61] nlme_3.1-168 GOSemSim_2.36.0 grid_4.5.2
#> [64] ggvenn_0.1.19 cluster_2.1.8.1 reshape2_1.4.5
#> [67] fgsea_1.36.2 gtable_0.3.6 R.methodsS3_1.8.2
#> [70] tidyr_1.3.2 data.table_1.18.2.1 utf8_1.2.6
#> [73] XVector_0.50.0 foreach_1.5.2 pillar_1.11.1
#> [76] stringr_1.6.0 yulab.utils_0.2.4 circlize_0.4.17
#> [79] splines_4.5.2 tweenr_2.0.3 treeio_1.34.0
#> [82] lattice_0.22-7 survival_3.8-3 bit_4.6.0
#> [85] tidyselect_1.2.1 fontLiberation_0.1.0 GO.db_3.22.0
#> [88] ComplexHeatmap_2.26.1 Biostrings_2.78.0 knitr_1.51
#> [91] fontBitstreamVera_0.1.1 gridExtra_2.3 bookdown_0.46
#> [94] Seqinfo_1.0.0 xfun_0.56 matrixStats_1.5.0
#> [97] stringi_1.8.7 lazyeval_0.2.2 ggfun_0.2.0
#> [100] yaml_2.3.12 evaluate_1.0.5 codetools_0.2-20
#> [103] gdtools_0.5.0 tibble_3.3.1 qvalue_2.42.0
#> [106] ggplotify_0.1.3 cli_3.6.5 systemfonts_1.3.2
#> [109] jquerylib_0.1.4 Rcpp_1.1.1 png_0.1-8
#> [112] parallel_4.5.2 blob_1.3.0 ggalluvial_0.12.6
#> [115] DOSE_4.4.0 ggstar_1.0.6 ggthemes_5.2.0
#> [118] viridisLite_0.4.3 tidytree_0.4.7 ggiraph_0.9.6
#> [121] ggridges_0.5.7 scales_1.4.0 purrr_1.2.1
#> [124] crayon_1.5.3 GetoptLong_1.1.0 rlang_1.1.7
#> [127] cowplot_1.2.0 fastmatch_1.1-8 KEGGREST_1.50.0