Chapter 1 Load TCM Data
TCMDATA provides two primary functions for retrieving Traditional Chinese Medicine data: search_herb() and search_target().
Both functions return a data.frame containing three key columns:
- herb: Herb name
- molecule: Molecule/compound name
- target: Target gene symbol
These columns represent the core components of TCM network pharmacology analysis.
1.1 Search by herb name
Users can query TCM data by providing an herb name and specifying the name format:
library(TCMDATA)
# Query using Chinese name
herbs <- c("灵芝")
lz <- search_herb(herb = herbs, type = "Herb_cn_name")
head(lz)#> herb molecule target
#> 1 lingzhi 3,4-Dihydroxybenzoic acid GAA
#> 2 lingzhi 3,4-Dihydroxybenzoic acid POLB
#> 3 lingzhi 3,4-Dihydroxybenzoic acid APP
#> 4 lingzhi 3,4-Dihydroxybenzoic acid CA1
#> 5 lingzhi 3,4-Dihydroxybenzoic acid CA12
#> 6 lingzhi 3,4-Dihydroxybenzoic acid CA14
The herb parameter accepts three name formats:
| Type | Description | Example |
|---|---|---|
Herb_cn_name |
Chinese characters (recommended) | "灵芝" |
Herb_pinyin_name |
Pinyin romanization (recommended) | "lingzhi" |
Herb_en_name |
English translation | "Ganoderma" or "GANODERMA" |
💡 Tip: Chinese and Pinyin formats are preferred for best search accuracy.
1.1.1 Using Pinyin format
For Pinyin queries, concatenate all syllables without spaces or hyphens:
# Correct: continuous pinyin
lz_py <- search_herb(herb = "lingzhi", type = "Herb_pinyin_name")
head(lz_py)#> herb molecule target
#> 1 lingzhi 3,4-Dihydroxybenzoic acid GAA
#> 2 lingzhi 3,4-Dihydroxybenzoic acid POLB
#> 3 lingzhi 3,4-Dihydroxybenzoic acid APP
#> 4 lingzhi 3,4-Dihydroxybenzoic acid CA1
#> 5 lingzhi 3,4-Dihydroxybenzoic acid CA12
#> 6 lingzhi 3,4-Dihydroxybenzoic acid CA14
Common mistakes to avoid:
# ✗ contains space
search_herb(herb = "ling zhi", type = "Herb_pinyin_name")
# ✗ contains hyphen
search_herb(herb = "ling-zhi", type = "Herb_pinyin_name")1.2 Search by target gene
All targets in TCMDATA are stored as gene symbols (e.g., HGNC nomenclature). You can retrieve associated herbs and molecules by querying target genes:
# Query multiple target genes
genes <- c("TP53", "EGFR", "BRCA1")
results <- search_target(genes)
head(results)#> herb molecule target
#> 1 dilong Adenine EGFR
#> 2 dongchongxiacao Adenine EGFR
#> 3 fuling Adenine EGFR
#> 4 fulingpi Adenine EGFR
#> 5 jiuxiangchong Adenine EGFR
#> 6 lingzhi Adenine EGFR
The output shows which herbs and molecules interact with the specified targets, enabling reverse pharmacology analysis.
1.3 Session information
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] TCMDATA_0.0.0.9000
#>
#> loaded via a namespace (and not attached):
#> [1] DBI_1.3.0 gson_0.1.0 rematch2_2.1.2
#> [4] rlang_1.1.7 magrittr_2.0.4 DOSE_4.4.0
#> [7] compiler_4.5.2 RSQLite_2.4.6 png_0.1-8
#> [10] systemfonts_1.3.2 vctrs_0.7.1 reshape2_1.4.5
#> [13] stringr_1.6.0 pkgconfig_2.0.3 crayon_1.5.3
#> [16] fastmap_1.2.0 XVector_0.50.0 rmarkdown_2.30
#> [19] enrichplot_1.30.5 purrr_1.2.1 bit_4.6.0
#> [22] xfun_0.56 cachem_1.1.0 aplot_0.2.9
#> [25] jsonlite_2.0.0 blob_1.3.0 tidydr_0.0.6
#> [28] tweenr_2.0.3 BiocParallel_1.44.0 cluster_2.1.8.1
#> [31] parallel_4.5.2 R6_2.6.1 bslib_0.10.0
#> [34] stringi_1.8.7 RColorBrewer_1.1-3 jquerylib_0.1.4
#> [37] GOSemSim_2.36.0 Rcpp_1.1.1 Seqinfo_1.0.0
#> [40] bookdown_0.46 knitr_1.51 ggtangle_0.1.1
#> [43] R.utils_2.13.0 IRanges_2.44.0 Matrix_1.7-4
#> [46] splines_4.5.2 igraph_2.2.2 tidyselect_1.2.1
#> [49] qvalue_2.42.0 rstudioapi_0.18.0 yaml_2.3.12
#> [52] codetools_0.2-20 lattice_0.22-7 tibble_3.3.1
#> [55] plyr_1.8.9 withr_3.0.2 Biobase_2.70.0
#> [58] treeio_1.34.0 KEGGREST_1.50.0 S7_0.2.1
#> [61] evaluate_1.0.5 gridGraphics_0.5-1 polyclip_1.10-7
#> [64] scatterpie_0.2.6 Biostrings_2.78.0 pillar_1.11.1
#> [67] ggtree_4.0.4 stats4_4.5.2 clusterProfiler_4.18.4
#> [70] ggfun_0.2.0 generics_0.1.4 paletteer_1.7.0
#> [73] S4Vectors_0.48.0 ggplot2_4.0.2 scales_1.4.0
#> [76] tidytree_0.4.7 glue_1.8.0 gdtools_0.5.0
#> [79] lazyeval_0.2.2 tools_4.5.2 ggnewscale_0.5.2
#> [82] data.table_1.18.2.1 fgsea_1.36.2 forcats_1.0.1
#> [85] ggiraph_0.9.6 fs_1.6.7 fastmatch_1.1-8
#> [88] cowplot_1.2.0 grid_4.5.2 tidyr_1.3.2
#> [91] ape_5.8-1 AnnotationDbi_1.72.0 nlme_3.1-168
#> [94] patchwork_1.3.2 ggforce_0.5.0 cli_3.6.5
#> [97] rappdirs_0.3.4 fontBitstreamVera_0.1.1 dplyr_1.2.0
#> [100] gtable_0.3.6 R.methodsS3_1.8.2 yulab.utils_0.2.4
#> [103] sass_0.4.10 digest_0.6.39 fontquiver_0.2.1
#> [106] BiocGenerics_0.56.0 ggrepel_0.9.7 ggplotify_0.1.3
#> [109] htmlwidgets_1.6.4 farver_2.1.2 memoise_2.0.1
#> [112] htmltools_0.5.9 R.oo_1.27.1 lifecycle_1.0.5
#> [115] httr_1.4.8 GO.db_3.22.0 fontLiberation_0.1.0
#> [118] bit64_4.6.0-1 MASS_7.3-65