The 'Corpus Workbench' ('CWB', <https://cwb.sourceforge.io/>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast (Evert and Hardie 2011, <http://www.stefan-evert.de/PUB/EvertHardie2011.pdf>). The 'cwbtools' package offers pure R tools to create indexed corpus files as well as high-level wrappers for the original C implementation of CWB as exposed by the 'RcppCWB' package <https://CRAN.R-project.org/package=RcppCWB>. Additional functionality to add and modify annotations of corpora from within R makes working with CWB indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the R packages 'RcppCWB' (<https://CRAN.R-project.org/package=RcppCWB>) and 'polmineR' (<https://CRAN.R-project.org/package=polmineR>) offers a lightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.
|Imports:||data.table, R6, xml2, stringi, curl, RcppCWB (≥ 0.5.2), pbapply, methods, tools, cli, jsonlite, httr, rstudioapi, zen4R, lifecycle, fs|
|Suggests:||tm (≥ 0.7.3), knitr, markdown, tokenizers (≥ 0.2.1), tidytext, SnowballC, janeaustenr, NLP, testthat, rmarkdown, openNLP, aws.s3|
|Author:||Andreas Blaette [aut, cre], Christoph Leonhardt [aut]|
|Maintainer:||Andreas Blaette <andreas.blaette at uni-due.de>|
|Citation:||cwbtools citation info|
|CRAN checks:||cwbtools results|
CWB corpora and openNLP
How to add a sentence annotation to an indexed corpus
|Windows binaries:||r-devel: cwbtools_0.3.5.zip, r-release: cwbtools_0.3.5.zip, r-oldrel: cwbtools_0.3.5.zip|
|macOS binaries:||r-release (arm64): cwbtools_0.3.5.tgz, r-oldrel (arm64): cwbtools_0.3.5.tgz, r-release (x86_64): cwbtools_0.3.5.tgz, r-oldrel (x86_64): cwbtools_0.3.5.tgz|
|Old sources:||cwbtools archive|
Please use the canonical form https://CRAN.R-project.org/package=cwbtools to link to this page.