Introduction to tabxplor

tabxplor tries to make it easy to deal with multiple cross-tables: to create and manipulate them, but also to read them, using color helpers to highlight important informations, and simplify your experience of data exploration. All functions are tidyverse propelled, pipe-friendly, and render tibble data frames which can be easily manipulated with dplyr. In the same time, time-taking operations are done with data.table to go faster with big dataframes. Tables can be exported to Excel and in html with formats and colors.

Base usage: cross-tables with color helpers for data exploration

The main functions are made to be user-friendly and time-saving in data analysis workflows.

tab makes a simple cross-table:

tab(forcats::gss_cat, marital, race)

#> # A tabxplor tab: 7 × 5
#>   marital       Other Black  White  Total
#>   <fct>           <n>   <n>    <n>    <n>
#> 1 No answer         2     2     13     17
#> 2 Never married   633 1 305  3 478  5 416
#> 3 Separated       110   196    437    743
#> 4 Divorced        212   495  2 676  3 383
#> 5 Widowed          70   262  1 475  1 807
#> 6 Married         932   869  8 316 10 117
#> 7 Total         1 959 3 129 16 395 21 483

When one of the row or column variables is numeric, tab calculates means by category of the other variable.

tab comes with options to weight the table, print percentages, manage totals, digits and missing values, add legends, gather rare categories in a “Others” level.

tab(forcats::gss_cat, marital, race, pct = "row", na = "drop", 
other_if_less_than = 1000, other_level = "Custom_other_level_name")

When a third variable is provided, tab makes a table with as many subtables as it has levels. With several tab_vars, it makes a subtable for each combination of their levels. The result is grouped: in dplyr, operations like sum() or all() are done within each subtable, and not for the whole dataframe.

Colors may be added to highlight over-represented and under-represented cells, and therefore help the user read the table. By default, with color = "diff", colors are based on the differences between a cell and it’s related total (which only works with means and row or col pct). When a percentage is superior to the average percentage of the line or column, it appears with shades of green (or blue). When it’s inferior, it appears with shades of red/orange. A color legend is added below the table. In RStudio colors are adapted to the theme, light or dark.

data <- forcats::gss_cat %>% 
  dplyr::filter(year %in% c(2000, 2006, 2012), !marital %in% c("No answer", "Widowed"))
gss  <- "Source: General social survey 2000-2014"
gss2 <- "Source: General social survey 2000, 2006 and 2012"
tab(data, race, marital, year, subtext = gss2, pct = "row", color = "diff")

race	Never married	Separated	Divorced	Married	Total	n
Other	36%	5%	12%	47%	100%	166
Black	41%	11%	16%	32%	100%	381
White	25%	3%	18%	54%	100%	1 996
Total 2000	28%	4%	17%	50%	100%	2 543
Other	29%	7%	11%	53%	100%	572
Black	47%	6%	18%	30%	100%	580
White	22%	3%	19%	57%	100%	2 986
Total 2006	26%	4%	18%	52%	100%	4 138
Other	37%	6%	7%	51%	100%	190
Black	43%	5%	21%	31%	100%	277
White	25%	3%	18%	53%	100%	1 344
Total 2012	29%	4%	18%	50%	100%	1 811
Total Ensemble	27%	4%	18%	51%	100%	8 492
marital: x > tot +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%
Source: General social survey 2000, 2006 and 2012

The sup_cols argument adds supplementary column variables to the table. With numeric variables, it calculates the mean for each category or the row variable. With text variables, only the first level is kept (you can choose which one to use by placing it first with forcats::fct_relevel). Use tab_many to keep all levels.

tab(dplyr::storms, category, status, sup_cols = c("pressure", "wind"))

category	disturbance	extratropical	hurricane	other low	subtropical depression	subtropical storm	tropical depression	tropical storm	tropical wave	Total	pressure	wind
1	0	0	2 548	0	0	0	0	0	0	2 548	981 (σ9 )	71 (σ6 )
2	0	0	993	0	0	0	0	0	0	993	967 (σ9 )	89 (σ4 )
3	0	0	593	0	0	0	0	0	0	593	955 (σ9 )	104 (σ4 )
4	0	0	553	0	0	0	0	0	0	553	940 (σ9 )	122 (σ6 )
5	0	0	116	0	0	0	0	0	0	116	918 (σ12)	146 (σ6 )
NA	171	2 151	0	1 453	151	298	3 569	6 830	111	14 734	1 002 (σ9 )	38 (σ12)
Total	171	2 151	4 803	1 453	151	298	3 569	6 830	111	19 537	993 (σ19)	50 (σ25)

References and comparison levels for colors

By default, to calculate colors, each cell is compared to the subtable’s related total.

When a third variable or more are provided, it’s possible to compare with the general total line instead, by setting comp = "all". Here, only the last total row is highlighted (TOTAL ENSEMBLE appears in white but other total rows in grey).

tab(data, race, marital, year, subtext = gss2, pct = "row", color = "diff", comp = "all")

race	Never married	Separated	Divorced	Married	Total	n
Other	36%	5%	12%	47%	100%	166
Black	41%	11%	16%	32%	100%	381
White	25%	3%	18%	54%	100%	1 996
Total 2000	28%	4%	17%	50%	100%	2 543
Other	29%	7%	11%	53%	100%	572
Black	47%	6%	18%	30%	100%	580
White	22%	3%	19%	57%	100%	2 986
Total 2006	26%	4%	18%	52%	100%	4 138
Other	37%	6%	7%	51%	100%	190
Black	43%	5%	21%	31%	100%	277
White	25%	3%	18%	53%	100%	1 344
Total 2012	29%	4%	18%	50%	100%	1 811
Total Ensemble	27%	4%	18%	51%	100%	8 492
marital: x > tot +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%
Source: General social survey 2000, 2006 and 2012

With ref = "first", each row (or column) is compared to the first row (or column), which is particularly helpful to highlight historical evolutions. The first rows then appears in white (while rows totals are themselves colored like normal lines).

data <- data %>% dplyr::mutate(year = as.factor(year))
tab(data, year, marital, race, pct = "row", color = "diff", ref = "first", tot = "col",
    totaltab = "table")

year	Never married	Separated	Divorced	Married	Total	n
2000	36%	5%	12%	47%	100%	166
2006	29%	7%	11%	53%	100%	572
2012	37%	6%	7%	51%	100%	190
2000	41%	11%	16%	32%	100%	381
2006	47%	6%	18%	30%	100%	580
2012	43%	5%	21%	31%	100%	277
2000	25%	3%	18%	54%	100%	1 996
2006	22%	3%	19%	57%	100%	2 986
2012	25%	3%	18%	53%	100%	1 344
2000	28%	4%	17%	50%	100%	2 543
2006	26%	4%	18%	52%	100%	4 138
2012	29%	4%	18%	50%	100%	1 811
marital: x > x1 +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%

When `ref`` is a number, the nth row (or column) is used for comparison.

tab(data, year, marital, race, pct = "row", color = "diff", ref = 3)

year	Never married	Separated	Divorced	Married	Total	n
2000	36%	5%	12%	47%	100%	166
2006	29%	7%	11%	53%	100%	572
2012	37%	6%	7%	51%	100%	190
Total Other	32%	6%	11%	51%	100%	928
2000	41%	11%	16%	32%	100%	381
2006	47%	6%	18%	30%	100%	580
2012	43%	5%	21%	31%	100%	277
Total Black	44%	7%	18%	31%	100%	1 238
2000	25%	3%	18%	54%	100%	1 996
2006	22%	3%	19%	57%	100%	2 986
2012	25%	3%	18%	53%	100%	1 344
Total White	23%	3%	19%	55%	100%	6 326
Total Ensemble	27%	4%	18%	51%	100%	8 492
marital: x > x3 +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%

Finally, when `ref`` is a string, it it used as a regular expression, to match with the names of the rows (or columns).

tab(data, year, marital, race, pct = "col", tot = "row", color = "diff", ref = "Married")

year	Never married	Separated	Divorced	Married
2000	20%	14%	20%	16%
2006	56%	68%	66%	63%
2012	24%	19%	13%	20%
Total Other	100%	100%	100%	100%
2000	29%	47%	27%	32%
2006	49%	38%	46%	46%
2012	22%	14%	27%	22%
Total Black	100%	100%	100%	100%
2000	34%	33%	31%	31%
2006	44%	44%	48%	49%
2012	23%	24%	21%	21%
Total White	100%	100%	100%	100%
Total Ensemble	100%	100%	100%	100%
marital: x > Married +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%

Confidence intervals

It it possible to print confidence intervals for each cell:

tab(forcats::gss_cat, race, marital, pct = "row", ci = "cell")

race	No answer	Never married	Separated	Divorced	Widowed	Married	Total	n
Other	0%	[30;34]%	[5;7]%	[9;12]%	[3;4]%	[45;50]%	100%	1 959
Black	0%	[40;43]%	[5;7]%	[14;17]%	[7;9]%	[26;29]%	100%	3 129
White	0%	[21;22]%	[2;3]%	[16;17]%	9%	[50;51]%	100%	16 395
Total	0%	25%	3%	16%	8%	47%	100%	21 483

It is also possible to use confidence intervals to enhance colors helpers. With color = "diff_ci", the cells are only colored if the confidence interval of the difference between them and their reference cell (in total or first row/col) is superior to the difference itself. Otherwise, it means the cell is not significantly different from it’s reference in the total (or first) row: it turns grey, and the reader is not anymore tempted to over-interpret the difference.

tab(forcats::gss_cat, race, marital, pct = "row", color = "diff_ci")

race	No answer	Never married	Separated	Divorced	Widowed	Married	Total	n
Other	0%	32%	6%	11%	4%	48%	100%	1 959
Black	0%	42%	6%	16%	8%	28%	100%	3 129
White	0%	21%	3%	16%	9%	51%	100%	16 395
Total	0%	25%	3%	16%	8%	47%	100%	21 483
marital: \|x-tot\|>ci & x > tot +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%

Finally, another calculation appears helpful: the difference between the cell and the total, minus the confidence interval of this difference (or in other word, what remains of that difference after having subtracted the confidence interval). ci = "after_ci" highligths all the cells whose value is significantly different from the relative total (or first cell). This is particularly useful when working on small populations: we can see at a glance which numbers we have right to read and interpret.

tab(forcats::gss_cat, race, marital, subtext = gss, pct = "row", color = "after_ci")

race	No answer	Never married	Separated	Divorced	Widowed	Married	Total	n
Other	0%	32%	6%	11%	4%	48%	100%	1 959
Black	0%	42%	6%	16%	8%	28%	100%	3 129
White	0%	21%	3%	16%	9%	51%	100%	16 395
Total	0%	25%	3%	16%	8%	47%	100%	21 483
marital: \|x-tot\| > ci +0%; +5%; +15%; ×2; +25%; -0%; -5%; -15%; -25%
Source: General social survey 2000-2014

Chi2 stats and contributions of cells to variance

chi2 = TRUE add summary statistics made in the chi2 metric: degrees of freedom (df), unweighted count, pvalue and (sub)table’s variance. Chi2 pvalue is colored in green when inferior to 5%, and in red when superior or equal to 5%, meaning that the table is not significantly different from the independent hypothesis (the two variables may be independent).

tab(forcats::gss_cat, race, marital, chi2 = TRUE)

race	No answer	Never married	Separated	Divorced	Widowed	Married	Total
Other	2	633	110	212	70	932	1 959
Black	2	1 305	196	495	262	869	3 129
White	13	3 478	437	2 676	1 475	8 316	16 395
Total	17	5 416	743	3 383	1 807	10 117	21 483
pvalue	<0.01%

Chi2 stats can also be used to color cells based on their contributions to the variance of the (sub)table, with color = "contrib". By default, only the cells whose contribution is superior to the mean contribution are colored. It highlights the cells which would stand out in a correspondence analysis (the two related categories would be located at the edges of the first axes ; here, being black is associated with never married and being separated).

tab(forcats::gss_cat, race, marital, color = "contrib")

race	No answer	Never married	Separated	Divorced	Widowed	Married	Total
Other	2	633	110	212	70	932	1 959
Black	2	1 305	196	495	262	869	3 129
White	13	3 478	437	2 676	1 475	8 316	16 395
Total	17	5 416	743	3 383	1 807	10 117	21 483
pvalue	<0.01%
marital: contrib > mean_ctr ×1; ×2; ×5; ×10; contrib > mean_ctr ×1; ×2; ×5; ×10

Combine tabxplor and dplyr

The result of tab is a tibble::tibble dataframe with class tab. It gets it’s own printing methods but, in the same time, can be transformed using most dplyr verbs, like a normal tibble.

library(dplyr)
tab(storms, category, status, sup_cols = c("pressure", "wind")) %>%
  filter(category != "-1") %>%
dplyr::select(-`tropical depression`)
  arrange(is_totrow(.), desc(category)) # use is_totrow to keep total rows

With dplyr::arrange, don’t forget to keep the order of tab variables and total rows:

tab(data, race, marital, year, pct = "row") %>%
  arrange(year, is_totrow(.), desc(Married))

Draw more complex tables with `tab_many`

tab is a wrapper around the more powerful function tab_many, which can be used to customize your tables.

It’s possible, for example, to make a summary table of as many columns variables as you want (showing all levels, or showing only one specific level like here):

library(dplyr)
first_lvs <- c("Married", "$25000 or more", "Strong republican", "Protestant")
data <- forcats::gss_cat %>% mutate(across(
  where(is.factor),
  ~ forcats::fct_relevel(., first_lvs[first_lvs %in% levels(.)])
))
tabs <- tab_many(data, race, c(marital, rincome, partyid, relig, age, tvhours),
         levels = "first", pct = "row", chi2 = TRUE, color = "auto")
tabs

#> # A tabxplor tab: 5 × 8
#>   race   Married `$25000 or more` `Strong republican` Protestant      n      age
#>   <char>  <row%>           <row%>              <row%>     <row%>    <n>   <mean>
#> 1 Other      48%              32%                  4%        20%  1 959 39 (σ14)
#> 2 Black      28%              28%                  2%        73%  3 129 44 (σ16)
#> 3 White      51%              36%                 13%        50% 16 395 49 (σ17)
#> 4 Total      47%              34%                 11%        50% 21 483 47 (σ17)
#> 5 pvalue  <0.01%           <0.01%              <0.01%     <0.01%                
#> # ℹ 1 more variable: tvhours <mean>
#> # marital, rincome, partyid: x > tot +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%
#> # age, tvhours             : x > tot ×1.15; ×1.5; ×2; ×4; x < tot /1.15; /1.5; /2; /4

Using tab or tab_many with purrr::map and tibble::tribble, you can program several tables with different parameters all at once, in a readable way:

tabs <-
  purrr::pmap(
    tibble::tribble(
      ~row_var, ~col_vars       , ~pct , ~filter              , ~subtext               ,
      "race"  , "marital"       , "no" , NULL                 , "Source: GSS 2000-2014",
      "race"  , "marital"       , "row", NULL                 , "Source: GSS 2000-2014",
      "race"  , "marital"       , "col", NULL                 , "Source: GSS 2000-2014",
      "relig" , c("race", "age"), "row", "year %in% 2000:2010", "Source: GSS 2000-2010",
      "relig" , c("race", "age"), "row", "year %in% 2010:2014", "Source: GSS 2010-2014",
      NA_character_, "race"     , "no" , NULL                 , "Source: GSS 2000-2014",
    ),
    .f = tab_many,
    data = forcats::gss_cat, color = "auto", chi2 = TRUE)

Export to html or Excel

To export a table to html with colors, like most of them in the current vignette, tabxplor uses knitr::kable and kableExtra. In this format differences from totals, confidence intervals, contribution to variance, and unweighted counts, are available in a tooltip at cells hover.

tabs %>% tab_kable()

race	Married	$25000 or more	Strong republican	Protestant	n	age	tvhours
Other	48%	32%	4%	20%	1 959	39 (σ14)	2.8 (σ2.4)
Black	28%	28%	2%	73%	3 129	44 (σ16)	4.2 (σ3.5)
White	51%	36%	13%	50%	16 395	49 (σ17)	2.8 (σ2.3)
Total	47%	34%	11%	50%	21 483	47 (σ17)	3.0 (σ2.6)
pvalue	<0.01%	<0.01%	<0.01%	<0.01%
marital, rincome, partyid: x > tot +5%; +10%; +20%; ×2; +30%; -5%; -10%; -20%; -30%
age, tvhours : x > tot ×1.15; ×1.5; ×2; ×4; x < tot /1.15; /1.5; /2; /4

To print an html table by default (for example, in RStudio viewer), use tabxplor options:

options(tabxplor.print = "kable") # default to options(tabxplor.print = "console")

tab_xl exports any table or list of tables to Excel, with all colors, chi2 stats and formatting. On Excel, it is still possible to do calculations on raw numbers.

tabs %>% tab_xl(replace = TRUE, sheets = "unique")

Programming with `tabxplor`

When not doing data analysis but writing functions, you can use the sub-functions of tab_many step by step to attain more flexibility or speed. That way, it’s possible to write new functions to customize your tables even more.

data <- dplyr::starwars %>%
tab_prepare(sex, hair_color, gender, other_if_less_than = 5, 
na_drop_all = sex)

data %>%
tab_plain(sex, hair_color, gender, tot = c("row", "col"), pct = "row", comp = "all") %>%
tab_ci("diff", color = "after_ci")  %>%
tab_chi2(calc = "p")

The whole architecture of tabxplor is powered by a special vector class, named tabxplor_fmt for formatted numbers. As a vctrs::record, it stores behind the scenes all the data necessary to calculate printed results, formats and colors. A set of functions are available to access or transform this data. ?fmt to get more information.

The simple way to recover the underlying numbers as numeric vectors is get_num, which extract the currently displayed field whatever it is :

tabs <- tab(data, race, marital, year, pct = "row")
tabs %>% mutate(across(where(is_fmt), get_num))

#> # A tabxplor tab: 33 × 10
#> # Groups:         year [9]
#>    year    race           Married `No answer` `Never married` Separated Divorced
#>    <fct>   <fct>            <dbl>       <dbl>           <dbl>     <dbl>    <dbl>
#>  1 2000    Other            0.446    0.00571            0.343    0.0457   0.114 
#>  2 2000    Black            0.282    0                  0.366    0.100    0.140 
#>  3 2000    White            0.488    0                  0.224    0.0276   0.163 
#>  4 2000    Total 2000       0.454    0.000355           0.253    0.0398   0.157 
#> 
#>  5 2002    Other            0.497    0                  0.281    0.0539   0.156 
#>  6 2002    Black            0.283    0                  0.398    0.0634   0.161 
#>  7 2002    White            0.489    0                  0.228    0.0279   0.161 
#>  8 2002    Total 2002       0.459    0                  0.256    0.0347   0.161 
#> 
#>  9 2004    Other            0.557    0                  0.289    0.0398   0.0846
#> 10 2004    Black            0.332    0                  0.403    0.0716   0.133 
#> 11 2004    White            0.556    0                  0.183    0.0269   0.156 
#> 12 2004    Total 2004       0.526    0                  0.220    0.0338   0.148 
#> 
#> 13 2006    Other            0.510    0.00169            0.279    0.0676   0.110 
#> 14 2006    Black            0.273    0                  0.426    0.0552   0.161 
#> 15 2006    White            0.516    0.00152            0.196    0.0247   0.172 
#> 16 2006    Total 2006       0.481    0.00133            0.239    0.0346   0.162 
#> 
#> 17 2008    Other            0.437    0                  0.372    0.0601   0.0765
#> 18 2008    Black            0.302    0                  0.416    0.0890   0.128 
#> 19 2008    White            0.518    0.00321            0.222    0.0218   0.148 
#> 20 2008    Total 2008       0.480    0.00247            0.262    0.0346   0.139 
#> 
#> 21 2010    Other            0.377    0                  0.404    0.0601   0.109 
#> 22 2010    Black            0.215    0                  0.511    0.0257   0.174 
#> 23 2010    White            0.487    0.000645           0.214    0.0297   0.172 
#> 24 2010    Total 2010       0.436    0.000489           0.276    0.0318   0.167 
#> 
#> 25 2012    Other            0.490    0                  0.357    0.0561   0.0663
#> 26 2012    Black            0.282    0                  0.399    0.0432   0.196 
#> 27 2012    White            0.487    0                  0.227    0.0298   0.166 
#> 28 2012    Total 2012       0.456    0                  0.266    0.0344   0.161 
#> 
#> 29 2014    Other            0.427    0                  0.347    0.0458   0.141 
#> 30 2014    Black            0.251    0.00518            0.433    0.0492   0.176 
#> 31 2014    White            0.502    0.00106            0.221    0.0265   0.162 
#> 32 2014    Total 2014       0.456    0.00158            0.266    0.0319   0.162 
#> 
#> 33 Ensemb… Total Ensemble   0.471    0.000791           0.252    0.0346   0.157 
#> # ℹ 3 more variables: Widowed <dbl>, Total <dbl>, n <dbl>

To render character vectors (without colors), use format:

tabs %>% mutate(across(where(is_fmt), format))

The following fields compose any fmt column (though many can be NA if not calculated) :

display : name of the field to display, customisable for each cell (character)
n : raw count (integer)
wn : weighted count
pct : percentages
diff : differences from totals or reference cells
digits : digits to display, customisable for each cell (integer)
ctr : contributions of cells to variance (with color = "contrib")
mean : means (for numeric column variables)
var : variance (for numeric column variables ; Chi2 variance with pct)
ci : confidence intervals
in_totrow : TRUE if the cell is part of a total row, FALSE otherwise (logical)
in_tottab : TRUE if the cell is part of a total table, FALSE otherwise (logical)
in_refrow : TRUE if the cell is part of a reference row, FALSE otherwise (logical)

vctrs::vec_data(tabs$Married)

#>        n display digits wn       pct      mean         diff ctr var ci rr or
#> 1     78     pct      0 NA 0.4457143 0.9824547 -0.007959836  NA  NA NA NA NA
#> 2    121     pct      0 NA 0.2820513 0.6217046 -0.171622839  NA  NA NA NA NA
#> 3   1079     pct      0 NA 0.4875734 1.0747217  0.033899308  NA  NA NA NA NA
#> 4   1278     pct      0 NA 0.4536741 1.0000000  0.000000000  NA  NA NA NA NA
#> 5     83     pct      0 NA 0.4970060 1.0829169  0.038054813  NA  NA NA NA NA
#> 6    116     pct      0 NA 0.2829268 0.6164639 -0.176024346  NA  NA NA NA NA
#> 7   1070     pct      0 NA 0.4890311 1.0655405  0.030079903  NA  NA NA NA NA
#> 8   1269     pct      0 NA 0.4589512 1.0000000  0.000000000  NA  NA NA NA NA
#> 9    112     pct      0 NA 0.5572139 1.0594223  0.031253760  NA  NA NA NA NA
#> 10   125     pct      0 NA 0.3315650 0.6303994 -0.194395184  NA  NA NA NA NA
#> 11  1242     pct      0 NA 0.5559534 1.0570258  0.029993276  NA  NA NA NA NA
#> 12  1479     pct      0 NA 0.5259602 1.0000000  0.000000000  NA  NA NA NA NA
#> 13   302     pct      0 NA 0.5101351 1.0602348  0.028982142  NA  NA NA NA NA
#> 14   173     pct      0 NA 0.2728707 0.5671183 -0.208282331  NA  NA NA NA NA
#> 15  1695     pct      0 NA 0.5161389 1.0727126  0.034985862  NA  NA NA NA NA
#> 16  2170     pct      0 NA 0.4811530 1.0000000  0.000000000  NA  NA NA NA NA
#> 17    80     pct      0 NA 0.4371585 0.9098473 -0.043316073  NA  NA NA NA NA
#> 18    85     pct      0 NA 0.3024911 0.6295674 -0.177983440  NA  NA NA NA NA
#> 19   807     pct      0 NA 0.5176395 1.0773505  0.037164970  NA  NA NA NA NA
#> 20   972     pct      0 NA 0.4804745 1.0000000  0.000000000  NA  NA NA NA NA
#> 21    69     pct      0 NA 0.3770492 0.8649703 -0.058860800  NA  NA NA NA NA
#> 22    67     pct      0 NA 0.2154341 0.4942169 -0.220475897  NA  NA NA NA NA
#> 23   755     pct      0 NA 0.4870968 1.1174251  0.051186794  NA  NA NA NA NA
#> 24   891     pct      0 NA 0.4359100 1.0000000  0.000000000  NA  NA NA NA NA
#> 25    96     pct      0 NA 0.4897959 1.0742857  0.033868867  NA  NA NA NA NA
#> 26    85     pct      0 NA 0.2823920 0.6193798 -0.173535025  NA  NA NA NA NA
#> 27   719     pct      0 NA 0.4867976 1.0677093  0.030870511  NA  NA NA NA NA
#> 28   900     pct      0 NA 0.4559271 1.0000000  0.000000000  NA  NA NA NA NA
#> 29   112     pct      0 NA 0.4274809 0.9369141 -0.028783859  NA  NA NA NA NA
#> 30    97     pct      0 NA 0.2512953 0.5507665 -0.204969439  NA  NA NA NA NA
#> 31   949     pct      0 NA 0.5021164 1.1004935  0.045851627  NA  NA NA NA NA
#> 32  1158     pct      0 NA 0.4562648 1.0000000  0.000000000  NA  NA NA NA NA
#> 33 10117     pct      0 NA 0.4709305 1.0000000  0.000000000  NA  NA NA NA NA
#>    in_totrow in_tottab in_refrow
#> 1      FALSE     FALSE     FALSE
#> 2      FALSE     FALSE     FALSE
#> 3      FALSE     FALSE     FALSE
#> 4       TRUE     FALSE     FALSE
#> 5      FALSE     FALSE     FALSE
#> 6      FALSE     FALSE     FALSE
#> 7      FALSE     FALSE     FALSE
#> 8       TRUE     FALSE     FALSE
#> 9      FALSE     FALSE     FALSE
#> 10     FALSE     FALSE     FALSE
#> 11     FALSE     FALSE     FALSE
#> 12      TRUE     FALSE     FALSE
#> 13     FALSE     FALSE     FALSE
#> 14     FALSE     FALSE     FALSE
#> 15     FALSE     FALSE     FALSE
#> 16      TRUE     FALSE     FALSE
#> 17     FALSE     FALSE     FALSE
#> 18     FALSE     FALSE     FALSE
#> 19     FALSE     FALSE     FALSE
#> 20      TRUE     FALSE     FALSE
#> 21     FALSE     FALSE     FALSE
#> 22     FALSE     FALSE     FALSE
#> 23     FALSE     FALSE     FALSE
#> 24      TRUE     FALSE     FALSE
#> 25     FALSE     FALSE     FALSE
#> 26     FALSE     FALSE     FALSE
#> 27     FALSE     FALSE     FALSE
#> 28      TRUE     FALSE     FALSE
#> 29     FALSE     FALSE     FALSE
#> 30     FALSE     FALSE     FALSE
#> 31     FALSE     FALSE     FALSE
#> 32      TRUE     FALSE     FALSE
#> 33      TRUE      TRUE     FALSE

To get those underlying fields you can either use vctrs::fields or, more simply, $ :

tabs %>% mutate(across(where(is_fmt), ~ vctrs::field(., "pct") ))

tabs$Married$pct
tabs$Married$n
tabs %>% mutate(across(where(is_fmt), ~ .$n))

To change the field currently displayed, for the whole table or a single vector, you can use set_display():

tabs |> set_display("diff")
tabs |> mutate(across(where(is_fmt), ~ set_display(., "diff")))

To modify a field, you can use vctrs field<-. For example, to change the number of digits :

tab(forcats::gss_cat, race, marital, pct = "row") |> 
    mutate(across(where(is_fmt), ~ vctrs::`field<-`(., "digits", rep(2L, length(.)))))

Faster to write and easier to read, you can also use dplyr::mutate() on an fmt vector. For example, to create a new column with standards deviations and display it with decimals :

tab_num(data, race, c(age, tvhours), marital, digits = 1L, comp = "all") |>
  dplyr::mutate(dplyr::across( #Mutate over the whole table.
    c(age, tvhours),
    ~ dplyr::mutate(., #Mutate over each fmt vector's underlying data.frame.
                    var     = sqrt(var), 
                    display = "var", 
                    digits  = 2L) |> 
      set_color("no"),
    .names = "{.col}_sd"
  ))

#> # A tabxplor tab: 25 × 6
#> # Groups:         marital [7]
#>    marital       race                         age    tvhours   age_sd tvhours_sd
#>    <fct>         <fct>                     <mean>     <mean> <mean-v> <mean-var>
#>  1 Married       Other               42.2 (σ13.0) 2.5 (σ1.9)    13.01       1.88
#>  2 Married       Black               46.4 (σ13.4) 3.8 (σ3.1)    13.40       3.06
#>  3 Married       White               49.7 (σ15.2) 2.6 (σ2.0)    15.24       1.98
#>  4 Married       Total Married       48.7 (σ15.1) 2.7 (σ2.1)    15.06       2.11
#> 
#>  5 No answer     Other               34.0 (σ8.5 )        2.0     8.49           
#>  6 No answer     Black                       64.0                               
#>  7 No answer     White               56.0 (σ15.7) 2.6 (σ1.2)    15.71       1.19
#>  8 No answer     Total No answer     52.4 (σ16.5) 2.6 (σ1.1)    16.51       1.13
#> 
#>  9 Never married Other               30.2 (σ10.6) 2.8 (σ2.7)    10.60       2.67
#> 10 Never married Black               34.5 (σ12.1) 4.2 (σ3.4)    12.14       3.39
#> 11 Never married White               34.4 (σ14.3) 2.8 (σ2.6)    14.29       2.56
#> 12 Never married Total Never married 33.9 (σ13.5) 3.1 (σ2.9)    13.47       2.86
#> 
#> 13 Separated     Other               42.5 (σ13.0) 3.3 (σ3.3)    12.97       3.26
#> 14 Separated     Black               46.2 (σ13.4) 5.1 (σ4.7)    13.36       4.73
#> 15 Separated     White               45.6 (σ13.5) 2.9 (σ2.8)    13.52       2.77
#> 16 Separated     Total Separated     45.3 (σ13.4) 3.5 (σ3.6)    13.43       3.60
#> 
#> 17 Divorced      Other               45.5 (σ11.8) 3.0 (σ2.7)    11.82       2.71
#> 18 Divorced      Black               51.0 (σ12.7) 4.3 (σ3.7)    12.67       3.74
#> 19 Divorced      White               51.6 (σ13.2) 2.9 (σ2.4)    13.22       2.43
#> 20 Divorced      Total Divorced      51.1 (σ13.1) 3.1 (σ2.7)    13.14       2.73
#> 
#> 21 Widowed       Other               64.5 (σ14.8) 4.2 (σ2.8)    14.84       2.79
#> 22 Widowed       Black               67.5 (σ13.9) 4.7 (σ3.7)    13.89       3.70
#> 23 Widowed       White               72.8 (σ12.5) 3.7 (σ2.7)    12.48       2.70
#> 24 Widowed       Total Widowed       71.7 (σ13.0) 3.9 (σ2.9)    13.00       2.90
#> 
#> 25 Ensemble      Total Ensemble      47.2 (σ17.3) 3.0 (σ2.6)    17.29       2.59
#> # age, tvhours: x > tot ×1.15; ×1.5; ×2; ×4; x < tot /1.15; /1.5; /2; /4

Some helper functions exists for total rows, total tables and reference rows (is_totrow() / as_totrow(), is_tottab() / as_tottab(), is_refrow() / as_refrow()) :

tab(data, race, marital, year, pct = "row") %>%
  dplyr::mutate(across( 
    where(is_fmt),
    ~ dplyr::if_else(is_totrow(.), 
                true  = mutate(., digits = 1L), 
                false = mutate(., digits = 2L))
  ))

#> # A tabxplor tab: 33 × 10
#> # Groups:         year [9]
#>    year    race           Married `No answer` `Never married` Separated Divorced
#>    <fct>   <fct>           <row%>      <row%>          <row%>    <row%>   <row%>
#>  1 2000    Other           44.57%       0.57%          34.29%     4.57%   11.43%
#>  2 2000    Black           28.21%          0%          36.60%    10.02%   13.99%
#>  3 2000    White           48.76%          0%          22.37%     2.76%   16.31%
#>  4 2000    Total 2000       45.4%          0%           25.3%      4.0%    15.7%
#> 
#>  5 2002    Other           49.70%          0%          28.14%     5.39%   15.57%
#>  6 2002    Black           28.29%          0%          39.76%     6.34%   16.10%
#>  7 2002    White           48.90%          0%          22.76%     2.79%   16.13%
#>  8 2002    Total 2002       45.9%          0%           25.6%      3.5%    16.1%
#> 
#>  9 2004    Other           55.72%          0%          28.86%     3.98%    8.46%
#> 10 2004    Black           33.16%          0%          40.32%     7.16%   13.26%
#> 11 2004    White           55.60%          0%          18.31%     2.69%   15.58%
#> 12 2004    Total 2004       52.6%          0%           22.0%      3.4%    14.8%
#> 
#> 13 2006    Other           51.01%       0.17%          27.87%     6.76%   10.98%
#> 14 2006    Black           27.29%          0%          42.59%     5.52%   16.09%
#> 15 2006    White           51.61%       0.15%          19.64%     2.47%   17.20%
#> 16 2006    Total 2006       48.1%        0.1%           23.9%      3.5%    16.2%
#> 
#> 17 2008    Other           43.72%          0%          37.16%     6.01%    7.65%
#> 18 2008    Black           30.25%          0%          41.64%     8.90%   12.81%
#> 19 2008    White           51.76%       0.32%          22.19%     2.18%   14.82%
#> 20 2008    Total 2008       48.0%        0.2%           26.2%      3.5%    13.9%
#> 
#> 21 2010    Other           37.70%          0%          40.44%     6.01%   10.93%
#> 22 2010    Black           21.54%          0%          51.13%     2.57%   17.36%
#> 23 2010    White           48.71%       0.06%          21.42%     2.97%   17.23%
#> 24 2010    Total 2010       43.6%          0%           27.6%      3.2%    16.7%
#> 
#> 25 2012    Other           48.98%          0%          35.71%     5.61%    6.63%
#> 26 2012    Black           28.24%          0%          39.87%     4.32%   19.60%
#> 27 2012    White           48.68%          0%          22.75%     2.98%   16.59%
#> 28 2012    Total 2012       45.6%          0%           26.6%      3.4%    16.1%
#> 
#> 29 2014    Other           42.75%          0%          34.73%     4.58%   14.12%
#> 30 2014    Black           25.13%       0.52%          43.26%     4.92%   17.62%
#> 31 2014    White           50.21%       0.11%          22.06%     2.65%   16.19%
#> 32 2014    Total 2014       45.6%        0.2%           26.6%      3.2%    16.2%
#> 
#> 33 Ensemb… Total Ensemble   47.1%        0.1%           25.2%      3.5%    15.7%
#> # ℹ 3 more variables: Widowed <row%>, Total <row%>, n <n>

Each fmt column have attributes, which you can access or modify with get_ and set_ functions :

type / get_type() / set_type() : the type of the fmt vector, among c("n", "mean", "row", "col", "all", "all_tabs") ; it determines which calculations are done within tab_ functions.
totcol / is_totcol() / as_totcol() : TRUE if the column is a total column, FALSE otherwise (logical)
refcol / is_refcol() / as_refcol() : TRUE if the column is a reference column for comparison, FALSE otherwise (logical)
color / get_color() / set_color() : the calculation to make to print colors ; among c("", "no", "diff", "diff_ci", "after_ci", "contrib")
col_var / get_col_var() / set_col_var() : the name of the column variable (there can be many in one single table)
comp_all / get_comp_all / set_comp_all() : when there are tab_vars, is the reference for comparison the subtable (FALSE), or the total table (TRUE) ?
ref / get_ref_type() / set_diff_type() : the type of difference calculated, either "no", "tot" for totals, an index, or a regular expression.
ci_type / get_ci_type() / set_ci_type() : the type of confidence interval, either "cell" or "diff"

For example, to print the number of observations of the total column :

tab(data, race, marital, year, pct = "row") %>%
  mutate(across(where(is_totcol), ~ mutate(., display = "n") ))

#> # A tabxplor tab: 33 × 10
#> # Groups:         year [9]
#>    year    race           Married `No answer` `Never married` Separated Divorced
#>    <fct>   <fct>           <row%>      <row%>          <row%>    <row%>   <row%>
#>  1 2000    Other              45%          1%             34%        5%      11%
#>  2 2000    Black              28%          0%             37%       10%      14%
#>  3 2000    White              49%          0%             22%        3%      16%
#>  4 2000    Total 2000         45%          0%             25%        4%      16%
#> 
#>  5 2002    Other              50%          0%             28%        5%      16%
#>  6 2002    Black              28%          0%             40%        6%      16%
#>  7 2002    White              49%          0%             23%        3%      16%
#>  8 2002    Total 2002         46%          0%             26%        3%      16%
#> 
#>  9 2004    Other              56%          0%             29%        4%       8%
#> 10 2004    Black              33%          0%             40%        7%      13%
#> 11 2004    White              56%          0%             18%        3%      16%
#> 12 2004    Total 2004         53%          0%             22%        3%      15%
#> 
#> 13 2006    Other              51%          0%             28%        7%      11%
#> 14 2006    Black              27%          0%             43%        6%      16%
#> 15 2006    White              52%          0%             20%        2%      17%
#> 16 2006    Total 2006         48%          0%             24%        3%      16%
#> 
#> 17 2008    Other              44%          0%             37%        6%       8%
#> 18 2008    Black              30%          0%             42%        9%      13%
#> 19 2008    White              52%          0%             22%        2%      15%
#> 20 2008    Total 2008         48%          0%             26%        3%      14%
#> 
#> 21 2010    Other              38%          0%             40%        6%      11%
#> 22 2010    Black              22%          0%             51%        3%      17%
#> 23 2010    White              49%          0%             21%        3%      17%
#> 24 2010    Total 2010         44%          0%             28%        3%      17%
#> 
#> 25 2012    Other              49%          0%             36%        6%       7%
#> 26 2012    Black              28%          0%             40%        4%      20%
#> 27 2012    White              49%          0%             23%        3%      17%
#> 28 2012    Total 2012         46%          0%             27%        3%      16%
#> 
#> 29 2014    Other              43%          0%             35%        5%      14%
#> 30 2014    Black              25%          1%             43%        5%      18%
#> 31 2014    White              50%          0%             22%        3%      16%
#> 32 2014    Total 2014         46%          0%             27%        3%      16%
#> 
#> 33 Ensemb… Total Ensemble     47%          0%             25%        3%      16%
#> # ℹ 3 more variables: Widowed <row%>, Total <row%-n>, n <n>

Note that, if tab_vars are provided, the table is grouped and all operations are made within groups. To remove grouping (for example when it gives errors), use dplyr::ungroup().

If you only need the simplest table, with only numeric counts (no fmt), or even a base data.frame (not a tibble) :

tab_plain(data, race, marital, num = TRUE) # counts as a numeric vector
tab_plain(data, race, marital, df = TRUE)  # same, with unique class = "data.frame"