Bivariate statistics between a categorical variable and a set of variables

Computes bivariate statistics for a set of variables according to the subgroups of observations defined by a categorical variable.

cattab(x, y, weights = NULL, percent = "column",
       robust = TRUE, show.n = TRUE, show.asso = TRUE,
       digits = c(1,1), na.rm = TRUE, na.value = "NAs")

Arguments

x: data frame. The variables which are described in rows. They can be numerical or factors.
y: factor. The categorical variable which defines subgroups of observations described in columns.
weights: numeric vector of weights. If NULL (default), uniform weights (i.e. all equal to 1) are used.
percent: character. Whether to compute row percentages ("row") or column percentages ("column", default).
robust: logical. Whether to use medians instead of means. Default is TRUE.
show.n: logical. Whether to display frequencies (between brackets) in addition to the percentages. Default is TRUE.
show.asso: logical. Whether to add a column with measures of global association (Cramer's V and eta-squared). Default is TRUE.
digits: vector of 2 integers. The first value sets the number of digits for percentages, the second one sets the number of digits for medians and means. Default is c(1,1). If NULL, the results are not rounded.
na.rm: logical, indicating whether NA values should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the variables (see na.value argument).
na.value: character. Name of the level for NA category. Default is "NAs". Only used if na.rm = FALSE.

Details

The function uses gtsummary package to build the table of statistics, and then gt package to finalize the layout. Weights are handled silently with survey package.

Besides, the function is compatible with the attribute labels assigned with labelled package : these labels are displayed automatically.

Note

This function is quite similar to profiles, but displays the results in a fancier way.

Value

An object of class gt_tbl.

Author

Nicolas Robette

Examples

# \dontrun{
data(Movies)
cattab(x = Movies[, c("Genre", "ArtHouse", "Critics", "BoxOffice")],
       y = Movies$Country)


  
      
        Movies_Country
      
      Total 
(n=1000)
      Association¹
    
Europe 
(n=72)
      France 
(n=605)
      Other 
(n=26)
      USA 
(n=297)
    
Genre










0.275
    Action
19.4%  (14)
9.8%  (59)
23.1%  (6)
29.0%  (86)
16.5%  (165)

    Animation
2.8%  (2)
3.0%  (18)
0.0%  (0)
8.8%  (26)
4.6%  (46)

    Other
5.6%  (4)
2.1%  (13)
0.0%  (0)
3.0%  (9)
2.6%  (26)

    ComDram
13.9%  (10)
18.2%  (110)
11.5%  (3)
8.8%  (26)
14.9%  (149)

    Comedy
22.2%  (16)
23.8%  (144)
15.4%  (4)
19.5%  (58)
22.2%  (222)

    Documentary
1.4%  (1)
12.1%  (73)
3.8%  (1)
0.7%  (2)
7.7%  (77)

    Drama
25.0%  (18)
29.6%  (179)
34.6%  (9)
11.8%  (35)
24.1%  (241)

    Horror
4.2%  (3)
0.2%  (1)
3.8%  (1)
6.7%  (20)
2.5%  (25)

    SciFi
5.6%  (4)
1.3%  (8)
7.7%  (2)
11.8%  (35)
4.9%  (49)

ArtHouse
45.8%  (33)
65.0%  (393)
76.9%  (20)
13.5%  (40)
48.6%  (486)
0.469
Critics










0.017
    median
3.0
3.0
3.2
2.7
3.0

    (Q1 - Q3)
(2.3 - 3.4)
(2.3 - 3.5)
(2.4 - 3.7)
(2.0 - 3.2)
(2.3 - 3.5)

BoxOffice










0.048
    median
106,747.0
57,140.0
49,098.0
328,559.0
106,747.0

    (Q1 - Q3)
(10,291.0 - 493,595.0)
(9,988.0 - 310,777.0)
(21,341.0 - 164,270.0)
(94,925.0 - 926,106.0)
(20,252.0 - 468,809.0)

¹ Cramer’s V (categorical var.) or eta-squared (continuous var.)
    
# }

	Movies_Country				Total (n=1000)	Association¹
	Europe (n=72)	France (n=605)	Other (n=26)	USA (n=297)	Total (n=1000)	Association¹
Genre						0.275
Action	19.4% (14)	9.8% (59)	23.1% (6)	29.0% (86)	16.5% (165)
Animation	2.8% (2)	3.0% (18)	0.0% (0)	8.8% (26)	4.6% (46)
Other	5.6% (4)	2.1% (13)	0.0% (0)	3.0% (9)	2.6% (26)
ComDram	13.9% (10)	18.2% (110)	11.5% (3)	8.8% (26)	14.9% (149)
Comedy	22.2% (16)	23.8% (144)	15.4% (4)	19.5% (58)	22.2% (222)
Documentary	1.4% (1)	12.1% (73)	3.8% (1)	0.7% (2)	7.7% (77)
Drama	25.0% (18)	29.6% (179)	34.6% (9)	11.8% (35)	24.1% (241)
Horror	4.2% (3)	0.2% (1)	3.8% (1)	6.7% (20)	2.5% (25)
SciFi	5.6% (4)	1.3% (8)	7.7% (2)	11.8% (35)	4.9% (49)
ArtHouse	45.8% (33)	65.0% (393)	76.9% (20)	13.5% (40)	48.6% (486)	0.469
Critics						0.017
median	3.0	3.0	3.2	2.7	3.0
(Q1 - Q3)	(2.3 - 3.4)	(2.3 - 3.5)	(2.4 - 3.7)	(2.0 - 3.2)	(2.3 - 3.5)
BoxOffice						0.048
median	106,747.0	57,140.0	49,098.0	328,559.0	106,747.0
(Q1 - Q3)	(10,291.0 - 493,595.0)	(9,988.0 - 310,777.0)	(21,341.0 - 164,270.0)	(94,925.0 - 926,106.0)	(20,252.0 - 468,809.0)
¹ Cramer’s V (categorical var.) or eta-squared (continuous var.)