Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
484 views
in Technique[技术] by (71.8m points)

dplyr - Creating a table for frequency analysis results in R

I need to create a table of a certain type and based on a certain template.

This is my data:

df = structure(list(group = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L),
                    degree = structure(c(1L, 1L, 1L, 1L, 1L, 3L, 2L, 1L, 1L, 1L),
                                       .Label = c("Mild severity", "Moderate severity", "Severe severity"),
                                       class = "factor")), 
               .Names = c("group", "degree"), 
               class = "data.frame", 
               row.names = c(NA, -10L))

I conducted a crosstab:

table(df$degree,df$group)
                   
                    1 2 3
  Mild severity     3 3 2
  Moderate severity 0 0 1
  Severe severity   0 0 1

but I need the results to be formatted in this template: [![enter image description here][1]][1]

How can I create a table with this structure?

very important edit

full dput() (42 obs.)

df = structure(list(Study.Subject.ID = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 5L, 7L, 8L, 9L, 1L, 2L, 3L, 5L, 8L, 2L, 3L, 5L, 8L, 2L, 3L, 5L, 8L, 2L, 3L, 5L, 8L, 3L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L),
                                                 .Label = c("01-06-104", "01-09-108", "01-15-201", "01-16-202", "01-18-204", "01-27-301", "01-28-302", "01-33-305", "01-42-310"),
                                                 class = "factor"),
                    group = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L),
                    Degree.of.severity = structure(c(2L, 2L, 2L, 2L, 2L, 4L, 3L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L),
                                                   .Label = c("Life-threatening or disabling", "Mild severity", "Moderate severity", "Severe severity"),
                                                   class = "factor")),
              .Names = c("Study.Subject.ID", "group", "Degree.of.severity"),
              class = "data.frame",
              row.names = c(NA, -42L))

There is a concept of the subject, and there is concept a number of side effects. One person can have several side effects. The side effect can be

severity
Moderate
Severe

I have to count how many people separated by group have this or that side effect, and how many side effects are in this group?

I.E. In the first group we have 9 observations, but there are two unique people.

01-06-104
01-09-108

but total count Mild severity is 7. So only two people have side effects of Mild severity (X) and total count Mild severity is 7 (Y). Total count of patients is 42, so to calculate percentage we must divide by 42 (2/42)=4,7

That's why I expected the output to be:

    degree       group1           group2         group3 
                  X (%)Y          X (%)Y         X (%) Y
                        
    Mild severity   2 (4,7%)7   3 (7,1%)13   3(7,1%)    12
    Moderato        1 (2,3%)1   0(0,0%%)0    2(4,7%)    6
    Severe severity 0(0,0%%)0   0(0,0%%)0     1(2,3)    1
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I have to admit that I'm not clear on what you're trying to do. Unfortunately your expected output image does not help.

I assume you are asking how to calculate a 2-way contingency table and show both counts and percentages (of total). Here is a tidyverse possibility

library(tidyverse)
df %>%
    group_by(group, degree) %>%
    summarise(n = n(), perc = n() / nrow(.)) %>%
    mutate(entry = sprintf("%i (%3.2f%%)", n, perc * 100)) %>%
    select(-n, -perc) %>%
    spread(group, entry, fill = "0 (0.0%%)")
## A tibble: 3 x 4
#  degree            `1`        `2`        `3`
#  <fct>             <chr>      <chr>      <chr>
#1 Mild severity     3 (30.00%) 3 (30.00%) 2 (20.00%)
#2 Moderate severity 0 (0.0%%)  0 (0.0%%)  1 (10.00%)
#3 Severe severity   0 (0.0%%)  0 (0.0%%)  1 (10.00%)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...