您不需要自定义函数来执行此操作。让dplyr处理它。假设您的数据位于名为 的数据框中df,它可能如下所示:
df %>% # Set up the pipe
subset(complete.cases(df)) %>% # Removes rows with NA values
group_by(Name) %>% # Groups by the Name column
count(Color) %>% # Counts each Color by Name, creates a new column n
mutate(max = max(n)) %>% # Creates a new column for the max(n) by Name
subset(n == max(n)) %>% # Keeps only those rows where n equals max(n)
mutate(Keep == case_when( # Creates a dummy logical column named 'Keep'
n > 1 ~ TRUE, # That is TRUEfor n > 1 to keep ties
n == 1 & Color == head(Color, 1) ~ TRUE, # That is TRUE for the first row of n = 1
TRUE ~ FALSE)) %>% # That is FALSE for all other cases
subset(Keep) %>% # Keeps only those rows where Keep is TRUE
select(Name, Mode = Color, n) # Keeps only the Name, Color, and n columns and
# renames Color as Mode
这是输出
# A tibble: 3 x 3
# Groups: Name [3]
Name Mode Count
<fct> <fct> <int>
1 Bob Green 3
2 Drew Blue 1
3 Jim Blue 2
4 Jim Red 2
如果您想要一个函数,请将其包装在函数定义中:
my_mode_func <- function(df){
df %>%
subset(complete.cases(df)) %>%
group_by(Name) %>%
count(Color) %>%
mutate(max = max(n)) %>%
subset(n == max) %>%
mutate(Keep = case_when(
n > 1 ~ TRUE,
n == 1 & Color == head(Color,1) ~ TRUE,
TRUE ~ FALSE)) %>%
subset(Keep) %>%
select(Name, Mode = Color, Count = n)
}