Underrated Tidyverse Functions

The Assignment

I’m teaching an R Programming course next term. Jessica Minnier and I are developing the Ready for R Materials into a longer and more involved course.

I think one of the most important things is to teach people how to self-learn. As learning to program is a lifelong learning activity, it’s critically important to give them these meta-learning skills. So that’s the motivation behind the Tidyverse function of the Week assignment.

I asked on Twitter:

Hi Everyone. I'm teaching an #rstats course next quarter.

One assignment is to have each student write about a #tidyverse function. What it's for and an example.

What are some less known #tidyverse functions that do a job you find useful?
— Ted Laderas, PhD 🏳️‍🌈 (@tladeras) November 30, 2020

Some of my favorite suggestions

Here are some of the highlights from the thread.

I loved all of these. Danielle Quinn wins the MVP award for naming so many useful functions:

dplyr::uncount()
tidyr::complete()
tidyr::fill() / replace_na()
stringr::str_detect() / str_which()
lubridate::ymd_hms() and related functions
ggplot2::labs() - so simple, yet under appreciated!
— Danielle Quinn (she/her) (@daniellequinn88) December 1, 2020

fill() was highly suggested:

tidyr::fill() - extremely useful when creating a usable dataset out of a spreadsheet originally built for data entry, in which redundant informations are only reported once at the beginning of the group they refer to, rather than in every row as needed for the analysis.
— Luca Foppoli (@foppoli_luca) December 1, 2020

Many people suggested the window functions, including lead() and lag() and the cumulative functions:

Check out the dplyr window functions, cummin, cummax, cumany and cumall. They don't seen useful at first but they can solve really tricky aggregation problems. https://t.co/aDpXqSB2Vx
— Robert Kubinec (@rmkubinec) December 1, 2020

Alison Hill suggested problems(), which helps you diagnose why your data isn’t loading:

Ooh problems is a good function for importing rx https://t.co/P4ZR57PgOG
— Alison Presmanes Hill (@apreshill) December 1, 2020

I think that deframe() and enframe() are really exciting, since I do this operation all the time:

tibble::deframe(), tibble::deframe()
coercing a two-column df to named vector, which I prefer immensely to names(df) <- vec_of_names
— E. David Aja (@PeeltothePithy) December 1, 2020

unite(), separate() and separate_rows() also had their own contingent:

I find myself using tidyr::unite() a lot to clean messy data - particularly useful for making unique and informative ID's for each row. coalesce() and fill() are also little known gems! :)
— Guy Sutton🐝🌾🇿🇦🇿🇼 (@Guy_F_Sutton) December 1, 2020

Wow! Let’s Grab All the Tweets and Replies

I was bowled over by all of the replies. This was an unexpectedly really fun thread, and lots of recommendations from others.

I thought I would try and summarize everyone’s suggestions and compile a list of recommended functions. I used this script with some modifications to pull all the replies to my tweet. In particular, I had to request for extended tweet mode, and I extracted a few more fields from the returned JSON.

This wrote the tweet information into a CSV file.

Then I started parsing the data. I wrote a couple of functions, remove_users_from_text(), which removes the users from a tweet (by looking for words that begin with @) and get_funcs(), which uses a relatively simple regular expression to try to return the function (it looks for paired parentheses () or an underscore “-” to extract the functions). It actually works pretty well, and grabs most of the functions.

Then I use separate_rows() to split the multiple functions into their separate rows. This makes it easier to tally all the functions.

remove_users_from_text <- function(col){
  str_replace_all(col, "\\@\\w*", "")
  
}

get_funcs <- function(col){
  out <- str_extract_all(col, "\\w*\\(\\)|\\w*_\\w*")
  paste(out[[1]], collapse=", ")  
}

parsed_tweets <- tweets %>%
  rowwise() %>%
  mutate(text = remove_users_from_text(text)) %>%
  mutate(funcs = get_funcs(text)) %>%
  ungroup() %>%
  separate_rows(funcs, sep=", ") %>%
  select(date, user, funcs, text, reply, parent_thread) %>%
  distinct()

write_csv(parsed_tweets, file = "cleaned_tweets_incomplete.csv")

knitr::kable(parsed_tweets[1:10,-c(5:6)])

date	user	funcs	text
02/12/2020 16:12:48	NathanKhadaroo	expand_grid()	tidyr::expand_grid() is really useful for creating new datasets to see how fitted models perform on new data!
02/12/2020 06:43:45	sleepydatum	anti_join()	dplyr::anti_join() is my personal favorite.
02/12/2020 01:19:24	dragonflystats		out of curiosity - who are the students? CS? Health Science?
02/12/2020 01:22:25	tladeras		Biostatistics students.
01/12/2020 19:15:14	eulerdiditfirst		Writing your own tidy verse functions from chaining tidy verse functions using {{}} . Seriously feels like a super power sometimes
01/12/2020 18:34:13	pedro_tfonseca		dplyr::near is one of my favorite
01/12/2020 18:28:52	daniellequinn88	uncount()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
01/12/2020 18:28:52	daniellequinn88	complete()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
01/12/2020 18:28:52	daniellequinn88	fill()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
01/12/2020 18:28:52	daniellequinn88	replace_na()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!

At this point, I realized that I just needed to hand annotate the rest of the tweets, rather than wasting my time trying to parse the rest of the cases. So I pulled everything into Excel and just annotated the ones which I couldn’t pull from.

Functions by frequency

Here are the function suggestions by frequency. Unsurprisingly, case_when() (which I cover in the main course), has the most number of suggestions, because it’s so useful. tidyr::pivot_wider() and tidyr::pivot_longer() are also covered in the course.

There are some others which were new to me, and a bit of a surprise, such as coalesce(), fill().

cleaned_tweets <- read_csv("cleaned_tweets.csv") %>% select(-parent_thread) %>%
  mutate(user = paste0("[",user,"](",reply,")")) %>%
  select(-reply)

## 
## -- Column specification --------------------------------------------------------
## cols(
##   date = col_character(),
##   user = col_character(),
##   funcs = col_character(),
##   text = col_character(),
##   reply = col_character(),
##   parent_thread = col_character()
## )

functions_by_freq <- cleaned_tweets %>%
  janitor::tabyl(funcs) %>%
  filter(!is.na(funcs)) %>%
  arrange(desc(n)) 

write_csv(functions_by_freq, "functions_by_frequency.csv")

functions_by_freq %>%
  knitr::kable()

funcs	n	percent	valid_percent
case_when()	16	0.0601504	0.0720721
pivot_longer()	7	0.0263158	0.0315315
pivot_wider()	6	0.0225564	0.0270270
coalesce()	5	0.0187970	0.0225225
fill()	5	0.0187970	0.0225225
across()	4	0.0150376	0.0180180
lag()	4	0.0150376	0.0180180
separate()	4	0.0150376	0.0180180
separate_rows()	4	0.0150376	0.0180180
str_detect()	4	0.0150376	0.0180180
uncount()	4	0.0150376	0.0180180
anti_join()	3	0.0112782	0.0135135
complete()	3	0.0112782	0.0135135
fct_reorder()	3	0.0112782	0.0135135
lead()	3	0.0112782	0.0135135
map()	3	0.0112782	0.0135135
recode()	3	0.0112782	0.0135135
replace_na()	3	0.0112782	0.0135135
slice()	3	0.0112782	0.0135135
str_wrap()	3	0.0112782	0.0135135
{forcats}	2	0.0075188	0.0090090
{tidyeval}	2	0.0075188	0.0090090
add_count()	2	0.0075188	0.0090090
between()	2	0.0075188	0.0090090
breaks_pretty()	2	0.0075188	0.0090090
distinct()	2	0.0075188	0.0090090
enframe()	2	0.0075188	0.0090090
fct_infreq()	2	0.0075188	0.0090090
floor_date()	2	0.0075188	0.0090090
gather()	2	0.0075188	0.0090090
group_indices()	2	0.0075188	0.0090090
group_map()	2	0.0075188	0.0090090
left_join()	2	0.0075188	0.0090090
mutate()	2	0.0075188	0.0090090
n_distinct()	2	0.0075188	0.0090090
nest()	2	0.0075188	0.0090090
partial()	2	0.0075188	0.0090090
pluck()	2	0.0075188	0.0090090
pull()	2	0.0075188	0.0090090
safely()	2	0.0075188	0.0090090
tabyl()	2	0.0075188	0.0090090
unite()	2	0.0075188	0.0090090
unnest()	2	0.0075188	0.0090090
walk()	2	0.0075188	0.0090090
*_join()	1	0.0037594	0.0045045
{janitor}	1	0.0037594	0.0045045
{readr}	1	0.0037594	0.0045045
{tsibble}	1	0.0037594	0.0045045
add_predictions()	1	0.0037594	0.0045045
arrange()	1	0.0037594	0.0045045
as_mapper()	1	0.0037594	0.0045045
ceiling_date()	1	0.0037594	0.0045045
count()	1	0.0037594	0.0045045
crossing()	1	0.0037594	0.0045045
cut_interval()	1	0.0037594	0.0045045
cut_number ()	1	0.0037594	0.0045045
cut_width()	1	0.0037594	0.0045045
deframe()	1	0.0037594	0.0045045
dense_rank()	1	0.0037594	0.0045045
dplyr::first()	1	0.0037594	0.0045045
dplyr::last()	1	0.0037594	0.0045045
drop_na()	1	0.0037594	0.0045045
every()	1	0.0037594	0.0045045
expand_grid()	1	0.0037594	0.0045045
fct_explicit_na()	1	0.0037594	0.0045045
fct_inorder()	1	0.0037594	0.0045045
fct_relevel()	1	0.0037594	0.0045045
filter()	1	0.0037594	0.0045045
first()	1	0.0037594	0.0045045
force_tz()	1	0.0037594	0.0045045
geom_count()	1	0.0037594	0.0045045
glimpse()	1	0.0037594	0.0045045
grepl()	1	0.0037594	0.0045045
group_*()	1	0.0037594	0.0045045
group_by()	1	0.0037594	0.0045045
group_walk()	1	0.0037594	0.0045045
hoist()	1	0.0037594	0.0045045
if()	1	0.0037594	0.0045045
if_else()	1	0.0037594	0.0045045
janitor::clean_names()	1	0.0037594	0.0045045
keep_all()	1	0.0037594	0.0045045
labs()	1	0.0037594	0.0045045
last()	1	0.0037594	0.0045045
left_join	1	0.0037594	0.0045045
make_valid()	1	0.0037594	0.0045045
map_*()	1	0.0037594	0.0045045
map_dfr()	1	0.0037594	0.0045045
mutate_at()	1	0.0037594	0.0045045
mutate_if()	1	0.0037594	0.0045045
n()	1	0.0037594	0.0045045
n_tile()	1	0.0037594	0.0045045
na_if()	1	0.0037594	0.0045045
near()	1	0.0037594	0.0045045
nest_by()	1	0.0037594	0.0045045
none()	1	0.0037594	0.0045045
nth()	1	0.0037594	0.0045045
ntile()	1	0.0037594	0.0045045
parse_*()	1	0.0037594	0.0045045
parse_date_time()	1	0.0037594	0.0045045
paste()	1	0.0037594	0.0045045
possibly()	1	0.0037594	0.0045045
problems()	1	0.0037594	0.0045045
read_csv()	1	0.0037594	0.0045045
read_delim()	1	0.0037594	0.0045045
reduce()	1	0.0037594	0.0045045
relocate()	1	0.0037594	0.0045045
select()	1	0.0037594	0.0045045
skim()	1	0.0037594	0.0045045
slice_max()	1	0.0037594	0.0045045
slice_min()	1	0.0037594	0.0045045
some()	1	0.0037594	0.0045045
spread()	1	0.0037594	0.0045045
stat_summary	1	0.0037594	0.0045045
str_glue()	1	0.0037594	0.0045045
str_match()	1	0.0037594	0.0045045
str_remove()	1	0.0037594	0.0045045
str_trim()	1	0.0037594	0.0045045
str_which()	1	0.0037594	0.0045045
string_extract()	1	0.0037594	0.0045045
summarise()	1	0.0037594	0.0045045
tidy()	1	0.0037594	0.0045045
View()	1	0.0037594	0.0045045
with_groups()	1	0.0037594	0.0045045
with_tz()	1	0.0037594	0.0045045
write_csv()	1	0.0037594	0.0045045
ymd*()	1	0.0037594	0.0045045
ymd_hms()	1	0.0037594	0.0045045
zap_label()	1	0.0037594	0.0045045

Cleaned Tweets and Threads

Here’s all of the tweets from this thread (naysayers included). They are in somewhat order (longer threads are grouped).

Here’s a link to the cleaned CSV file

knitr::kable(cleaned_tweets)

date	user	funcs	text
2/12/2020 16:12	NathanKhadaroo	expand_grid()	tidyr::expand_grid() is really useful for creating new datasets to see how fitted models perform on new data!
2/12/2020 6:43	sleepydatum	anti_join()	dplyr::anti_join() is my personal favorite.
2/12/2020 1:19	dragonflystats	NA	out of curiosity - who are the students? CS? Health Science?
2/12/2020 1:22	tladeras	NA	Biostatistics students.
1/12/2020 19:15	eulerdiditfirst	NA	Writing your own tidy verse functions from chaining tidy verse functions using {{}} . Seriously feels like a super power sometimes
1/12/2020 18:34	pedro_tfonseca	near()	dplyr::near is one of my favorite
1/12/2020 18:28	daniellequinn88	uncount()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	complete()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	fill()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	replace_na()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	str_detect()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	str_which()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	ymd_hms()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 18:28	daniellequinn88	labs()	dplyr::uncount(); tidyr::complete(); tidyr::fill() / replace_na(); stringr::str_detect() / str_which(); lubridate::ymd_hms() and related functions; ggplot2::labs() - so simple, yet under appreciated!
1/12/2020 17:52	AmeliaMN	separate()	I donâ€™t know if itâ€™s less known or not, by tidyr::separate() is very useful
1/12/2020 18:04	tladeras	pivot_wider()	Yes! Very useful. I do think that {tidyr} in general is less known outside of pivot_wider() and pivot_longer().
1/12/2020 18:04	tladeras	pivot_longer()	Yes! Very useful. I do think that {tidyr} in general is less known outside of pivot_wider() and pivot_longer().
1/12/2020 18:32	ElinWaring	fct_infreq()	Forcats is also not as well known but has tons of handy functions like fct_infreq().
1/12/2020 17:43	rdh_CLE	dplyr::first()	Dplyr:: first and last
1/12/2020 17:43	rdh_CLE	dplyr::last()	Dplyr:: first and last
1/12/2020 16:38	IamBugsPotter	tabyl()	I know it’s not official but several times when starting out I would start collapsing things using dplyr before remembering I could just use janitor::tabyl()
1/12/2020 16:42	tladeras	tabyl()	tabyl() is the best, along with clean_names().
1/12/2020 16:42	tladeras	janitor::clean_names()	tabyl() is the best, along with clean_names().
2/12/2020 1:18	dragonflystats	{janitor}	i love Love LOVE Janitor
1/12/2020 16:29	aosmith16	n_distinct()	I try to squeeze in dplyr::n_distinct() to my basic intro. In my experience mostly useful for data checking/qaqc (i.e., becoming one with your dataset).
1/12/2020 16:43	tladeras	skim()	Yes! Super helpful. I currently use skimr to give students an overview, but that’s super helpful in giving single variable summaries.
1/12/2020 15:47	harlananelson	NA	Why would you want students who are just learning r to write about some obscure function?
1/12/2020 19:31	tladeras	NA	The point is for them to learn on their own and teach others. These aren’t obscure functions, they’re just lesser known ones. ; ; I can’t teach them everything, so the more I can teach the meta-learning, the better they will be off in the future.
1/12/2020 15:37	WireMonkey	{forcats}	I use forcats all the time. It’s especially helpful in ggplot when reordering an axis.
1/12/2020 16:44	tladeras	{forcats}	Definitely! {forcats} has so many useful functions.
1/12/2020 15:29	foppoli_luca	fill()	tidyr::fill() - extremely useful when creating a usable dataset out of a spreadsheet originally built for data entry, in which redundant informations are only reported once at the beginning of the group they refer to, rather than in every row as needed for the analysis.
1/12/2020 15:17	SorensenOystein	uncount()	tidyr::uncount(); tidyr::unnest(); dplyr::ntile()
1/12/2020 15:17	SorensenOystein	unnest()	tidyr::uncount(); tidyr::unnest(); dplyr::ntile()
1/12/2020 15:17	SorensenOystein	ntile()	tidyr::uncount(); tidyr::unnest(); dplyr::ntile()
1/12/2020 14:51	randyboyes	coalesce()	dplyr::coalesce() is so handy when you need it
1/12/2020 14:39	Airrock_TheRed	case_when()	I personally find case_when(), select(), slice(), and separate_rows() very helpful.
1/12/2020 14:39	Airrock_TheRed	select()	I personally find case_when(), select(), slice(), and separate_rows() very helpful.
1/12/2020 14:39	Airrock_TheRed	slice()	I personally find case_when(), select(), slice(), and separate_rows() very helpful.
1/12/2020 14:39	Airrock_TheRed	separate_rows()	I personally find case_when(), select(), slice(), and separate_rows() very helpful.
1/12/2020 14:28	TooSweetGeek	NA	unroll
1/12/2020 14:28	threadreaderapp	NA	Saluti, you can read it here: : Hi Everyone. I’m teaching an #rstats course next quarter. One assignment is to have each student write aboutâ€¦ https://t.co/PJH3wqv7aO Share this if you think it’s interesting. ðŸ¤–
1/12/2020 14:23	rmkubinec	NA	Check out the dplyr window functions, cummin, cummax, cumany and cumall. They don’t seen useful at first but they can solve really tricky aggregation problems. https://t.co/aDpXqSB2Vx
1/12/2020 14:12	InflationSquare	paste()	%\(% makes using paste() easier (among other things). ; I use %T>% View() at the end of %>% chains a lot as well.; dplyr::dense_rank() is another good one that I wouldn't have come across if I didn't know the SQL equivalent.; dplyr::group_[keys, rows, indices] are neat as well \| \|1/12/2020 14:12 \|[InflationSquare](https://twitter.com/InflationSquare/status/1333760840995049472) \|View() \|%\)% makes using paste() easier (among other things). ; I use %T>% View() at the end of %>% chains a lot as well.; dplyr::dense_rank() is another good one that I wouldn’t have come across if I didn’t know the SQL equivalent.; dplyr::group_[keys, rows, indices] are neat as well
1/12/2020 14:12	InflationSquare	dense_rank()	%\(% makes using paste() easier (among other things). ; I use %T>% View() at the end of %>% chains a lot as well.; dplyr::dense_rank() is another good one that I wouldn't have come across if I didn't know the SQL equivalent.; dplyr::group_[keys, rows, indices] are neat as well \| \|1/12/2020 14:12 \|[InflationSquare](https://twitter.com/InflationSquare/status/1333760840995049472) \|group_*() \|%\)% makes using paste() easier (among other things). ; I use %T>% View() at the end of %>% chains a lot as well.; dplyr::dense_rank() is another good one that I wouldn’t have come across if I didn’t know the SQL equivalent.; dplyr::group_[keys, rows, indices] are neat as well
1/12/2020 14:10	PRLPoliSci	drop_na()	A lot of great ones in the thread so far! I’d also toss in `drop_na`
1/12/2020 14:09	stateofstats	pivot_wider()	pivot_wider() and pivot_longer(), formerly spread() and gather(). Incredibly useful in converting messy data into something useable
1/12/2020 14:09	stateofstats	pivot_longer()	pivot_wider() and pivot_longer(), formerly spread() and gather(). Incredibly useful in converting messy data into something useable
1/12/2020 14:09	stateofstats	spread()	pivot_wider() and pivot_longer(), formerly spread() and gather(). Incredibly useful in converting messy data into something useable
1/12/2020 14:09	stateofstats	gather()	pivot_wider() and pivot_longer(), formerly spread() and gather(). Incredibly useful in converting messy data into something useable
1/12/2020 14:04	Smith80D	fill()	tidyr::fill() is the one that I find especially useful, for all those imported Excel files with row headings that are merged. Always assumed it existed, but didn’t know its name until a colleague introduced us.
1/12/2020 14:03	wzzerd	case_when()	When you don’t teach case_when, students will go years nesting ifelse like absolute chumps! Alternatively, to relabel discrete data I like to left_join with a crosswalk table so the associations are not hardcoded in the script.
1/12/2020 14:03	wzzerd	left_join	When you don’t teach case_when, students will go years nesting ifelse like absolute chumps! Alternatively, to relabel discrete data I like to left_join with a crosswalk table so the associations are not hardcoded in the script.
1/12/2020 14:02	Dorialexander	lead()	dplyr::lead and dplyr::lag Very practical especially within groups and yet they can be a bit tricky since it obviously raise NAs on first/last rows.
1/12/2020 14:02	Dorialexander	lag()	dplyr::lead and dplyr::lag Very practical especially within groups and yet they can be a bit tricky since it obviously raise NAs on first/last rows.
1/12/2020 14:01	LuisDVerde	enframe()	tibble::enframe()
1/12/2020 13:58	sebvanliempd	pivot_longer()	pivot_longer(); pivot_wider()
1/12/2020 13:58	sebvanliempd	pivot_wider()	pivot_longer(); pivot_wider()
1/12/2020 13:57	kjhealy	case_when()	case_when()
1/12/2020 13:51	Laserhedvig	distinct()	Oh man I use distinct() so much, especially with arrange() before and .keep_all = T
1/12/2020 13:51	Laserhedvig	arrange()	Oh man I use distinct() so much, especially with arrange() before and .keep_all = T
1/12/2020 13:51	Laserhedvig	keep_all()	Oh man I use distinct() so much, especially with arrange() before and .keep_all = T
1/12/2020 10:28	ChloeFouilloux	gather()	tidyr:: gather () has saved me many a time when wrangling unruly data
1/12/2020 10:19	VizMonkey	add_count()	I haven’t seen add_count but that’s a good one. Also keep and discard. And string_extract
1/12/2020 10:19	VizMonkey	string_extract()	I haven’t seen add_count but that’s a good one. Also keep and discard. And string_extract
1/12/2020 9:59	MattAlhonte	separate()	separate is pretty awesome and something I covet from Python, so much so that I made a blog post about writing a hacky Pandas approximation! https://t.co/se5O4nR1sa
1/12/2020 9:35	Stephenpedj	NA	- You should add this founder to your candid interview list!!
1/12/2020 8:04	Guy_F_Sutton	unite()	I find myself using tidyr::unite() a lot to clean messy data - particularly useful for making unique and informative ID’s for each row. coalesce() and fill() are also little known gems! :)
1/12/2020 8:04	Guy_F_Sutton	coalesce()	I find myself using tidyr::unite() a lot to clean messy data - particularly useful for making unique and informative ID’s for each row. coalesce() and fill() are also little known gems! :)
1/12/2020 8:04	Guy_F_Sutton	fill()	I find myself using tidyr::unite() a lot to clean messy data - particularly useful for making unique and informative ID’s for each row. coalesce() and fill() are also little known gems! :)
1/12/2020 7:25	ephorie	NA	Neither of them: https://t.co/Fbw9RHE3YF
1/12/2020 7:22	Amit_Levinson	group_indices()	Found myself using group_indices() several times in the past weeks. Great for giving groups sequential ids.
1/12/2020 7:10	ReillyInnes	pivot_longer()	tidyr::pivot_longer/wider; dplyr::n_distinct; tibble::glimpse; Are some of my most used (as well as %>% )
1/12/2020 7:10	ReillyInnes	n_distinct()	tidyr::pivot_longer/wider; dplyr::n_distinct; tibble::glimpse; Are some of my most used (as well as %>% )
1/12/2020 7:00	bmwiernik	nest()	nest() / unnest()
1/12/2020 7:00	bmwiernik	unnest()	nest() / unnest()
1/12/2020 6:51	BenInquiring	map()	The map() family from {purrr} was a game changer for me. as_mapper() is a nifty little function, but might be a bit advanced.
1/12/2020 6:51	BenInquiring	as_mapper()	The map() family from {purrr} was a game changer for me. as_mapper() is a nifty little function, but might be a bit advanced.
1/12/2020 6:18	brodriguesco	NA	anything from {purrr}
1/12/2020 5:24	vishal_katti	case_when()	case_when() is one of my favourite #rstats dplyr functions. The formula-like syntax needs more explaining usually. This would be a good candidate for your assignment.
1/12/2020 5:37	tladeras	NA	We cover it in class. Itâ€™s way too useful to not cover it.
1/12/2020 5:20	EOTWorld28	{tidyeval}	One more thing that is beneficial to users would be Non Standard Evaluation(NSE); How to send columns name/ column names are strings to user functions.; ; I am yet to get my head around sym/syms ! :)
1/12/2020 5:22	tladeras	{tidyeval}	We may get to curly-curly {{ }}, but it will probably be after we work with {purrr}.
1/12/2020 4:46	Breza	partial()	partial() is so useful!
1/12/2020 5:18	tladeras	partial()	pryr::partial()?
1/12/2020 4:40	dh_slone	tidy()	One more and then I’ll shut up ðŸ˜. sf is not part of the tidyverse, but it might as well be. Spatial file processing that is completely seamless with dplyr, ggplot, etc. I make all my maps with it these days.; And finally, the tidy() function from broom.
1/12/2020 4:42	tladeras	NA	Yes, {sf} is fantastic! Makes complicated spatial queries and joins much easier.
1/12/2020 6:06	dh_slone	make_valid()	make_valid() is the sf magic wand that solves random polygon slivers that often exist in data.
1/12/2020 4:20	dh_slone	NA	Not tidyverse per se, but lots of these cover the ’verse:; https://t.co/lPLTvRO02z; I keep a binder of these on my desk.
1/12/2020 4:17	dh_slone	between()	I have not seen between() mentioned yet. Are you covering magrittr? %<>% ?
1/12/2020 4:26	tladeras	between()	We will cover {magrittr} - and yes between() can be very useful. ; ; I am a little leery of the assignment pipe, because it can cause mistakes due to overwriting the data frame.
1/12/2020 4:29	dh_slone	NA	I’ve never done that and had to start over from the beginning.
1/12/2020 4:06	EOTWorld28	group_walk()	I just learnt about the function â€œgroup_walkâ€; My requirement was to store my groups into separate csv files and group_walk() helps in just that in just single line of code!!; ; Still face palming myself to learn this so late !!ðŸ˜€
1/12/2020 4:27	tladeras	NA	That’s a nice one! Very cool.
1/12/2020 3:58	cote_energy	slice_max()	slice_max, slice-min
1/12/2020 3:58	cote_energy	slice_min()	slice_max, slice-min
1/12/2020 18:07	cote_energy	n_tile()	Or n_tile!
1/12/2020 3:55	ellis_hughes	parse_*()	The readr parse_* functions. One of the listeners of #TidyX brought it up, and I’ve now used it so many places!!
1/12/2020 3:43	KellyBodwin	add_predictions()	I think broom::add_predictions() is criminally underrated.
1/12/2020 3:09	lisalendway	complete()	complete()
1/12/2020 3:14	tladeras	complete()	Good one! Always forget about complete()
1/12/2020 3:12	alexcookson	NA	Just used this today! So handy!
1/12/2020 2:34	jeremy_data	first()	These were less known to me for a long time, but that may just be my own fault :) so, first() last() and nth() on grouped data that is arranged.
1/12/2020 2:34	jeremy_data	last()	These were less known to me for a long time, but that may just be my own fault :) so, first() last() and nth() on grouped data that is arranged.
1/12/2020 2:34	jeremy_data	nth()	These were less known to me for a long time, but that may just be my own fault :) so, first() last() and nth() on grouped data that is arranged.
1/12/2020 2:21	usansky	anti_join()	dplyr::anti_join(); dplyr::coalesce()
1/12/2020 2:21	usansky	coalesce()	dplyr::anti_join(); dplyr::coalesce()
1/12/2020 2:07	lopierra	mutate()	Not a function, but I recently discovered you can use .before and .after with mutate() to put the new column where you want it, rather than the default all the way at the end.
1/12/2020 1:47	wouldeye125	nest()	Honestly? nest() makes a lot of higher level stuff super easy
1/12/2020 2:07	tladeras	nest_by()	For sure. nest_by()/map() is probably one of the most powerful combos in the tidyverse.
1/12/2020 2:07	tladeras	map()	For sure. nest_by()/map() is probably one of the most powerful combos in the tidyverse.
1/12/2020 1:46	iamericfletcher	every()	every(), some(), and none() from {purrr}.
1/12/2020 1:46	iamericfletcher	some()	every(), some(), and none() from {purrr}.
1/12/2020 1:46	iamericfletcher	none()	every(), some(), and none() from {purrr}.
1/12/2020 1:06	PeeltothePithy	deframe()	tibble::deframe(), tibble::deframe(); coercing a two-column df to named vector, which I prefer immensely to names(df) <- vec_of_names
1/12/2020 1:27	tladeras	NA	This one is super helpful. I didn’t know about this one.
1/12/2020 1:32	grrrck	reduce()	Oh thatâ€™s cool! I often use purrr::reduce() for this and feel both clever and sorry for whoever reads my code next
1/12/2020 1:35	PeeltothePithy	left_join()	There are some truly horrific reduce(left_join) statements hanging around in some old code of mine, and I apologize to my erstwhile colleagues.
1/12/2020 1:09	PeeltothePithy	enframe()	also enframe(); ; DAMN YOU LACK OF EDIT
1/12/2020 0:42	CPumarFrohberg	fct_reorder()	forcats::fct_reorder()! Probably quite well-known, but its contribution to ordering levels in a visually intuitive way is not to be underestimated!
1/12/2020 0:36	Bouzoulay	map_dfr()	If it hasn’t been mentioned already, purrr::map_dfr() or dplyr::case_when()
1/12/2020 0:36	Bouzoulay	case_when()	If it hasn’t been mentioned already, purrr::map_dfr() or dplyr::case_when()
1/12/2020 0:15	tw0handt0uch1	crossing()	crossing() is pretty handy and str_glue() can be quite powerful
1/12/2020 0:15	tw0handt0uch1	str_glue()	crossing() is pretty handy and str_glue() can be quite powerful
30/11/2020 23:43:53	Luisfreii	str_trim()	stringr::str_trim() is pretty good
30/11/2020 23:15:32	ludictech	*_join()	The dplyr *_join()s and, well, all of stringr! str_wrap() can be pretty useful for wrapping eg plot titles to a certain length, str_match() or str_detect() are so useful…
30/11/2020 23:15:32	ludictech	str_wrap()	The dplyr *_join()s and, well, all of stringr! str_wrap() can be pretty useful for wrapping eg plot titles to a certain length, str_match() or str_detect() are so useful…
30/11/2020 23:15:32	ludictech	str_match()	The dplyr *_join()s and, well, all of stringr! str_wrap() can be pretty useful for wrapping eg plot titles to a certain length, str_match() or str_detect() are so useful…
30/11/2020 23:15:32	ludictech	str_detect()	The dplyr *_join()s and, well, all of stringr! str_wrap() can be pretty useful for wrapping eg plot titles to a certain length, str_match() or str_detect() are so useful…
30/11/2020 23:17:52	tladeras	str_wrap()	Oh yeah, str_wrap()! I had to use this for tooltips on a plotly plot recently.
30/11/2020 23:11:26	ludictech	read_csv()	readr::read_csv() & write_csv() … (or read_delim() more generally) ?
30/11/2020 23:11:26	ludictech	write_csv()	readr::read_csv() & write_csv() … (or read_delim() more generally) ?
30/11/2020 23:11:26	ludictech	read_delim()	readr::read_csv() & write_csv() … (or read_delim() more generally) ?
30/11/2020 23:13:14	tladeras	{readr}	Certainly. We spend time with both {readr} and {readxl} because I think that loading data is the biggest point of frustration for students.
1/12/2020 3:42	apreshill	problems()	Ooh problems is a good function for importing rx https://t.co/P4ZR57PgOG
1/12/2020 3:48	tladeras	NA	Ooooh. That looks great. Learning so much from this thread!
30/11/2020 23:05:28	ArthurGailes	across()	Don’t know how well known or is because it’s new, but I never go a day without using across() anymore
30/11/2020 23:11:46	tladeras	across()	across() is super useful!
30/11/2020 22:48:18	Trabendo_daze	case_when()	case_when() but that’s pretty well known
30/11/2020 23:23:31	tladeras	NA	There’s a reason it’s well known! Super Useful.
30/11/2020 22:32:24	JKubale	str_detect()	I don’t think str_detect(), case_when(), and zap_label() have been mentioned yet. Highly recommend.
30/11/2020 22:32:24	JKubale	case_when()	I don’t think str_detect(), case_when(), and zap_label() have been mentioned yet. Highly recommend.
30/11/2020 22:32:24	JKubale	zap_label()	I don’t think str_detect(), case_when(), and zap_label() have been mentioned yet. Highly recommend.
30/11/2020 22:43:02	tladeras	NA	Nice! I am a little {haven} illiterate, so happy to include this.
30/11/2020 21:51:49	trentlikesstats	slice()	slice()
30/11/2020 21:45:07	cmdline_tips	unite()	like unite() and separate(). have a post based on ’s talk https://t.co/Qre4ACTRd6 #rstats
30/11/2020 21:45:07	cmdline_tips	separate()	like unite() and separate(). have a post based on ’s talk https://t.co/Qre4ACTRd6 #rstats
30/11/2020 22:08:15	tladeras	NA	Nice! Thanks for putting this together.
30/11/2020 22:13:18	cmdline_tips	NA	the post was written immediately after ’s talk. I believe video of the talk is available now.
30/11/2020 21:38:32	robinson_es	NA	has a good talk https://t.co/s3LBiZ95tR
30/11/2020 23:04:46	jaredlander	NA	And herself has the lesser known stars talk https://t.co/80zdiWhIn4
30/11/2020 21:41:40	ameresv	NA	One of the favorite. Also his screencast are the best. So much things to learn from it
30/11/2020 21:40:34	tladeras	NA	Noice! Thanks, Emily.
30/11/2020 21:34:40	pj_ballantyne	mutate_at()	mutate_at() and mutate_if() ðŸ˜
30/11/2020 21:34:40	pj_ballantyne	mutate_if()	mutate_at() and mutate_if() ðŸ˜
30/11/2020 21:30:23	nathaneastwood_	with_groups()	There are plenty of lesser known experimental functions in dplyr 1.0.0 like with_groups(). Also some experimental features like .keep in mutate()
30/11/2020 21:30:23	nathaneastwood_	mutate()	There are plenty of lesser known experimental functions in dplyr 1.0.0 like with_groups(). Also some experimental features like .keep in mutate()
30/11/2020 21:28:13	MikeMahoney218	map_*()	purrr (and furrr) in general imo! I don’t know that map_* is more complicated than loops, but I think they’re underutilized. Also tidyr::nest and forcats::fct_reorder
30/11/2020 21:28:13	MikeMahoney218	fct_reorder()	purrr (and furrr) in general imo! I don’t know that map_* is more complicated than loops, but I think they’re underutilized. Also tidyr::nest and forcats::fct_reorder
30/11/2020 21:35:57	tladeras	NA	We will get to {purrr} eventually. I’ve been trying to slowly distentangle the use case so one concept is learned at a time. It’s been tricky. ; ; https://t.co/A6r9jWtsCV
30/11/2020 21:41:48	MikeMahoney218	safely()	TIL about `safely` ðŸ˜‚ I’ve mostly been writing package code recently & am reluctant to include tidyverse dependencies, but boy oh boy do I have some horrifying `tryCatch` calls that could probably stand to be replaced…
30/11/2020 21:47:36	tladeras	safely()	Ha. safely()/possibly() can be super useful and I just learned about it by putting this section together…
30/11/2020 21:47:36	tladeras	possibly()	Ha. safely()/possibly() can be super useful and I just learned about it by putting this section together…
30/11/2020 21:28:03	allawayr	pluck()	I went for way too long not knowing about purrr::pluck()
30/11/2020 21:29:25	allawayr	case_when()	Oh oh and case_when() lets me be super lazy.
30/11/2020 21:19:36	ijeamaka_a	fct_relevel()	Forcats::fct_relevel() and forcats::fct_reorder()
30/11/2020 21:19:36	ijeamaka_a	fct_reorder()	Forcats::fct_relevel() and forcats::fct_reorder()
30/11/2020 21:19:32	piquergaming	hoist()	Hoist() - when youâ€™re dealing with JSON (or dynamodb in my case) itâ€™s a lifesaver.
30/11/2020 21:19:30	chrishanretty	if_else()	if_else (and an example of where you need to use it/where baseR ifelse breaks down)
30/11/2020 21:23:39	tladeras	NA	Super useful!
30/11/2020 21:18:48	maggiedalena123	anti_join()	anti_join()
30/11/2020 21:17:37	JJVenky	across()	mutate(across()) as in; ; data.frame(a=c(q,w,e), b=c(1,2,-1)) %>% mutate(across(c(b), na_if, -1)); ; or; ; data.frame(a=c(q,w,e), b=c(1,2,-1)) %>% mutate(across(c(b), ~replace(., .<0,NA))
30/11/2020 21:17:37	JJVenky	na_if()	mutate(across()) as in; ; data.frame(a=c(q,w,e), b=c(1,2,-1)) %>% mutate(across(c(b), na_if, -1)); ; or; ; data.frame(a=c(q,w,e), b=c(1,2,-1)) %>% mutate(across(c(b), ~replace(., .<0,NA))
30/11/2020 21:28:10	tladeras	across()	Yup, mutate(across()) is great. I do cover {tidyselect} in my {tidyowl} tutorials: https://t.co/pRvC9YJZQG
30/11/2020 21:15:55	aecoppock	coalesce()	dplyr::coalesce()
30/11/2020 21:15:42	_echong	pull()	dplyr::pull(), to emphasize the difference between a vector and a one-column dataframe.
30/11/2020 21:20:04	tladeras	pull()	This is really one of the hardest concepts to teach, but agreed, pull() makes it much more clear.
30/11/2020 21:11:01	apreshill	breaks_pretty()	I think all of the scales package is helpful; ; https://t.co/s5WMZcWwYR; ; I especially like breaks_pretty and the label functions: https://t.co/PtrVT2R7dM
30/11/2020 21:15:05	tladeras	breaks_pretty()	For sure. I usually don’t get to scales when I teach {ggplot2}, but I think it might be worth highlighting the useful cases like breaks_pretty().
30/11/2020 21:20:12	apreshill	group_indices()	Oh and one more! Sometimes dplyr::group_indices is helpful. The actual reference page is less helpful, but this discussion on the implementation is quite good: https://t.co/sD3iauuN9B
30/11/2020 20:53:23	gvwilson	lag()	I am frequently surprised by how few people know about lag()
2/12/2020 5:54	EvenKeely	lead()	lead() and lag() are awesome for working with transect point data.
2/12/2020 5:54	EvenKeely	lag()	lead() and lag() are awesome for working with transect point data.
30/11/2020 20:57:51	tladeras	lead()	Agreed. The documentation/examples are a little terse for lead()/lag(), which may be why few people use them.
30/11/2020 20:57:51	tladeras	lag()	Agreed. The documentation/examples are a little terse for lead()/lag(), which may be why few people use them.
30/11/2020 21:07:38	apreshill	NA	I think in general the window functions could use some love https://t.co/8z9DdvFQgt
30/11/2020 21:10:02	tladeras	NA	Agreed! I think the window functions are really useful.
30/11/2020 20:49:45	kaiz_p	left_join()	left_join() and other joins, separate(), recode(), pivot_longer(), pivot_wider(), filter()
30/11/2020 20:49:45	kaiz_p	separate()	left_join() and other joins, separate(), recode(), pivot_longer(), pivot_wider(), filter()
30/11/2020 20:49:45	kaiz_p	recode()	left_join() and other joins, separate(), recode(), pivot_longer(), pivot_wider(), filter()
30/11/2020 20:49:45	kaiz_p	pivot_longer()	left_join() and other joins, separate(), recode(), pivot_longer(), pivot_wider(), filter()
30/11/2020 20:49:45	kaiz_p	pivot_wider()	left_join() and other joins, separate(), recode(), pivot_longer(), pivot_wider(), filter()
30/11/2020 20:49:45	kaiz_p	filter()	left_join() and other joins, separate(), recode(), pivot_longer(), pivot_wider(), filter()
30/11/2020 20:59:28	tladeras	recode()	All very useful! I sometimes do get confused over whether to teach recodeI() vs case_when() - they’re both useful, but the use cases are different.
30/11/2020 20:59:28	tladeras	case_when()	All very useful! I sometimes do get confused over whether to teach recodeI() vs case_when() - they’re both useful, but the use cases are different.
30/11/2020 21:08:24	kaiz_p	case_when()	Good point! ; I previously used case_when() for all my recoding needs, but then I discovered recode() and itâ€™s so much easier. Less code = fewer mistakes!
30/11/2020 21:08:24	kaiz_p	recode()	Good point! ; I previously used case_when() for all my recoding needs, but then I discovered recode() and itâ€™s so much easier. Less code = fewer mistakes!
30/11/2020 21:09:25	kaiz_p	case_when()	I now use case_when() mostly when Iâ€™m looking for a string – grepl() or when working across multiple columns – case_when(A == â€œaâ€ & B == â€œbâ€ ~ â€œabâ€)
30/11/2020 21:09:25	kaiz_p	grepl()	I now use case_when() mostly when Iâ€™m looking for a string – grepl() or when working across multiple columns – case_when(A == â€œaâ€ & B == â€œbâ€ ~ â€œabâ€)
30/11/2020 21:09:25	kaiz_p	case_when()	I now use case_when() mostly when Iâ€™m looking for a string – grepl() or when working across multiple columns – case_when(A == â€œaâ€ & B == â€œbâ€ ~ â€œabâ€)
2/12/2020 0:38	EvenKeely	NA	(TRUE ~ You missed one)
2/12/2020 0:41	tladeras	NA	I feel seen.
30/11/2020 21:11:16	tladeras	case_when()	Very true. It can be hard to see which cases you missed when you write a case_when() statement, much like writing nested if() statements.
30/11/2020 21:11:16	tladeras	if()	Very true. It can be hard to see which cases you missed when you write a case_when() statement, much like writing nested if() statements.
30/11/2020 20:43:48	Corey_Yanofsky	replace_na()	dplyr::arrange & helper dplyr::desc; dplyr::coalesce; tidyr::replace_na; ; https://t.co/h9ew04PYcU
30/11/2020 20:46:12	tladeras	coalesce()	Ha! ; ; And coalesce()/replace_na() are great. Adding them to the list.
30/11/2020 20:46:12	tladeras	replace_na()	Ha! ; ; And coalesce()/replace_na() are great. Adding them to the list.
30/11/2020 20:32:01	Recon1974	NA	do you teach tidyverse directly or base r first?
30/11/2020 20:34:58	tladeras	NA	I teach just enough base R for them to understand vectors, functions, and data.frames. ; ; You can see the previous class here: https://t.co/4WweuqMSl8; ; The rest is mostly tidyverse, except for the cases when they will encounter base-R a lot.
30/11/2020 20:40:42	tladeras	NA	I know this is probably controversial, but my goal is to get them up and working usefully as quickly as possible, rather than teach a standard programming course, which you have to learn a lot of things before you do something useful.
30/11/2020 20:30:29	BeltzEcology	floor_date()	lubridate::floor_date; ; I have recently become a HUGE fan!
1/12/2020 19:17	eulerdiditfirst	{tsibble}	Shout out to tsibble; if youâ€™re using times series data you should def check it out
30/11/2020 21:18:36	JenRichmondPhD	parse_date_time()	also lubridate::parse_date_time() is kinda magic
30/11/2020 21:49:37	tladeras	NA	It is definitely magic.
1/12/2020 4:12	dh_slone	ymd*()	most of lubridate is magical. I use the heck out of ymd…() and similar, with_tz() and force_tz() take care of my biggest headaches, floor_date(), and ceiling_date(), etc.
1/12/2020 4:12	dh_slone	with_tz()	most of lubridate is magical. I use the heck out of ymd…() and similar, with_tz() and force_tz() take care of my biggest headaches, floor_date(), and ceiling_date(), etc.
1/12/2020 4:12	dh_slone	force_tz()	most of lubridate is magical. I use the heck out of ymd…() and similar, with_tz() and force_tz() take care of my biggest headaches, floor_date(), and ceiling_date(), etc.
1/12/2020 4:12	dh_slone	floor_date()	most of lubridate is magical. I use the heck out of ymd…() and similar, with_tz() and force_tz() take care of my biggest headaches, floor_date(), and ceiling_date(), etc.
1/12/2020 4:12	dh_slone	ceiling_date()	most of lubridate is magical. I use the heck out of ymd…() and similar, with_tz() and force_tz() take care of my biggest headaches, floor_date(), and ceiling_date(), etc.
30/11/2020 20:35:21	tladeras	NA	Ooh, this is great! Thanks!
30/11/2020 20:37:36	BeltzEcology	NA	Welcome!
30/11/2020 20:15:42	emilmalta	uncount()	I use these all the time:; tidyr::uncount(); tidyr::separate and tidyr::separate_rows(); forcats::fct_inorder(); forcats::fct_infreq()
30/11/2020 20:15:42	emilmalta	separate_rows()	I use these all the time:; tidyr::uncount(); tidyr::separate and tidyr::separate_rows(); forcats::fct_inorder(); forcats::fct_infreq()
30/11/2020 20:15:42	emilmalta	fct_inorder()	I use these all the time:; tidyr::uncount(); tidyr::separate and tidyr::separate_rows(); forcats::fct_inorder(); forcats::fct_infreq()
30/11/2020 20:15:42	emilmalta	fct_infreq()	I use these all the time:; tidyr::uncount(); tidyr::separate and tidyr::separate_rows(); forcats::fct_inorder(); forcats::fct_infreq()
30/11/2020 23:49:22	samclifford	fct_explicit_na()	Been using forcats::fct_explicit_na() of late.
30/11/2020 20:17:34	tladeras	separate_rows()	Yes, we cover tidyr a little bit. These are great suggestions, especially separate_rows()
30/11/2020 20:37:29	GenomeGal	separate_rows()	Yes! Separate_rows is my life - so useful!!
30/11/2020 20:21:21	emilmalta	uncount()	One thing that really made everything click for me, when learning tidy data, was that uncount() is in tidyr, and not dplyr.; ; Itâ€™s kinda subtle, but it was the thing that made me realize that tidying!=transforming.
30/11/2020 20:31:05	tladeras	NA	Yes, this distinction escaped me at first. I guess it’s like the form versus content distinction.
30/11/2020 20:15:12	kaija_bean	pivot_longer()	pivot_longer() and pivot_wider() are great!
30/11/2020 20:15:12	kaija_bean	pivot_wider()	pivot_longer() and pivot_wider() are great!
30/11/2020 20:16:33	tladeras	NA	They are great!
30/11/2020 20:15:41	kaija_bean	NA	And although they’re basically inverses of each other, each one had different arguments and different things to pay attention to, so I could easily see one student doing each of them without too much overlap.
30/11/2020 20:11:38	JayUlfelder	case_when()	A few that come to mind: dplyr::case_when(), purrr::map(), dplyr::group_map(), purrr::walk(), and purrr::pluck().
30/11/2020 20:11:38	JayUlfelder	map()	A few that come to mind: dplyr::case_when(), purrr::map(), dplyr::group_map(), purrr::walk(), and purrr::pluck().
30/11/2020 20:11:38	JayUlfelder	group_map()	A few that come to mind: dplyr::case_when(), purrr::map(), dplyr::group_map(), purrr::walk(), and purrr::pluck().
30/11/2020 20:11:38	JayUlfelder	walk()	A few that come to mind: dplyr::case_when(), purrr::map(), dplyr::group_map(), purrr::walk(), and purrr::pluck().
30/11/2020 20:11:38	JayUlfelder	pluck()	A few that come to mind: dplyr::case_when(), purrr::map(), dplyr::group_map(), purrr::walk(), and purrr::pluck().
30/11/2020 20:13:52	tladeras	case_when()	Yup, these are really useful! I do cover case_when() because it’s so universally useful. ; ; And I’m going to cover {purrr} a little bit. group_map() and walk() are great suggestions.
30/11/2020 20:13:52	tladeras	group_map()	Yup, these are really useful! I do cover case_when() because it’s so universally useful. ; ; And I’m going to cover {purrr} a little bit. group_map() and walk() are great suggestions.
30/11/2020 20:13:52	tladeras	walk()	Yup, these are really useful! I do cover case_when() because it’s so universally useful. ; ; And I’m going to cover {purrr} a little bit. group_map() and walk() are great suggestions.
30/11/2020 20:06:04	ivelasq3	fill()	Is tidyr::fill() lesser known? [regardless I love it]; ; Sharing just in case you haven’t seen this! https://t.co/cpJfD56rxB
30/11/2020 20:06:44	tladeras	NA	Ooh! This is perfect. Thanks!
30/11/2020 19:42:31	francisco_yira	cut_width()	ggplot2::cut_width, cut_number and cut_interval to transform continuous variables into discrete bins
30/11/2020 19:42:31	francisco_yira	cut_number ()	ggplot2::cut_width, cut_number and cut_interval to transform continuous variables into discrete bins
30/11/2020 19:42:31	francisco_yira	cut_interval()	ggplot2::cut_width, cut_number and cut_interval to transform continuous variables into discrete bins
30/11/2020 19:43:33	tladeras	NA	Ah, very interesting!
30/11/2020 19:40:55	tladeras	relocate()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
30/11/2020 19:40:55	tladeras	count()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
30/11/2020 19:40:55	tladeras	n()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
30/11/2020 19:40:55	tladeras	distinct()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
30/11/2020 19:40:55	tladeras	glimpse()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
30/11/2020 19:40:55	tladeras	slice()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
30/11/2020 19:40:55	tladeras	geom_count()	Here’s the list so far:; ; - dplyr::relocate(); - dplyr::count() / n(); - dplyr::distinct(); - dplyr::glimpse(); - dplyr::slice(); - ggplot2::geom_count()
1/12/2020 17:06	robbins_ave	add_count()	dplyr::add_count() is often useful
1/12/2020 14:23	delaBJL	stat_summary	ggplot2::stat_summary is fun; ; I use a lot of the stringr functions, str_remove(), str_detect(); ; pivot_longer() and pivot_wider() are fairly simple to grok and are incredibly useful
1/12/2020 14:23	delaBJL	str_remove()	ggplot2::stat_summary is fun; ; I use a lot of the stringr functions, str_remove(), str_detect(); ; pivot_longer() and pivot_wider() are fairly simple to grok and are incredibly useful
1/12/2020 14:23	delaBJL	str_detect()	ggplot2::stat_summary is fun; ; I use a lot of the stringr functions, str_remove(), str_detect(); ; pivot_longer() and pivot_wider() are fairly simple to grok and are incredibly useful
1/12/2020 14:23	delaBJL	pivot_longer()	ggplot2::stat_summary is fun; ; I use a lot of the stringr functions, str_remove(), str_detect(); ; pivot_longer() and pivot_wider() are fairly simple to grok and are incredibly useful
1/12/2020 14:23	delaBJL	pivot_wider()	ggplot2::stat_summary is fun; ; I use a lot of the stringr functions, str_remove(), str_detect(); ; pivot_longer() and pivot_wider() are fairly simple to grok and are incredibly useful
1/12/2020 15:37	toeb18	str_wrap()	str_wrap is also a fantastic one
1/12/2020 13:59	GMFranceschini	case_when()	case_when(), group_by()/summarise()
1/12/2020 13:59	GMFranceschini	group_by()	case_when(), group_by()/summarise()
1/12/2020 13:59	GMFranceschini	summarise()	case_when(), group_by()/summarise()

Source Code and Data

Feel free to use and modify.

RMarkdown file used to generate this post
Python Twitter Scraper (by Giovanni Mellini) - I used this because there wasn’t a ready made recipe in rtweet to extract replies - you have to use recursion to extract all of the thread replies that belong to a tweet, and this was easily modifiable.
Cleaned Tweets File (CSV)

Thank You

This post is my thank you for everyone who contributed to this thread. Thank you!