Data Preparation and Management with R
October 7, 2024
arrange()
arrange()
arrange()
sorts out observations.arrange()
desc()
# arrange observations by `dep_delay` in descending order.
flights |>
arrange(desc(dep_delay))
flights |>
arrange(-dep_delay)
desc(VARIABLE)
to re-order by a VARIABLE
in descending order.
-
before a numeric variable (-NUMERIC_VARIABLE
) also works.arrange()
distinct()
distinct()
distinct()
can find all the unique observations in a data.frame.distinct()
distinct()
.select()
select()
It’s not uncommon to get datasets with hundreds or thousands of variables.
select()
allows us to narrow in on the variables we’re actually interested in.
We can select variables by their names.
select()
select(-VARIABLES)
, we can remove variables.rename()
rename()
rename()
can be used to rename variables:
DATA_FRAME |> rename(NEW_VARIABLE = EXISTING_VARIABLE)