Data Transformation

Author

YOUR NAME

Published

January 25, 2026

1 Getting Started

This document introduces the dplyr workflow for cleaning, transforming, and summarizing data in R.

You will learn how to:

  • select columns with select()
  • filter rows with filter()
  • sort data with arrange()
  • create new variables with mutate()
  • summarize data with summarise()
  • group operations with group_by()
  • combine datasets with joins


2 Setup

2.1 Install (one-time)

Code
# install.packages("tidyverse")

2.2 Load packages

Code
library(tidyverse)


3 A Quick Look at a Dataset

We will use the built-in dataset mpg.

Code
mpg |> glimpse()
Rows: 234
Columns: 11
$ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi", "…
$ model        <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "…
$ displ        <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.…
$ year         <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 200…
$ cyl          <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, …
$ trans        <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto…
$ drv          <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4…
$ cty          <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 1…
$ hwy          <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 2…
$ fl           <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p…
$ class        <chr> "compact", "compact", "compact", "compact", "compact", "c…


4 The Pipe Operator |>

The pipe sends the output of one step into the next step.

Code
mpg |>
  select(manufacturer, model, year)
# A tibble: 234 Ă— 3
   manufacturer model       year
   <chr>        <chr>      <int>
 1 audi         a4          1999
 2 audi         a4          1999
 3 audi         a4          2008
 4 audi         a4          2008
 5 audi         a4          1999
 6 audi         a4          1999
 7 audi         a4          2008
 8 audi         a4 quattro  1999
 9 audi         a4 quattro  1999
10 audi         a4 quattro  2008
# ℹ 224 more rows

This is equivalent to:

Code
select(mpg, manufacturer, model, year)
# A tibble: 234 Ă— 3
   manufacturer model       year
   <chr>        <chr>      <int>
 1 audi         a4          1999
 2 audi         a4          1999
 3 audi         a4          2008
 4 audi         a4          2008
 5 audi         a4          1999
 6 audi         a4          1999
 7 audi         a4          2008
 8 audi         a4 quattro  1999
 9 audi         a4 quattro  1999
10 audi         a4 quattro  2008
# ℹ 224 more rows

âś… Use pipes to write code in a clear, step-by-step style.


5 select() (Choose Columns)

5.1 Select a few columns

Code
mpg |>
  select(manufacturer, model, displ, hwy)
# A tibble: 234 Ă— 4
   manufacturer model      displ   hwy
   <chr>        <chr>      <dbl> <int>
 1 audi         a4           1.8    29
 2 audi         a4           1.8    29
 3 audi         a4           2      31
 4 audi         a4           2      30
 5 audi         a4           2.8    26
 6 audi         a4           2.8    26
 7 audi         a4           3.1    27
 8 audi         a4 quattro   1.8    26
 9 audi         a4 quattro   1.8    25
10 audi         a4 quattro   2      28
# ℹ 224 more rows

5.2 Select a range of columns

Code
mpg |>
  select(manufacturer:year)
# A tibble: 234 Ă— 4
   manufacturer model      displ  year
   <chr>        <chr>      <dbl> <int>
 1 audi         a4           1.8  1999
 2 audi         a4           1.8  1999
 3 audi         a4           2    2008
 4 audi         a4           2    2008
 5 audi         a4           2.8  1999
 6 audi         a4           2.8  1999
 7 audi         a4           3.1  2008
 8 audi         a4 quattro   1.8  1999
 9 audi         a4 quattro   1.8  1999
10 audi         a4 quattro   2    2008
# ℹ 224 more rows

5.3 Remove columns

Code
mpg |>
  select(-cty, -hwy)
# A tibble: 234 Ă— 9
   manufacturer model      displ  year   cyl trans      drv   fl    class  
   <chr>        <chr>      <dbl> <int> <int> <chr>      <chr> <chr> <chr>  
 1 audi         a4           1.8  1999     4 auto(l5)   f     p     compact
 2 audi         a4           1.8  1999     4 manual(m5) f     p     compact
 3 audi         a4           2    2008     4 manual(m6) f     p     compact
 4 audi         a4           2    2008     4 auto(av)   f     p     compact
 5 audi         a4           2.8  1999     6 auto(l5)   f     p     compact
 6 audi         a4           2.8  1999     6 manual(m5) f     p     compact
 7 audi         a4           3.1  2008     6 auto(av)   f     p     compact
 8 audi         a4 quattro   1.8  1999     4 manual(m5) 4     p     compact
 9 audi         a4 quattro   1.8  1999     4 auto(l5)   4     p     compact
10 audi         a4 quattro   2    2008     4 manual(m6) 4     p     compact
# ℹ 224 more rows


6 filter() (Choose Rows)

6.1 Filter by a condition

Code
mpg |>
  filter(class == "compact")
# A tibble: 47 Ă— 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
 2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
 3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
 6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
 7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
 9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
# ℹ 37 more rows

6.2 Multiple conditions with AND

Code
mpg |>
  filter(class == "compact", year == 2008)
# A tibble: 22 Ă— 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 2 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 3 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 4 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
 5 audi         a4 quattro   2    2008     4 auto… 4        19    27 p     comp…
 6 audi         a4 quattro   3.1  2008     6 auto… 4        17    25 p     comp…
 7 audi         a4 quattro   3.1  2008     6 manu… 4        15    25 p     comp…
 8 subaru       impreza a…   2.5  2008     4 auto… 4        20    25 p     comp…
 9 subaru       impreza a…   2.5  2008     4 auto… 4        20    27 r     comp…
10 subaru       impreza a…   2.5  2008     4 manu… 4        19    25 p     comp…
# ℹ 12 more rows

6.3 OR conditions using %in%

Code
mpg |>
  filter(class %in% c("compact", "suv"))
# A tibble: 109 Ă— 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
 2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
 3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
 6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
 7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
 9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
# ℹ 99 more rows


7 arrange() (Sort Rows)

7.1 Sort ascending

Code
mpg |>
  arrange(hwy)
# A tibble: 234 Ă— 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 dodge        dakota pi…   4.7  2008     8 auto… 4         9    12 e     pick…
 2 dodge        durango 4…   4.7  2008     8 auto… 4         9    12 e     suv  
 3 dodge        ram 1500 …   4.7  2008     8 auto… 4         9    12 e     pick…
 4 dodge        ram 1500 …   4.7  2008     8 manu… 4         9    12 e     pick…
 5 jeep         grand che…   4.7  2008     8 auto… 4         9    12 e     suv  
 6 chevrolet    k1500 tah…   5.3  2008     8 auto… 4        11    14 e     suv  
 7 jeep         grand che…   6.1  2008     8 auto… 4        11    14 p     suv  
 8 chevrolet    c1500 sub…   5.3  2008     8 auto… r        11    15 e     suv  
 9 chevrolet    k1500 tah…   5.7  1999     8 auto… 4        11    15 r     suv  
10 dodge        dakota pi…   5.2  1999     8 auto… 4        11    15 r     pick…
# ℹ 224 more rows

7.2 Sort descending

Code
mpg |>
  arrange(desc(hwy))
# A tibble: 234 Ă— 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 volkswagen   jetta        1.9  1999     4 manu… f        33    44 d     comp…
 2 volkswagen   new beetle   1.9  1999     4 manu… f        35    44 d     subc…
 3 volkswagen   new beetle   1.9  1999     4 auto… f        29    41 d     subc…
 4 toyota       corolla      1.8  2008     4 manu… f        28    37 r     comp…
 5 honda        civic        1.8  2008     4 auto… f        25    36 r     subc…
 6 honda        civic        1.8  2008     4 auto… f        24    36 c     subc…
 7 toyota       corolla      1.8  1999     4 manu… f        26    35 r     comp…
 8 toyota       corolla      1.8  2008     4 auto… f        26    35 r     comp…
 9 honda        civic        1.8  2008     4 manu… f        26    34 r     subc…
10 honda        civic        1.6  1999     4 manu… f        28    33 r     subc…
# ℹ 224 more rows

7.3 Sort by multiple variables

Code
mpg |>
  arrange(class, desc(hwy))
# A tibble: 234 Ă— 11
   manufacturer model    displ  year   cyl trans   drv     cty   hwy fl    class
   <chr>        <chr>    <dbl> <int> <int> <chr>   <chr> <int> <int> <chr> <chr>
 1 chevrolet    corvette   5.7  1999     8 manual… r        16    26 p     2sea…
 2 chevrolet    corvette   6.2  2008     8 manual… r        16    26 p     2sea…
 3 chevrolet    corvette   6.2  2008     8 auto(s… r        15    25 p     2sea…
 4 chevrolet    corvette   7    2008     8 manual… r        15    24 p     2sea…
 5 chevrolet    corvette   5.7  1999     8 auto(l… r        15    23 p     2sea…
 6 volkswagen   jetta      1.9  1999     4 manual… f        33    44 d     comp…
 7 toyota       corolla    1.8  2008     4 manual… f        28    37 r     comp…
 8 toyota       corolla    1.8  1999     4 manual… f        26    35 r     comp…
 9 toyota       corolla    1.8  2008     4 auto(l… f        26    35 r     comp…
10 toyota       corolla    1.8  1999     4 auto(l… f        24    33 r     comp…
# ℹ 224 more rows


8 mutate() (Create New Variables)

8.1 Create a new variable

Code
mpg |>
  mutate(hwy_per_liter = hwy / displ)
# A tibble: 234 Ă— 12
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
 2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
 3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
 6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
 7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
 9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
# ℹ 224 more rows
# ℹ 1 more variable: hwy_per_liter <dbl>

8.2 Create multiple new variables

Code
mpg |>
  mutate(
    hwy_per_liter = hwy / displ,
    cty_per_liter = cty / displ
  )
# A tibble: 234 Ă— 13
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
 2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
 3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
 6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
 7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
 9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
# ℹ 224 more rows
# ℹ 2 more variables: hwy_per_liter <dbl>, cty_per_liter <dbl>

8.3 Use if_else() for conditional logic

Code
mpg |>
  mutate(
    big_engine = if_else(displ >= 4, "Yes", "No")
  ) |>
  count(big_engine)
# A tibble: 2 Ă— 2
  big_engine     n
  <chr>      <int>
1 No           148
2 Yes           86


9 summarise() (Compute Summary Statistics)

9.1 One summary value

Code
mpg |>
  summarise(mean_hwy = mean(hwy))
# A tibble: 1 Ă— 1
  mean_hwy
     <dbl>
1     23.4

9.2 Multiple summaries

Code
mpg |>
  summarise(
    mean_hwy = mean(hwy),
    sd_hwy   = sd(hwy),
    max_hwy  = max(hwy),
    min_hwy  = min(hwy)
  )
# A tibble: 1 Ă— 4
  mean_hwy sd_hwy max_hwy min_hwy
     <dbl>  <dbl>   <int>   <int>
1     23.4   5.95      44      12

âś… Tip: Use na.rm = TRUE if your data has missing values.


10 group_by() + summarise() (Grouped Summaries)

10.1 Mean highway MPG by class

Code
mpg |>
  group_by(class) |>
  summarise(mean_hwy = mean(hwy), .groups = "drop") |>
  arrange(desc(mean_hwy))
# A tibble: 7 Ă— 2
  class      mean_hwy
  <chr>         <dbl>
1 compact        28.3
2 subcompact     28.1
3 midsize        27.3
4 2seater        24.8
5 minivan        22.4
6 suv            18.1
7 pickup         16.9

10.2 Count rows by group

Code
mpg |>
  group_by(drv) |>
  summarise(n = n(), .groups = "drop")
# A tibble: 3 Ă— 2
  drv       n
  <chr> <int>
1 4       103
2 f       106
3 r        25


11 count() (Fast Frequency Tables)

Code
mpg |>
  count(class, sort = TRUE)
# A tibble: 7 Ă— 2
  class          n
  <chr>      <int>
1 suv           62
2 compact       47
3 midsize       41
4 subcompact    35
5 pickup        33
6 minivan       11
7 2seater        5

With two variables:

Code
mpg |>
  count(class, drv, sort = TRUE)
# A tibble: 12 Ă— 3
   class      drv       n
   <chr>      <chr> <int>
 1 suv        4        51
 2 midsize    f        38
 3 compact    f        35
 4 pickup     4        33
 5 subcompact f        22
 6 compact    4        12
 7 minivan    f        11
 8 suv        r        11
 9 subcompact r         9
10 2seater    r         5
11 subcompact 4         4
12 midsize    4         3


12 distinct() (Unique Values)

Code
mpg |>
  distinct(manufacturer)
# A tibble: 15 Ă— 1
   manufacturer
   <chr>       
 1 audi        
 2 chevrolet   
 3 dodge       
 4 ford        
 5 honda       
 6 hyundai     
 7 jeep        
 8 land rover  
 9 lincoln     
10 mercury     
11 nissan      
12 pontiac     
13 subaru      
14 toyota      
15 volkswagen  

Unique combinations:

Code
mpg |>
  distinct(manufacturer, year)
# A tibble: 30 Ă— 2
   manufacturer  year
   <chr>        <int>
 1 audi          1999
 2 audi          2008
 3 chevrolet     2008
 4 chevrolet     1999
 5 dodge         1999
 6 dodge         2008
 7 ford          1999
 8 ford          2008
 9 honda         1999
10 honda         2008
# ℹ 20 more rows


13 slice_*() (Pick Specific Rows)

13.1 Top rows

Code
mpg |>
  slice_head(n = 5)
# A tibble: 5 Ă— 11
  manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
  <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…

13.2 Bottom rows

Code
mpg |>
  slice_tail(n = 5)
# A tibble: 5 Ă— 11
  manufacturer model  displ  year   cyl trans      drv     cty   hwy fl    class
  <chr>        <chr>  <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>
1 volkswagen   passat   2    2008     4 auto(s6)   f        19    28 p     mids…
2 volkswagen   passat   2    2008     4 manual(m6) f        21    29 p     mids…
3 volkswagen   passat   2.8  1999     6 auto(l5)   f        16    26 p     mids…
4 volkswagen   passat   2.8  1999     6 manual(m5) f        18    26 p     mids…
5 volkswagen   passat   3.6  2008     6 auto(s6)   f        17    26 p     mids…

13.3 Top rows by a variable

Code
mpg |>
  arrange(desc(hwy)) |>
  slice_head(n = 10)
# A tibble: 10 Ă— 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 volkswagen   jetta        1.9  1999     4 manu… f        33    44 d     comp…
 2 volkswagen   new beetle   1.9  1999     4 manu… f        35    44 d     subc…
 3 volkswagen   new beetle   1.9  1999     4 auto… f        29    41 d     subc…
 4 toyota       corolla      1.8  2008     4 manu… f        28    37 r     comp…
 5 honda        civic        1.8  2008     4 auto… f        25    36 r     subc…
 6 honda        civic        1.8  2008     4 auto… f        24    36 c     subc…
 7 toyota       corolla      1.8  1999     4 manu… f        26    35 r     comp…
 8 toyota       corolla      1.8  2008     4 auto… f        26    35 r     comp…
 9 honda        civic        1.8  2008     4 manu… f        26    34 r     subc…
10 honda        civic        1.6  1999     4 manu… f        28    33 r     subc…


14 across() (Apply Functions to Many Columns)

Example: compute means for multiple numeric columns.

Code
mpg |>
  summarise(
    across(c(cty, hwy, displ), mean)
  )
# A tibble: 1 Ă— 3
    cty   hwy displ
  <dbl> <dbl> <dbl>
1  16.9  23.4  3.47

You can combine with group_by():

Code
mpg |>
  group_by(class) |>
  summarise(
    across(c(cty, hwy), mean),
    .groups = "drop"
  )
# A tibble: 7 Ă— 3
  class        cty   hwy
  <chr>      <dbl> <dbl>
1 2seater     15.4  24.8
2 compact     20.1  28.3
3 midsize     18.8  27.3
4 minivan     15.8  22.4
5 pickup      13    16.9
6 subcompact  20.4  28.1
7 suv         13.5  18.1


15 Joining Data (Combining Tables)

Joins combine tables using a matching key column.

We will create two small example tables.

Code
students <- tibble(
  id = c(101, 102, 103),
  name = c("Alex", "Bella", "Chris")
)

grades <- tibble(
  id = c(101, 103),
  score = c(95, 88)
)

15.1 left_join() (most common)

Code
students |>
  left_join(grades, by = "id")
# A tibble: 3 Ă— 3
     id name  score
  <dbl> <chr> <dbl>
1   101 Alex     95
2   102 Bella    NA
3   103 Chris    88

15.2 inner_join()

Code
students |>
  inner_join(grades, by = "id")
# A tibble: 2 Ă— 3
     id name  score
  <dbl> <chr> <dbl>
1   101 Alex     95
2   103 Chris    88

15.3 anti_join() (find non-matches)

Code
students |>
  anti_join(grades, by = "id")
# A tibble: 1 Ă— 2
     id name 
  <dbl> <chr>
1   102 Bella


16 A Typical dplyr Workflow Example

Goal: find the best average highway MPG by car class (with sample size).

Code
mpg |>
  group_by(class) |>
  summarise(
    mean_hwy = mean(hwy),
    n = n(),
    .groups = "drop"
  ) |>
  arrange(desc(mean_hwy))
# A tibble: 7 Ă— 3
  class      mean_hwy     n
  <chr>         <dbl> <int>
1 compact        28.3    47
2 subcompact     28.1    35
3 midsize        27.3    41
4 2seater        24.8     5
5 minivan        22.4    11
6 suv            18.1    62
7 pickup         16.9    33


17 Practice Problems âś…

  1. Use select() to keep only manufacturer, model, year, hwy.
  2. Filter to only cars with drv == "f" and year == 2008.
  3. Sort the filtered data by hwy (descending).
  4. Create a new variable efficiency = hwy / displ.
  5. Compute the mean of efficiency.
  6. Compute mean hwy by class and sort from highest to lowest.
  7. Find the top 5 classes by mean hwy.


18 Challenge đź’ˇ (Top Manufacturers)

  1. Count how many rows each manufacturer has.
  2. Keep only manufacturers with at least 20 cars in the dataset.
  3. Compute mean highway MPG by manufacturer.
  4. Sort the results from highest to lowest.

Starter code:

Code
mpg |>
  count(manufacturer, sort = TRUE)
# A tibble: 15 Ă— 2
   manufacturer     n
   <chr>        <int>
 1 dodge           37
 2 toyota          34
 3 volkswagen      27
 4 ford            25
 5 chevrolet       19
 6 audi            18
 7 hyundai         14
 8 subaru          14
 9 nissan          13
10 honda            9
11 jeep             8
12 pontiac          5
13 land rover       4
14 mercury          4
15 lincoln          3