Chapter 4 Reshape2
4.1 melt
Moves data from rows to columns (dcast
) and from columns to rows (melt
). These functions require two parameters: id.vars
and measures.vars
.
- id.vars
are those columns that would stay unchanged (on the left), whereas
- measures.vars
are those columns that would disappear into rows when melting.
When melt
ing, columns move into rows. The way Reshape2 does this is by:
1. Inserts two new columns - called variable and value.
2. Removes all measures.vars
columns.
3. For each column removed, for every single row, the name of the column is inserted in the variable field, and the value in the value field.
4. This has the effect of increasing the number of rows in the dataframe by the number of columns that are not id.vars
.
When casting using dcast
, the reverse happens as follows:
library(reshape2)
data(mtcars)
n = mtcars[,1:4]
n$car = rownames(n)
rownames(n)=NULL
n # Just to get to a regular data frame to melt
## mpg cyl disp hp car
## 1 21.0 6 160.0 110 Mazda RX4
## 2 21.0 6 160.0 110 Mazda RX4 Wag
## 3 22.8 4 108.0 93 Datsun 710
## 4 21.4 6 258.0 110 Hornet 4 Drive
## 5 18.7 8 360.0 175 Hornet Sportabout
## 6 18.1 6 225.0 105 Valiant
## 7 14.3 8 360.0 245 Duster 360
## 8 24.4 4 146.7 62 Merc 240D
## 9 22.8 4 140.8 95 Merc 230
## 10 19.2 6 167.6 123 Merc 280
## 11 17.8 6 167.6 123 Merc 280C
## 12 16.4 8 275.8 180 Merc 450SE
## 13 17.3 8 275.8 180 Merc 450SL
## 14 15.2 8 275.8 180 Merc 450SLC
## 15 10.4 8 472.0 205 Cadillac Fleetwood
## 16 10.4 8 460.0 215 Lincoln Continental
## 17 14.7 8 440.0 230 Chrysler Imperial
## 18 32.4 4 78.7 66 Fiat 128
## 19 30.4 4 75.7 52 Honda Civic
## 20 33.9 4 71.1 65 Toyota Corolla
## 21 21.5 4 120.1 97 Toyota Corona
## 22 15.5 8 318.0 150 Dodge Challenger
## 23 15.2 8 304.0 150 AMC Javelin
## 24 13.3 8 350.0 245 Camaro Z28
## 25 19.2 8 400.0 175 Pontiac Firebird
## 26 27.3 4 79.0 66 Fiat X1-9
## 27 26.0 4 120.3 91 Porsche 914-2
## 28 30.4 4 95.1 113 Lotus Europa
## 29 15.8 8 351.0 264 Ford Pantera L
## 30 19.7 6 145.0 175 Ferrari Dino
## 31 15.0 8 301.0 335 Maserati Bora
## 32 21.4 4 121.0 109 Volvo 142E
melt(n, id.vars = c("car", "mpg"))
## car mpg variable value
## 1 Mazda RX4 21.0 cyl 6.0
## 2 Mazda RX4 Wag 21.0 cyl 6.0
## 3 Datsun 710 22.8 cyl 4.0
## 4 Hornet 4 Drive 21.4 cyl 6.0
## 5 Hornet Sportabout 18.7 cyl 8.0
## 6 Valiant 18.1 cyl 6.0
## 7 Duster 360 14.3 cyl 8.0
## 8 Merc 240D 24.4 cyl 4.0
## 9 Merc 230 22.8 cyl 4.0
## 10 Merc 280 19.2 cyl 6.0
## 11 Merc 280C 17.8 cyl 6.0
## 12 Merc 450SE 16.4 cyl 8.0
## 13 Merc 450SL 17.3 cyl 8.0
## 14 Merc 450SLC 15.2 cyl 8.0
## 15 Cadillac Fleetwood 10.4 cyl 8.0
## 16 Lincoln Continental 10.4 cyl 8.0
## 17 Chrysler Imperial 14.7 cyl 8.0
## 18 Fiat 128 32.4 cyl 4.0
## 19 Honda Civic 30.4 cyl 4.0
## 20 Toyota Corolla 33.9 cyl 4.0
## 21 Toyota Corona 21.5 cyl 4.0
## 22 Dodge Challenger 15.5 cyl 8.0
## 23 AMC Javelin 15.2 cyl 8.0
## 24 Camaro Z28 13.3 cyl 8.0
## 25 Pontiac Firebird 19.2 cyl 8.0
## 26 Fiat X1-9 27.3 cyl 4.0
## 27 Porsche 914-2 26.0 cyl 4.0
## 28 Lotus Europa 30.4 cyl 4.0
## 29 Ford Pantera L 15.8 cyl 8.0
## 30 Ferrari Dino 19.7 cyl 6.0
## 31 Maserati Bora 15.0 cyl 8.0
## 32 Volvo 142E 21.4 cyl 4.0
## 33 Mazda RX4 21.0 disp 160.0
## 34 Mazda RX4 Wag 21.0 disp 160.0
## 35 Datsun 710 22.8 disp 108.0
## 36 Hornet 4 Drive 21.4 disp 258.0
## 37 Hornet Sportabout 18.7 disp 360.0
## 38 Valiant 18.1 disp 225.0
## 39 Duster 360 14.3 disp 360.0
## 40 Merc 240D 24.4 disp 146.7
## 41 Merc 230 22.8 disp 140.8
## 42 Merc 280 19.2 disp 167.6
## 43 Merc 280C 17.8 disp 167.6
## 44 Merc 450SE 16.4 disp 275.8
## 45 Merc 450SL 17.3 disp 275.8
## 46 Merc 450SLC 15.2 disp 275.8
## 47 Cadillac Fleetwood 10.4 disp 472.0
## 48 Lincoln Continental 10.4 disp 460.0
## 49 Chrysler Imperial 14.7 disp 440.0
## 50 Fiat 128 32.4 disp 78.7
## 51 Honda Civic 30.4 disp 75.7
## 52 Toyota Corolla 33.9 disp 71.1
## 53 Toyota Corona 21.5 disp 120.1
## 54 Dodge Challenger 15.5 disp 318.0
## 55 AMC Javelin 15.2 disp 304.0
## 56 Camaro Z28 13.3 disp 350.0
## 57 Pontiac Firebird 19.2 disp 400.0
## 58 Fiat X1-9 27.3 disp 79.0
## 59 Porsche 914-2 26.0 disp 120.3
## 60 Lotus Europa 30.4 disp 95.1
## 61 Ford Pantera L 15.8 disp 351.0
## 62 Ferrari Dino 19.7 disp 145.0
## 63 Maserati Bora 15.0 disp 301.0
## 64 Volvo 142E 21.4 disp 121.0
## 65 Mazda RX4 21.0 hp 110.0
## 66 Mazda RX4 Wag 21.0 hp 110.0
## 67 Datsun 710 22.8 hp 93.0
## 68 Hornet 4 Drive 21.4 hp 110.0
## 69 Hornet Sportabout 18.7 hp 175.0
## 70 Valiant 18.1 hp 105.0
## 71 Duster 360 14.3 hp 245.0
## 72 Merc 240D 24.4 hp 62.0
## 73 Merc 230 22.8 hp 95.0
## 74 Merc 280 19.2 hp 123.0
## 75 Merc 280C 17.8 hp 123.0
## 76 Merc 450SE 16.4 hp 180.0
## 77 Merc 450SL 17.3 hp 180.0
## 78 Merc 450SLC 15.2 hp 180.0
## 79 Cadillac Fleetwood 10.4 hp 205.0
## 80 Lincoln Continental 10.4 hp 215.0
## 81 Chrysler Imperial 14.7 hp 230.0
## 82 Fiat 128 32.4 hp 66.0
## 83 Honda Civic 30.4 hp 52.0
## 84 Toyota Corolla 33.9 hp 65.0
## 85 Toyota Corona 21.5 hp 97.0
## 86 Dodge Challenger 15.5 hp 150.0
## 87 AMC Javelin 15.2 hp 150.0
## 88 Camaro Z28 13.3 hp 245.0
## 89 Pontiac Firebird 19.2 hp 175.0
## 90 Fiat X1-9 27.3 hp 66.0
## 91 Porsche 914-2 26.0 hp 91.0
## 92 Lotus Europa 30.4 hp 113.0
## 93 Ford Pantera L 15.8 hp 264.0
## 94 Ferrari Dino 19.7 hp 175.0
## 95 Maserati Bora 15.0 hp 335.0
## 96 Volvo 142E 21.4 hp 109.0
4.2 dcast
Melt is pretty straightforward, but dcast
, which is the reverse, can be a bit tricky. You have to tell it the id.vars columns (those that would stay), plus the column that contains the variable field, plus the column containing value).
The syntax is dcast(dataframe_name, ID.Var1 + ID.Var2...+ID.Var_n ~ variable_column, value.var = value_column)
.
Continuing the example from before:
m = melt(n, id.vars = c("car", "mpg")) # Get the melted dataframe in m.
colnames(m)[3]="x_variable" #change column names for clarity
colnames(m)[4]="x_value"
head(m)
## car mpg x_variable x_value
## 1 Mazda RX4 21.0 cyl 6
## 2 Mazda RX4 Wag 21.0 cyl 6
## 3 Datsun 710 22.8 cyl 4
## 4 Hornet 4 Drive 21.4 cyl 6
## 5 Hornet Sportabout 18.7 cyl 8
## 6 Valiant 18.1 cyl 6
dcast(m, car + mpg ~ x_variable, value.var = "x_value")
## car mpg cyl disp hp
## 1 AMC Javelin 15.2 8 304.0 150
## 2 Cadillac Fleetwood 10.4 8 472.0 205
## 3 Camaro Z28 13.3 8 350.0 245
## 4 Chrysler Imperial 14.7 8 440.0 230
## 5 Datsun 710 22.8 4 108.0 93
## 6 Dodge Challenger 15.5 8 318.0 150
## 7 Duster 360 14.3 8 360.0 245
## 8 Ferrari Dino 19.7 6 145.0 175
## 9 Fiat 128 32.4 4 78.7 66
## 10 Fiat X1-9 27.3 4 79.0 66
## 11 Ford Pantera L 15.8 8 351.0 264
## 12 Honda Civic 30.4 4 75.7 52
## 13 Hornet 4 Drive 21.4 6 258.0 110
## 14 Hornet Sportabout 18.7 8 360.0 175
## 15 Lincoln Continental 10.4 8 460.0 215
## 16 Lotus Europa 30.4 4 95.1 113
## 17 Maserati Bora 15.0 8 301.0 335
## 18 Mazda RX4 21.0 6 160.0 110
## 19 Mazda RX4 Wag 21.0 6 160.0 110
## 20 Merc 230 22.8 4 140.8 95
## 21 Merc 240D 24.4 4 146.7 62
## 22 Merc 280 19.2 6 167.6 123
## 23 Merc 280C 17.8 6 167.6 123
## 24 Merc 450SE 16.4 8 275.8 180
## 25 Merc 450SL 17.3 8 275.8 180
## 26 Merc 450SLC 15.2 8 275.8 180
## 27 Pontiac Firebird 19.2 8 400.0 175
## 28 Porsche 914-2 26.0 4 120.3 91
## 29 Toyota Corolla 33.9 4 71.1 65
## 30 Toyota Corona 21.5 4 120.1 97
## 31 Valiant 18.1 6 225.0 105
## 32 Volvo 142E 21.4 4 121.0 109