6.7 数据合并
merge 合并两个数据框
<- data.frame(
authors ## I(*) : use character columns of names to get sensible sort order
surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
nationality = c("US", "Australia", "US", "UK", "Australia"),
deceased = c("yes", rep("no", 4))
)<- within(authors, {
authorN <- surname
name rm(surname)
})<- data.frame(
books name = I(c(
"Tukey", "Venables", "Tierney",
"Ripley", "Ripley", "McNeil", "R Core"
)),title = c(
"Exploratory Data Analysis",
"Modern Applied Statistics ...",
"LISP-STAT",
"Spatial Statistics", "Stochastic Simulation",
"Interactive Data Analysis",
"An Introduction to R"
),other.author = c(
NA, "Ripley", NA, NA, NA, NA,
"Venables & Smith"
)
)
authors
## surname nationality deceased
## 1 Tukey US yes
## 2 Venables Australia no
## 3 Tierney US no
## 4 Ripley UK no
## 5 McNeil Australia no
authorN
## nationality deceased name
## 1 US yes Tukey
## 2 Australia no Venables
## 3 US no Tierney
## 4 UK no Ripley
## 5 Australia no McNeil
books
## name title other.author
## 1 Tukey Exploratory Data Analysis <NA>
## 2 Venables Modern Applied Statistics ... Ripley
## 3 Tierney LISP-STAT <NA>
## 4 Ripley Spatial Statistics <NA>
## 5 Ripley Stochastic Simulation <NA>
## 6 McNeil Interactive Data Analysis <NA>
## 7 R Core An Introduction to R Venables & Smith
默认找到同名的列,然后是同名的行合并,多余的没有匹配到的就丢掉
merge(authorN, books)
## name nationality deceased title other.author
## 1 McNeil Australia no Interactive Data Analysis <NA>
## 2 Ripley UK no Spatial Statistics <NA>
## 3 Ripley UK no Stochastic Simulation <NA>
## 4 Tierney US no LISP-STAT <NA>
## 5 Tukey US yes Exploratory Data Analysis <NA>
## 6 Venables Australia no Modern Applied Statistics ... Ripley
还可以指定合并的列,先按照 surname 合并,留下 surname
merge(authors, books, by.x = "surname", by.y = "name")
## surname nationality deceased title other.author
## 1 McNeil Australia no Interactive Data Analysis <NA>
## 2 Ripley UK no Spatial Statistics <NA>
## 3 Ripley UK no Stochastic Simulation <NA>
## 4 Tierney US no LISP-STAT <NA>
## 5 Tukey US yes Exploratory Data Analysis <NA>
## 6 Venables Australia no Modern Applied Statistics ... Ripley
留下的是 name
merge(books, authors, by.x = "name", by.y = "surname")
## name title other.author nationality deceased
## 1 McNeil Interactive Data Analysis <NA> Australia no
## 2 Ripley Spatial Statistics <NA> UK no
## 3 Ripley Stochastic Simulation <NA> UK no
## 4 Tierney LISP-STAT <NA> US no
## 5 Tukey Exploratory Data Analysis <NA> US yes
## 6 Venables Modern Applied Statistics ... Ripley Australia no
为了比较清楚地观察几种合并的区别,这里提供对应的动画展示 https://github.com/gadenbuie/tidyexplain
(inner, outer, left, right, cross) join 共5种合并方式详情请看 https://stackoverflow.com/questions/1299871
cbind 和 rbind 分别是按列和行合并数据框