4.13 高级的提取

从 text 中抽取给定模式 pattern 的字符串

str_extract <- function(text, pattern, ...) regmatches(text, regexpr(pattern, text, ...))

举个栗子,比如提取数字

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
stringr::str_extract(shopping_list, "\\d")
## [1] "4" NA  NA  "2"
# 注意二者的差别
str_extract(shopping_list, "\\d")
## [1] "4" "2"

提取所有符合匹配模式的字符串

str_extract_all <- function(text, pattern, ...) regmatches(text, gregexpr(pattern, text, ...))

举个栗子,提取其中的英文字母

str_extract_all(shopping_list, "[a-z]+")
## [[1]]
## [1] "apples" "x"     
## 
## [[2]]
## [1] "bag"   "of"    "flour"
## 
## [[3]]
## [1] "bag"   "of"    "sugar"
## 
## [[4]]
## [1] "milk" "x"
stringr::str_extract_all(shopping_list, "[a-z]+")
## [[1]]
## [1] "apples" "x"     
## 
## [[2]]
## [1] "bag"   "of"    "flour"
## 
## [[3]]
## [1] "bag"   "of"    "sugar"
## 
## [[4]]
## [1] "milk" "x"