3.2 其它数据格式
来自其它格式的数据形式,如 JSON、XML、YAML 需要转化清理成 R 中数据框的形式 data.frame
- Data Rectangling with jq
- Mongolite User Manual introduction to using MongoDB with the mongolite client in R
jsonlite 读取 *.json
格式的文件,jsonlite::write_json
函数将 R对象保存为 JSON 文件,jsonlite::fromJSON
将 json 字符串或文件转化为 R 对象,jsonlite::toJSON
函数正好与之相反
library(jsonlite)
# 从 json 格式的文件导入
# jsonlite::read_json(path = "path/to/filename.json")
# A JSON array of primitives
<- '["Mario", "Peach", null, "Bowser"]'
json
# 简化为原子向量atomic vector
fromJSON(json)
## [1] "Mario" "Peach" NA "Bowser"
# 默认返回一个列表
fromJSON(json, simplifyVector = FALSE)
## [[1]]
## [1] "Mario"
##
## [[2]]
## [1] "Peach"
##
## [[3]]
## NULL
##
## [[4]]
## [1] "Bowser"
yaml 包读取 *.yml
格式文件,返回一个列表,yaml::write_yaml
函数将 R 对象写入 yaml 格式
library(yaml)
::read_yaml(file = '_bookdown.yml') yaml
## $book_filename
## [1] "notesdown"
##
## $delete_merged_file
## [1] TRUE
##
## $language
## $language$label
## $language$label$fig
## [1] "图 "
##
## $language$label$tab
## [1] "表 "
##
##
## $language$ui
## $language$ui$edit
## [1] "编辑"
##
## $language$ui$chapter_name
## [1] "第 " " 章"
##
## $language$ui$appendix_name
## [1] "附录 "
##
##
##
## $new_session
## [1] TRUE
##
## $before_chapter_script
## [1] "_common.R"
##
## $rmd_files
## [1] "index.Rmd" "preface.Rmd"
## [3] "data-wrangling.Rmd" "data-structure.Rmd"
## [5] "data-transportation.Rmd" "string-operations.Rmd"
## [7] "regular-expressions.Rmd" "data-manipulation.Rmd"
## [9] "advanced-manipulation.Rmd" "parallel-manipulation.Rmd"
## [11] "other-manipulation.Rmd" "statistical-graphics.Rmd"
## [13] "graphics-foundations.Rmd" "visualization-colors.Rmd"
## [15] "visualization-gallery.Rmd" "interactive-web-graphics.Rmd"
## [17] "statistical-computation.Rmd" "numerical-optimization.Rmd"
## [19] "differential-equations.Rmd" "appendix.Rmd"
## [21] "references.Rmd"
统计软件 | R函数 | R包 |
---|---|---|
ERSI ArcGIS | read.shapefile |
shapefiles |
Matlab | readMat |
R.matlab |
minitab | read.mtp |
foreign |
SAS (permanent data) | read.ssd |
foreign |
SAS (XPORT format) | read.xport |
foreign |
SPSS | read.spss |
foreign |
Stata | read.dta |
foreign |
Systat | read.systat |
foreign |
Octave | read.octave |
foreign |
文件格式 | R函数 | R包 |
---|---|---|
列联表数据 | read.ftable |
stats |
二进制数据 | readBin |
base |
字符串数据 | readChar |
base |
剪贴板数据 | readClipboard |
utils |
read.dcf
函数读取 Debian 控制格式文件,这种类型的文件以人眼可读的形式在存储数据,如 R 包的 DESCRIPTION 文件或者包含所有 CRAN 上 R 包描述的文件 https://cran.r-project.org/src/contrib/PACKAGES
<- read.dcf(file = system.file("DESCRIPTION", package = "splines"),
x fields = c("Package", "Version", "Title"))
x
## Package Version Title
## [1,] "splines" "4.2.3" "Regression Spline Functions and Classes"
最后要提及拥有瑞士军刀之称的 rio 包,它集合了当前 R 可以读取的所有统计分析软件导出的数据。