5.2 软件环境
R 内置的正则表达式实现是基于 PCRE ICU TRE iconv 等第三方库,搞清楚自己使用的版本信息是重要的,一些字符集的解释与区域环境有关,如 [:alnum:]
和 [:alpha:]
等,所以获取当前的区域设置也很重要
# find a suitable coding for the current locale
localeToCharset(locale = Sys.getlocale("LC_CTYPE"))
## [1] "UTF-8" "ISO8859-1"
# 软件版本信息
extSoftVersion()
## zlib
## "1.2.11"
## bzlib
## "1.0.8, 13-Jul-2019"
## xz
## "5.2.5"
## PCRE
## "10.39 2021-10-29"
## ICU
## "70.1"
## TRE
## "TRE 0.8.0 R_fixes (BSD)"
## iconv
## "glibc 2.35"
## readline
## "8.1"
## BLAS
## "/usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3"
# 区域及其编码信息
l10n_info()
## $MBCS
## [1] TRUE
##
## $`UTF-8`
## [1] TRUE
##
## $`Latin-1`
## [1] FALSE
##
## $codeset
## [1] "UTF-8"
# 表示数字、货币的细节
Sys.localeconv()
## decimal_point thousands_sep grouping int_curr_symbol
## "." "" "" "USD "
## currency_symbol mon_decimal_point mon_thousands_sep mon_grouping
## "$" "." "," "\003\003"
## positive_sign negative_sign int_frac_digits frac_digits
## "" "-" "2" "2"
## p_cs_precedes p_sep_by_space n_cs_precedes n_sep_by_space
## "1" "0" "1" "0"
## p_sign_posn n_sign_posn
## "1" "1"
# PCRE 启用的配置选项
pcre_config()
## UTF-8 Unicode properties JIT stack
## TRUE TRUE TRUE FALSE
# 比较全的字符信息
::stri_info() stringi
## $Unicode.version
## [1] "14.0"
##
## $ICU.version
## [1] "70.1"
##
## $Locale
## $Locale$Language
## [1] "en"
##
## $Locale$Country
## [1] "US"
##
## $Locale$Variant
## [1] ""
##
## $Locale$Name
## [1] "en_US"
##
##
## $Charset.internal
## [1] "UTF-8" "UTF-16"
##
## $Charset.native
## $Charset.native$Name.friendly
## [1] "UTF-8"
##
## $Charset.native$Name.ICU
## [1] "UTF-8"
##
## $Charset.native$Name.UTR22
## [1] NA
##
## $Charset.native$Name.IBM
## [1] "ibm-1208"
##
## $Charset.native$Name.WINDOWS
## [1] "windows-65001"
##
## $Charset.native$Name.JAVA
## [1] "UTF-8"
##
## $Charset.native$Name.IANA
## [1] "UTF-8"
##
## $Charset.native$Name.MIME
## [1] "UTF-8"
##
## $Charset.native$ASCII.subset
## [1] TRUE
##
## $Charset.native$Unicode.1to1
## [1] NA
##
## $Charset.native$CharSize.8bit
## [1] FALSE
##
## $Charset.native$CharSize.min
## [1] 1
##
## $Charset.native$CharSize.max
## [1] 3
##
##
## $ICU.system
## [1] TRUE
##
## $ICU.UTF8
## [1] TRUE
需要临时改变区域环境设置,配合特殊的画图和文本输出要求。
# 获取当前默认的区域设置
Sys.getlocale()
<- Sys.getlocale()
foo # 恢复默认的区域设置
Sys.setlocale("LC_ALL", locale = foo)