[R] 去除字串前後空格

  When processing data, sometimes we need to remove beginning or ending spaces (or even other kinds of punctuation) in string. In R, as can be seen in the following code, we can use gsub to replace spaces.
  ^ represents the beginning of the string, $ represents the end of the string, and \\s represents space
  As can be seen in the first code below, these can only remove the first and the last space, if there are more than one space at the beginning or ending string, we should use +. + means "at least one time," so when there are many leading or trailing spaces, we can remove it all, as shown in the second code below.

處理資料時,
有些時候需要清除文字的前後空格,
有兩種方法,
一個是使用 gsub 去取代,
一個是使用 stringr 套件處理。

1.  gsub


\\s表示空格,
^表示文字的開頭,
$表示文字的結尾。

不過這樣只會消除一個空格,
如果要完全去除如下:


+表示至少一次,
因此當有多個空格,
就都會取代掉。

參考網址:R gsub Function

2. stringr套件

stringr 套件中有 str_trim 可以作使用,
其可以移除字串前後的空格與TAB
stringr 這套件還有其他許多處理文字的函數,
可參考此篇
滿詳細的。

沒有留言:

張貼留言