The paste
function in the R language is an essential tool for concatenating strings; it joins together string vectors with user specified separators. What had confused me about the paste
function was that it has two different parameters for specifying the “separators”, i.e. sep
and collapse
. This post explains what their differences are and when to use them.
The sep
Parameter
So what does the sep
parameter do? It turns out that the sep
parameter is only relevant if you are dealing with two or more input vectors; to demonstrate this, let us first make an example vector:
a <- c(1, 2, 3)
Now let us see what happens when we call paste
directly on a
with sep
set to “,”:
paste(a, sep = ", ")
## [1] "1" "2" "3"
We see that the function simply converted the elements in the vector into strings without using the sep
.
Let us make another vector:
b <- c("a", "b")
and call paste
with both a
and b
supplied to it:
paste(a, b, sep = ", ")
## [1] "1, a" "2, b" "3, a"
We see that when supplied with more than one vectors, the paste
function basically zips together the vectors with recycle and concatenates them using the separator specified by the sep
parameter; this process produces a vector of strings.
paste0
is a shorthand for paste(..., sep = "")
:
paste0(a, b)
## [1] "1a" "2b" "3a"
The collapse
Parameter
The collapse
separator is applied to concatenate the vector of strings produced with the sep
parameter:
paste(a, b, sep = ", ", collapse = "; ")
## [1] "1, a; 2, b; 3, a"
One of the most common use cases of the paste
function is to concatenate the elements in a vector with a separator. To do this, we only need to specify the collapse
parameter since sep
is irrelevant when dealing with a single input vector:
paste(a, collapse = "; ")
## [1] "1; 2; 3"
Summary
A call to the paste
function with both sep
and collapse
specified performs the following tasks:
- It first zips together the input vectors with recycle; one can think of this as applying
cbind()
orrbind()
to the input vectors. - The function then joins together each zipped tuple using the separator specified by
sep
; this step produces a vector of strings, one for each zipped tuple. - Lastly, the function concatenates the vector of strings produced in the previous step using the separator specified by
collapse
; this step produces a single string.