2 min read

R's paste (and paste0) Function Demystified

The paste function in the R language is an essential tool for concatenating strings; it joins together string vectors with user specified separators. What had confused me about the paste function was that it has two different parameters for specifying the “separators”, i.e. sep and collapse. This post explains what their differences are and when to use them.

The sep Parameter

So what does the sep parameter do? It turns out that the sep parameter is only relevant if you are dealing with two or more input vectors; to demonstrate this, let us first make an example vector:

a <- c(1, 2, 3)

Now let us see what happens when we call paste directly on a with sep set to “,”:

paste(a, sep = ", ")
## [1] "1" "2" "3"

We see that the function simply converted the elements in the vector into strings without using the sep.

Let us make another vector:

b <- c("a", "b")

and call paste with both a and b supplied to it:

paste(a, b, sep = ", ")
## [1] "1, a" "2, b" "3, a"

We see that when supplied with more than one vectors, the paste function basically zips together the vectors with recycle and concatenates them using the separator specified by the sep parameter; this process produces a vector of strings.

paste0 is a shorthand for paste(..., sep = ""):

paste0(a, b)
## [1] "1a" "2b" "3a"

The collapse Parameter

The collapse separator is applied to concatenate the vector of strings produced with the sep parameter:

paste(a, b, sep = ", ", collapse = "; ")
## [1] "1, a; 2, b; 3, a"

One of the most common use cases of the paste function is to concatenate the elements in a vector with a separator. To do this, we only need to specify the collapse parameter since sep is irrelevant when dealing with a single input vector:

paste(a, collapse = "; ")
## [1] "1; 2; 3"

Summary

A call to the paste function with both sep and collapse specified performs the following tasks:

  1. It first zips together the input vectors with recycle; one can think of this as applying cbind() or rbind() to the input vectors.
  2. The function then joins together each zipped tuple using the separator specified by sep; this step produces a vector of strings, one for each zipped tuple.
  3. Lastly, the function concatenates the vector of strings produced in the previous step using the separator specified by collapse; this step produces a single string.