The sub() and gsub() function in R With Examples

You can replace the string or the characters in a vector or a data frame using the sub() and gsub() function in R.

Hello folks, we are going to focus on the most useful and beneficial functions in R, i.e. sub() and gsub() functions.

The sub() and gsub() functions in R, will replace the string with a specific string. You can even use regular expressions with the gsub() function. Col right?

Let’s move forward and explore these functions using relevant illustrations.


Syntax of sub() and gsub()

sub() and gsub(): The functions which are exclusively useful for string substitution operations in R. You can replace the string in a vector or a data frame and can substitute the specified string.

Where,

  • Pattern = The pattern or the string which you want to be replaced.
  • Replacement = A input string to replace the pattern string.
  • X = A vector or a data frame to replace the strings.

The sub() function in R

The sub() function in R is used to replace the string in a vector or a data frame with the input or the specified string.

When you are dealing with large data sets, it’s impossible to look at each line to find and replace the target words or strings.

In this case, the sub() function will replace string.

But, the disadvantage of the sub() function is that the function replaces only the first occurrence by leaving all other similarities.

Complicated? Don’t worry. Let’s illustrate this using a simple example.


1. A simple implementation of sub() function

In this example, we are going to replace the string with our input string in a vector. Let’s see how it goes.

In the above example, you can see that the sub() function replaces the string ‘R’ in the vector with the ‘R language’ string which is specified in the code as a replacement.

Let’s go for another sample to understand it eve better.

In this example, you can observe that the sub() function replaced the first occurrence of the string ‘Earth’ with ‘Planetary’. But in it’s next occurrence the string remains same.

Well, as discussed above, the sub() function will not replace all the strings, instead it merely replaces the first occurrence of the string.

I hope, by now it is clear to you.


2. sub() function with a data frame

When you think of using the sub() function with data frames, you will get the same output as above.

The sub() function will change only first ever occurrence by leaving other as it is.

Let’s see how it works!

For this we have to create a data frame first. Then we can use sub() function to get the results.

You can see that the function will change the first occurrence by leaving the others. Note that, we have selected an entire dataset here.

But you can select the particular column to get all the words with ‘G’ replaced by ‘A’ as shown below.

Like this, you can easily substitute the values to a data frame.

In the next section, we are going to see how gsub() function can be used in R.


The gsub() function in R

The gsub() function in R is used for replacement operations. The functions takes the input and substitutes it against the specified values.

The gsub() function always deals with regular expressions. You can use the regular expressions as the parameter of substitution.

The regular expression is just a series of characters that represent a search pattern in the data.

In the below sections, you can witness the applications and usage of gsub() function in R.


1. A simple implementation of gsub() function

The gsub() function in R is used to replace the strings with input strings or values. Note that, you can also use the regular expression with gsub() function to deal with numbers.

This is data that has ‘R’ written multiple times. Now, we are going to replace the R with ‘R programming’ in both sentences using gsub() function.

Fantastic!

See how quickly the word ‘R’ in both sentences gets replaced by the ‘R programming’ word.

The gsub() function finds every word matching the parameter and replaces that with our input word or values.


2. gsub() function with regular expression

As the heading suggests you can use the regular expression with gsub() function without any hassle.

You can negate the numbers from the data using the regular expressions.

Regular expressions(regex): Also called as rational expressions, they are a sequence of values or characters which usually defines a pattern of search. Most commonly used by the searching algorithms and developed in the language theory in the computer science domain.

Let’s see how it works.

So, basically the gsub() function searches for the numbers in the data and substitute them with a no space or you can all it as eliminating the numbers.


2. The gsub() function with data frames

Like the sub() function, the gsub() is used to substitute the values with the input values. One of the interesting application is shown below which explains the relevance and importance of gsub() function in R.

Let’s roll!!!

well, now we have a list of speakers and their age as a input data.

Now, we are going to use the regular expression with gsub() to substitute the initial space with ‘Mr/Mrs.’ expression. Let’s do it together.

Awesome. You did it.

See, how easily you have added the expressions behind the speaker names. cool right?


Wrapping Up

The sub() and gsub() function in R is used for substitution as well as replacement operations.

The sub() function will replace the first occurrence leaving the other as it is. On the other hand, the gsub() function will replace all the strings or values with the input strings.

Although you cannot find lot of differences between them, you can use them accordingly.

I hope you got the better of sub() and gsub() function in R. That’s all for now. Happy substituting!!!

More read: R documentation

By admin

Leave a Reply

%d bloggers like this: