The substring() function in R - Things to know With Examples

Substring() function in R is widely used to either extract the characters present in the data or to manipulate the data. You can easily extract the required characters from a string and also replace the values in a string.

Hello folks, hope you are doing good. Today let’s focus on the substing function in R.

The substring() Function Syntax

Substring: We can perform multiple things like extracting of values, replacement of values and more. For this we use functions like substr() and substring().

Where:

  • x = the input data / file.
  • Start / First= starting index of the substring.
  • Stop / Last= Ending index of the substring.

Well, I hope that you are pretty much clear about the syntax. Now, let’s extract some characters from the string using our substring() function in R.

Output = “Journal_dev”

Output = “Journal”

Congratulations, you just extracted the data from the given string. As you can observe, the substring() function in R takes the start/first and last/end values as arguments and indexes the string and returns a required substring of mentioned dimensions.


Replace using substring() function in R

With the help of substring() function, you can also replace the values in the string with your desired values. Seems to be interesting right? Then Let’s see how it works.

Output = “We are developers”

Output = “R is a language made for statistical analysis”

Great, you did it! In this way, you can replace the values in a string with your desired value.

In the above case, you have replaced the ‘_’ (underscore) and “=” (equal sign) with a ” ” (space). I hope you got it better.


String replacement using substring() function

Till now, everything is good! But what if you are required to replace some values, which should reflect in all the strings present?

Don’t worry! We can replace the values and can make them to reflect on all the strings present.

Let’s see how it works!

Output = “Alo$” “Jos$ph” “Hay$to” “Kel$y” “Pal$ma” “Moc$”

Oh, What happened? Every 4th letter in the strings has replaced by ‘$’ sign!.

Well, that is substring() for you. It can replace the marked positions with our given value.

In the above case, every 4th letter in all the input strings was replaced by the ‘$’ sign by the substring() function. It’s incredible right? I say Yes. What about you?


The use of substr() and str_sub() function in R

We’ve already focused on rows. Now, we will be looking into the extraction of characters in the columns as well.

Let’s see how it works!.

We can create a data frame with sample data having 2 columns namely Technologies and popularity. Let’s extract some specific characters out of this data. It will be fun.

Yes, we have now created a data frame. Let’s extract some text. To do so, run the below code to extract characters from 8-10 in all the strings in Technologies column using substr() function in R.

Output =

Now, you can see that we have created a new column with extracted data. Like this, you can extract the data by specifying the index values.


The use of str_sub() function in R

We saw the substr() function in action. Now, as I mentioned before, we will be looking into the str_sub() function and its way of extraction.

Let’s roll!

Again we are going to create the same data frame including the data of Technologies and its popularity as well.

Well, let’s make use of the str_sub() function, which will return the indexed characters as output. Taking/generating a substring in R can be done in many ways and this is one of them.

As you can see that the str_sub() function extracted the indexed values and returns the output as shown below.


Wrapping Up

Yes, taking or generating a substring of the given string is quite an easier task. Thanks to functions like substr(), substring(), and str_sub() which made sub stringing interesting and exciting.

That’s all for now. Don’t forget to make use of this amazing function in your computation. Happy sub-stringing!!!

More study: R documentation

By admin

Leave a Reply

%d bloggers like this: