In this article, we’ll take a look at using the strtok() and strtok_r() functions in C.
These functions are very useful, if you want to tokenize a string. C provides these handy utility functions to split our input string into tokens.
Let’s take a look at using these functions, using suitable examples.
Using the strtok() function
First, let’s look at the strtok() function.
This function is a part of the <string.h>
header file, so you must include it in your program.
1 2 |
#include <string.h> char* strtok(char* str, const char* delim); |
This takes in an input string str
and a delimiter character delim
.
strtok()
will split the string into tokens based on the delimited character.
We expect a list of strings from strtok()
. But the function returns us a single string! Why is this?
The reason is how the function handles the tokenization. After calling strtok(input, delim)
, it returns the first token.
But we must keep calling the function again and again on a NULL
input string, until we get NULL
!
Basically, we need to keep calling strtok(NULL, delim)
until it returns NULL
.
Seems confusing? Let’s look at an example to clear it out!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
#include <stdio.h> #include <string.h> int main() { // Our input string char input_string[] = "Hello from JournalDev!"; // Our output token list char token_list[20][20]; // We call strtok(input, delim) to get our first token // Notice the double quotes on delim! It is still a char* single character string! char* token = strtok(input_string, " "); int num_tokens = 0; // Index to token list. We will append to the list while (token != NULL) { // Keep getting tokens until we receive NULL from strtok() strcpy(token_list[num_tokens], token); // Copy to token list num_tokens++; token = strtok(NULL, " "); // Get the next token. Notice that input=NULL now! } // Print the list of tokens printf("Token List:n"); for (int i=0; i < num_tokens; i++) { printf("%sn", token_list[i]); } return 0; } |
So, we have our input string “Hello from JournalDev!”, and we’re trying to tokenize it by spaces.
We get the first token using strtok(input, " ")
. Notice the double quotes, as the delimiter is a single character string!
Afterwards, we keep getting tokens using strtok(NULL, " ")
and loop until we get NULL
from strtok()
.
Let’s look at the output now.
Output
1 2 3 4 |
Token List: Hello from JournalDev! |
Indeed, we seem to have got the correct tokens!
Similarly, let’s now look at using strtok_r()
.
Using the strtok_r() function
This function is very similar to the strtok()
function. The key difference is that the _r
means that this is a re-entrant function.
A reentrant function is a function that can be interrupted during its execution. This type of function can also be safely called again, to resume execution!
This is why it is a “re-entrant” function. Just because it can safely enter again!
Due to this fact, re-entrant functions are thread-safe, meaning that they can safely be interrupted by threads, just because they can resume again without any harm.
Now, similar to strtok()
, the strtok_r()
function is a thread-safe version of it.
However, this has an extra parameter to it, called the context. We need this, so that the function can resume from the right place.
NOTE: If you’re using Windows, the equivalent function is strtok_s(). strtok_r() is for Linux / Mac based systems!
1 2 |
#include <string.h> char *strtok_r(char *str, const char *delim, char **context); |
The context
parameter is a pointer to the character, which strtok_r
uses internally to save its state.
Usually, we can just pass it from a user-declared pointer.
Let’s look at the same example for strtok()
, now using strtok_r()
(or strtok_s()
on Windows).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
#include <stdio.h> #include <string.h> int main() { // Our input string char input_string[] = "Hello from JournalDev!"; // Our output token list char token_list[20][20]; // A pointer, which we will be used as the context variable // Initially, we will set it to NULL char* context = NULL; // To get the value of the context variable, we can pass it's address // strtok_r() to automatically populate this context variable, and refer // it's context in the future char* token = strtok_r(input_string, " ", &context); int num_tokens = 0; // Index to token list. We will append to the list while (token != NULL) { // Keep getting tokens until we receive NULL from strtok() strcpy(token_list[num_tokens], token); // Copy to token list num_tokens++; token = strtok_r(NULL, " ", &context); // We pass the context variable to strtok_r } // Print the list of tokens printf("Token List:n"); for (int i=0; i < num_tokens; i++) { printf("%sn", token_list[i]); } return 0; } |
Output
1 2 3 4 |
Token List: Hello from JournalDev! |
While we get the same output, this version is better, since it is thread safe!
Conclusion
In this article, we learned about how we could use the strtok() and strtok_r() functions in C, to tokenize strings easily.
For similar content, do go through our tutorial section on C programming!