Before we delve deeper into R programming, it is important to understand the various data types used in R. R has the following basic data types:
- Character
- Numeric
- Integer
- Logical
- Complex
Apart from these basic types, R also supports several data structures that are constructed from the basic types. These are:
- Vectors
- Matrices
- Arrays
- Dataframes
- Lists
- Factors
This tutorial will help you understand the basic data types. The working of each complex data structure will be covered in more elaborate tutorials.
Numeric Data Types in R
Since R is predominantly a language designed for statistical purpose, numeric and integer data types are widely used. We will begin with the most frequently used numeric data type. As the name indicates, numeric corresponds to any numerical value. These values can be floating-point, decimal or double.
Before we assign our first variable, please note that the assignment operator in R is <- . Less than symbol followed by a hyphen.
Begin by assigning random numeric values to variables a, b, and c. The typeof()
function in R gives how the variable is stored internally, whereas the class()
function tells us what data type class this variable belongs to. R also offers some checking functions starting with “is“. The function is.numeric()
when called upon a variable returns a Boolean value of whether the object is of the numeric data type. There are respective is functions for other classes as well.
Run the following code in your script editor. Place the cursor on the first line and execute each line pressing Ctrl+ Enter. Also observe how these variables start to become a part of your environment window as you run each line.
1 2 3 4 5 6 7 8 9 10 |
a <- 3 b <- 5.4 c <- -9.5 typeof(a) typeof(b) typeof(c) is.integer(a) is.numeric(b) is.numeric(c) class(a) |
Output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
> a <- 3 > b <- 5.4 > c <- -9.5 > typeof(a) [1] "double" > typeof(b) [1] "double" > typeof(c) [1] "double" > is.integer(a) [1] FALSE > is.numeric(b) [1] TRUE > is.numeric(c) [1] TRUE > class(a) [1] "numeric" |
Note how all the variables are stored as double in the internal storage, irrespective of being defined with floating-point or not, as shown by the typeof() function. The variable a with value 3 is stored as 3.0, a double value. The function call is.integer() returns a FALSE Boolean value because of this. The class(a)
line gives our variable’s type numeric.
Data Types in R
Integer
Without any specification a numerical value is treated as a numeric data type by R compiler. Therefore, to define an integer variable, it is necessary to specify so. This is done by using the as.integer()
function.
1 2 3 4 |
a<-as.integer(a) typeof(a) is.integer(a) is.numeric(a) |
Soon after you run the first line of the code, you see the value assigned to a become 3L, as opposed to 3. The suffix L signifies a long integer of range -2*10ˆ9 to +2*10ˆ9. The variable a is both an integer and a numeric, since all integers are numerics, but not vice versa.
Output:
1 2 3 4 5 6 7 |
> a<-as.integer(a) > typeof(a) [1] "integer" > is.integer(a) [1] TRUE > is.numeric(a) [1] TRUE |
Character
The character data type in R is meant to be used for both strings and single characters. A character can be simply specified by using a single or double quotes pair as below.
1 2 3 4 |
name="journaldev" typeof(name) name2='journaldev2' typeof(name2) |
Output:
1 2 3 4 5 6 |
> name="journaldev" > typeof(name) [1] "character" > name2='journaldev2' > typeof(name2) [1] "character" |
An integer or a numeric type can be converted into a character using as.character() function.
1 2 3 4 |
average=0.558 typeof(average) average=as.character(average) typeof(average) |
Output:
1 2 3 4 5 6 |
> average=0.558 > typeof(average) [1] "double" > average=as.character(average) > typeof(average) [1] "character" |
Logical
Logical is the data type used to store Boolean values TRUE and FALSE that arise as a result of logical operations. We have already encountered the TRUE and FALSE outputs above as we called the is.numeric()
and is.integer()
functions above. Let us verify this using the following statement.
1 |
typeof(is.integer(average)) |
Output:
1 2 |
> typeof(is.integer(average)) [1] "logical" |
Similarly, variables of logical type can be used to work with all logical operations.
1 2 3 4 5 |
x=TRUE; y=FALSE x&y #Logical AND x|y #Logical OR !x #Logical NOT !y #Logical NOT |
Output:
1 2 3 4 5 6 7 8 9 |
> x=TRUE; y=FALSE > x&y [1] FALSE &gt; x|y [1] TRUE > !x [1] FALSE > !y [1] TRUE |
Complex
The final basic data type we discuss here is the complex type. This is used to represent complex numbers in mathematics. Complex numbers are of the form a+bi
where a and b are integers.
1 2 |
complex=4+2i typeof(complex) |
Output:
1 2 3 |
> complex=4+2i > typeof(complex) [1] "complex" |
If we wish to calculate the square root of -1, which is mathematically known as i, the imaginary component of the imaginary number, We cannot calculate it by using normal sqrt
(square root) function.
1 |
typeof(sqrt(-1)) #Throws a warning message. |
Output:
1 2 |
Warning message: In sqrt(-1) : NaNs produced |
This is possible by defining -1 as a complex number first.
1 2 |
sqrt(-1+0i) typeof(sqrt(-1+0i)) |
Output:
1 2 3 4 |
> sqrt(-1+0i) [1] 0+1i > typeof(sqrt(-1+0i)) [1] "complex" |
This ends our discussion of the basic data types in R.