R is an open source programming language and software atmosphere for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software system and data analysis.
In this blog, we will go through getting started with R and the basic commands in R language.
Here, we also discuss the Vector data type in detail. We also present to you a command and output screenshot as the execution proof for the same.
Let see who are the Relatively high-profile users of R include:
Facebook: Used by some within the company for tasks such as analyzing user behavior.
Google: There are more than 500 R users at Google, according to David Smith at Revolution Analytics, doing tasks such as making online advertising more effective.
National Weather Service: Flood forecasts.
Orbitz: Statistical analysis to suggest best hotels for promotion to its users.
Trulia: Statistical modeling.
Beginners who do not have R installed, please go to the link below:
This will download an .exe file which will execute and install the R language in your Windows system. Also, download and install RStudio on your desktop. This will give an environment to work with R language.
After installing RStudio, you will get a welcome page, as shown in the screenshot below. In this post, we will see that the same Rstudio executes all the basic commands on R.
Create a new R script to open the scripting shell in R-studio. Follow the screenshot below:
This will give you the screen below:
You can now start working on Rstudio. We have attached screenshots with every command for you to practice and refer to for better understanding.
Note: Please do check the commands after copying to Rstudio, as it may give some error due to changes in the ASCII values of the characters.
Change your working directory with the setwd() function. Note that the slashes have to be either forward or double backward slashes. For Windows, the command might look something like:
2: Install Packages
Syntax: install.packages(“package name”)
You can do pretty much anything in R, using 10000+ packages in R at CRAN or at the Comprehensive R Archive Network.
The command for installing a package is: install.packages(“thepackagename”); e.g., install.packages(“sqldf”)
If you don’t want to type the command by yourself, in RStudio you can see that there’s a “Packages” tab on the lower right of the window, click that and you’ll see a button to “Install Packages.”
3: Updated Packages
For updated packages, you can run the following to get the latest versions of all your installed packages:
To do it all at once:
Another dialogue box will appear. Press “Update.”
To remove a package on your system, type in the following:
5: Function Help
This is a shortcut to the help function, which uses parentheses. E.g.: help(sqldf)
If you want to find out more about a function, type: ?functionName;
As you see, both give the exact same details.
Objects obtain values in R by assignment (‘x gets a value’).
This is done either by “<-“ or “=”
Thus, to create a scalar constant x with value 6, we type:
x <- 6 or x = 6
y <- “a” or y = “a”
The operators available in R are listed below. Hope you already understand the use of each of these operators.
8: Modulo and Integer Quotients
Syntax: x % / % y
To know the integer part of a division, say, how many 2s are there in 50, type in the following:
Syntax: x %% y
To know the remainder (what is left over when 50 is divided by 2):
In math, this is known as modulo.
The screen prompt > is an invitation to put R to work.
Each line can have 128 characters.
Two or more expressions can be on a single line, as long as they are separated by semi-colons:
10: In-Built Functions
The log function gives logs to the base e (e = 2718), for which the antilog function is exp.
log(10)  2.302585 exp(1)  2.718282
For very big or small numbers, R uses the following scheme:
1.9e3 1900 i.e. 1.9 multiplied by 1000
1.9e-2 0.019 i.e. 1.9 multiplied by 1/100
These Functions are used to convert the decimals into integers.
There are multiple types of roundings:
These can be done in R using:
Syntax: floor(Decimal Number)
And the ‘next integer’ function is ceiling.
Syntax: ceiling(Decimal Number):
These are the very basics of R. Any beginner can start with these commands to get familiar with R studio.
To know more about R, we move forward to see Data Structures in R.
We have the following types of Data Structures:
Classified under 2 branches:
Out of all, we will cover only “Vector” in detail. Other data types are equally interesting, but as beginners, let us keep it short and simple.
A collection of values that all have the same data type. The elements of a vector are all numbers, giving a numeric vector, or all character values, giving a character vector. Also, there is another type of vector that is present, which we know it as a logical vector.
13: Creating a Vector
Vectors are variables with one or more than one values of the same type: logical, integer, real, complex, and string. Vectors could also have a length equal to 0.
14: Length of a Vector
Syntax: length(Vector name)
This will show the length of a vector specified.
The length of the longest vector is assigned to a derived vector (created by calculation), here A is of length 5 and B is of length 2:
15: Types of Vectors
a <- c(4,3,6.3,6,-8,9)
b <- c(“nine”,”two”,”eight”)
c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE)
Next, we learn how to refer to these elements.
E.g. a[c(1,3)] # refers to the 1st and 3rd element of a vector.
16: How to Work with Vectors & Logical Subscripts?
Take the example of a vector containing the 8 numbers, from 0 to 7:
To add up all the values:
To know how many of the values were less than 3:
To find the sum of the values of x that are less than 3, we write:
To find out the logical condition x<3 is either true or false:
17: Vector Functions in R
Important vector functions are listed in the Vector functions used in R.
All these functions are same as the functions that we use in mathematics. Hope readers understand these basic functions.
18: How to Work with Vectors and Logical Subscripts
To find out the sum of the two largest values in a vector.
First, sort the vector in descending order, then add up the values of the last two elements of the sorted array.
Let’s do this in stages. First, the values of a:
Now if you apply “sort” to this, the numbers will be sorted by ascending sequence,
19: Logical Arithmetic
Arithmetic involving TRUE or FALSE can be done in R .
R can coerce TRUE or FALSE into numerical values: 1 for TRUE and 0 for FALSE.
Is a less than 4?
Any Value greater than 0?
Any Value greater than -5?
Any Value less than 2?
Sum of values less than 2 in vector a sum(a<2)
We have many other operators which are pretty similar to mathematical operators listed below.
List of operators:
21Generating Regular Sequences of Numbers
For regularly spaced sequences, involving integers, it is simplest to use the colon operator. This can produce ascending or descending sequences:
 5 6 7 8 9 10 11 12 13 14 15
Use the seq function to go from 0 up to 5 in steps of 0.5:
Sequencing downwards from 5 down to 0 in steps of 0.5:
22: Generating Repeated Sequence
The rep() (or repeat) function puts the same constant into long vectors. The call form is rep(x,times).
x <- rep(6,4)
23: Identifying Missing Values
mean (vector name, na.rm=T)
Missing values are a cause of concern and can be dealt accordingly
Suppose we have a vector
To handle the missing values, using the na.rm=TRUE argument
To check for the location of missing values within a vector, use the function is.na(x)
To convert the NA to 0, use the ifelse function:
24: Sorting, Ranking & Ordering
Now apply the three different functions to the vector called sales
Syntax:- Vector name <- data.frame(vector1,vector2,vector3,..)
Make a dataframe using the four vectors:
26: Using sprintf
Syntax: sprintf(“ %d,vector1,operation on vector”)
This function assembles a string from the parts in a formatted manner.
i <- 2
s <- sprintf(“the cube of %d is %d”,i,i^3)
27: Character Strings
In R, character strings are also defined in double quotations:
Numbers can be characters (as in b, above), but characters cannot be numbers.
To amalgamate strings into vectors of character information:
28: length VS nchar
One of the confusing things about character strings is the distinction between the length of a character object (a vector) and the numbers of characters in the strings comprising that object.
Let’s see an example to make the distinction clear:
Here, sports is a vector comprising of 4 character strings:
The individual character strings have 9,11,7 and 9 characters, respectively:
Syntax: – regexpr(“ specific character”,vetor name)
regexpr() function reports the character position in the provided string(s) where the start of the match with pattern occurs. The function also returns the length of the match.
Syntax:- gregexpr(“ specific character”,vector name)
gregexpr(pattern,text) is the same as regexpr(), but it finds all the instances of pattern. Here’s an example:
Syntax: – regmatches(vector1,vector2 with regexpr stored)
regmatches() function can retrieve the matching components of a string vector for a provided match object produced by regxpr().
32: Using sub and gsub
Sub (“specific character to find only 1st value”,” specific character to replace with only 1st value”, vector name)
gsub(“spcific character to find”, ”specific character to replace with”, vector name)
sub() function finds patterns within strings in a manner similar to that of grep(), but then substitutes the first instance of a match with a specified string.
gsub() function works in exactly the same manner as sub(), but replaces all matches to pattern rather than replacing only the first match.
Syntax: – vector name=strsplit(vector1,split=”split by character”,fixed=TRUE)
This function is used to splits a string into a list containing multiple strings, based on a defined delimiter.
The delimiter can be defined as a fixed character string or as a regular expression.
34: grep Function (Pattern matching)
Syntax: – grep(“special character”,vector name)
This function is used to search for matches of pattern within each element of a character vector and returns an integer vector of the elements of the vector that matches (if value is set to FALSE, which is the default).
If value is set to TRUE, the contents of the matching elements of the character vector are returned.
x = c(“apple”,“potato”,”grape”,”10″,”blue.flower”)
Hope you do practice the commands and use it in data analysis for everyday life.
You will get to see the other data types present in R in our next blog.
For more BigData trending blogs subscribe to us at ACADGILD.