Basics of R- software

Basics of R- software

In this article we will go through some basic applications of R-software with the help of some examples. Later on we will get to know some commonly used R functions.

Contents of the article-

  1. Identifying, extracting and removing duplicates
  2. Sorting
  3. Basic functions such as sum, range, date, time in R-software

Identifying and removing duplicates in R software

Duplicates are the exact repeat of an element in a data set more than once. We can easily identify the duplicates in R and remove them and hence obtain the unique elements in our dataset.

Example 1:

Suppose we have a have vector of names and we want to identify and remove duplicates from it.

Code-

names<-c(“a”,”b”,”c”,”d”,”a”,”a”,”b”) ## creating a vector of names

duplicated(names)

## this code will return Boolean values to us for the vector “names”. It will return TRUE if the element is duplicated and FALSE otherwise.

Output

[1] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE

##the first four names are unique and hence the code has returned the value FALSE for them. We can see that the 5th, 6th and 7th elements are repeated in the vector “names” and hence the code is returning us the Boolean value TRUE  for them indicating that they are duplicates.

This code will also work on a vector of numeric type.

Suppose we want to extract the duplicate elements.

Code & Output

names[duplicated(names)] [1] “a” “a” “b”

##we can see all the duplicated values have been extracted.

Suppose we want to extract all the unique names from our vector.

Code & Output

names[!duplicated(names)]

[1] “a” “b” “c” “d”

## we can see that the code has extracted all the unique values from our  vector names.

Similarly we can remove duplicates based on columns from a data frame with multiple columns.

Example 2:

Suppose you have a dataset of name, age and genders and you want to identify, extract

and remove all the duplicates from the column gender of the dataframe

Code-

name<-c(“manvir”,”yash”,”sandra”,”rohit”,”raj”)

age<-c(22,21,22,61,18)

gender<-c(“m”,”f”,”f”,”m”,”m”)

b<-data.frame(name,age,gender)

## these codes will create a data frame with three columns and store it in the variable b.

Identifying the duplicates in column “gender”of the data frame

Note that the function duplicates works row wise only.

Code and Output-

duplicated(b$gender)

[1] FALSE FALSE  TRUE  TRUE  TRUE

Extracting the duplicates from the data frame on the basis of gender

Code & Output-

b[duplicated(b$gender),]

    name age gender

3 sandra  22      f

4  rohit  61      m

5    raj  18      m

## we can see that all the rows with duplicates on the basis of gender has been extracted

Removing the duplicates from the data frame on the basis of gender

Code & Output-

b[!duplicated(b$gender),]

    name age gender 1 manvir  22    m 2   yash  21    f

## we can see that all the duplicates have been removed and unique values have been extracted

Using similar techniques, one can easily identify, extract and remove duplicate values from their data set.

Sorting in R-software

Sorting means arranging the data in either ascending or descending order. It is done by using the function sort() . It can work on both numeric elements and character elements. Below are few examples-

Suppose we want to sort a vector in ascending order

Code & Output

age<-c(22,21,32,61,18) ## creating a numeric vector which we are going to sort in ascending order.

sort(age, decreasing = FALSE)

##since we want to sort in ascending order so we have set the logical value of argument decreasing as FALSE.

[1] 18 21 22 32 61

## we can see that the vector has been arranged in ascending order

Suppose we want to sort a vector in descending order

Code & Output

name<-c(“a”,”y”,”s”,”r”,”j”)  ##creating a vector of character elements

sort(name,decreasing = TRUE) ## sorting the data in descending order, that is, from z to a

[1] “y” “s” “r” “j” “a”

## output of the code , we can see is in descending order alphabetically

 Some other basic functions in R

age<-c(22,21,32,61,18) ## creating a vector

sum(age) ## this code will give us the sum of all the rows of the vector age

max(age) ## this code will give us the maximum element of the vector age

min(age) ## this code will give us the minimum element of the vector age

mean(age) ## this code will give us the mean of the vector age

median(age) ## this code will give us the median of the vector age

mode(age) ## this code will give us the mode of the vector age

cumsum(age) ## this code will give you a row-wise cumulative sum of ages

var(age) ## this code will give us the variance.

sd(age) ##this code will give us the standard deviation

Sys.Date() ## this code will return today’s date in R-software

Sys.time() ## this code will return today’ date and current time in R-software

names(dataset) ## this code will return the name/header of all the columns of a dataset

range(dataset) ## this code will give you the minimum and maximum value of a dataset

Mathematica-City

Mathematica-city is an online Education forum for Science students run by Kounteyo, Shreyansh and Souvik. We aim to provide articles related to Actuarial Science, Data Science, Statistics, Mathematics and their applications using different Statistical Software. We also provide freelancing services on the aforementioned topics. Feel free to reach out to us for any kind of discussion on any of the related topics.

One thought on “Basics of R- software

  1. I want to show thanks to this writer for bailing me out of this particular challenge. Because of browsing through the the net and seeing basics which were not helpful, I was thinking my life was gone. Being alive minus the approaches to the problems you have resolved by way of your report is a crucial case, and those which may have negatively affected my career if I hadn’t come across your site. Your own knowledge and kindness in taking care of a lot of stuff was tremendous. I don’t know what I would’ve done if I had not discovered such a subject like this. It’s possible to now relish my future. Thanks for your time so much for this skilled and effective help. I will not hesitate to propose your site to any person who should have guidance about this situation.

Comments are closed.