How to use %in% in R: 8 Example Uses of the Operator (2024)

Are you an R enthusiast struggling to understand %in% in R and how it can be used in your data analysis and manipulation tasks? Look no further! This post will explore the ins and outs of the %in% operator. We’ll start with the basics, including what %in% means in R and how it differs from the == operator. Then, we will dive into the practical applications of %in%, including seven ways to use this operator in your R code. You will learn how to use %in% to compare sequences of numbers and vectors containing letters or factors and test whether a value is in a column.


Additionally, you will discover how to add a new column to a dataframe, subset data, remove columns, and select columns using the %in% operator. And if that’s not enough, we will look at how to create a not in operator in R! By the end of this post, you’ll have a comprehensive understanding of the %in% operator and how to utilize it in your R programming tasks. So let’s get started and unlock the full potential of the %in% operator in R!

Table of Contents

  • Outline
  • 8 Ways to Use the %in% Operator in R
    • 1: Using %in% to Compare two Sequences of Numbers (vectors)
    • 2: Utilizing %in% in R to Compare two Vectors Containing Letters or Factors
    • 3: How to use the %in% Operator in R to Test if Value is in Column
    • 4: Using %in% in R to Add a New Column to a Dataframe
    • 5: Utilizing the %in% in R to Subset Data
    • 6: Add a Column using %in% in R
    • 7: Using %in% in R to Remove Columns from Dataframe
    • 8: Make use of the %in% Operator to Select Columns
  • Bonus: Creating a not in operator in R
  • Conclusion: R in operator
  • R Tutorials

Outline

This post’s outline is described in more detail than the table of contents. First, we start with simple examples of using the %in% operator. Specifically, we will look at how to use the operator when testing whether two vectors contain sequences of numbers and letters. As you may already have expected, the operator can be used in other, maybe more advanced cases. The following sections will examine how to work with this operator and dataframes. For example, you will see that you can use the operator to create new variables, remove columns, and select columns.

Save

What does %in% Mean in R

The %in% operator in R can be used to identify if an element (e.g., a number) belongs to a vector or dataframe. For example, it can be used to see if the number 1 is in the sequence of numbers 1 to 10.

What is the Difference Between the == and %in% Operators in R

The %in% operator is used for matching values. “returns a vector of the positions of (first) matches of its first argument in its second”. On the other hand, the == operator is a logical operator used to compare if two elements are equal. Using the %in% operator, you can compare vectors of different lengths to see if elements of one vector match at least one element in another. The output length will equal the length of the compared vector (the first one). This is not possible when utilizing the == operator.

What is the use of %in% in R?

The %in% operator matches values in, e.g., two different vectors, as already answered in the two previous questions. You can also use the operator to select certain columns in the dataframe or to subset the dataframe.

If you need to know what is $ in r, another operator, check the linked post. Now that you know that %in% is in R, and what is the difference between this operator and == is we can go on and have a look at the example usages.

8 Ways to Use the %in% Operator in R

In this section, we are going through eight examples of using %in% in R. As you already know, we will start by working with vectors. After that, we will look at how to use the operator when working with dataframes.

1: Using %in% to Compare two Sequences of Numbers (vectors)

In this example, we will use %in% to check if two vectors contain overlapping numbers. Specifically, we will look at how we can get a logical value for more specific elements, whether they are also present in a longer vector. Here’s the first example of an excellent usage of the operator:

# sequence of numbers 1:a <- seq(1, 5)# sequence of numbers 2:b <- seq(3, 12)# using the %in% operator to check matching values in the vectorsa %in% bCode language: R (r)

In the code chunk above, we created two numeric vectors, a and b, containing sequences of numbers. The seq() function generates a sequence of numbers in R from the starting value to the ending value in increments of 1 by default.

Importantly, the R %in% operator is then used to check for matching values between a and b. Specifically, a %in% b will return a logical vector indicating which elements in the vector a are also present in the vector b. The output shows us which elements that are the in the shorter vector:

How to use %in% in R: 8 Example Uses of the Operator (2)

  • Save

In a real-world example, our vectors might not contain sequences but just random numbers. If we, on the other hand, want to test which elements of a longer vector are in a short vector, we do as follows:

# shorter vector:a <- seq(12, 19)# longer vector:b <- seq(1, 16)# test if elements in longer vector is in shorter:b %in aCode language: R (r)

How to use %in% in R: 8 Example Uses of the Operator (3)

  • Save

As you can see, both above methods will result in a boolean. Additionally, if we use the which() function, we can use the indexes of where the overlapping elements:

# Using the operator together with the which() functionwhich(seq(1, 10) %in% seq(4, 12))Code language: PHP (php)

Might also interest you:

In the following example, we will see that we can apply the same methods for letters or factors in R. That is, we will test if two vectors containing letters are overlapping.

Save

2: Utilizing %in% in R to Compare two Vectors Containing Letters or Factors

In this example, we will use %in% to check if two vectors contain overlapping letters. Note, this can also be done for words (e.g., factors). First, we will compare letters in shorter and longer vectors. Here’s how to compare two vectors containing letters:

# Sequences of Letters:a <- LETTERS[1:10]# Second seq of lttersb <- LETTERS[4:10]# longer in shortera %in% bCode language: PHP (php)

As you can see, and probably already figured out, we used the %in% operator exactly in the same way as for vectors containing sequences of numbers. Again we can test which letters in a long vector are in a short vector:

b %in% a

Naturally, as with the examples where we used sequences of numbers in R, the result is a boolean vector when working with letters, words, or factors. Furthermore, as in the first example, we can use the which() function to get indexes:

g <- c("C", "D", "E")h <- c("A", "E", "B", "C", "D", "E", "A", "B", "C", "D", "E")which(h %in% g)Code language: JavaScript (javascript)

How to use %in% in R: 8 Example Uses of the Operator (5)

  • Save

Finally, here’s an example of why using the %in% operator is better than the ==. If we use which(), together with ==, we will get only the two-three elements:

# %in% vs == the equal operator wrong!which(g == h)Code language: R (r)

In the following example, we will work with a dataframe, instead of vectors. First, however, we are going to read the readxl package to read a .xlsx file in R. Here’s how we get our dataframe to play around with:

library(readxl)library(httr)#URL to Excel File:xlsx_URL <- 'https://mathcs.org/statistics/datasets/titanic.xlsx'# Get the .xlsx file as an temporary fileGET(xlsx_URL, write_disk(tf <- tempfile(fileext = ".xlsx")))# Reading the temporary .xlsx file in R:dataf <- read_excel(tf)# Checkiing dataframe:head(dataf)Code language: R (r)

How to use %in% in R: 8 Example Uses of the Operator (6)

  • Save

A quick note before going on to the third example is that readxl and dplyr, a package we will use later, are part of the Tidyverse package. If you install Tidyverse you will get some powerful tools to extract year from date in R, carry out descriptive statistics, visualize data (e.g., scatter plots with ggplot2), to name a few.

3: How to use the %in% Operator in R to Test if Value is in Column

In this example, we will look at a straightforward example of how to use this operator. Namely, we are going to use %in% to check if a value is in one of the columns in a dataframe:

# %in% column 2 %in% dataf$boatCode language: R (r)

If you have read through the first two examples, you already know we get a boolean vector. In this vector, the value TRUE means that the cell contained the value we sought.Notice also how we used the $ operator to select one of the columns.

  • How to Standardize Data in R

4: Using %in% in R to Add a New Column to a Dataframe

Here’s how to use the %in% operator to create a new variable:

# Creating a dataframe:dataf2 <- data.frame(Type = c("Fruit","Fruit","Fruit","Fruit","Fruit", "Vegetable", "Vegetable", "Vegetable", "Vegetable", "Fruit"), Name = c("Red Apple","Strawberries","Orange","Watermelon","Papaya", "Carrot","Tomato","Chili","Cucumber", "Green Apple"), Color = c(NA, "Red", "Orange", "Red", "Green", "Orange", "Red", "Red", "Green", "Green"))# Adding a New Column:dataf2 <- within(dataf2, { Red_Fruit = "No" Red_Fruit[Type %in% c("Fruit")] = "No" Red_Fruit[Type %in% "Vegetable"] = "No" Red_Fruit[Name %in% c("Red Apple", "Strawberries", "Watermelon", "Chili", "Tomato")] = "Yes"})Code language: PHP (php)

Notice how we make use of the operator. Here’s the dataframe, with the added column “Red_Fruit”:

How to use %in% in R: 8 Example Uses of the Operator (7)

  • Save

In another post, you will learn how to use R to add a column to a dataframe based on conditions and/or values in other columns.

5: Utilizing the %in% in R to Subset Data

In this example, we are going to use the %in% operator to subset the data:

library(dplyr)home.dests <- c("St Louis, MO", "New York, NY", "Hudson, NY")# Subsetting using %in% in R:dataf %>% filter(home.dest %in% home.dests)Code language: R (r)

Notice how we created a vector of the elements we want to include in our new, subsetted dataframe. Furthermore, we also used the dplyr package and the filter() function together with the %in% operator. Finally, we get the resulting subsetted dataframe:

How to use %in% in R: 8 Example Uses of the Operator (8)

  • Save

Note dplyr comes with many other handy functions, such as the select-family. For example, you can use dplyr to select columns in R or to take the absolute value in R, using the function only on numerical columns. In the next section, we will look at another way to use the %in% operator: to drop columns from a dataframe.

6: Add a Column using %in% in R

Here we make use of the mutate() function together with the %in% operator in R:

dataf2 %>% mutate(Red_fruit = if_else( Color %in% "Red" & Type == "Fruit", "Yes","No"))Code language: JavaScript (javascript)

In the code chunk above, we are using the %>% operator to perform a sequence of operations on the dataf2 data frame. Specifically, we are using the mutate() function to add a new column called Red_fruit to dataf2.

Next, within the mutate() function, we are using the if_else() function to determine whether each row of dataf2 corresponds to a red fruit.

In this case, the logical condition is given by Color %in% "Red" & Type == "Fruit", which checks whether the Color column contains the value “Red” and the Type column contains the value “Fruit”. If the condition is TRUE, the value “Yes” is returned, indicating that the row corresponds to a red fruit. If the condition is FALSE, the value “No” is returned instead.

The %in% operator is used to check whether each column element is equal to “Red”. It returns a logical vector indicating whether each element is a member of the set { “Red” }.

7: Using %in% in R to Remove Columns from Dataframe

In this example, we are going to use %in% to drop columns from the datafarme:

# Drop columns using %in% operator in Rdataf[, !(colnames(dataf) %in% c("pclass", "embarked", "boat"))]Code language: R (r)

In the code chunk above, we used the I to tell R that we do not want to select these columns. Running the code, above will result in a new dataframe with the columns removed:

How to use %in% in R: 8 Example Uses of the Operator (9)

  • Save

Note it is also possible to use dplyr to remove columns in R. For example, using the select() function with the pipe operator may result in a slightly more readable code.

In the following example, we are going to have a look at how we can use the %in% operator to do the opposite of dropping columns. That is, we are going to select columns instead.

8: Make use of the %in% Operator to Select Columns

Let us use the %in% operator to select several variables from the dataframe:

# Select columns using %in%:dataf[, (colnames(dataf) %in% c("pclass", "embarked", "boat"))]Code language: CSS (css)

Note that we removed the ! before the parentheses, which will tell R to select these columns (see example 6 above for the opposite).

Selecting columns instead of deleting them might be a more efficient way to go if we have a lot of variables in our dataset and we want to create a new dataframe with only some of them. Notice how we used another nice function: select_if(). This function is also from the dplyr package, and when we wanted to select columns if they had certain names.

In the final bonus section, we will see how to negate the %in% operator. Now, we will do this because there is not built-in “not in” operator in R.

Bonus: Creating a not in operator in R

Here’s how we can create our own not in operator in R:

# Creating a not in operator:`%notin%` <- Negate(`%in%`)Code language: R (r)

Pretty simple. It is now possible to use this new R not in operator to check if e.g., a number is not in a vector:

# Generating a sequence of numbers:numbs <- rep(seq(3), 4)# Using the not in operator:4 %notin% numbs# Output: [1] TRUECode language: R (r)

As you can see in the example above, we can use the %notin% operator similarly to the %in% operator. Note that it is also possible to use both operators on lists. Finally, it is worth noting that some R packages contain “not in” functions. For example, package mefa4 has the %notin% function. See the recent post about selecting and filtering values not in R’s dataframe.

Conclusion: R in operator

In summary, %in% is an operator in R that compares two sequences of values and returns a logical vector indicating which elements of the first sequence are also present in the second sequence. The %in% operator differs from the == operator, which compares individual values. Moreover, the %in% operator has several use cases, such as testing if a value is in a column, adding a new column to a dataframe, subsetting data, and selecting columns. Additionally, a not in operator can be created by negating the %in% operator with the ! operator. Overall, the %in% operator is a powerful tool in R for data manipulation and analysis.

R Tutorials

Here are some other useful tutorials:

  • How to Rename Column (or Columns) in R with dplyr
  • How to Extract Year from Date in R with Examples
  • Learn How to Calculate Descriptive Statistics in R the Easy Way
  • How to Create a Word Cloud in R
  • How to Extract Day from Datetime in R with Examples
  • Learn How to Create Dummy Variables in R (with Examples)

  • Save

How to use %in% in R: 8 Example Uses of the Operator (2024)

FAQs

How to use the in operator in R? ›

Use %in% for Vector Membership

Instead of using multiple OR conditions, you can use the %in% operator to check if an element belongs to a vector or list. For example, x %in% c(1, 2, 3) is more concise than x == 1 | x == 2 | x == 3.

What is %*% in R? ›

R uses the following miscellaneous operators: The : operator creates a sequence of numbers from the left argument to the right one. The %in% operator returns TRUE if the left argument is in the vector to the right. The %*% operator performs matrix multiplication on two matrices.

What is the use of in R? ›

What does "$" in R programming do? In R programming, the "$" symbol is used to access components of a data structure, typically within data frames or lists. It is known as the "dollar sign operator" or "list subsetting operator."

What is the meaning of %>% in R? ›

R pipes are a way to chain multiple operations together in a concise and expressive way. They are represented by the %>% operator, which takes the output of the expression on its left and passes it as the first argument to the function on its right. Using pipes in R allows us to link a sequence of analysis steps.

What is the use of in operator with example? ›

The IN operator ignores duplicate values in the list. For example, -- IN ignores the duplicate 'USA' value SELECT first_name, country FROM Customers WHERE country IN ('USA', 'UK', 'USA'); The code above is equivalent to the one below.

What is the logical operator for or in R? ›

The OR operator | works similarly to the AND operator & , but the difference is that only one of the logical values needs to be equal to TRUE for the entire OR operation to evaluate to TRUE . This means that TRUE | TRUE equals TRUE , but also, TRUE | FALSE and FALSE | TRUE evaluates to TRUE .

What does $$ mean in R? ›

In R programming, the dollar sign serves as a versatile and powerful operator for extracting, manipulating, and creating variables within data frames, lists, and environments. Its concise syntax enhances the readability of code and contributes to the efficiency of data analysis workflows.

What does \\ mean in R? ›

To match metacharacters in R you need to escape them with a double backslash “\\”. The following displays the general escape syntax for the most common metacharacters: The following provides examples to show how to use the escape syntax to find and replace metacharacters.

What is the difference between * and %*%? ›

The * operator is a simple arithmetic operator. It is called the multiplication operator. The %*% operator is a special kind of multiplication operator. It is used in the multiplication of matrices in R.

What is the use of %% in R? ›

What is the double percent ( %% ) used for in R? From using it, it looks as if it divides the number in front by the number in back of it as many times as it can and returns the left over value.

What does '|' mean in R? ›

& and && indicate logical AND and | and || indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators.

What does '~' mean in R? ›

'~(tilde)' is an operator that splits the left side and right side in a formula. It is normally used to separate the response variable from the set of feature variables. Example – lm(wages ~ yearsEd, data = df) In the above example, df is the dataframe that contains the columns wages and yearsEd.

What does %% mean? ›

Because these uses give the percent sign special meaning, the sequence %% (two percent signs) is used to represent a literal percent sign, so that: set PATH=c:\;%%PATH%% would set PATH to the literal value " c:\;%PATH% ". In the C Shell, % is part of the default command prompt.

What is <- in R? ›

<- The assignment operator <- is used for assigning a value to a variable. For example, name <- "James Bond" , assigns the string "James Bond" to the variable called name .

What is == in R? ›

The Equality Operator ==

Relational operators, or comparators, are operators which help us see how one R object relates to another. For example, you can check whether two objects are equal (equality) by using a double equals sign == .

What is the '~' operator in R? ›

'~(tilde)' is an operator that splits the left side and right side in a formula. It is normally used to separate the response variable from the set of feature variables. In the above example, df is the dataframe that contains the columns wages and yearsEd.

What is INF in R? ›

Inf and -Inf are positive and negative infinity whereas NaN means 'Not a Number'. (These apply to numeric values and real and imaginary parts of complex values but not to values of integer vectors.) Inf and NaN are reserved words in the R language.

What is the purpose of the <- operator in R? ›

The assignment operator <- is used for assigning a value to a variable. For example, name <- "James Bond" , assigns the string "James Bond" to the variable called name .

How to access each character in a string R? ›

There are two R's inbuilt functions in order to access both the single character as well as the substrings of the string. substr() or substring() function in R extracts substrings out of a string beginning with the start index and ending with the end index.

Top Articles
Latest Posts
Article information

Author: Aron Pacocha

Last Updated:

Views: 6121

Rating: 4.8 / 5 (48 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Aron Pacocha

Birthday: 1999-08-12

Address: 3808 Moen Corner, Gorczanyport, FL 67364-2074

Phone: +393457723392

Job: Retail Consultant

Hobby: Jewelry making, Cooking, Gaming, Reading, Juggling, Cabaret, Origami

Introduction: My name is Aron Pacocha, I am a happy, tasty, innocent, proud, talented, courageous, magnificent person who loves writing and wants to share my knowledge and understanding with you.