Lookup with unique identifiers

INDEX function

The INDEX function returns the value from a range of cells based on row and column specified in the function arguments.

We are simplifying here, the INDEX function can actually return values from one or more arrays – i.e. broader term for lists of data which includes cell ranges.

But the most common use is with cell ranges, as that is the typical way of structuring data in Excel.

 

Syntax of the INDEX function is as follows:

= INDEX ( array ; row_num ; column_num )

Array will in our case always be defined as a range of cells, such as C6:G10 – think of it as a new “virtual” table that is created inside of the INDEX function based on values from a range of cells such as C6:G10.

Row number is always an integer number such as 0, 1, 2, 3…

Column number is always an integer number such as 0, 1, 2, 3…

 

Here is how basic application of the INDEX function looks like:

excel index function array

Here, we created our new “virtual” table from the data in cell range C6:G10.

In INDEX function, number 50000 is not in the cell G10, but rather inside of a “virtual” table existing only in our INDEX function, with an address row 5, column 5:

excel index function row column

Indeed, if we set row number as 5, and column as 5, our function will return number 50.000:

excel index function array row column

It doesn’t matter how we set or calculate those row and column integer numbers (and if we input decimal numbers, they will be rounded down to integer numbers anyway). We can do it in the most nonsense kind of ways, such as referring to the cells which we included in our “virtual” table:

excel index function strange row column

 

If we want to return the whole row, we can do that by defining the desired row, and then defining the column as zero.

If we want to return the whole column, we can do that by defining the desired column, and before that defining the row as zero.

If we want to return the whole table, we can do that by defining both column and row as zero.

Returning the whole array of numbers into a single cell is not really what you can realistically do in Excel, but you can, for example, sum that array of numbers:

excel index row column zero

As you can see, we are returning row number 2, column number 2, or the whole table with the help of the INDEX function. This results in errors, but with the help of the SUM function, we can return the sum of the desired row, column and table, respectively.

 

It is often the case that your array will consist of only one column, such as E6:E10 in the example below.

In the column number part of the argument, valid statements are then either zero or number 1, as shown in the cell H19.

You can also avoid writing that part or the argument altogether, as shown in the cell H20 – Excel will then assume you’ve entered number 1:

excel index single column array

 

It is also often the case that your array will consist of only one row, such as C9:G9 in the example below.

In the row number part of the argument, valid statements are then also either zero or number 1, as shown in the cell H20.

You can also avoid writing that part or the argument altogether, as shown in the cell H21 – Excel will then assume you’ve entered number 1.

However, while this function in the cell H21 will still return your result as Excel will try (and succeed) to interpret what you meant by your formula, it is not in line with the INDEX function syntax specification, and it is perhaps better to write those formulas as shown in the cell H20:

excel index single row array

 

MATCH function

The MATCH function returns the relative position of the item in the range, either a row or a column.

Relative position is returned in the form of an integer number, such as 1, 2, 3…

 

Syntax of the MATCH function is as follows:

= MATCH ( lookup_value; lookup_array; match_type)

Lookup value can be either a text or a number, inputted directly into the function or as a cell reference.

Lookup array will in our case always be defined as a range of cells, such as C3:C7 – again, think of it as a new “virtual” table that is created inside of the MATCH function based on values from a range of cells such as C3:C7.

The difference here is that we are dealing with either a row, or a column. An integer number returned is than always a relative position of our lookup value in the range.

Match type should always be zero.

 

Here is how basic application of the MATCH function looks like:

excel match function column

In cell F9 we are retrieving relative position of number 40 inside of an array C3:C7, and that relative position is represented by an integer number 4.

In cell F10 we are doing the same for the text “40”, but as our array C3:C7 doesn’t contain such text, error is returned.

Be careful when designating your arrays, blank cells are counted:

excel match function whole column

Next, this is an example of the MATCH function applied to a data in a row:

excel match function row

In cell F13 we are retrieving relative position the text “bb” from an array C3:C7, and that relative position is represented by an integer number 2.
In cells F12 and F13 we are doing the same for other text strings, but we are also referencing them from cells.

Concerning text, the MATCH function is not case-sensitive, and can perform matches with text strings up to 255 characters long.

Partial matches with the help of an asterisk [*] or a question mark [?] are also supported, as long as the match type is defined as zero. Formula such as =MATCH(“cc*”;G3:K3;0) will return number 3 in our example. However, while there may be uses for this kind of lookup values in the correct context, in principle it is to be avoided. As we will see, you will want to use unique identifiers.

Match type, other than 0 – Exact match, can be:

  • 1 – Less than, finds the largest value that is less than or equal to lookup value, while lookup array must be placed in ascending order
  • -1 – Greater than, finds the smallest value that is greater than or equal to lookup value, while lookup array must be placed in descending order.

However, while there may be uses for these match types in the correct context, in principle it is to be avoided. As we will see, not only that sorting your lookup arrays in a sense defeats the purpose, you will want to use unique identifiers.

 

How to combine INDEX with MATCH

The INDEX function and the MATCH function aren’t that useful such as they are, but their combination is. We can use the INDEX function to designate a cell range from which we want to retrieve data from, and at the same time use the MATCH function to specify the address of a cell in that range, i.e. row and column.

This is how that would look:

excel index match match

Here, we are retrieving data from the cell range E5:I9.

The row from which data will be retrieved will be row 1 in that selection, as “red” matches row 1 of selection D5:D9.

The column from which data will be retrieved will be column 4 in that selection, as 30.11.2020 matches column 4 of selection E4:I4.

We can see that process clearer here:

excel index match match illustrated

Notice how range E5:I9 rows are mirrored with the rows in the array D5:D9, and E5:I9 columns are mirrored with the columns in the range E4:I4.

This has to be true – otherwise, INDEX MATCH combination will not return the desired results, rather, errors or, worse, incorrect results will be returned.

We can see that in the following example:

excel index match match error one

As columns from the range D4:H4 are “off” with columns from the range E5:I9, and 30.11.2020 is fifth figure in the range D4:H4 – incorrect figure from row 1, column 5 of the E5:I9 range will be returned.

Not much will be different even if we extend our match array to cell I4:

excel index match match error two

If, on the other hand, our column lookup range is to “short”, but starts in the right cell, depending on the query, correct results can still be returned:

excel index match match partial

Obviously, not if we try to lookup 31.12.2020:

excel index match match partial and error

Always make sure that the cell range in the INDEX function array has the same number of rows and columns as you lookup arrays, and that they are correctly positioned!

 

If we just want to retrieve data from a single row, horizontal lookup is quite simple with INDEX MATCH:

excel index match horizontal lookup

Here, we’ve retrieved the cell F6 value from the row blue, for the date 30.9.2020. Note that we should set our row condition as number 1.

 

Retrieving data from a single column (vertical lookup), it’s even simpler:

excel index match vertical lookup

Here, we’ve retrieved the cell F7 value from the row column F, for the match green. Note that we didn’t have to bother to set column number.

 

You’ve probably already noticed that we have a double entry in our table, team white shows up two times. This will happen when we try to lookup “white”:

excel index match vertical lookup double match

Here, we’ve retrieved the cell F8 value from the row column F, for the match white. Cell F8 value was retrieved because it’s match is in the fourth row or our range, i.e. it shows up earlier than the match for cell F9.

We’ve returned the first match – not necessarily the value we were hoping to return.

We don’t have an issue with our formula here – we either have an issue with our data quality, or are we trying to use INDEX MATCH inappropriately.

Make sure that you are dealing with unique values when using MATCH for lookup!

 

Retrieving data with unique identifiers

A unique identifier is usually a table column which contains data in “all” of the rows, and that data is non-repeating.

Possible examples of unique identifiers are product codes (one code for every product), user IDs (one code for every user), user names (one username per every user), email addresses (no double registrations with the same email address), and so on.

There could be several such columns with non-repeating data in all of the rows in any given table, i.e. table containing user ID will possibly also contain email address, and they are both unique.

However, if we want to efficiently manage data in multiple large tables, from multiple sources, we will usually select one unique identifier such as user ID (one code per every user), ensure that that code is non-repeating in our primary table (let’s call that table Users), and then ensure that this unique identifier is available in all other tables.

Combination of INDEX and MATCH functions is the preferred way of managing data structured in the described way in Excel.

 

Let’s now show this with a practical example.

First, we have our table Users with each user having its own unique ID:

excel index match unique identifier

We also have our table Incentive, where we’ve also ensured that the unique ID’s are available:

excel index match table

Let’s say we want to retrieve user email addresses. With INDEX MATCH, this is easy:

excel index match retrieve email

Here, we are matching users’ IDs with user IDs in column A, table Users. When match is found, value from the corresponding row in column B, table Users is returned, i.e. email address.

We can also use INDEX MATCH to generate messages users will receive:

excel index match generate message

Notice how we are retrieving parts of the message by matching both user ID and, earlier retrieved, email address, with the table Users.

We can do this without errors as they are both unique in the Users table.

 

One thought on “Lookup with unique identifiers”

Leave a Reply

Your email address will not be published. Required fields are marked *