Excel may have a reputation for crunching numbers, but text formatted data such as names, addresses and descriptions is often encountered. Excel is also often used for processing data from wide variety of unrelated databases, applications and web sites. With those, issues such as numbers formatted as text, text formatted as numbers, multiple text fields combined into a single cell and broken tables are not uncommon. Further complicating things, Excel functions designed to manipulate text formatted data are many, have inconsistent syntax and are often performing similar but slightly different tasks.
Still, you can perform almost everything with small number of functions, and many things can be done by using Excel features.
The most powerful feature at your disposal is common Find and Replace pop-up which can be activated by pressing Ctrl + F key combination:
Above you can see a single table with several issues. Not only is the data not distributed properly into columns, but there are also several multiple spaces between the fields instead of one. Rather than using the TRIM function, you will probably resolve the issue with multiple unnecessary spaces faster if you just repeatedly replace two spaces [ ] with one space [ ], i.e., by using Find and Replace.
We can also use Text to columns feature in order to properly format the table (note that only column A is selected):
If we pick Delimited option in the first step, we can than set space and, for example, at sign [@] as delimiters:
Once we convert text strings to columns you will notice that our table is still broken (as is often the case), and we will have to make some manual adjustments. We will insert some cells:
We will also delete some cells:
You can also generate new text cells via formulas. That is useful if we want to combine multiple text inputs (cells, cell ranges and/or text strings) into a single cell.
If you want to combine more text inputs (cells, cell ranges and/or text strings) inside a single formula, use operator and [&].
If you want to input new text directly inside of a formula, nest it inside of quotation marks [“”].
Here is one example:
Note that every time we wanted to add comma and space, we had to yet again use operator and [&] and yet again insert that text inside quotation marks [“”].
There are also several functions you can use for joining text, such as TEXTJOIN and CONCAT:
As demonstrated in these examples, those functions can be very useful in the right circumstances. They can also complicate things fast, and for most purposes you can always resort back to using and [&] operator.
Basic text functions
The EXACT function compares two text strings. If they are the same, function will return TRUE, and if they are different FALSE.
In our example table, we can check did our user registration process perform as expected – did it generate usernames which are the same as users e-mail addresses:
We’ve received two FALSE returns. In the row 5, user entered its e-mail address in uppercase, while the user registration process generated lowercase username – this is not an issue. In the row 6, user registration process generated username is blank – that is an issue.
The LEN function returns the number of characters (including spaces) in a text string.
In our example table, we can check what number of characters do some of our cells contain:
Numbers returned here will actually be treated as numbers by Excel, and can further be used in mathematical operations. If the LEN function encounters empty field, it will return 0.
The SEARCH function locates one text string within a second text string, and returns the number of the starting position of the that text string. Or, to be more practical, if you want to locate the first occurrence of a single character inside a text string, you can search that character. The SEARCH function is not case-sensitive.
The FIND function works much the same way, but is case-sensitive.
We can look for a character “a” in the first column:
You can see that we will here also receive numbers as results. However, if we can’t find our characters, or if we reference an empty cell, we will receive an error.
Both the SEARCH and the FIND functions can be further conditioned with the starting position. For example, formula =SEARCH(“a”;A7;2) would still look for the first occurrence of the character “a” in the cell A7, but it would start looking with the second character and would then return 4 as a result.
LEFT, RIGHT and MID functions will return defined number of characters from the left, right or defined position in text string. By using them, you will create new text string with your defined number of characters:
Syntax of the formulas is almost identical, we first define referenced text string, follow with starting position (if any), and finish with the number of characters we want to return.
Now that we know all the basics, we can do something useful. We can take column D Domain text and separate second-level and top-level domains via formulas. That will look like this:
Given that the period [.] character separates second-level and top-level domain names, we are able to use the SEARCH function to locate its position in the text string. Given that everything to left of that character is our second-level domain, we can then use the LEFT function to return us one less than position of period [.] character left number of characters. Similarly, we can use the RIGHT function to return length of the string minus position of the period [.] character right number of characters, in order to retrieve top-level domain.
PROPER, UPPER and LOWER are fairly straightforward functions that can be used to change the text case. In this example, we will adjust our columns A, B and C respectively with them:
How to convert number to text?
The ISTEXT function enables us to check whether are we dealing with text or not, and returns TRUE or FALSE depending on the outcome of the check.
Other way around, the ISNUMBER function enables us to check whether are we dealing with number or not, and returns TRUE or FALSE depending on the outcome of the check.
This is how that looks in practice:
We will deal with IS formulas in more detail later, but for now note that the ISTEXT function returns TRUE for text cells, and FALSE for number and empty cells. The ISNUMBER function returns TRUE for number cells, and FALSE for text and blank cells.
In theory, we could try formatting these cells as text. If we copy our values from column F to column G, right-click that selection, and then pick Format Cells, we will open Format Cells pop-up where we can select Text format instead of General:
However, this will only make those numbers look like text, i.e. they will be aligned left. For the purpose of calculations those will still be treated as numbers, as we can also see in ISTEXT and ISNUMBER returns:
To actually convert those numbers, our first options is adding something called leading apostrophe.
Apostrophe [‘] is a very special sign in Excel when properly placed at the start of the value:
- if will not be displayed in the worksheet
- it will not be visible to Find and Replace
- it will be displayed in formula bar and inside of the cell after double-clicking
- everything inside of a cell positioned right of the leading apostrophe will be treated as text, weather originally a number or a formula.
We can add leading apostrophe by manually entering it left of the number as shown below:
Leading apostrophe should, in principle, be added manually into each cell, and as such represents a quick solution with only specific uses. While it can be added via formulas and functions, that can become quite messy really fast, and kind of defeats the purpose. It should also not be added to files you expect will be uploaded to or opened with applications other than Excel.
More general-purpose tool for conversions of numbers to text is the TEXT function. TEXT function will perform two tasks:
- it will convert number to text
- it will format that number converted to text, in multiple different ways if asked – which can be useful in certain contexts.
We will show three examples here:
In the column I, we are converting numbers from column F and formatting those numbers as plain text.
In the column J, we are converting numbers from column F and additionally formatting those numbers. Using this principle, we could, for example, format international telephone number such as 00999991234567 into something more readable like +999 99 123-4567. For that, we would use the formula =TEXT(“00999991234567″;”+?? ?? ???-????”).
In the column K, we are converting numbers from column F and adding leading zeroes. This is useful when dealing with various identification numbers which often have fixed number of digits, and zero can be the starting one.
The TEXT function is the most versatile options you have for formatting text via functions, and you will probably use it often.
It is also worth noting that many functions shown in Part 2 lessons will convert your numbers to text. For example, you can convert number 237 to text with both =LEFT(237;3) and =UPPER(237), in spite of lack of rationale behind such usage of those functions.
But you should keep in mind that data returned by text functions is (minus few inconsistencies) treated as text, as it should be. If you use the SUM function on 2 and 2 formatted as text, the SUM function will return you 2.