In the “old days”, correcting messy data came down to writing complex formulas to transform data.  Imagine writing the following formula to extract a number from within a set of text.

=IFERROR(VALUE(TEXTJOIN( "", TRUE, FILTER(MID(D4, SEQUENCE(,LEN(D4)), 1),
ISNUMBER(VALUE(MID(D4, SEQUENCE(,LEN(D4)), 1)))))), "")

For many users, this is an impossible task.  The user would type the needed data by manually examining the source material.  If this constituted hundreds or thousands of records, the time required could be substantial.

On the other hand, if we use the Get and Transform tools (also known as “Power Query”) the task is reduced to just a few clicks… and NO formulas.

The best part of this occurs when we add new data.  A single click to tell Power Query to refresh the data will apply all the transformations to the newly added data.

Let’s look at some examples of common data transformation issues.

Start Using Excel in Power Mode

If you have a corporate job and work with Excel, then knowing Power Query is must.

This course is perfect for anyone that works with data from different files and struggles to create meaningful reports.

Featured Course

Master Excel Power Query – Beginner to Pro

Power Query is essential for Excel users who work with lots of data. This course teaches you how to use Excel in Power Mode and create meaningful reports with far less effort.
Learn More
Power Query Course cover

The Data and the Mission

Below is a snapshot of our data which is formatted as an official Excel Table named “TEmployees”.

The transformations we require are:

  • Split the Department and Position into 2 separate columns
  • Extract the Employee Number from the middle of the Unique Code and format it as E-## (ex: E-70)
  • Split the First and Last names from the Full Name into separate columns and correct casing issues

If we were to perform these tasks with the traditional formula-based approach, our formulas would get quite complicated and we would need to be well versed in dozens of different functions and tools.

Transformation #1: Split Text by Delimiter

Our first transformation is to separate the Department and the Position into 2 columns.

Because the Department and Position are separated by “space – forward slash – space” characters, we can leverage these characters as a delimiter to assist in the separation process.

  1. Click in the table and select Data (tab) -> Get & Transform (group) -> From Table/Range.
  2. Rename the transformation to “ProperData”.
  3. Select the Department/Position column and click Home (tab) -> Transform (group) -> Split Column -> By Delimiter.
  4. In the Split Column by Delimiter dialog box, select Custom and use a forward slash (/) as the delimiter.
  5. Click OK.

NOTES:

  • Once the split has been performed, another “Changed Type” step is added. This is to reassess the newly split data for possible new data types.  We could delete this step as it is not creating anything we didn’t already possess.
  • Technically, we could have used a “ / “ (space, forward slash, space) as the custom delimiter. This would have allowed us to eliminate the upcoming “Trim” step.

Removing the Leading & Trailing Spaces

If you select to the right of one of the Department names, you will be presented with a preview window at the bottom on the screen.

If you then click to the right of the data in the preview window you will notice that the cursor does not rest directly next to the last letter; there is a trailing space.

We can also discover that there is a leading space before the Position in the second column.

Trimming the Data

To remove the trailing and leading spaces, select the Department column, hold CTRL and select the Position column.  Select Transform (tab) -> Text Column (group) -> Format -> Trim.

I prefer using the “space – forward slash – space” approach as it reduces the query by a step.

NOTE: We will deal with the column headings at the end.  For now, we can leave them a bit messy.

Transformation #2: Extract Employee ID from Unique Code

Our next step involves the extraction of the Employee ID from within the Unique Code column.  Notice that the format is “last name – employee ID – first name”.

Select the Unique Code column and click Transform (tab) -> Text Column (group) -> Split Column -> By Non-Digit to Digit.

This separates the Last Name from the Employee ID/First Name.

Next, select the column with Employee ID and Last Name and click Transform (tab) -> Text Column (group) -> Split Column -> By Digit to Non-Digit.

This separates the Employee ID from the First Name.

Getting rid of the unneeded bits

As we only need the Employee ID, we will remove the newly separated First Name and Last Name columns.

Click the column of Last Names, then press CTRL and click the column of First Names.

Press the Delete key to remove the selected columns.

Formatting the Employee ID

Remember, one of the requirements was to format the Employee IDs with an “E-“ prefix.

Select the column that contains Employee IDs and click Transform (tab) -> Text Column (group) -> Format -> Add Prefix.

In the Prefix dialog box, enter a Value of “E-“ and click OK.

We now have our properly formatted Employee IDs.

Transformation #3: Separate Names and Format Casing

Our final set of transformations is to separate the contents of the Full Name column into a First Name column and Last Name column whilst ignoring any middle name information.

There is a wealth of options sitting withing the Extract feature in Power Query.

We don’t want to transform what we have into a new set of data; we want to leave the original Full Name column while adding additional columns for First Name and Last Name.

For this operation, we will use the version of Extract located on the Add Column ribbon.

Extracting First Names

Select the Full Name column and click Add Column (tab) -> From Text (group) -> Extract -> Text Before Delimiter.

In the Text Before Delimiter dialog box, enter a space in the Delimiter field and click OK.

We are presented with a new column that contains all text up to the first space in the Full Name text.

Extracting Last Names

Select the Full Name column and click Add Column (tab) -> From Text (group) -> Extract -> Text After Delimiter.

In the Text After Delimiter dialog box, enter a space in the Delimiter field.  Expand the Advanced Options and set the Scan for the Delimiter option to “From the end of the input” and click OK.

Starting from the end and “looking” backward is necessary because of the existence of middle names in some of our records.

If we began our search for a space from left-to-right, we would stop before a middle name and extract the middle and last names.

We are presented with a new column that contains all text after the last space in the Full Name text.

Format the Names with Proper Casing

The next step is to format the newly added First Name and Last Name columns so that the first letter is upper-case while the remaining letters are lower-case.

Select the First Name and Last Name columns then click Transform (tab) -> Text Column (group) -> Format -> Capitalize Each Word.

We now have our properly formatted names.

Featured Course

Black Belt Excel Package

What would your life look like if you became an Excel Black Belt? You’ll get the best deal with this (cost savings) package. It will take you from Excel Newbie to Grand Master.
Learn More
Excel Black Belt course cover

Finishing Touches

  • We no longer require the original column of Full Names, so we can select the Full Name column and press Delete.
  • Rename the column headings with more meaningful names.

Sending the Results to Excel

To send the transformation results back to Excel as a finished table, select Home (tab) -> Close (group) -> Close & Load (lower part of button) -> Close & Load to…

You can load the results to a table on a new sheet, or an existing sheet.

The results are as follows.

All three requirements have been satisfied and we didn’t have to write a single formula to get the job done.

Updating the Results with New Data

Because Power Query “remembers” the steps you performed on the original data, you can easily repeat those steps when you receive new data.

If we add a few new records to the original data set…

We can click Data (tab) -> Queries & Connections (group) -> Refresh All to absorb the newly added records into our query results table.

Now sit back and bask in the glory that is a fully-automated query.

NOTE: Text manipulation with formulas has since gotten easier, with the new functions TEXTSPLIT, TEXTBEFORE & TEXTAFTER available in Microsoft 365. They are like the Split Column and Extract features in Power Query, and they’re also dynamic. However, if your data updates frequently or comes from external sources, like SharePoint or SQL database, Power Query is still your best bet.

Practice Workbook

Feel free to Download the Workbook HERE.

Excel Download Practice file

Leila Gharani

I'm a 6x Microsoft MVP with over 15 years of experience implementing and professionals on Management Information Systems of different sizes and nature.

My background is Masters in Economics, Economist, Consultant, Oracle HFM Accounting Systems Expert, SAP BW Project Manager. My passion is teaching, experimenting and sharing. I am also addicted to learning and enjoy taking online courses on a variety of topics.