The Power Series Way, Part 1: Power Query


Challenge level: 2 out of 5

Estimated reading time: 14 min

Power Query functions covered: Change Data Type, Remove Duplicates, Auto-Generate Calendar Table (using M language)

Excel attachment reference sheets: template_Power Series


Once you have both Power Query and Power Pivot up and running, we can start using these powerful tools to calculate all of the above figures a lot quicker than it originally took with “normal” Excel functions.

We have already built a template that allows you to just copy and paste your own data into it and get monthly, YOY, L12M etc. figures. All you need to do is  hit the “Refresh All” button under the Data tab and see the magic happen!

It’s pretty simple to build this from scratch too. No macros. No VBA. It is, amazingly possible. Here’s how:  

Power Query in action

1. Set-up your first query: Get your primary data table in the right format

1.1. After you have copied and pasted your data into a normal Excel worksheet, go to the "Data" tab. Under the "Get & Transform Data" section, select the "From Table/Range" button.

1.2. You will be prompted to put your data into table form. Make sure that you tick the little box that says “My table has headers” so that your original headers do not get overwritten.  

power query step 1 (further crop).png

1.3. Once you click on the “OK” button, Power Query Editor window will pop up immediately.

If you do not see the formula bar, click on the “View” tab. Under the “Layout” section, tick the box for "Formula Bar".

power query window (further crop).png
day 4_ 2.1. power query window formula bar (further crop).png

1.4. Make sure you set your data to the correct types by clicking on the small icons next to the header words. In this case, we want the dates to be in date format, category to be Text, and value to be Whole Numbers.

1.5. Before closing and loading this query, make sure to also name (choose a relevant name) the query (aka the table created). Here, we’ve named it “Data”. The importance of this will be become apparent as we proceed further into the template set-up.

day 4_mailchimp edit.png

1.6. Finally, select the “Close & Load” button on the top left-hand corner on the “Home” tab. Click on the words portion of the button and a dropdown list will appear. You want to choose the “Close & Load To…” option. A separate window will pop up when you choose this option.

1.7. Make sure you select the correct options. First, choose to “Only Create Connection” as the option on how to view the data in Excel. Secondly, check the box at the bottom that says “Add this data to the Data Model”. We will explain in awhile what this “Data Model” is about.

day 4_4. power query step loading data (edit).png

1.8. Once the above is done, hit the “OK” button. You now have your primary data table set-up!

2. Set-up your second query: Create a sub-Category table

Creating a separate sub-Category table has its benefits, especially if you are intending to use slicers later to filter your data via these categories. The filtering will be done off this Category table which has significantly lesser rows (i.e. 21 rows) than the primary data table (i.e. 752 rows in this case but imagine if it has over 1000 or 2000 rows etc.). It's a much faster process.  

2.1. With the first query created, a Query list should appear on the right-hand side of the worksheet. (If you don’t see it, click on “Show Queries” under the Data tab’s Get & Transform.) Right-click on the “Data” query and select “Reference”. A new query will be set up based off this first query.

power query new query set up (further crop).png

2.2. Once the Power Query Editor window has been loaded, hold on to the ctrl key while clicking on the date, value and YYYYMM columns.

2.3. When all three columns have been selected, release the ctrl key and right-click on any of the three headers. Select “Remove Columns” from the dropdown list.

2.4. Right-click on the remaining “category” column header and choose “Remove Duplicates” from the dropdown list.

power query removing duplicates (edit).png

2.5. If you would like or need to add on further categorisation to the cleaned-up category list, you can do so by adding additional columns through the Add Column tab. In our template, we inserted a larger umbrella grouping of Cellphone, Camera, Computer, Home system and Others. To do this, click on the “Column from Examples” button and  select “From All Columns” option. Start keying in your additional categorisation in the column that appears on the right. Once you are done, click on the “OK” button.

power query adding category (further crop).png

2.6. To change the name of the header, double-click on it directly and make the edit. Finally, remember to assign a relevant name to the query. Here, we’ve named it “Category”. Follow steps 1.6 and 1.7 to close and load the Category table.

power query load as connection (edited) (further crop).png

3. Set-up your third query: Build a sub-calendar table

In essence, the separate calendar table here exists for the same purpose as the reference date table does for our Day 1 to 3 Excel template. It enables Power Pivot formulas to carry out dynamic calculations that factor in time period sensitivity. As with the separate Category table, it also allows the calculations to run at a much faster speed with less rows to filter through.  

3.1. Repeat step 2.1. to create a new query based off the first query (titled “Data”).  

3.2. With the Power Query Editor window opened, click on the “Advanced Editor” button that sits under the “Home” tab.

power query calendar table set up (further crop edit).png

3.3. Once the “Advanced Editor” window is loaded, copy and paste the below M-language code in:


let

Source = Data,

MaxDate = Record.Field( Table.Max( Source, "date"),"date"),

MinDate = Record.Field( Table.Min( Source, "date"),"date"),

DaysElapsed = Number.From( MaxDate-MinDate),

DatesList = List.Dates( MinDate, DaysElapsed + 1, Duration.From(1)),

RawDatesTable = Table.FromList( DatesList, Splitter.SplitByNothing(), {"date"}, null, ExtraValues.Error),

ChangedType2 = Table.TransformColumnTypes( RawDatesTable,{{"date", type date}}),

InsertedDay = Table.AddColumn( ChangedType2, "Day", each Date.Day([date]), type number),

InsertedMonth = Table.AddColumn( InsertedDay, "Month", each Date.Month([date]), type number),

InsertedYear = Table.AddColumn( InsertedMonth, "Year", each Date.Year([date]), type number),

#"Changed Type" = Table.TransformColumnTypes(InsertedYear,{{"Day", Int64.Type}, {"Month", Int64.Type}, {"Year", Int64.Type}})

in

#"Changed Type"


Click on the “Done” button to load the code.

In short, what the M language code does is to create a list of dates based on the earliest and latest date you have in your data.

If (for whatever reason) you want to reference a different original source data to build the calendar table from, you need to change the line “Source = Data”. Replace the word “Data” with the name of the reference table you want to base the calendar table off.

3.4. Remember to give the query a relevant name (we’ve named it “Calendar”) before repeating steps 1.5 and 1.6 to close and load the query.

Remember, once these queries are written, they will clean and set up your data the exact same way each time your raw data set gets updated. All you need to do is just hit the “Refresh All” button (under the Data tab)!