Transform Data in Power Query using the Unpivot and Pivot Functions

As you know, we can get data in all forms and shapes and sometime it’s simply NOT possible to use the data directly as we receive it. In these cases, Power Query comes to our rescue to help us shape and transform data so that we can use it in our reports.

I recently came across a question on the Power BI community forum where data was available in the following shape with Product, Kg and Bag columns for each Customer and each Date.

Before Transformation

Raw Data

Given the above data, it’s not possible to create reports where you can slice the data by product, compare customer sales by product, or perform any other type of analysis. The ideal shape and form of the above data should be as depicted below – do you guys agree? This form of data allows us to have a 360-degree view with the ability to slice by products and compare various data elements.

After Transformation

Final Transformed Data

It was really easy to transform the data using the Unpivot and Pivot transformation functions in Power Query. See the Power Query script and steps below that transformed the data into its required shape.

Power Query Script

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQ31DcyMLRU0lFyBOKgzORUIGVoACKAOCC/JLEkHyxiABGP1QFpMkLWBFdkBFJjhDDGCKzJCKbJGKbJCYhD8nMTi8C6jEGKjEHCqYl5xWABA4gwRJsJTJszkhoTkBITFAEDiHBsLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Customer = _t, #"Prod 1" = _t, #"Kg 1" = _t, #"Bag 1" = _t, #"Prod 2" = _t, #"Kg 2" = _t, #"Bag 2" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Customer", type text}, {"Prod 1", type text}, {"Kg 1", Int64.Type}, {"Bag 1", Int64.Type}, {"Prod 2", type text}, {"Kg 2", Int64.Type}, {"Bag 2", Int64.Type}}),
    #"Unpivoted except Date and Customer" = Table.UnpivotOtherColumns(#"Changed Type", {"Date", "Customer"}, "Attribute", "Value"),
    #"Extracted Product" = Table.AddColumn(#"Unpivoted except Date and Customer", "Product", each if Text.Start([Attribute], 4)= "Prod" then [Value] else null, type text),
    #"Filled Product" = Table.FillDown(#"Extracted Product",{"Product"}),
    #"Added Unit Type Column" = Table.AddColumn(#"Filled Product", "Type", each Text.BeforeDelimiter([Attribute], " ")),
    #"Filtered Value Rows Contains Product" = Table.SelectRows(#"Added Unit Type Column", each ([Product] <> [Value])),
    #"Removed Attribute Columns" = Table.RemoveColumns(#"Filtered Value Rows Contains Product",{"Attribute"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Attribute Columns", List.Distinct(#"Removed Attribute Columns"[Type]), "Type", "Value", List.Sum),
    #"Changed Data Type" = Table.TransformColumnTypes(#"Pivoted Column",{{"Kg", type number}, {"Bag", type number}})
    #"Changed Data Type"

Key steps in above scripts are

  1. Unpivoted data for each date and customer (step #3)
  2. Added new Product column from Value column (step #4)
  3. Filled Product column (step #5)
  4. Added Unit Type column (step #6)
  5. Removed Product from Value column (Step #7)
  6. Pivoted on Unit Type column (step #9)

Caution: Pivot is an expensive transformation and it may not perform very well on a large dataset.

Recommended Posts

No comment yet, add your voice below!

Add a Comment