Heat map technique with data smoothing

More than a year ago, a question was asked on the Power BI community in regards to showing a heat map in Power BI. Not just a simple heat map in the matrix visual, but a heat map based on data smoothing. I cannot find the link to that post (sorry for that!), but I want to share what I learned from it and how I solved it.

The business logic was to smooth data by taking the average of the surrounding two rows and two columns in both directions and to take the average of the value based on the bin size on the columns. In our case, we are assuming that the bin size value for the column is set to 5.

To explain this further, assume we have the following data in Excel and we want to calculate the average of Cell C3; we need to look at the adjacent two rows and two columns.

Smoothing Avg Data Example

In this case, the total values will be 243 (SUM A1:E5). For C3, which is in row 3, we are adding the values from row 1 to row 5, and similarly for column C we are adding the values from column A to column E; that’s what translates to A1 to E5 (SUM A1:E5). The average value for this cell (C3) will be 24 – 243 divided by 10. You get 10 by multiplying the bin size (5) by 2, and then 243 divided by 10 rounds to 24.

I hope this gives you an idea on how we are calculating the average of each cell to smooth our data. Let’s see how it all fits into Power BI and how the heat map looks after data smoothing.

Here is some sample data with three columns:

  • Length
  • Weight
  • Qty
Raw Data

In this example, Length will be on rows with bin size of 150 and Weight will be on columns with bin size of 5.

Let’s first add a bins column to our table in order to add the bin for the Length column.

  • Right click on the Length column
  • Select New Group
How to add bin?
  • In the New Groups window, add the bin name – in our case it is called Length (bins) and give it a Bin Size, i.e. 150
Bin Settings

Similarly, we are going to add Weight (bin) with the Bin Size value set to 5. Once the bins are added, we will have two new bin columns in our model: Weight (bin) and Length (bin).

Learn more about Grouping and Binning here.

Now it’s time to add a measure to calculate the average for each cell as explained in the beginning of this post, using the Excel sheet as an example.

Smoothing Avg =
VAR __weightBinSize = 5   --bin size of our columns
VAR __totalWeightBin = __weightBinSize * 2
VAR __myWeightBin = SELECTEDVALUE( FlatData[Weight (bins)] )  --get current weight bin
//establish range for weight bin. Go back two columns (bins) and go forward two columns (bins)
VAR __minWeightBin = __myWeightBin - __totalWeightBin   --go back 2 columns
VAR __maxWeightBin = __myWeightBin + __totalWeightBin   --go forward 2 columns
VAR __lengthBinSize = 150   --bin size of length columns
VAR __totalLengthBin = __lengthBinSize * 2
VAR __myconcSpan = SELECTEDVALUE( FlatData[Length (bins)] )  --get current length bin
//establish range for length bin, go back two rows and go forward two rows
VAR __minLengthBin = __myconcSpan - __totalLengthBin    --go back 2 rows
VAR __maxLengthBin = __myconcSpan + __totalLengthBin    --go forward 2 rows
//sum the quantity for the range, it is similar to A1..E5 in our excel example above
VAR __totalQty =
    CALCULATE(
        SUM ( FlatData[Qty] ),
        FlatData[Weight (bins)] >= __minWeightBin ,
        FlatData[Weight (bins)] <= __maxWeightBin ,
        FlatData[Length (bins)] >= __minLengthBin,
        FlatData[Length (bins)] <= __maxLengthBin
    )
RETURN
    DIVIDE( __totalQty, __totalWeightBin )  --calculate the average

Alright, now that our measure is in place we can create a matrix with a heat map. To do so, select the Matrix Visual and add Length (bins) on rows, Weight (bins) on columns, and our average measure (Smoothing Avg) in the Value section. This is what it will look like:

Matrix

Once all of the above is done, this is what the matrix will look like. I love numbers, but this looks super boring:

Raw Data

So, now it’s time to create a heat map, and for this we will use conditional formatting. Before that, however, let’s see what a heat map without data smoothing would look like:

Raw Data Heat Map

Looks pretty good, but let’s check out how it looks after we use the data smoothing average measure.

Heat Map with Smoothing Avg Measure

Looks better – but let’s take it even further. We really don’t need to see those numbers; in the conditional formatting window, let’s change the foreground and background to the same colour. Here is the output:

Heat Map with Smoothing Avg no number

I find the above heat map to be much more useful than a heat map without data smoothing.

And this is what we are doing in the conditional formatting window for both the foreground and background colour:

Conditional Formatting for Heat Map

Read more about conditional formatting here.

We can do a lot of amazing things in Power BI to visualize our data and this is just one example. How will you use it? As always, I’m looking forward to your feedback and learning from you all!

Useful Links

Learn more about Grouping and Binning here.
Read more about conditional formatting here.

Self-serve data prep at lightening fast speed with Power BI dataflows and Azure Data Lake

Power BI is an amazing tool as it fulfills all level of BI needs, whether you are using it for data exploration, as self-service BI to analyze the data, as an Enterprise BI solution, or embedding into your internal or external portal.

I have been working with Power BI since it was made available back in 2015 and have worked with many kinds of organizations and users. Power BI has come a long way since then and I am wholeheartedly enjoying this journey. Although I have seen different businesses and users using Power BI in different ways, I have noticed that there is one thing they all have in common – that users need to wait on the IT/technical team to prepare the data and store it somewhere before they can connect to the data and start working on analyzing/visualizing it. As we all know, data is the fuel in any BI tool and on time data availability with the correct shape/structure makes data analysis much easier. Without proper data, a BI tool is like having a Ferrari but not having the proper fuel to drive it. Sometime data prep takes longer than expected and business miss the opportunity to make smart and timely decisions. This is where Power BI dataflows come into the picture.

I have been working with Power BI dataflows for the last 9 months or so and worked on two major projects where dataflows helped in data preparation – including complex data transformation which used to take hours and can now be completed within minutes. It saved a lot of time for the business to try and get value out of their data and the business was now able to deliver the solution on time – leading to happy clients 😊. Of course, there were a few challenges on the way since this is a comparatively new technology, but at the end of the day everything worked out great.

In an upcoming meetup here in Vancouver, Canada, I will be speaking about my experience and journey using dataflows:

  • how has it helped organizations to prepare data at lightning speed?
  • what challenges have I faced?
  • what are some things to avoid?
  • what are the best practices?
  • and what is in the future?

The presentation I have prepared for this meetup is available to download at the link below. If you have any further questions, don’t hesitate to reach out by email or leave your comments below and I will get back to you ASAP!

Implement Row Level Security based on business rules in Power BI

Here is another interesting use case of Row Level Security (RLS). It’s funny that my first post was also about RLS – when you use RLS, lot of different challenges show up (which are not relevant and do not have any use cases without RLS).

In this post we are going to look at how we can implement RLS based on a certain business logic/condition.

Basics of Row Level Security (aka RLS)

Just a quick note on Row Level Security: it is essentially a technique to filter rows based on the user who is viewing the report. Here is a typical example: a Sales table contains data for all company sales with salesperson information. A single Power BI report is viewed by all the salespeople but we want each salesperson to see only their own sales. This is where RLS comes into the picture.

Based on the salesperson who is currently viewing the report, the data for other salespeople gets filtered out even though the dataset contains data for all the salespeople.

Parts of RLS

Who is the current user: Identify which user is currently logged in. Luckily, we have a DAX function called USERNAME() which returns the current user login name. Details of this function are provided in the link at the end of this post.

RLS role: It is basically a DAX expression that uses business logic to filter the rows from the user; it returns either TRUE() or FALSE(). The rows that return FALSE() value are hidden from the user.

Problem Statement

So now we know the basics of RLS, how it works, and what the key components are to successfully implement RLS.

We have the following table that contains Name and Room Size for each user. The problem we are trying to solve here is that based on the current user logged in, we want to take the room size of that user and show all the rows that are between -20% to +20% of the current user’s room size.

Here is the sample data:

Sample Data

Based on the above table, if user “A” is logged in we want to show all the rows where the room size is +/- 20% of the current user’s room size. User “A’s” room size is 20 and the range is going to be 16 to 24 (the lower range is 20 x 80% and the upper range is 20 x 120%). This means that we want to show all the rows where the room size is between 16 and 24, and as per our sample table we expect to see following rows when user “A” is logged in.

Data based on user “A”

In the case where user “C” is looking at the report, we will see the following rows (range 80 to 120)

Data based on user “C”

Solution

To implement this, we need the following measures:

  • Current user’s room size
  • Lower Range
  • Upper Range

In RLS role, return TRUE() if the room size falls between the lower and upper range – otherwise return FALSE() (hide rows from the user).

Quick list of steps to create a new role:

  1. Go to “Modelling” tab
  2. Click “Manage roles”
  3. Click “Create”
  4. Enter role name
  5. Select the table on which the business rule is going to be applied
  6. Enter role formula
  7. Click “Save”

Here is the DAX expression which will do the magic for us:

--get current user name
VAR __user = USERNAME()
--get current user's room size
VAR __roomSize =
CALCULATE (
    [Room Size],
    ALL ( 'Table' ),
    'Table'[Name] = __user
)
--get range
VAR __upperRange = __roomSize * 1.20
VAR __lowerRange = __roomSize * 0.80
--check room size in the range, return true or false
VAR __rls =
[Room Size] >= __lowerRange &&
[Room Size] <= __upperRange
--final return
RETURN __rls 


Let’s see how it works when we are viewing as user “A” or “C.”

Role Testing

I hope this post is helpful and provides some insight on how you can implement your business logic in RLS.

Your valuable and constructive feedback is always welcome and I am always open to learning from your interaction with my posts!

Credit

I would like to thank my daughter for helping me with this post. She is currently a third-year BCom student at UBC Sauder.

How to set default value of single select slicer using RLS?

I have been wanting to start writing my own blog posts for around a year now in order to share my knowledge and experience – and I have finally gotten around to writing my first blog post! In this blog post, I will be discussing a question that was posted on the Power BI Community forum. If you are not aware of what the Power BI Community forum is, I highly recommend that you check it out at community.powerbi.com. It is a great place to learn from others, ask your own questions, learn about best practices, and hone your skills in Power BI.

Now let’s get back to the question at hand: “How do you reset the Single select slicer when using Row Level Security (RLS)?” Before we get into the solutions, let’s first look at the Single select slicer and what it does.

Single Select Slicer

Single select is an option in the slicer settings that allows the user to select only one value in a slicer. Once “Single select” is turned on, the user must always have one default value selected in the slicer – you cannot have no values selected in the slicer.

To turn a slicer into “Single select,” go to the “Format” tab and toggle “Single select” to “on” (as depicted in the image below).

Image 1

So, what are the issues when using a Single select slicer with RLS? Here is the problem that a user posted on the Power BI Community forum:

I have a report based on RLS with two slicers one Department  and second Jobs. Users can see only jobs which they belong to theirs department. Everything works just fine (for example user A can see jobs belong to Dept 10 and user B can see jobs belong to dept 11) The problem starts when I modify the Job slicer to single select for example if I publish the report I need to select at least one job which belong to department that only users with RLS can see. For example, I published a report with Department = ALL and Job 10 When user B try to browse the report he / she see an empty report because theirs department is not on theirs RLS list.

My question – how do I reset the single select slicer so on the first time user can see theirs department and jobs? 

Here is the link to the post on community forum.

One important thing that I want to mention is that I will not be explaining Row Level Security in this blog post. If you want more information regarding RLS, I will share some resources at the end of this post that explain what it is and how you can set it up.

Now let’s discuss the solution that I came up with. First, let’s look at the very simple data model. There is one table called Investment, containing a “Team” column and some other number columns, and one table called Users, containing a “User” column along with which “Team” data they can see.

The image below depicts the data model:

Image 2

Note: We don’t want to set any relationship between these two tables.

Below is what our Users table looks like – we have a special row in this table where both the “Team” and the “User” value is ALL. This row will later be used for our solutions, but for now it is important to know that there is a special row in our Users table.

Image 3

First, we are going to add a simple measure to identify our special row (ALL) in the Users table:

ALL Team = IF ( MAX ( Users[Team] ) = "ALL", 1, 0  )

The next step is to add a role for RLS that identifies which users can see which teams. This utilizes the logic that every user can see the ALL row. There are seven steps to add this simple RLS filter:

  1. Go to “Modelling” tab
  2. Click “Manage roles”
  3. Click “Create”
  4. Enter role name
  5. Select Users table
  6. Enter role formula
  7. Click “Save”
Image 4

Once we have created our role for RLS, we can quickly test it by dropping a slicer visual onto the canvas and adding the “Team” column from the Users table onto the slicer. To test a role, click “View as” on the “Modelling” tab and enter the user name. It will list only those teams which the selected user can see.

An important point to note here is that you will always see an “ALL” option in the slicer – it doesn’t matter which user is currently looking at the data.

Image 5

Now in the next step we are going to filter our Investment table based on the current user. To achieve this, we will create a measure that will show only those teams which the currently selected user can see.

Filter Team =
VAR __isAllTeamSelected = [ALL Team]
RETURN
CALCULATE (
    COUNTROWS( Investments ),
    KEEPFILTERS(
        TREATAS (
            SUMMARIZE (
                FILTER(
                    ALL ( Users[Team] ),
                    IF ( __isAllTeamSelected = 1, TRUE(), Users[Team] = MAX ( Users[Team] ) )
                ),
                Users[Team]
            ),
            Investments[Team]
        )
    )
) + 0

Let’s look what above measure is going to return. It is going to give us count of rows from Investment table for only those teams which a user can see based on Users table, and if a user cannot see a team, above measure will return a Zero value and in case “ALL” value is selected we will get row count for each team a user can see.

If ALL value selected in the slicer

Image 6

If single team selected in a slicer, only that team shows row count, other teams has Zero value.

Image 7

We will use this measure as visual level filter – see screen shot below to exclude teams which we don’t want to show, or in other words only show selected team, and case of “ALL”  we will see all the teams of the user.

Image 8

So how does this fit in our original problem.  Didn’t I mention that special row with team “ALL” is important as this row will be always available regardless what user is viewing the data. In our “single select” slicer we will select “ALL” as default selection and regardless what user is looking at the data, they will always see their teams with “ALL” as default selected team.

Power BI file is attached with above solution. As I’m always learning, looking forward to your valuable feedback or share how you will solve it.

About the author

Parv Chana is a BI Architect and Consultant with over 20 years of industry experience. He runs a small consulting company, PeryTUS IT Solutions, where he implements Power BI solutions for his clients, which range from small to large enterprises. He has experience working within a wide variety of industries and has deployed many unique solutions to solve his clients’ BI needs. He is a Microsoft Data Platform MVP and has been working with Power BI ever since it was made available in 2015.

His areas of focus include implementing end to end Power BI solutions (from data modelling to visualization), analyzing existing Power BI solutions, helping organizations follow best practices for scalable BI models, and providing training to help his clients achieve their BI goals.

Useful Links

Learn more about RLS here.
Link to USERNAME() and TREATAS function