Anonymised Data & GDPR Compliance using Many-to-Many Relationships in Power BI

You are here: >

Anonymised Data & GDPR Compliance using Many-to-Many Relationships in Power BI

A customer recently asked me if it was possible to view anonymised data in Power BI.  I will set some context:

  • Power BI reports viewed by different Brands.
  • Brand managers can see competitor brands sales values, but not allowed to know who the brand is. E.g. ‘Contoso’ brand would view ‘Datazen’ brand as ‘Brand 1’.
  • Sales Associates are ONLY allowed to see data for their brand.
  • The underlying data for the brand names are not exposed to a given brand.

Immediately I began solutionising.  Could I leverage Row Level Security (RLS) or perhaps need to investigate Dynamic Data Masking (DDM) in the SQL Database?  The latter has an impact on performance and would also mask the data at REST, which was not a requirement for the client.  Time to take step back.
To protect the customers identity, this blog post utilises fictional data in Excel. The data from the real scenario was sourced from SQL Server.

Pseudonymisation or Anonymisation?

Technically, what the customer requires is a form of Pseudonymisation.  This helps to enhance privacy by replacing identifying fields (within a data record) with artificial identifiers (pseudonyms). Olenka Van Schendel’s blog post (here) explains how Pseudonymisation complies with Article 3 of the GDPR and in turn, enabling “data processors” to meet data compliance.
The difference between anonymised and pseudonymised data is the way it is categorised as personal data. Pseudonymous can still be indirectly re-identified, while anonymous data cannot.

Solution

After studying the sales data, it became apparent that another data set would be required to map the brands to the appropriate anonymised brand.  I could then utilize the user’s email address to verify what brand they belong to, meaning Power BI will dynamically display the correct data.
Step 1 – Setup data structures
I ended up with two data sets – see below.
Sales Data

Permissions Data

  1. Email used to determine Power BI user.
  2. Brand Joined back to Sales data to determine sales for a given brand.
  3. Brand Name Anonymise is the display attribute users see in the report.

Step 2 – Create Many-to-Many Relationship
The next task was to import the data and model it using the ‘Relationships’ tab in Power BI.  Prior to Microsoft’s Power BI Desktop release in July 2018 (found here), the two tables couldn’t be mapped as they contain many brand values in both data sets.  Bridging tables and extra modeling was often needed in order to support Many-to-Many relationships.  The good news is they are now natively supported – making it far easier to build data models.  For a step-by-step guide on how to activate and use this feature, click here.

Step 3 – Manage Roles
I used Jane Doe as my test subject.  At this point, the report contained all possible anonymised Brands. Navigate to the ‘Modeling’ tab to locate the Manage Roles button.

  1. Click ‘Manage Roles’.
  2. Add Role.
  3. Assign permission to ‘Permissions’ table.
  4. Add filter context for Jane Doe email address.

There is a cool feature (View as Roles) within Power BI Desktop, in which you can impersonate a given user.  This enabled me to test Jane Doe and confirm the permissions have worked.  The only brand name visible is ‘AdventureWorks’ – the rest are anonymised.

The process was repeated for the other brand users.  This is when I noticed a problem with Sally Jones, who is a Sales Operative.  They are not allowed to see competitor numbers at all, meaning further RLS was needed.  I added the following filter expression to the ‘Sales’ table:
[Brand] = “AdventureWorks”
Once the additional filter was applied, Sally Jones could only see the data for her own brand.

Step 4 – Test Underlying Data

This led me to one last test.  Can the underlying data be accessed and more importantly, can brand managers see other brands?  As you can see in the below Export to Excel output, the data is gully anonymised.

The customer now has a fully working proof of concept, which meets all 4 of their requirements listed at the beginning of this article.

Contact Me

If you have any questions or have come up with your anonymisation techniques in Power BI, feel free to leave a comment below.  I am happy to provide the pbix file and data upon request.
Twitter:   @DataVizWhizz

Author: 
Callum Green

Join our list

eBECS will invite you to webinars, events and keep you up to date with relevant news. You can unsubscribe at any time.

UK: +44 (0) 8455 441 441
Ireland: +353 (0)1 893 4831
USA: +1 (678) 701 5856
Saudi Arabia: +966 (11)920 007299

© 2018 eBECS Limited. All rights reserved.
Registered office: Royal Pavilion, Wellesley Road, Aldershot, Hampshire, England, GU11 1PZ