I read something once that said good is bad and bad is good. Often in life, I’ve learned it’s more valuable of a lesson when something doesn’t work out rather than when everything is going in the right direction. For this reason, I felt it was appropriate to create a post on bad practices rather than best practices when it comes to Power BI for an end-user to reflect upon and see what not to do when using Power BI.
The goal of this post is to teach what to do by learning what not to do.
It’s important to note that items I'm going to talk about may not hold relevancy for every project. Every project is different for various reasons including audience, project length, and size.
What NOT to do with Power BI:
1) Keeping a Power BI File on Your Local Machine
During the COVID-19 pandemic, one thing I've learned is that anything can happen at any time. Using the cloud and fully utilizing a collaborative platform is important in this new world. When a user keeps the file locally, it introduces a set of risks and undermines the team's ability to cross communicate and collaborate. It also raises the risk of losing the Power BI.
It might be better to use a collaboration platform, such as OneDrive, GitHub, etc. One thing that I've seen work best is to start with a file on a Team's site and having users sync with their OneDrive. Using Teams avoids having users download and review only to reuploaded files again. Using the sync features makes the whole user experience seamless. Using teams is also more critical when you're working with larger files.
2) Not Keeping a Power BI File Backup
If we rewound time, it wasn't always the case where users could download a Power BI file from the data set. Moving toward the cloud for everything is the future. However, using the service as your master copy without having any other backup is a recipe for disaster. In some cases, the download feature could be disabled when working with more advanced features. In this scenario, if you were to rely on the service as your only backup, you would be left starting from scratch with no way to edit or modify the data sets or reports.
Knowing your data is where governance and CI/CD come into play. It's important to start thinking about where to store your files, who has access to them, and what security should be around access to the data. Data is a living and evolving process.
3) Use and Consume Source Data Files On Your Local File System for Your Power BI Model
There have been scenarios I've seen where a Power BI user was using local files to publish to the service. The Power BI user left the company after transitioning the Dashboard. A new user went to refresh the Dataset, and the file failed because it was looking for a local file on someone else's computer.
There are options to avoid this pitfall:
- Setup a shared OneDrive or TeamSite
2) Work of files in a blob storage account in Azure3) Have all the files a company file share
All of these options are great, it just depends on the scenario. Option 1
tends to be the best option from a change control perspective if the organization has a Team site handy. Organizations frequently already have a Team's site set up and are using Office 365. So, transitioning a Power BI file to the Team site tends to be a natural transition.
Option 2 tends to be a great fit if the organization doesn't have a Team site and doesn't have a file share. It's relatively straightforward to set up blob storage. However, from a support perspective, a typical Power User is not familiar with working with Azure, and having the file in blob storage tends to be more challenging for most users to manage.
Option 3 Is an excellent option for organizations that still rely heavily on an on-premises network. When working with on-premises systems, it adds the complexity of setting up the Power BI gateway. Power BI needs the gateway to establish the connection with any on-premises data and systems.
4) When Working with Larger Data Sets Not Using Power Query
As a long-time SQL user, my natural inclination is to levitate to SQL to solve common data problems. However, using Power BI and SQL to pull data into a table is not a best practice, it slows down the data processing.
Power BI has its method of working with data from a database and manipulating the data. It's referred to as query folding. The users can either go to the advanced editor and code the query or click through the transformation step by step in Power Query.
When using query folding, keep in mind it's not straightforward to do the advanced transformation from multiple tables at once. When these scenarios are presented, it makes sense to use a view in the source database and use Power Query to connect to that table.
5)Trying to Keep It "Real" Time By Using a Direct Query for Real-Time Analytics
When using Power BI’s Direct Query, it doesn't use an in-memory analytics engine. As a result, it doesn't have an analytics in-memory analytics engine to help process the data using the direct query. Power BI in-memory analytics solution is a core to the heart of the Power BI Product.
When you use direct query, the tool reads the data live and process the data live, which could mean visuals take longer to process.
6) One Table to Rule Them All
I often see users create one large table with all the critical data in one place. On the surface, this may seem like an easy win-win solution. The model is just one giant table with all the essential fields in one place.
However, the reality the analytics engine behind Power BI uses a Vertipaq in-memory solution designed around concepts around a "star schema". When you move to one size fits all, it loses its ability to optimize the solution, and the reports will run slow. The role of the data architect is crucial in building out any Power BI solution. It's important to consider how the data will be modeled and how end users will use the data.
Power BI is an extremely Powerful tool that is user friendly, but it’s easy to fall into potential pitfalls. Remember to keep these ideas in the mind when designing a project.