I am quite excited to share some of the possibilities in the analytics world with integration between R and Microsoft Power BI. Data wrangling is the biggest benefit as far as I’m concerned. I would like to discuss a few scenarios and R packages that can be used with Power BI.
When I heard about the integration of Microsoft Power BI integration with R, the first thing I wanted to try was the famous Hadley Wicham’s package DPLYR. The reason I wanted to do that was to check if the integration of R in Microsoft Power BI supports the custom R packages or does it work only with inbuilt R commands. To my surprise, it supports custom packages like DPLYR. The SELECT query from DPLYR worked perfectly. I am not sure though if the command INSTALL.PACKAGES would work. The best way to make custom packages like DPLYR to work is to install the package in the system and just use the LIBRARY function in the EXECUTE R COMMAND in Power BI. I will going through detailed steps on how to execute R command in Power BI using the new “EXECUTE R COMMAND” tab in the OTHERS in data source section of Power BI.
So, how far does the Power BI integration with R go? Can we run the graphics package codes in Power BI? Well, obviously the answer is NO as Power BI is a visualization tool itself. Microsoft obviously doesn't want the graphics capability of R to cannibalize their own product. So, the R visualization packages like GGPLOT2 will not work with Power BI. The script will not execute if there is any R code which produces graphic output. The ultimate goal to use the “EXECUTE R COMMAND” should be to create cleansed datasets from raw datasets that we need to use for our analysis. We can import all the raw data from different sources into R very easily. For an example, we can import multiple EXCEL files, CSV files, Text files, JSON documents etc. Then, we can cleanse the raw data using R packages like DATA.TABLE, SQLDF and DPLYR. Ultimately, we can have all the cleansed data with no missing values, unwanted rows, incorrect data, duplicates etc. from our raw datasets.
We all know that 80% of the time in analysis is actually spent on importing and cleansing the data. R packages like DPLYR, SQLDF and DATA.TABLE are making the data cleansing and data wrangling tasks simpler than ever. Microsoft, after the acquisition of Revolution analytics is increasingly bringing these powerful R capabilities to different platforms like Microsoft Power BI and SQL SERVER 2016. This is an awesome move by Microsoft as Power BI has some amazing interactive visualizations capability. The interactive visualizations of Power BI when combined with the powerful data wrangling capacity of R, Power Bi adoption will be easier by data scientists in large enterprises.
Power BI has already gained the attention of Digital Marketers and Financiers with the integrations like QuickBooks and Google Analytics. Now, with the integration of R into Power BI, Microsoft will surely get the interest of data scientists in Power BI.
In my next blog, I will share how I used DPLYR package to cleanse our Eventbrite attendee data and analyze the type of audience that’s going to visit our conference. This analysis will help our speakers to know the type of audience attending the conference prior so that they can tailor their presentations to the audience.
Please CLICK on the link to check out the Implementation of r code in Microsoft Power BI.