Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. Among all the new Microsoft offerings, Synapse generates the most excitement and promises to be a gamechanger for the data side of things. However, there are new features in Synapse that regular users may need to get acclimatized to, and certain others that they will be bound to miss. One such feature is the missing Synapse Pipeline connector in Azure Logic Apps as well as in Microsoft Power Automate Flow.
This Blog aims to resolve the issue of triggering Synapse Analytics Pipeline from outside the Synapse Workspace based on an incoming new email trigger. This seems a fairly easy problem for which an easy solution does not exist. To achieve this trigger automation, we make use of Azure functions code to call the pipeline we want to automate, which in turn is called from within a Logic App based on an incoming email trigger.
For our use case today, we are making use of a copy data pipeline build into the Azure Synapse Analytics which copies data from On-Premises Oracle DW to Synapse local Azure Data Lake Storage Gen2. Every day, new data is loaded into the on-premises Oracle DW which in turn needs to be loaded into the Data Lake tables in the Synapse Analytics environment. This process needs to be executed as soon as the data load finishes in the Oracle DW. We receive an Outlook email once the Oracle DW loads are completed on the on-premises side. This email should then trigger the Synapse pipeline to run to copy data into the Data Lake Storage Gen2 tables.
Solution Process Workflow Diagram
Solution Process Flow - Automating Synapse Analytics Pipelines using Azure Logic Apps and Azure Functions
The solution resides inside of an Azure Logic App comprising of Azure Functions App, Azure Synapse Analytics and Power BI Datasets. The Logic App trigger is set up to execute the entire flow when a new email is received in the Outlook inbox. This in turn triggers the Azure Function App containing the C# code to call the master pipeline from within the Azure Synapse Analytics workspace. Current Synapse Analytics architecture has one Master_All pipeline, which triggers the Stage load and another pipeline that triggers the DW load. Once the load finishes, we copy a new Blob file into the Blob Storage folder. The Blob Storage confirms that a new blob is copied into the container folder, specifying the Stage and DW load is finished in Synapse Analytics to the Logic App flow. We then trigger the Power BI Datasets to refresh in the Power BI service to load the visuals with latest data. Any subscriptions, dashboards also get sent out, refreshed at the same moment.
Let us look at each component individually –
Synapse Analytics Pipeline
- Synapse Analytics – contains pipelines to copy data into Sales and DW tables
- Master_All – Pipeline thereby executes Stage and DW pipelines by calling Execute Pipeline Activity
- All pipelines work on successful completion of previous activity
- Move Parquet File –
- Copy Activity works on copying a blank parquet file from one folder to LogicAppLoad/Finished folder
- Informs the Logic App that all tables have finished loading
Master_ALL Pipeline - Executes Stage and DW pipelines
Pipeline Source Dataset Settings
Pipeline Sink Dataset Settings
Azure Functions App
- Create an Azure Function choosing HTTP trigger option with Anonymous Authorization Level
- Add references to libraries that will trigger Synapse pipeline
- Create an instance of pipeline client providing Synapse dev endpoint and consequently add a form of authentication
- Create pipeline run instance to run the pipeline client
- Publish and host it on the web using Function app in Azure Portal
Azure Logic App
Azure Logic App
- Email Action Trigger – As soon as an email arrives in the Outlook inbox folder with the Subject – “Power BI Tables are loaded” the workflow will get triggered
- Azure Function – OutlookPipelineTrigger – calls Master_All Synapse pipeline which executes and loads data into Stage and DW tables
- Blob is added/modified trigger - Once all data is loaded into tables, a new parquet file is added in the blob to signify all data load is complete
- Power BI datasets are then refreshed
And this is how we accomplish the task of automating Synapse Pipeline trigger and Power BI datasets refresh based on a new incoming email using Azure Logic Apps and Azure Functions.
I hope this blog tutorial will be helpful for future tasks and to highlight the flexibility in using the Azure resources to achieve automation!