Skip to main content

Running the Data Migration Assistant - Part Three - Jupyter Notebook

Author by Frank Gill

Previously, I have introduced the Data Migration Assistant (DMA) and shown how to run it using PowerShell modules.  In the final post in this series on the DMA, I will show how you can use a Jupyter Notebook to consolidate the steps from the previous posts.

Jupyter Notebook is an open-source web application that allows combination of formatted text and executable code in the same document. Azure Data Studio supports Jupyter Notebooks containing T-SQL, PowerShell, Python, and Spark.  This example uses PowerShell code.  The completed notebook file is available for download here.

A Jupyter Notebook is made up of cells.  There are two types of cells, text and code.  In Azure Data Studio, the code type is determined by the kernel chosen in the notebook.  Currently, only one kernel can be chosen, so each notebook will support one type of code.  I used PowerShell for this notebook because it has the flexibility to execute T-SQL code.

Azure Data Studio is available for download from this link.  Once downloaded and installed, you create a new notebook by selecting File -> New Notebook.  You add a cell to the new notebook by clicking the +Cell button in the upper left or the +Code and +Text links in the center of the page.

Creating a new text cell will open the cell in edit mode.  Text cells are written using the Markdown language.  Markdown is a text-to-HTML language that is easy to use.  Azure Data Studio contains a formatting bar for each text cell which will insert the necessary markdown tags.

Samples of available formatting options are shown below.  In the latest version of Azure Data Studio, the preview pane for the text cell defaults to the right of the edit pane.

To insert a code cell, select code cell from the drop down or use the +Code button. The code cell uses formatting and highlighting for the kernel selected.  You can enter code in the code cell by typing or copying and pasting.  To execute the code, click the play button on the left of the code cell.

Once the code executes, the results will be displayed below the code.  One feature of the Juptyer Notebook is that results are saved with the notebook.

The notebook attached below contains code the perform the following steps:

  • Download and install the Data Migration Assistant – The code checks if the DMA is already installed.  If it is, the current version is checked against the installed version.  If it is not installed or the versions do not match, it will be downloaded and installed.
  • Download the PowerShell modules – The .zip file containing the DMA PowerShell modules is download, extracted, and copied to the newly created C:\ProgramFiles\WindowsPowerShell\Modules\DataMigrationAssistant folder.  The DBATools PowerShell module is installed for
  • Create the Inventory Database – Create a database to contain the DatabaseInventory table.  If the database exists, it will be dropped.
  • Create the DatabaseInventory table – Creates the DatabaseInventory with the columns required by the DMACollector module.  If you want to add additional columns, modify the table creation code.
  • Populate the DatabaseInventory Table – The code reads a list of SQL Server instances from file C:\temp\instances.txt.  The databases for each instance are inserted into the DatabaseInventory table.  The AssessmentFlag column is set to 0 for each database.
  • Update the AssessmentFlag – Update the AssessmentFlag column in the DatabaseInventory table to 1 for the databases to be assessed.  The query is set to update all rows in the table.  Add a WHERE clause to the query to update a subset of the inventory.
  • Execute the Data Migration Assistant – Execute the DMA against all databases with the AssessmentFlag set to 1.  The code defaults to write the results to path C:\temp\results.
  • Import the DMA Results – Imports the result file into SQL Server.  The code defaults to create both database and data warehouse.

Because each code cell runs independently, you can create the database once and run the dmaDataCollector and dmaProcessor steps multiple times. The Jupyter Notebook code can be found here.

I hope this series of posts is useful and simplifies the execution of the Data Migration Assistant.