How to call Databricks notebook without Github mail from Data Factory

You can call Databricks notebook affiliated to your Github repository from Data Factory. It is a easy way to have your whole workflow pipeline inside Data Factory and have all triggers only in one place and not also having them in Databricks. However, when calling a Databricks notebook the path depends on your Github username or mail address. Which is not ideal for production purposes.

Databricks notebook

How do we keep your Github username/mail out of production to avoid future failure when it might be removed. The more ‘correct’ way is to not clone it with your username, obviously. What you should do it use the folder structure in Databricks Repos.

Go to Databricks and select Repos from the left side menu. Select the dropdown -> Create -> Folder. Name it e.g. Production.

How to call Databricks notebook without Github mail from Data Factory

Now select the dropdown on the newly created Production folder and select “Add Repo”

How to call Databricks notebook without Github mail from Data Factory

Do the same similar setup. Add the git repo URL and Create.

How to call Databricks notebook without Github mail from Data Factory

As a final touch you can setup the different permission needed so the people need access get it. Select the dropdown on Production folder and hit Permissions.

How to call Databricks notebook without Github mail from Data Factory

Add the necessary users with the right permissions.

How to call Databricks notebook without Github mail from Data Factory

Documentation inspiration: https://github.com/alexott/databricks-nutter-repos-demo#setup-on-databricks-side