The integrated web editor inside Databricks is okay, but there is a better/another way to edit your Spark jobs by connect to your Databricks from Visual Studio Code.
- First install the Databricks extension for VS Code: https://marketplace.visualstudio.com/items?itemName=paiqo.databricks-vscode
2. Go to Preferences: Open User Settings
3. Select the Databricks menu under the Extension
4. Now fill out the mandatory fields
- Api Root url (https://westeurope.azuredatabricks.net)
- The region is where your Databricks is stored in Azure/AWS
- Display Name (MyDataBricks)
- Can be ‘almost’ anything
- Local Sync Folder (C:\workspace\databricks)
- If you’re on linux or mac make sure the path is accordingly like /home/myUser/mySyncFolder or myUser/mySyncFolder
- Personal Access Token Secure
- You can create your own token in Databricks by going to Settings -> User Settings and press Generate new token
Now you are all set. Press the databrick icon in VS Code
And it’ll connect to the Databricks cluster
Note that to execute your python script you might need to install or update ipykernel. You can do this by running
c:/Program Files (x86)/Python37-32/python.exe" -m pip install ipykernel -U --user --force-reinstall