Databricks Job Creation & permission management using Databricks api

Sushil Deshpande
3 min readApr 16, 2022

In an enterprise organization Databricks environment is in share mode with multiple teams. When someone create a new job by default the user who has created the job becomes the owner of the job. The owner of the job can add permission for Active Directory groups or individual team members for the job. Permissions such as Can Manage , Can Manage Runs, Can view can be given by owner of job to other users, however changing ownership is not possible. The ownership of the Job can be transferred only by Databricks Admin. (For more info refer to job permission Databricks documentation)

Job Permission Metrics

When this job runs, it will use the user’s permissions associated with the creator of the job for execution. If for some reason this user has moved out of the organization or no longer has access to Databricks, consequently the job is at risk of failure.

To avoid such situations we can follow the routine of creating a job with an application ID. This application ID can be part of the desired AD group so it has all the required permission.

The challenge here is we have to login with an application ID each time for job creation and add the required permission subsequently. Moreover, this task becomes cumbersome, when we have SSO enabled logins, when we have to create frequent jobs etc.

Supplementally we have to share the password for application ID with all team members which can lead to security compliance.

By following below steps we can solve this issue. On top of that we will automate steps of job creation. Furthermore, it helps to remove human errors and manual intervention by almost 95%. It will help to automate the standard compliance required for jobs such as resource tagging, cluster settings etc.

Step 1: Login one time with application ID and password to Databricks environment and create access token for API. You can create never expire or can set it as per the organization key rotation policy.

Databricks UI

Step 2: Store this token into Databricks secrets manager .

Snapshot for creating scope and secrete

Step 3: Import below python notebook into Databricks folder. to view on git please follow the link

Step 4: Adjust the parameters such as Databricks instance, User Group, App ID, job_json etc. with your requirements and settings. In case you have multiple Databricks instances adjust the env list accordingly.

Step 5: Attach cluster to notebook and run all commands. It will create widgets.

Step 6: Add the user group or email for granting the permission, then select the required Databricks instance from second dropdown, finally select Yes and your job will be created with Can Manage permissions for the provided user or AD group.

Note: Reference and some images used from the Databricks documentation available in public. Databricks is the Data + AI company. Databricks combines the best of data warehouses and data lakes to offer an open and unified platform for data and AI.

--

--

Sushil Deshpande

Solution Architect Cloud Migration | Data & Digital Transformation