Unlocking Modular and Reusable Data Workflows – Databricks is a powerful platform built for big data analytics and machine learning.

As projects grow in size, keeping your code modular, clean, and reusable becomes important. One best practice is calling one notebook from another, enabling you to create reusable functions, ETL steps, or shared logic.

In this blog post, we will explore six different ways to call or run one notebook from another in Databricks, highlighting syntax, use cases, and caveats. These methods work across various languages like Python, SQL, and Scala in Databricks.

1. %run Magic Command (Most Common Way)

The %run command is the most widely used method for calling another notebook in Databricks.

%run /Users/srini@example.com/Utilities/Helpers

✅ Use case:

  • Reusing shared functions or variables.
  • Loading configuration or helper notebooks.
  • Works with notebooks in the same workspace.

⚠️ Notes:

  • %run executes all the code in the target notebook and loads variables/functions into your current namespace.
  • It only works in interactive mode (not inside a job task).
  • Cannot pass parameters directly.

2. dbutils.notebook.run() (Used in Workflows and Jobs)

This is the official way to call one notebook from another programmatically and with parameter passing.

result = dbutils.notebook.run("/Path/To/Notebook", timeout_seconds=60, arguments={"input": "value"})

✅ Use case:

  • Executing notebooks as tasks in workflows.
  • Passing and receiving parameters.
  • Works in job contexts.

⚠️ Notes:

  • Executes the notebook as a new job with a different context (like a subprocess).
  • You can use dbutils.notebook.exit("result") to return a value from the child notebook.

3. Chaining Tasks Using Databricks Workflows (No Code Required)

Databricks Workflows (formerly Jobs) allow you to chain multiple notebooks visually or via JSON.

Steps:

  • Go to Workflows > Create Job.
  • Add multiple notebooks as tasks.
  • Define dependencies between them.

✅ Use case:

  • Production pipelines.
  • Low-code orchestration.
  • Schedule multiple notebook runs sequentially or in parallel.

⚠️ Notes:

  • Good for automation, but not useful for ad hoc reuse or conditional logic within notebooks.

4. import Shared Logic via Databricks Repos (Git Integration)

If your notebooks are version-controlled via Databricks Repos, you can import Python files or notebooks just like standard Python modules.

from my_utils.helpers import some_function

✅ Use case:

  • Modular and testable Python code.
  • Reusing logic across notebooks in Repos.
  • Ideal for teams following software engineering best practices.

⚠️ Notes:

  • Only works with .py files, not .ipynb directly.
  • Requires Git integration and code organization.

5. REST API to Trigger Notebooks (External Triggering)

You can trigger a notebook remotely using Databricks REST API.

Example:

Call this endpoint:

POST /api/2.0/jobs/run-now

With payload:

{
  "job_id": 12345,
  "notebook_params": {
    "param1": "value1"
  }
}

✅ Use case:

  • Triggering notebooks from external systems like Airflow, Jenkins, or Lambda.
  • Remote orchestration.

⚠️ Notes:

  • Requires authentication (PAT token or Azure AD).
  • You don’t get variable access; only logs or output status.

6. Import as Python Modules via %pip install -e (Advanced Use)

You can organize notebooks as Python packages inside Databricks Repos, and use %pip install -e . to install them in edit mode.

Steps:

  1. Place your reusable logic in a folder (e.g., mylib).
  2. Add __init__.py to make it a package.
  3. Install using %pip install -e .
  4. Import and use:
from mylib.module import my_function

✅ Use case:

  • Enterprise-grade modularity.
  • Unit testing with pytest.
  • Works across multiple notebooks and projects.

⚠️ Notes:

  • Higher learning curve.
  • Requires Repos and code structure discipline.

Summary Table

MethodParameter SupportNamespace SharedWorks in JobsNotes
%runGood for dev
dbutils.notebook.run()Best for production
Databricks WorkflowsVisual chaining
Git Repos + importPython only
REST APIExternal control
%pip install -e + importAdvanced packaging

Best Practices

  • Use %run during development or quick experimentation.
  • Use dbutils.notebook.run() for parameterized logic in production pipelines.
  • Use Databricks Workflows for orchestration without writing orchestration code.
  • Leverage Git and Python modules for enterprise-grade reusability.

Final Thoughts

Databricks provides a variety of flexible ways to reuse notebooks, build modular pipelines, and automate workflows. By choosing the right method depending on your use case—development, production, or external orchestration—you can keep your codebase clean, DRY (Don’t Repeat Yourself), and maintainable.

If you’re working in a team or scaling your data platform, investing in reusable patterns like dbutils.notebook.run() or Git-based module imports can save countless hours and reduce bugs.