In Databricks, you can create both external and managed tables in a database. You’ll find differences between external Vs managed tables.
Difference between managed and external tables in Databricks

Managed Tables
- Also known as internal tables.
- Managed tables are stored in a managed location within the Databricks cluster’s file system (DBFS).
- Databricks manages the lifecycle of managed tables, including data and metadata.
- When you drop a managed table, Databricks removes both the metadata and the underlying data from the managed location.
- Managed tables are well-suited for temporary or transient data, or when you want Databricks to manage both the data and metadata for you.
- To create a managed table in Databricks, you can use SQL commands like
CREATE TABLE.
Example:
CREATE TABLE managed_table (
column1 INT,
column2 STRING
)
USING parquet
LOCATION '/path/to/managed_table'
External Tables
- External tables are pointers to data files located outside the Databricks cluster, often in a storage service like AWS S3, Azure Blob Storage, or Google Cloud Storage.
- Databricks manages only the metadata of external tables, not the actual data files.
- When you drop an external table, only the metadata is removed, while the data files remain intact in the external storage location.
- External tables are useful when you want to query or analyze data without physically moving or copying it into the Databricks cluster, or when the data is shared among multiple systems.
- To create an external table in Databricks, you can also use SQL commands like
CREATE TABLE.
Example:
CREATE TABLE external_table (
column1 INT,
column2 STRING
)
USING parquet
LOCATION 's3://bucket/path/to/external_table'
Conclusion
In short, the key difference lies in where the data resides and who manages it. Managed tables store data within the Databricks cluster, and Databricks manages both data and metadata, while external tables store data externally, and Databricks manages only the metadata.
References
- Databricks documentation creating managed (or) external table: https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html
- Creating external location for AWS S3: https://docs.databricks.com/en/sql/language-manual/sql-ref-external-locations.html
- Creating external location for Azure Data Lake (ADLS Gen2): https://learn.microsoft.com/en-us/azure/databricks/connect/unity-catalog/external-locations
- Creating storage credentials in Databricks for AWS S3 IAM: https://docs.databricks.com/en/connect/unity-catalog/storage-credentials.html
- Creating storage credentials in Databricks for ADLS (Gen2): https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/azure-managed-identities#use-a-managed-identity-to-access-storage-managed-by-a-unity-catalog-metastore







You must be logged in to post a comment.