1. Home
  2. Companies
  3. Databricks
Databricks logoDA

Databricks

About

Databricks operates a unified Data Intelligence Platform that processes and analyzes data for over 15,000 organizations globally, including more than 60% of the Fortune 500. The platform combines data engineering, ETL pipelines, ML model training, and generative AI infrastructure - running on AWS, Azure, and GCP. Founded in 2013 by the engineers who created Apache Spark at UC Berkeley, the company built its architecture around lakehouse design, which merges data warehouse and data lake capabilities into a single system that handles both structured analytics and unstructured ML workloads.

The security surface spans distributed data processing at enterprise scale, multi-cloud deployments, and data governance across heterogeneous environments. Unity Catalog provides centralized access control and audit logging for lakehouse assets, while the platform must secure data pipelines that move between cloud storage, compute clusters, and external integrations. Databricks maintains three major open-source projects - Delta Lake for transactional storage, MLflow for ML lifecycle management, and Unity Catalog for governance - each introducing distinct attack vectors and compliance requirements that security teams need to monitor across production deployments.

The threat model includes data exfiltration risks during ETL operations, privilege escalation in shared compute environments, and supply chain vulnerabilities in open-source dependencies. Teams work with Python, Scala, and Java codebases, securing Spark clusters that process sensitive data alongside ML models that require both training data protection and inference endpoint hardening. Security engineering at this scale means defending distributed systems where data moves constantly between storage layers, compute resources, and API endpoints across multiple cloud providers.

Similar companies

Bayer logoBA

Bayer

Bayer is a global life science company with core competencies in healthcare and agriculture, committed to "Health for all, Hunger for none" through scientific innovation and breakthrough solutions.

6 jobs
DataRobot, Inc. logoDI

DataRobot, Inc.

Enterprise platform for building, deploying, and governing predictive and generative AI systems integrated into core business processes.

1 job
Fivetran logoFI

Fivetran

Fivetran provides automated data connectors and pipelines that continuously move data from source systems to warehouses and data lakes for analytics and AI applications.

1 job
AvePoint logoAV

AvePoint

AvePoint is the global leader in data security, governance, and resilience, providing a unified platform to protect and manage critical data across Microsoft 365, Google Workspace, Salesforce, and other cloud environments for over 25,000 customers worldwide.

1 job
Airtable logoAI

Airtable

Airtable is an AI-native app platform that combines the familiarity of spreadsheets with the power of databases, enabling 500,000+ organizations to build custom apps, automations, and AI agents without code.

Datavant logoDA

Datavant

Datavant is the data collaboration platform trusted for healthcare, connecting and securing the world's health data to enable better decisions and outcomes across the healthcare ecosystem.