DataChain Studio
DataChain Studio is a web application that enables Machine Learning and Data teams to seamlessly
- Run and track jobs
- Track experiments and manage models (via DVC integration)
- Collaborate on data projects
DataChain Studio supports multiple workflows: - DataChain workflows: For unstructured data processing and transformation - DVC + Git workflows: For ML experiment tracking and model registry, maintaining Git as the single-source-of-truth
Sign in to DataChain Studio using your GitHub.com, GitLab.com, or Bitbucket.org account, or with your email address. Explore the demo projects and datasets, and let us know if you need any help getting started.
Why DataChain Studio?
- Simplify data processing job tracking, visualization, and collaboration.
- Support both modern DataChain workflows and traditional DVC experiment tracking.
- Keep your code, data and processing connected at all times.
- Apply your existing software engineering stack for data and ML teams.
- Build a comprehensive data processing and ML platform for transparency and discovery across all your projects.
- For DVC projects, maintain Git as the single-source-of-truth and use GitOps for deployment and automation.
Getting Started
New to DataChain Studio? Start with these guides:
- User Guide - Learn how to use DataChain Studio features
- API Reference - Integrate with Studio programmatically
- Webhooks - Set up event notifications
- Self-hosting - Deploy your own Studio instance
Key Features
Dataset Management
- Track and version your datasets
- Visualize data processing pipelines
- Share datasets across teams
Job Processing
- Run data processing jobs in the cloud
- Monitor job progress and logs
- Schedule recurring data processing tasks
ML Experiment Tracking (DVC Integration)
- Track and compare ML experiments
- Manage model lifecycle and registry
- Visualize metrics and plots
- Git-based experiment versioning
Team Collaboration
- Share projects with team members
- Control access with role-based permissions
- Integrate with development workflows
API Integration
- RESTful API for programmatic access
- Webhook notifications for automation
- Command-line tools for developers
Visit studio.datachain.ai to get started, or learn about self-hosting for enterprise deployments.