Bitbucket Configuration
This guide covers how to configure DataChain Studio to integrate with Bitbucket Cloud or Bitbucket Server (Data Center).
Overview
DataChain Studio integrates with Bitbucket using OAuth consumers, providing:
- Secure Authentication: OAuth 1.0a/2.0 user authentication
- Repository Access: Access to Bitbucket repositories
- Webhook Integration: Automatic job triggering on Git events
- Team Synchronization: Bitbucket workspace and team mapping
Prerequisites
- Bitbucket workspace with admin access (for Bitbucket Cloud)
- Bitbucket Server with admin access (for self-hosted)
- DataChain Studio deployment ready for configuration
- Valid domain name for DataChain Studio instance
Bitbucket Cloud Setup
1. Create OAuth Consumer
- Go to Bitbucket Cloud
- Navigate to your workspace settings
- Go to Settings → OAuth consumers
- Click Add consumer
2. Configure OAuth Consumer
Fill in the consumer details:
- Name:
DataChain Studio - Description:
DataChain Studio integration for data processing workflows - Callback URL:
https://studio.yourcompany.com/auth/bitbucket/callback - URL:
https://studio.yourcompany.com - Permissions: Select the following:
- Account: Read
- Team membership: Read
- Repositories: Read, Write (if needed)
- Pull requests: Read, Write (if needed)
- Issues: Read, Write (optional)
- Webhooks: Read, Write
3. Save Consumer Credentials
After creating the consumer, save: - Key (Client ID) - Secret (Client Secret)
Bitbucket Server Setup
1. Create Application Link
- Log in to Bitbucket Server as an administrator
- Go to Administration → Application links
- Click Create link
2. Configure Application Link
- Application URL:
https://studio.yourcompany.com - Application Name:
DataChain Studio - Application Type: Generic Application
- Service Provider Name:
DataChain Studio - Consumer Key: Generate a unique key
- Shared Secret: Generate a secure secret
- Request Token URL:
https://studio.yourcompany.com/auth/bitbucket/request_token - Access Token URL:
https://studio.yourcompany.com/auth/bitbucket/access_token - Authorize URL:
https://studio.yourcompany.com/auth/bitbucket/authorize
DataChain Studio Configuration
Bitbucket Cloud Configuration
Add the following to your values.yaml file:
global:
git:
bitbucket:
enabled: true
type: "cloud" # or "server" for Bitbucket Server
clientId: "your-bitbucket-client-id"
clientSecret: "your-bitbucket-client-secret"
webhookSecret: "your-webhook-secret"
Bitbucket Server Configuration
For Bitbucket Server deployments:
global:
git:
bitbucket:
enabled: true
type: "server"
url: "https://bitbucket.yourcompany.com"
consumerKey: "your-consumer-key"
consumerSecret: "your-consumer-secret"
webhookSecret: "your-webhook-secret"
# SSL configuration for Bitbucket Server
ssl:
verify: true
caCertificate: |
-----BEGIN CERTIFICATE-----
... (your Bitbucket Server CA certificate) ...
-----END CERTIFICATE-----
Advanced Configuration
For more complex setups:
global:
git:
bitbucket:
enabled: true
type: "cloud" # or "server"
url: "https://bitbucket.yourcompany.com" # Only for server
clientId: "your-client-id"
clientSecret: "your-client-secret"
webhookSecret: "your-webhook-secret"
# OAuth configuration
oauth:
version: "2.0" # "1.0a" for older integrations
scopes:
- account
- team
- repository
- pullrequest
# Additional OAuth parameters
redirectUri: "https://studio.yourcompany.com/auth/bitbucket/callback"
# Webhook configuration
webhooks:
events:
- repo:push
- pullrequest:created
- pullrequest:updated
- pullrequest:approved
- pullrequest:merged
# Webhook delivery settings
active: true
# Rate limiting
rateLimit:
requestsPerHour: 1000
burstSize: 50
# Connection settings
timeout:
connect: 30s
read: 60s
write: 30s
# Repository access control
repositories:
# Allow specific repositories
allowList:
- "workspace/important-repo"
- "workspace/data-*"
# Block specific repositories
blockList:
- "workspace/sensitive-repo"
# Workspace filtering
workspaces:
allowList:
- "your-workspace"
- "partner-workspace"
blockList:
- "external-workspace"
Secret Management
For Kubernetes deployments, store sensitive data in secrets:
# Create secret for Bitbucket OAuth credentials
kubectl create secret generic bitbucket-oauth \
--namespace datachain-studio \
--from-literal=client-id=your-client-id \
--from-literal=client-secret=your-client-secret
# Create secret for webhook secret
kubectl create secret generic bitbucket-webhook \
--namespace datachain-studio \
--from-literal=secret=your-webhook-secret
Reference secrets in configuration:
global:
git:
bitbucket:
enabled: true
type: "cloud"
clientIdSecret:
name: bitbucket-oauth
key: client-id
clientSecretSecret:
name: bitbucket-oauth
key: client-secret
webhookSecretSecret:
name: bitbucket-webhook
key: secret
Webhook Configuration
Automatic Webhook Setup
DataChain Studio can automatically configure webhooks:
global:
git:
bitbucket:
webhooks:
autoSetup: true
events:
- repo:push
- pullrequest:created
- pullrequest:updated
- pullrequest:merged
# Additional webhook settings
active: true
skipCertVerification: false # Only for testing
Manual Webhook Setup
If automatic setup doesn't work, configure webhooks manually:
Bitbucket Cloud:
- Go to repository Settings → Webhooks
- Click Add webhook
- Configure:
- Title:
DataChain Studio - URL:
https://studio.yourcompany.com/api/webhooks/bitbucket - Status: Active
- Triggers: Select relevant events:
- Repository push
- Pull request created
- Pull request updated
- Pull request merged
- Skip certificate verification: Unchecked (unless testing)
Bitbucket Server:
- Go to repository Settings → Hooks
- Enable Web Post Hooks
- Configure:
- URL:
https://studio.yourcompany.com/api/webhooks/bitbucket - Secret: Your webhook secret
User Authentication
Configure Bitbucket OAuth for user authentication:
global:
auth:
bitbucket:
enabled: true
type: "cloud" # or "server"
clientId: "your-oauth-client-id"
clientSecret: "your-oauth-client-secret"
# OAuth scopes (for Cloud)
scopes:
- account
- team
- repository
# Team synchronization
teamSync:
enabled: true
workspaceWhitelist:
- "your-workspace"
Permissions and Access Control
Repository-Level Permissions
Configure fine-grained repository access:
global:
git:
bitbucket:
permissions:
# Default repository permissions
default:
repository: read
pullrequest: read
# Custom permissions for specific repositories
repositories:
"workspace/critical-repo":
repository: read
pullrequest: write
issues: read
Team Mapping
Map Bitbucket teams to DataChain Studio roles:
global:
teams:
bitbucket:
mapping:
# Bitbucket team → Studio role
"developers": "member"
"data-engineers": "member"
"administrators": "admin"
"viewers": "viewer"
# Workspace-wide settings
defaultRole: "viewer"
syncInterval: "1h"
Bitbucket Pipelines Integration
Pipeline Triggers
Configure pipeline triggers from DataChain Studio:
global:
git:
bitbucket:
pipelines:
enabled: true
# Pipeline trigger settings
triggers:
# Trigger on data changes
dataChange:
enabled: true
branch: "main"
variables:
DATACHAIN_TRIGGER: "data_change"
# Custom pipeline variables
customVariables:
DATACHAIN_STUDIO_URL: "https://studio.yourcompany.com"
DATACHAIN_WEBHOOK_SECRET: "webhook-secret"
Build Status Updates
Update Bitbucket commit status from DataChain Studio jobs:
global:
git:
bitbucket:
buildStatus:
enabled: true
# Status contexts
contexts:
dataProcessing: "datachain/processing"
dataValidation: "datachain/validation"
dataQuality: "datachain/quality"
# Status details
url: "https://studio.yourcompany.com/jobs/{job_id}"
description: "DataChain data processing job"
Monitoring and Debugging
Health Checks
Monitor Bitbucket integration health:
monitoring:
bitbucket:
enabled: true
healthChecks:
api: true
webhooks: true
oauth: true
metrics:
- apiCalls
- responseTime
- errorRate
- webhookDelivery
alerts:
- name: "Bitbucket API Errors"
condition: "bitbucket_api_error_rate > 5%"
duration: "5m"
severity: "warning"
- name: "Bitbucket Webhook Failures"
condition: "bitbucket_webhook_failure_rate > 10%"
duration: "5m"
severity: "critical"
Debug Configuration
Enable debug logging for Bitbucket integration:
Testing the Integration
Test Bitbucket API Access
Bitbucket Cloud:
# Test API connectivity
curl -H "Authorization: Bearer $BITBUCKET_TOKEN" \
https://api.bitbucket.org/2.0/user
# Test repository access
curl -H "Authorization: Bearer $BITBUCKET_TOKEN" \
https://api.bitbucket.org/2.0/repositories/workspace
Bitbucket Server:
# Test API connectivity
curl -H "Authorization: Bearer $BITBUCKET_TOKEN" \
https://bitbucket.yourcompany.com/rest/api/1.0/projects
# Test user information
curl -H "Authorization: Bearer $BITBUCKET_TOKEN" \
https://bitbucket.yourcompany.com/rest/api/1.0/users/username
Test OAuth Flow
# Test OAuth authorization URL (Cloud)
curl "https://bitbucket.org/site/oauth2/authorize?client_id=YOUR_CLIENT_ID&redirect_uri=https://studio.yourcompany.com/auth/bitbucket/callback&response_type=code"
Test Webhook Delivery
# Test webhook endpoint
curl -X POST https://studio.yourcompany.com/api/webhooks/bitbucket \
-H "Content-Type: application/json" \
-H "X-Event-Key: repo:push" \
-H "X-Hook-UUID: webhook-uuid" \
-d '{
"push": {
"changes": [{
"new": {
"name": "main",
"target": {
"hash": "abcdef123456"
}
}
}]
},
"repository": {
"name": "test-repo",
"full_name": "workspace/test-repo"
}
}'
Troubleshooting
Common Issues
OAuth authentication failures: - Verify client ID and secret are correct - Check callback URL matches exactly - Ensure required permissions are granted - Verify OAuth version (1.0a vs 2.0)
API connectivity issues: - Test Bitbucket API endpoint accessibility - Check SSL certificate validity - Verify network connectivity - Review API rate limits
Webhook delivery failures: - Confirm webhook URL is accessible - Verify webhook secret matches - Check SSL certificate validity - Review webhook event configuration
Debug Commands
# Check Bitbucket configuration
kubectl get configmap datachain-studio-config -n datachain-studio -o yaml | grep -A 20 bitbucket
# View Bitbucket-related logs
kubectl logs -f deployment/datachain-studio-backend -n datachain-studio | grep -i bitbucket
# Test Bitbucket API from container
kubectl exec -it deployment/datachain-studio-backend -n datachain-studio -- \
curl -v https://api.bitbucket.org/2.0/user
# Test OAuth endpoint
kubectl exec -it deployment/datachain-studio-backend -n datachain-studio -- \
curl -v https://bitbucket.org/site/oauth2/access_token
Security Considerations
OAuth Security
- Use confidential OAuth consumers
- Regularly rotate client secrets
- Limit OAuth scopes to minimum required
- Monitor OAuth token usage
Webhook Security
- Always use webhook secrets
- Validate webhook signatures
- Use HTTPS for webhook URLs
- Monitor webhook delivery patterns
Access Control
- Use principle of least privilege
- Regularly audit repository access
- Monitor API usage patterns
- Review team/workspace permissions
Migration from Other Git Forges
When migrating from other Git forges to Bitbucket:
- Export existing configuration
- Set up Bitbucket OAuth consumer
- Configure DataChain Studio for Bitbucket
- Migrate repository connections
- Update webhook configurations
- Test integration thoroughly
- Update user authentication
Next Steps
- Configure GitHub integration for additional Git forges
- Set up GitLab integration if needed
- Review SSL/TLS configuration for secure communications
- Check troubleshooting guide for common issues
- Configure monitoring and alerting for the integration