GitHub App
Learn how to install and configure the DataChain Studio GitHub App for seamless integration with your GitHub repositories.
Overview
The DataChain Studio GitHub App provides secure, fine-grained access to your GitHub repositories, enabling:
- Repository Access: Connect public and private repositories
- Webhook Integration: Automatic job triggering on code changes
- Security: OAuth-based authentication with granular permissions
- Team Collaboration: Shared access across team members
Installation
Install for Personal Account
- Navigate to DataChain Studio GitHub App
- Click "Install" or "Configure"
- Choose "Only select repositories" or "All repositories"
- Select the repositories you want to connect
- Review and approve permissions
- Complete installation
Install for Organization
- Go to your organization's settings on GitHub
- Navigate to "Third-party access" → "GitHub Apps"
- Search for "DataChain Studio" or use the installation link
- Configure repository access and permissions
- Complete installation for the organization
Configuration
Repository Selection
Choose which repositories to connect:
- All repositories: Grants access to all current and future repositories
- Selected repositories: Choose specific repositories to connect
- Recommended: Start with selected repositories for better security
Permissions
The DataChain Studio GitHub App requests these permissions:
Repository Permissions
- Contents: Read repository files and commit history
- Metadata: Read repository information and settings
- Pull requests: Read PR information for job triggering
- Commit statuses: Update commit status based on job results
Organization Permissions
- Members: Read organization membership (for team features)
- Plan: Read organization plan information
Usage
Creating Datasets
Once installed, you can create datasets from GitHub repositories:
- Go to DataChain Studio
- Click "Create Dataset"
- Select your GitHub organization
- Choose the repository
- Configure dataset settings
- Create the dataset
Webhook Integration
The GitHub App automatically configures webhooks for:
- Push events: Trigger jobs on new commits
- Pull requests: Run validation jobs on PRs
- Releases: Deploy or process data on releases
Troubleshooting
App Not Visible
If you don't see the GitHub App or repositories:
- Check Installation: Verify the app is installed on the correct account/organization
- Repository Access: Ensure the app has access to the desired repositories
- Permissions: Verify you have admin access to the organization
- Cache: Try logging out and back into DataChain Studio
Permission Issues
If you encounter permission errors:
- Review Permissions: Check that all required permissions are granted
- Reinstall: Try uninstalling and reinstalling the app
- Organization Approval: Some organizations require admin approval for new apps
Webhook Issues
If webhooks aren't triggering jobs:
- Check Webhook Settings: Verify webhooks are configured in repository settings
- Event Types: Ensure the correct event types are enabled
- Repository Access: Confirm the app has access to the repository
- Network: Check that GitHub can reach DataChain Studio servers
Security
Best Practices
- Least Privilege: Only grant access to repositories that need DataChain integration
- Regular Reviews: Periodically review and audit app permissions
- Organization Policies: Follow your organization's security policies
- Access Monitoring: Monitor app access logs and usage
Permissions Audit
Regularly audit GitHub App permissions:
- Go to your GitHub settings
- Navigate to "Applications" → "Authorized GitHub Apps"
- Review DataChain Studio permissions
- Update or revoke access as needed
Next Steps
- Learn about custom GitLab server integration
- Explore team collaboration features
- Set up automated workflows
- Configure webhooks for notifications