Studio REST API v1.0.0
Scroll down for code samples, example requests and responses. .
DataChain Studio provides REST API for programmatically managing datasets, jobs, and storage operations. All API endpoints require authentication and are scoped to specific teams.
Authorization:
All API endpoints require authentication via a Studio token.
The token must be included in the Authorization
header.
You can get a token by using datachain auth token
after logging in with datachain auth login
or from Tokens page in the Studio UI Settings.
Once you get a token, attach it to the Authorization
header in the following format:
- Base URL:
https://studio.datachain.ai/api
Default
Get Jobs
Code samples
import http.client
conn = http.client.HTTPSConnection("example.com")
headers = {
'Accept': "application/json",
'Authorization': "API_KEY"
}
conn.request("GET", "/api/datachain/jobs/?team_name=string", headers=headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
curl --request GET \
--url 'https://example.com/api/datachain/jobs/?team_name=string' \
--header 'Accept: application/json' \
--header 'Authorization: API_KEY'
GET /api/datachain/jobs/
Retrieve a list of jobs with optional status filtering.
Requires a token with read access to JOB scope.
Example responses
200 Response
[
{
"id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
"url": "https://studio.datachain.ai/team/team_name/datasets/jobs/0502eef6-a32e-45fa-8e3b-d20ecpabbcf0",
"status": "CREATED",
"created_at": "2021-01-01T00:00:00Z",
"created_by": "username",
"finished_at": "2021-01-01T00:00:00Z",
"query": "print('Hello, World!')",
"query_type": "PYTHON",
"team": "TeamName",
"name": "QueryName",
"workers": 1,
"python_version": "3.12",
"requirements": "numpy==1.24.0",
"repository": "https://github.com/user/repo",
"environment": {
"ENV_NAME": "ENV_VALUE"
},
"exit_code": 0,
"error_message": "Error message"
}
]
Responses
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK | Inline |
Response Schema
Status Code 200
Create Job
Code samples
import http.client
conn = http.client.HTTPSConnection("example.com")
payload = "{\"query\":\"print('Hello, World!')\",\"query_type\":\"PYTHON\",\"team_name\":\"TeamName\",\"environment\":\"ENV_NAME=ENV_VALUE\",\"workers\":1,\"query_name\":\"QueryName\",\"files\":[\"2\",\"3\"],\"python_version\":\"3.12\",\"requirements\":\"numpy==1.24.0\",\"repository\":\"https://github.com/user/repo\",\"priority\":1,\"compute_cluster_name\":\"ComputeClusterName\",\"compute_cluster_id\":1,\"start_after\":\"2021-01-01T00:00:00Z\",\"cron_expression\":\"0 0 * * *\",\"credentials_name\":\"CredentialsName\"}"
headers = {
'Content-Type': "application/json",
'Accept': "application/json",
'Authorization': "API_KEY"
}
conn.request("POST", "/api/datachain/jobs/", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
curl --request POST \
--url https://example.com/api/datachain/jobs/ \
--header 'Accept: application/json' \
--header 'Authorization: API_KEY' \
--header 'Content-Type: application/json' \
--data '{"query":"print('\''Hello, World!'\'')","query_type":"PYTHON","team_name":"TeamName","environment":"ENV_NAME=ENV_VALUE","workers":1,"query_name":"QueryName","files":["2","3"],"python_version":"3.12","requirements":"numpy==1.24.0","repository":"https://github.com/user/repo","priority":1,"compute_cluster_name":"ComputeClusterName","compute_cluster_id":1,"start_after":"2021-01-01T00:00:00Z","cron_expression":"0 0 * * *","credentials_name":"CredentialsName"}'
POST /api/datachain/jobs/
Creates a job and returns the job metadata.
Note that compute_cluster_name and compute_cluster_id are mutually exclusive. Requires a token with write access to JOB scope.
Body parameter
{
"query": "print('Hello, World!')",
"query_type": "PYTHON",
"team_name": "TeamName",
"environment": "ENV_NAME=ENV_VALUE",
"workers": 1,
"query_name": "QueryName",
"files": [
"2",
"3"
],
"python_version": "3.12",
"requirements": "numpy==1.24.0",
"repository": "https://github.com/user/repo",
"priority": 1,
"compute_cluster_name": "ComputeClusterName",
"compute_cluster_id": 1,
"start_after": "2021-01-01T00:00:00Z",
"cron_expression": "0 0 * * *",
"credentials_name": "CredentialsName"
}
Example responses
200 Response
{
"id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
"url": "https://studio.datachain.ai/team/team_name/datasets/jobs/0502eef6-a32e-45fa-8e3b-d20ecpabbcf0",
"status": "CREATED",
"created_at": "2021-01-01T00:00:00Z",
"created_by": "username",
"finished_at": "2021-01-01T00:00:00Z",
"query": "print('Hello, World!')",
"query_type": "PYTHON",
"team": "TeamName",
"name": "QueryName",
"workers": 1,
"python_version": "3.12",
"requirements": "numpy==1.24.0",
"repository": "https://github.com/user/repo",
"environment": {
"ENV_NAME": "ENV_VALUE"
},
"exit_code": 0,
"error_message": "Error message"
}
Responses
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK | JobOutput |
Cancel Job
Code samples
import http.client
conn = http.client.HTTPSConnection("example.com")
payload = "{\"team_name\":\"TeamName\"}"
headers = {
'Content-Type': "application/json",
'Accept': "application/json",
'Authorization': "API_KEY"
}
conn.request("POST", "/api/datachain/jobs/497f6eca-6276-4993-bfeb-53cbbbba6f08/cancel", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
curl --request POST \
--url https://example.com/api/datachain/jobs/497f6eca-6276-4993-bfeb-53cbbbba6f08/cancel \
--header 'Accept: application/json' \
--header 'Authorization: API_KEY' \
--header 'Content-Type: application/json' \
--data '{"team_name":"TeamName"}'
POST /api/datachain/jobs/{job_id}/cancel
Cancel a running or queued job.
Requires a token with write access to JOB scope.
Body parameter
Example responses
200 Response
Responses
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK | ActionFeedback |
Upload File
Code samples
import http.client
conn = http.client.HTTPSConnection("example.com")
payload = "-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"file\"\r\n\r\nstring\r\n-----011000010111000001101001--\r\n"
headers = {
'Content-Type': "multipart/form-data; boundary=---011000010111000001101001",
'Accept': "application/json",
'Authorization': "API_KEY"
}
conn.request("POST", "/api/datachain/jobs/files?team_name=string", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
curl --request POST \
--url 'https://example.com/api/datachain/jobs/files?team_name=string' \
--header 'Accept: application/json' \
--header 'Authorization: API_KEY' \
--header 'Content-Type: multipart/form-data; boundary=---011000010111000001101001' \
--form file=string
POST /api/datachain/jobs/files
Upload a file to use with a job.
Use the file id returned by this endpoint in the files
field of the job input.
Requires a token with write access to JOB scope.
Body parameter
Example responses
200 Response
Responses
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK | UploadFileOutput |
Get Clusters
Code samples
import http.client
conn = http.client.HTTPSConnection("example.com")
headers = {
'Accept': "application/json",
'Authorization': "API_KEY"
}
conn.request("GET", "/api/datachain/clusters/", headers=headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
curl --request GET \
--url https://example.com/api/datachain/clusters/ \
--header 'Accept: application/json' \
--header 'Authorization: API_KEY'
GET /api/datachain/clusters/
Example responses
200 Response
[
{
"id": 1,
"name": "ComputeClusterName",
"status": "ACTIVE",
"cloud_provider": "AWS",
"cloud_credentials": "CredentialsName",
"is_active": true,
"default": true,
"max_workers": 1
}
]
Responses
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK | Inline |
Response Schema
Status Code 200
Schemas
JobOutput
{
"id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
"url": "https://studio.datachain.ai/team/team_name/datasets/jobs/0502eef6-a32e-45fa-8e3b-d20ecpabbcf0",
"status": "CREATED",
"created_at": "2021-01-01T00:00:00Z",
"created_by": "username",
"finished_at": "2021-01-01T00:00:00Z",
"query": "print('Hello, World!')",
"query_type": "PYTHON",
"team": "TeamName",
"name": "QueryName",
"workers": 1,
"python_version": "3.12",
"requirements": "numpy==1.24.0",
"repository": "https://github.com/user/repo",
"environment": {
"ENV_NAME": "ENV_VALUE"
},
"exit_code": 0,
"error_message": "Error message"
}
JobInput
{
"query": "print('Hello, World!')",
"query_type": "PYTHON",
"team_name": "TeamName",
"environment": "ENV_NAME=ENV_VALUE",
"workers": 1,
"query_name": "QueryName",
"files": [
"2",
"3"
],
"python_version": "3.12",
"requirements": "numpy==1.24.0",
"repository": "https://github.com/user/repo",
"priority": 1,
"compute_cluster_name": "ComputeClusterName",
"compute_cluster_id": 1,
"start_after": "2021-01-01T00:00:00Z",
"cron_expression": "0 0 * * *",
"credentials_name": "CredentialsName"
}