Skip to content

SQL

Use SQL functions to operate on the underlying database storing the chain data. Useful for operations like DataChain.filter and DataChain.mutate. Import these functions from datachain.sql.functions.

avg

count module-attribute

count = count

greatest

Bases: ReturnTypeFromArgs

least

Bases: ReturnTypeFromArgs

max module-attribute

max = max

min module-attribute

min = min

rand

sum module-attribute

sum = sum

array

cosine_distance

Bases: GenericFunction

Takes a column and array and returns the cosine distance between them.

euclidean_distance

Bases: GenericFunction

Takes a column and array and returns the Euclidean distance between them.

length

Bases: GenericFunction

Returns the length of the array.

path

This module provides generic SQL functions for path logic.

These need to be implemented using dialect-specific compilation rules. See https://docs.sqlalchemy.org/en/14/core/compiler.html

file_ext

Bases: GenericFunction

Returns the extension of the given path.

file_stem

Bases: GenericFunction

Strips an extension from the given path.

name

Bases: GenericFunction

Returns the final component of a posix-style path.

parent

Bases: GenericFunction

Returns the directory component of a posix-style path.

string

length

Bases: GenericFunction

Returns the length of the string.

regexp_replace

Bases: GenericFunction

Replaces substring that match a regular expression.

split

Bases: GenericFunction

Takes a column and split character and returns an array of the parts.