Esc
Start typing to search...

DataFrame.Expr Module

Composable column expressions for DataFrame operations.

The DataFrame.Expr module provides a functional, pipe-friendly API for building column expressions that compile directly to Polars with SIMD optimization and parallel execution. Use expressions with DataFrame.withColumns, DataFrame.filterExpr, and DataFrame.aggExprs.

Why Expressions?

  1. Performance: Expressions compile directly to Polars — always fast, no fallback
  2. Composability: Expressions are values that can be bound, passed, and composed
  3. Window functions: Impossible with closures, natural with expressions
  4. Aggregations: Sum, mean, count as composable operations

Common patterns

import DataFrame
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Add computed columns
df |> DataFrame.withColumns
    [ col "price" |> Expr.mul (col "qty") |> Expr.named "total"
    , col "price" |> Expr.mul (lit 1.1) |> Expr.named "with_tax"
    ]

-- Filter with expressions
df |> DataFrame.filterExpr (col "status" |> Expr.eq (lit "active"))

-- Conditional logic
Expr.cond
    [ (col "age" |> Expr.lt (lit 18), lit "minor")
    , (col "age" |> Expr.lt (lit 65), lit "adult")
    ] (lit "senior")

-- Window functions
col "sales" |> Expr.sum |> Expr.over ["region"] |> Expr.named "region_total"

Aggregation with groupBy

df
    |> DataFrame.groupBy ["department"]
    |> DataFrame.aggExprs
        [ col "salary" |> Expr.mean |> Expr.named "avg_salary"
        , col "id" |> Expr.count |> Expr.named "employee_count"
        ]

Functions

Constructors

DataFrame.Expr.col

String -> Expr

Reference a DataFrame column by name. This is the primary way to start building expressions.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Reference a single column
col "name"

-- Use in arithmetic
col "price" |> Expr.mul (col "quantity")

-- Use in comparisons
col "age" |> Expr.gte (lit 18)
Try it

Notes: Column names are case-sensitive and must exactly match the DataFrame column names.

See also: lit, named

DataFrame.Expr.lit

a -> Expr

Create a literal (constant) expression from a value. Supports Int, Float, String, Bool, and Unit (null).

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Integer literal
lit 42

-- Float literal
lit 3.14159

-- String literal
lit "active"

-- Boolean literal
lit True

-- Use in expressions
col "price" |> Expr.mul (lit 1.1)  -- 10% markup
col "status" |> Expr.eq (lit "active")
Try it

Notes: Unit values become SQL NULL. Use lit for constants in expressions rather than hardcoding values.

See also: col

DataFrame.Expr.named

String -> Expr -> Expr

Assign a name (alias) to an expression's output column. Required when using withColumns to define the result column name.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Name the output of a computation
col "price" |> Expr.mul (col "qty") |> Expr.named "total"

-- Multiple named expressions
df |> DataFrame.withColumns
    [ col "a" |> Expr.add (col "b") |> Expr.named "sum_ab"
    , col "a" |> Expr.mul (col "b") |> Expr.named "product_ab"
    ]
Try it

Notes: Column name cannot be empty. The alias only affects the output column name, not the expression itself.

See also: col, DataFrame.withColumns

Arithmetic

DataFrame.Expr.add

Expr -> Expr -> Expr

Add two expressions element-wise. Works with numeric columns (Int, Float).

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Add two columns
col "a" |> Expr.add (col "b")

-- Add a constant
col "price" |> Expr.add (lit 10)

-- Chain operations
col "a" |> Expr.add (col "b") |> Expr.add (col "c")
Try it

Notes: Follows pipe convention: lhs |> add rhs = lhs + rhs. Type coercion follows Polars rules (Int + Float = Float).

See also: sub, mul, div

DataFrame.Expr.sub

Expr -> Expr -> Expr

Subtract two expressions element-wise (lhs - rhs).

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Column difference
col "revenue" |> Expr.sub (col "cost")

-- Subtract a constant
col "score" |> Expr.sub (lit 5)
Try it

Notes: Follows pipe convention: lhs |> sub rhs = lhs - rhs.

See also: add, mul, div

DataFrame.Expr.mul

Expr -> Expr -> Expr

Multiply two expressions element-wise.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Calculate total
col "price" |> Expr.mul (col "quantity")

-- Apply percentage
col "salary" |> Expr.mul (lit 1.05)  -- 5% raise

-- Named result
col "hours" |> Expr.mul (col "rate") |> Expr.named "pay"
Try it

Notes: Follows pipe convention: lhs |> mul rhs = lhs * rhs.

See also: add, sub, div, pow

DataFrame.Expr.div

Expr -> Expr -> Expr

Divide two expressions element-wise (lhs / rhs). Returns Float for integer division.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Calculate ratio
col "completed" |> Expr.div (col "total")

-- Per-unit value
col "total_cost" |> Expr.div (col "quantity")

-- Normalize (0-1 range)
col "value" |> Expr.div (col "max_value")
Try it

Notes: Division by zero returns null (not an error). Integer division produces Float result.

See also: mul, mod

DataFrame.Expr.mod

Expr -> Expr -> Expr

Modulo (remainder) of two expressions (lhs % rhs).

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Check if even
col "n" |> Expr.mod (lit 2) |> Expr.eq (lit 0)

-- Get last digit
col "id" |> Expr.mod (lit 10)
Try it

Notes: Result has the same sign as the dividend (lhs).

See also: div

DataFrame.Expr.pow

Expr -> Expr -> Expr

Raise base to exponent power (base ^ exp).

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Square a column
col "x" |> Expr.pow (lit 2)

-- Cube root (exponent 1/3)
col "volume" |> Expr.pow (lit 0.333333)

-- Compound interest
col "principal" |> Expr.mul (lit 1.05 |> Expr.pow (col "years"))
Try it

Notes: Follows pipe convention: base |> pow exp = base ^ exp.

See also: sqrt, mul

Comparison

DataFrame.Expr.eq

Expr -> Expr -> Expr

Test equality (==). Returns a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Filter by status
col "status" |> Expr.eq (lit "active")

-- Compare columns
col "actual" |> Expr.eq (col "expected")

-- Use with filterExpr
df |> DataFrame.filterExpr (col "country" |> Expr.eq (lit "USA"))
Try it

Notes: Null values: null == null returns null, not True. Use isNull for null checks.

See also: neq, gt, lt, isNull

DataFrame.Expr.neq

Expr -> Expr -> Expr

Test inequality (!=). Returns a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Exclude a status
col "status" |> Expr.neq (lit "deleted")

-- Filter non-matching
df |> DataFrame.filterExpr (col "type" |> Expr.neq (lit "test"))
Try it

Notes: Null values: null != value returns null, not True.

See also: eq, isNotNull

DataFrame.Expr.gt

Expr -> Expr -> Expr

Greater than comparison (>). Returns a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Age filter
col "age" |> Expr.gt (lit 18)

-- Compare columns
col "revenue" |> Expr.gt (col "cost")

-- Chain with boolean ops
col "score" |> Expr.gt (lit 90) |> Expr.and (col "passed" |> Expr.eq (lit True))
Try it

Notes: Follows pipe convention: lhs |> gt rhs = lhs > rhs.

See also: gte, lt, lte

DataFrame.Expr.gte

Expr -> Expr -> Expr

Greater than or equal comparison (>=). Returns a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Minimum threshold
col "quantity" |> Expr.gte (lit 10)

-- Date comparison
col "year" |> Expr.gte (lit 2020)
Try it

Notes: Follows pipe convention: lhs |> gte rhs = lhs >= rhs.

See also: gt, lte

DataFrame.Expr.lt

Expr -> Expr -> Expr

Less than comparison (<). Returns a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Below threshold
col "temperature" |> Expr.lt (lit 0)

-- Range check (combine with gt)
let inRange = col "x" |> Expr.gt (lit 0) |> Expr.and (col "x" |> Expr.lt (lit 100))
Try it

Notes: Follows pipe convention: lhs |> lt rhs = lhs < rhs.

See also: lte, gt, gte

DataFrame.Expr.lte

Expr -> Expr -> Expr

Less than or equal comparison (<=). Returns a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Maximum threshold
col "price" |> Expr.lte (lit 100)

-- Cohort bucketing with cond
Expr.cond
    [ (col "age" |> Expr.lte (lit 17), lit "minor")
    , (col "age" |> Expr.lte (lit 64), lit "adult")
    ]
    (lit "senior")
Try it

Notes: Follows pipe convention: lhs |> lte rhs = lhs <= rhs.

See also: lt, gte

Boolean

DataFrame.Expr.and

Expr -> Expr -> Expr

Logical AND of two boolean expressions. Both must be true for result to be true.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Combine conditions
let isActiveAdult =
    col "age" |> Expr.gte (lit 18)
    |> Expr.and (col "status" |> Expr.eq (lit "active"))

-- Multiple conditions
col "a" |> Expr.gt (lit 0)
    |> Expr.and (col "b" |> Expr.gt (lit 0))
    |> Expr.and (col "c" |> Expr.gt (lit 0))
Try it

Notes: Short-circuit evaluation is not guaranteed. Null AND True = Null, Null AND False = False.

See also: or, not

DataFrame.Expr.or

Expr -> Expr -> Expr

Logical OR of two boolean expressions. Either being true makes result true.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Either condition
let isSpecial =
    col "status" |> Expr.eq (lit "vip")
    |> Expr.or (col "status" |> Expr.eq (lit "admin"))

-- Fallback check
col "primary_email" |> Expr.isNotNull
    |> Expr.or (col "backup_email" |> Expr.isNotNull)
Try it

Notes: Null OR True = True, Null OR False = Null.

See also: and, not

DataFrame.Expr.not

Expr -> Expr

Logical NOT (negation) of a boolean expression.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Negate a condition
col "is_deleted" |> Expr.not

-- Filter for NOT matching
df |> DataFrame.filterExpr (col "status" |> Expr.eq (lit "spam") |> Expr.not)
Try it

Notes: NOT Null = Null.

See also: and, or

Aggregation

DataFrame.Expr.sum

Expr -> Expr

Sum of all values in a column. Use with groupBy/agg for group-wise sums.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Total sum
col "amount" |> Expr.sum |> Expr.named "total_amount"

-- Group-wise sum
df
    |> DataFrame.groupBy ["category"]
    |> DataFrame.aggExprs [col "sales" |> Expr.sum |> Expr.named "total_sales"]

-- Window sum
col "value" |> Expr.sum |> Expr.over ["group_id"] |> Expr.named "group_total"
Try it

Notes: Null values are ignored (not treated as 0). Returns null for empty groups.

See also: mean, count, over

DataFrame.Expr.mean

Expr -> Expr

Arithmetic mean (average) of values. Returns Float.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Overall average
col "score" |> Expr.mean |> Expr.named "avg_score"

-- Group average
df
    |> DataFrame.groupBy ["department"]
    |> DataFrame.aggExprs [col "salary" |> Expr.mean |> Expr.named "avg_salary"]
Try it

Notes: Null values are excluded from both numerator and count. Empty groups return null.

See also: sum, median, std

DataFrame.Expr.min

Expr -> Expr

Minimum value in a column. Works with numeric, string, and date types.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Find minimum
col "price" |> Expr.min |> Expr.named "lowest_price"

-- Group minimum
df
    |> DataFrame.groupBy ["product"]
    |> DataFrame.aggExprs [col "date" |> Expr.min |> Expr.named "first_sale"]
Try it

Notes: Null values are ignored. Returns null for empty groups.

See also: max, first

DataFrame.Expr.max

Expr -> Expr

Maximum value in a column. Works with numeric, string, and date types.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Find maximum
col "temperature" |> Expr.max |> Expr.named "peak_temp"

-- Group maximum
df
    |> DataFrame.groupBy ["user_id"]
    |> DataFrame.aggExprs [col "login_time" |> Expr.max |> Expr.named "last_login"]
Try it

Notes: Null values are ignored. Returns null for empty groups.

See also: min, last

DataFrame.Expr.count

Expr -> Expr

Count of non-null values in a column.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Count non-null values
col "email" |> Expr.count |> Expr.named "emails_provided"

-- Group counts
df
    |> DataFrame.groupBy ["status"]
    |> DataFrame.aggExprs [col "id" |> Expr.count |> Expr.named "n"]
Try it

Notes: Counts non-null values only. For total rows including nulls, count a non-nullable column like id.

See also: sum, first, last

DataFrame.Expr.first

Expr -> Expr

First value in a group. Order depends on the DataFrame's current row order.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Get first value after sorting
df
    |> DataFrame.sort "date"
    |> DataFrame.groupBy ["customer"]
    |> DataFrame.aggExprs [col "order_id" |> Expr.first |> Expr.named "first_order"]
Try it

Notes: Returns first non-null value. Sort the DataFrame first if you need a specific ordering.

See also: last, min

DataFrame.Expr.last

Expr -> Expr

Last value in a group. Order depends on the DataFrame's current row order.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Get last value after sorting
df
    |> DataFrame.sort "timestamp"
    |> DataFrame.groupBy ["user"]
    |> DataFrame.aggExprs [col "action" |> Expr.last |> Expr.named "last_action"]
Try it

Notes: Returns last non-null value. Sort the DataFrame first if you need a specific ordering.

See also: first, max

DataFrame.Expr.std

Expr -> Expr

Sample standard deviation (with Bessel's correction, ddof=1).

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Calculate spread
col "score" |> Expr.std |> Expr.named "score_stddev"

-- Group variation
df
    |> DataFrame.groupBy ["treatment"]
    |> DataFrame.aggExprs [col "response" |> Expr.std |> Expr.named "response_std"]
Try it

Notes: Uses ddof=1 (sample standard deviation). Requires at least 2 values.

See also: var, mean

DataFrame.Expr.var

Expr -> Expr

Sample variance (with Bessel's correction, ddof=1).

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Calculate variance
col "measurement" |> Expr.var |> Expr.named "measurement_var"
Try it

Notes: Uses ddof=1 (sample variance). Variance = std^2.

See also: std, mean

DataFrame.Expr.median

Expr -> Expr

Median (50th percentile) of values. More robust to outliers than mean.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Robust central tendency
col "income" |> Expr.median |> Expr.named "median_income"

-- Compare mean vs median
df
    |> DataFrame.groupBy ["region"]
    |> DataFrame.aggExprs
        [ col "price" |> Expr.mean |> Expr.named "mean_price"
        , col "price" |> Expr.median |> Expr.named "median_price"
        ]
Try it

Notes: For even-length groups, returns average of the two middle values.

See also: mean, min, max

String

DataFrame.Expr.strLength

Expr -> Expr

Length of string in characters (not bytes). Returns Int.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- String length
col "name" |> Expr.strLength |> Expr.named "name_len"

-- Filter by length
df |> DataFrame.filterExpr (col "code" |> Expr.strLength |> Expr.eq (lit 5))
Try it

Notes: Counts Unicode characters, not bytes. Null strings return null.

See also: strUpper, strLower, strTrim

DataFrame.Expr.strUpper

Expr -> Expr

Convert string to uppercase.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Normalize to uppercase
col "country_code" |> Expr.strUpper |> Expr.named "country_code_upper"

-- Case-insensitive comparison
col "status" |> Expr.strUpper |> Expr.eq (lit "ACTIVE")
Try it

Notes: Uses Unicode case mapping rules.

See also: strLower

DataFrame.Expr.strLower

Expr -> Expr

Convert string to lowercase.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Normalize to lowercase
col "email" |> Expr.strLower |> Expr.named "email_normalized"
Try it

Notes: Uses Unicode case mapping rules.

See also: strUpper

DataFrame.Expr.strTrim

Expr -> Expr

Remove leading and trailing whitespace from string.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Clean up user input
col "user_input" |> Expr.strTrim |> Expr.named "cleaned_input"
Try it

Notes: Removes spaces, tabs, newlines, and other Unicode whitespace.

See also: strReplace

DataFrame.Expr.strContains

String -> Expr -> Expr

Check if string contains the given pattern. Returns boolean.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Check for substring
col "email" |> Expr.strContains "@gmail.com"

-- Filter emails
df |> DataFrame.filterExpr (col "email" |> Expr.strContains "@company.com")
Try it

Notes: Literal string matching (not regex). Case-sensitive.

See also: strStartsWith, strEndsWith

DataFrame.Expr.strStartsWith

String -> Expr -> Expr

Check if string starts with the given prefix. Returns boolean.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Check prefix
col "phone" |> Expr.strStartsWith "+1"

-- Filter by title
df |> DataFrame.filterExpr (col "name" |> Expr.strStartsWith "Dr.")
Try it

Notes: Case-sensitive comparison.

See also: strEndsWith, strContains

DataFrame.Expr.strEndsWith

String -> Expr -> Expr

Check if string ends with the given suffix. Returns boolean.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Check file extension
col "filename" |> Expr.strEndsWith ".csv"

-- Filter by domain
df |> DataFrame.filterExpr (col "url" |> Expr.strEndsWith ".org")
Try it

Notes: Case-sensitive comparison.

See also: strStartsWith, strContains

DataFrame.Expr.strReplace

String -> String -> Expr -> Expr

Replace first occurrence of a pattern with replacement string.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Replace substring
col "text" |> Expr.strReplace "old" "new"

-- Remove prefix
col "id" |> Expr.strReplace "ID_" ""
Try it

Notes: Only replaces the first occurrence. Pattern is literal (not regex).

See also: strTrim

Math

DataFrame.Expr.abs

Expr -> Expr

Absolute value. Works with Int and Float.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Absolute difference
col "actual" |> Expr.sub (col "predicted") |> Expr.abs |> Expr.named "abs_error"
Try it

Notes: Returns same type as input.

See also: sqrt, round

DataFrame.Expr.sqrt

Expr -> Expr

Square root. Returns Float.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Calculate RMSE component
col "squared_error" |> Expr.sqrt

-- Distance calculation
col "x" |> Expr.pow (lit 2)
    |> Expr.add (col "y" |> Expr.pow (lit 2))
    |> Expr.sqrt
    |> Expr.named "distance"
Try it

Notes: Negative values return NaN, not an error.

See also: pow, abs

DataFrame.Expr.floor

Expr -> Expr

Round down to nearest integer (toward negative infinity).

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Truncate to integer
col "price" |> Expr.floor |> Expr.named "price_floor"

-- floor(2.7) = 2, floor(-2.3) = -3
Try it

Notes: Returns Float type, not Int. Use for rounding, not type conversion.

See also: ceil, round

DataFrame.Expr.ceil

Expr -> Expr

Round up to nearest integer (toward positive infinity).

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Round up
col "quantity" |> Expr.ceil |> Expr.named "quantity_ceil"

-- ceil(2.1) = 3, ceil(-2.7) = -2
Try it

Notes: Returns Float type, not Int.

See also: floor, round

DataFrame.Expr.round

Int -> Expr -> Expr

Round to specified number of decimal places.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Round to 2 decimal places
col "price" |> Expr.round 2 |> Expr.named "price_rounded"

-- Round to whole number
col "average" |> Expr.round 0
Try it

Notes: Uses banker's rounding (round half to even). Negative decimals round to tens, hundreds, etc.

See also: floor, ceil

Null

DataFrame.Expr.fillNull

Expr -> Expr -> Expr

Replace null values with a default value.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Fill with constant
col "score" |> Expr.fillNull (lit 0)

-- Fill with another column
col "nickname" |> Expr.fillNull (col "name")

-- Chain to handle multiple fallbacks
col "preferred_email"
    |> Expr.fillNull (col "work_email")
    |> Expr.fillNull (col "personal_email")
Try it

Notes: The default expression is only evaluated for null values.

See also: isNull, isNotNull

DataFrame.Expr.isNull

Expr -> Expr

Check if value is null. Returns boolean.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Find missing values
col "email" |> Expr.isNull

-- Filter for nulls
df |> DataFrame.filterExpr (col "deleted_at" |> Expr.isNull)

-- Count nulls
col "score" |> Expr.isNull |> Expr.sum |> Expr.named "missing_count"
Try it

Notes: Null represents missing data. Use isNull instead of eq(lit null).

See also: isNotNull, fillNull

DataFrame.Expr.isNotNull

Expr -> Expr

Check if value is not null. Returns boolean.

Example:
import DataFrame.Expr exposing (col)
import DataFrame.Expr as Expr

-- Find present values
col "email" |> Expr.isNotNull

-- Filter for non-nulls
df |> DataFrame.filterExpr (col "verified_at" |> Expr.isNotNull)
Try it

Notes: Equivalent to isNull |> not, but more readable.

See also: isNull, fillNull

Conditional

DataFrame.Expr.cond

[(Expr, Expr)] -> Expr -> Expr

Multi-branch conditional expression (like SQL CASE WHEN). Takes a list of (condition, result) pairs and a default value.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Age categories
let ageGroup = Expr.cond
    [ (col "age" |> Expr.lt (lit 18), lit "minor")
    , (col "age" |> Expr.lt (lit 65), lit "adult")
    ]
    (lit "senior")
    |> Expr.named "age_group"

-- Numeric bucketing
let cohort = Expr.cond
    [ (col "year" |> Expr.lte (lit 1949), lit 1)
    , (col "year" |> Expr.lte (lit 1959), lit 2)
    , (col "year" |> Expr.lte (lit 1969), lit 3)
    ]
    (lit 4)
    |> Expr.named "cohort"

-- Use with withColumns
df |> DataFrame.withColumns [ageGroup, cohort]
Try it

Notes: Conditions are evaluated in order; first match wins. The default is required and used when no conditions match.

See also: and, or, eq

Window

DataFrame.Expr.over

[String] -> Expr -> Expr

Apply an expression as a window function over partition columns. Enables row-level access to aggregated values.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Running total per group
col "amount" |> Expr.sum |> Expr.over ["customer_id"] |> Expr.named "customer_total"

-- Percentage of group
col "sales"
    |> Expr.div (col "sales" |> Expr.sum |> Expr.over ["region"])
    |> Expr.mul (lit 100)
    |> Expr.named "pct_of_region"

-- Multiple partitions
col "value" |> Expr.mean |> Expr.over ["year", "category"] |> Expr.named "avg_by_year_cat"

-- Global window (no partitions)
col "score" |> Expr.mean |> Expr.over [] |> Expr.named "global_avg"
Try it

Notes: Empty partition list [] means global window (entire DataFrame). Results are broadcast back to each row.

See also: sum, mean, rowNumber, lag, lead

DataFrame.Expr.rowNumber

Expr

Assign sequential row numbers within partitions (1-based). Use with over to partition.

Example:
import DataFrame.Expr as Expr

-- Row number within groups
Expr.rowNumber |> Expr.over ["customer_id"] |> Expr.named "order_seq"

-- Global row number
Expr.rowNumber |> Expr.over [] |> Expr.named "row_num"

-- Get first row per group (filter where rowNumber == 1)
df
    |> DataFrame.withColumns [Expr.rowNumber |> Expr.over ["group"] |> Expr.named "rn"]
    |> DataFrame.filterExpr (col "rn" |> Expr.eq (lit 1))
Try it

Notes: Starts at 1. Order depends on current DataFrame sort order.

See also: rank, denseRank, over

DataFrame.Expr.rank

Expr

Rank values with gaps for ties. Ties get the same rank; next rank skips accordingly.

Example:
import DataFrame.Expr as Expr

-- Rank with gaps: [1, 2, 2, 4] for values [10, 20, 20, 30]
Expr.rank |> Expr.over ["department"] |> Expr.named "sales_rank"
Try it

Notes: Use with over for partitioned ranking. Sort the DataFrame first to control rank ordering.

See also: denseRank, rowNumber

DataFrame.Expr.denseRank

Expr

Rank values without gaps for ties. Consecutive ranks even when ties exist.

Example:
import DataFrame.Expr as Expr

-- Dense rank: [1, 2, 2, 3] for values [10, 20, 20, 30]
Expr.denseRank |> Expr.over ["category"] |> Expr.named "price_rank"
Try it

Notes: Unlike rank, dense_rank doesn't skip numbers after ties.

See also: rank, rowNumber

DataFrame.Expr.lag

Int -> Expr -> Expr

Get value from n rows before the current row. Useful for comparing to previous values.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Previous row's value
col "price" |> Expr.lag 1 |> Expr.over ["stock"] |> Expr.named "prev_price"

-- Calculate change from previous
col "value" |> Expr.sub (col "value" |> Expr.lag 1 |> Expr.over [])
    |> Expr.named "change"

-- Look back 7 periods
col "sales" |> Expr.lag 7 |> Expr.over [] |> Expr.named "sales_last_week"
Try it

Notes: Returns null for rows without enough history (first n rows). Sort first for meaningful order.

See also: lead, over

DataFrame.Expr.lead

Int -> Expr -> Expr

Get value from n rows after the current row. Useful for comparing to future values.

Example:
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr

-- Next row's value
col "price" |> Expr.lead 1 |> Expr.over ["stock"] |> Expr.named "next_price"

-- Days until next event
col "event_date" |> Expr.lead 1 |> Expr.over ["user"]
    |> Expr.sub (col "event_date")
    |> Expr.named "days_to_next"
Try it

Notes: Returns null for rows without enough future values (last n rows). Sort first for meaningful order.

See also: lag, over