DataFrame Expressions
The DataFrame.Expr module provides composable, type-safe column expressions that compile directly to Polars operations. Unlike closures (which may fall back to slow row-by-row evaluation), expressions always use Polars' optimized SIMD and parallel execution.
Getting Started
Import the Expr module with an alias for concise usage:
import DataFrame
import DataFrame.Expr as Expr
Column References and Literals
Build expressions from column references and literal values:
-- tags: dataframe, expr, expressions
-- expect: ["name", "revenue", "double_revenue"]
-- DataFrame.Expr for composable column operations
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "Alice", revenue = 100 }
, { name = "Bob", revenue = 200 }
]
|> DataFrame.selectExpr
[ Expr.col "name"
, Expr.col "revenue"
, Expr.col "revenue"
|> Expr.mul (Expr.lit 2)
|> Expr.named "double_revenue"
]
|> DataFrame.columns
Try itExpr.col "name"references a column by nameExpr.lit valuecreates a constant expression from an Int, Float, or StringExpr.named "alias" exprrenames the output column
Arithmetic
Combine expressions with arithmetic operators:
import DataFrame.Expr as Expr
-- Column arithmetic
Expr.col "price" |> Expr.mul (Expr.col "quantity")
-- Mixed column and literal
Expr.col "score" |> Expr.add (Expr.lit 10)
Available: add, sub, mul, div, mod, pow.
Comparison and Boolean Logic
import DataFrame.Expr as Expr
-- Filter-style expressions
Expr.col "age" |> Expr.gte (Expr.lit 18)
-- Combine with boolean logic
let isAdult = Expr.col "age" |> Expr.gte (Expr.lit 18)
let isActive = Expr.col "status" |> Expr.eq (Expr.lit "active")
Expr.and isAdult isActive
Comparison: eq, neq, gt, gte, lt, lte.
Boolean: and, or, not.
Conditional Expressions
Use cond for if-then-else logic:
-- tags: dataframe, expr, conditional
-- expect: ["name", "score", "grade"]
-- Conditional expressions with DataFrame.Expr
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "Alice", score = 95 }
, { name = "Bob", score = 72 }
, { name = "Carol", score = 88 }
]
|> DataFrame.selectExpr
[ Expr.col "name"
, Expr.col "score"
, Expr.cond
(Expr.col "score" |> Expr.gte (Expr.lit 90))
(Expr.lit "A")
(Expr.lit "B")
|> Expr.named "grade"
]
|> DataFrame.columns
Try itAggregations
Reduce columns to summary values:
-- tags: dataframe, expr, aggregation
-- expect: ["total", "average"]
-- Aggregation with DataFrame.Expr
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { value = 10 }
, { value = 20 }
, { value = 30 }
]
|> DataFrame.selectExpr
[ Expr.col "value" |> Expr.sum |> Expr.named "total"
, Expr.col "value" |> Expr.mean |> Expr.named "average"
]
|> DataFrame.columns
Try itAvailable: sum, mean, min, max, count, first, last, std, var, median.
String Operations
Transform string columns:
-- tags: dataframe, expr, string
-- expect: ["name", "upper_name"]
-- String operations with DataFrame.Expr
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "alice" }
, { name = "bob" }
]
|> DataFrame.selectExpr
[ Expr.col "name"
, Expr.col "name"
|> Expr.strUpper
|> Expr.named "upper_name"
]
|> DataFrame.columns
Try itAvailable: strLength, strUpper, strLower, strTrim, strContains, strStartsWith, strEndsWith, strReplace.
Math Functions
import DataFrame.Expr as Expr
Expr.col "value" |> Expr.abs
Expr.col "value" |> Expr.sqrt
Expr.col "value" |> Expr.round
Available: abs, sqrt, floor, ceil, round.
Null Handling
import DataFrame.Expr as Expr
-- Replace nulls with a default
Expr.col "score" |> Expr.fillNull (Expr.lit 0)
-- Check for nulls
Expr.col "email" |> Expr.isNull
Expr.col "email" |> Expr.isNotNull
Window Functions
Apply expressions over partitions (SQL-style window functions):
import DataFrame.Expr as Expr
-- Running sum per group
Expr.col "revenue" |> Expr.sum |> Expr.over ["region"]
-- Ranking within groups
Expr.col "score" |> Expr.rank |> Expr.over ["department"]
Expr.col "score" |> Expr.denseRank |> Expr.over ["department"]
-- Access previous/next rows
Expr.col "value" |> Expr.lag 1 -- previous row's value
Expr.col "value" |> Expr.lead 1 -- next row's value
When to Use Expressions vs Closures
| Use Expressions When | Use Closures When |
|---|---|
| Column arithmetic and comparisons | Complex logic needing full language features |
| Aggregations and window functions | Pattern matching on values |
| String transformations on columns | Calling other Keel functions |
| Performance is critical | Prototyping or one-off transforms |
Expressions compile to native Polars operations and benefit from SIMD vectorization, parallel execution, and query optimization. Use them for performance-critical data pipelines.
Next Steps
See the DataFrame stdlib page for the complete function reference, including how expressions integrate with selectExpr, filterExpr, and other DataFrame operations.