DataFrame Module

Tabular data manipulation backed by Polars DataFrames.

The DataFrame module provides a pipe-friendly API for loading, transforming, filtering, and aggregating tabular data. DataFrames are opaque native objects that can only be manipulated through DataFrame module functions.

Common patterns

import DataFrame

let data = DataFrame.readCsv "employees.csv"
let summary = data
    |> DataFrame.select ["name", "department", "salary"]
    |> DataFrame.filterGt "salary" 50000
    |> DataFrame.sort "salary"
    |> DataFrame.head 20
IO.println (DataFrame.shape summary)

Display

DataFrames render as formatted, column-aligned tables when printed or displayed in the REPL. Output includes shape, column names, dtypes, and data rows. Large DataFrames (>10 rows) show the first 5 and last 5 rows with a … separator:

shape: (1000, 3)
name | age |     city
str | i64 |      str
-------+-----+---------
Alice |  30 | New York
Bob |  25 |   London
… |   … |        …
Yara |  31 |   Berlin
Zach |  22 |    Tokyo

Security

Variable	Effect
`KEEL_DATAFRAME_DISABLED=1`	Disable DataFrame operations
`KEEL_DATAFRAME_SANDBOX=/path`	Restrict file I/O to directory
`KEEL_DATAFRAME_MAX_ROWS=10000`	Limit rows loaded from files

Functions

I/O

`DataFrame.readCsv`

String -> DataFrame

Read a CSV file into a DataFrame.

Example:

import DataFrame

DataFrame.readCsv "data.csv"

DataFrame Module

Common patterns

Display

Security

Functions

I/O

DataFrame.readCsv

DataFrame.readJson

DataFrame.readParquet

DataFrame.writeCsv

DataFrame.writeJson

DataFrame.writeParquet

Column ops

DataFrame.select

DataFrame.drop

DataFrame.rename

DataFrame.withColumn

DataFrame.column

DataFrame.columns

DataFrame.dtypes

Row ops

DataFrame.head

DataFrame.tail

DataFrame.slice

DataFrame.sort

DataFrame.sortDesc

DataFrame.unique

DataFrame.sample

Filters

DataFrame.filterEq

DataFrame.filterNeq

DataFrame.filterGt

DataFrame.filterGte

DataFrame.filterLt

DataFrame.filterLte

DataFrame.filterIn

Aggregation

DataFrame.groupBy

DataFrame.agg

DataFrame.count

DataFrame.describe

Multi-DataFrame

DataFrame.join

DataFrame.concat

DataFrame.pivot

Metadata

DataFrame.setMeta

DataFrame.getMeta

DataFrame.allMeta

DataFrame.setColumnMeta

DataFrame.getColumnMeta

DataFrame.allColumnMeta

DataFrame.describeMeta

Inspection

DataFrame.shape

Conversion

DataFrame.toRecords

DataFrame.fromRecords

Other

DataFrame.aggExprs

DataFrame.collect

DataFrame.columnLineage

DataFrame.describeLabel

DataFrame.describeLabels

DataFrame.describeVariables

DataFrame.filter

DataFrame.filterExpr

DataFrame.fromLists

DataFrame.getAllValueLabels

DataFrame.getDisplayMode

DataFrame.getValueLabels

DataFrame.getVarLabel

DataFrame.getVarLabels

DataFrame.lazy

DataFrame.lazyCollect

DataFrame.lazyFilter

DataFrame.lazySelect

DataFrame.lazyWithColumns

DataFrame.lineage

DataFrame.mutate

`DataFrame.readCsv`

`DataFrame.readJson`

`DataFrame.readParquet`

`DataFrame.writeCsv`

`DataFrame.writeJson`

`DataFrame.writeParquet`

`DataFrame.select`

`DataFrame.drop`

`DataFrame.rename`

`DataFrame.withColumn`

`DataFrame.column`

`DataFrame.columns`

`DataFrame.dtypes`

`DataFrame.head`

`DataFrame.tail`

`DataFrame.slice`

`DataFrame.sort`

`DataFrame.sortDesc`

`DataFrame.unique`

`DataFrame.sample`

`DataFrame.filterEq`

`DataFrame.filterNeq`

`DataFrame.filterGt`

`DataFrame.filterGte`

`DataFrame.filterLt`

`DataFrame.filterLte`

`DataFrame.filterIn`

`DataFrame.groupBy`

`DataFrame.agg`

`DataFrame.count`

`DataFrame.describe`

`DataFrame.join`

`DataFrame.concat`

`DataFrame.pivot`

`DataFrame.setMeta`

`DataFrame.getMeta`

`DataFrame.allMeta`

`DataFrame.setColumnMeta`

`DataFrame.getColumnMeta`

`DataFrame.allColumnMeta`

`DataFrame.describeMeta`

`DataFrame.shape`

`DataFrame.toRecords`

`DataFrame.fromRecords`

`DataFrame.aggExprs`

`DataFrame.collect`

`DataFrame.columnLineage`

`DataFrame.describeLabel`

`DataFrame.describeLabels`

`DataFrame.describeVariables`

`DataFrame.filter`

`DataFrame.filterExpr`

`DataFrame.fromLists`

`DataFrame.getAllValueLabels`

`DataFrame.getDisplayMode`

`DataFrame.getValueLabels`

`DataFrame.getVarLabel`

`DataFrame.getVarLabels`

`DataFrame.lazy`

`DataFrame.lazyCollect`

`DataFrame.lazyFilter`

`DataFrame.lazySelect`

`DataFrame.lazyWithColumns`

`DataFrame.lineage`

`DataFrame.mutate`

`DataFrame.orderBy`

`DataFrame.partitionBy`

`DataFrame.readCsvColumns`

`DataFrame.readDta`

`DataFrame.readDtaColumns`

`DataFrame.readJsonColumns`

`DataFrame.readParquetColumns`

`DataFrame.recode`

`DataFrame.removeValueLabels`

`DataFrame.removeVarLabel`

`DataFrame.searchVariables`

`DataFrame.withColumns`

`DataFrame.withCumMax`

`DataFrame.withCumMean`

`DataFrame.withCumMin`