So many types to choose from…
AggregateFunction
might sound like a function but it is actually a data type, most likely used in an incremental materilized view. This type doesn’t store a final value like a number or string. Instead, it stores the intermediate state of an aggregation—something that can be merged later.
messages_sent
is an Int64 column. This function doesn’t store the result as an Int64 like you might expect. Instead, it produces a column of type:
sum
, uniqExact
)Int64
, String
).AggregateFunction
, SimpleAggregateFunction
stores the final result of an aggregation, rather than its intermediate state. It’s a simpler alternative to AggregateFunction, and while less powerful, it can be useful when you don’t need to merge partial states across batches.
Use SimpleAggregateFunction when:
LowCardinality()
is a wrapper type that can be applied to certain data types to optimize performance and reduce storage. For example, LowCardinality(String)
behaves like a regular string—but under the hood, it’s optimized for columns with repeated values.
Instead of storing the full string value in every row, ClickHouse builds a dictionary of unique values and stores small integer references in the actual column. This is similar to dictionary encoding and can significantly reduce memory usage and speed up operations like filtering and grouping.
If you have a column with fewer than ~10,000 distinct values and an eligible base type, using LowCardinality can radically improve query performance and reduce storage costs compared to the equivalent non-encoded type.
You can wrap the following types with LowCardinality(…):
status
, role
, or priority
. It offers excellent performance and enforces strict value validation at insert time:
UUID = UUID
, not UUID = String
.
groupArray(message_id)
collects all message IDs per user into an array.arrayJoin()
explodes the array so that each message_id gets its own row—flattening the structure.