Optimize Queries
Optimizing queries with materialized pipes
Although endpoints alone can handle high volumes, they may slow down at larger scales.
Thus, optimizing queries is a crucial part of managing production workloads in Airfold, specifically by precomputing and storing results incrementally through materialized pipes.
What are Materialized Pipes?
Materialized pipes precompute and store results as data is ingested. They offer a more efficient way to handle repetitive or resource-intensive queries compared to draft or published pipes.
How they work:
- Historical Data Pre-Aggregation: When creating a materialized pipe, you have the option to pre-aggregate all historical data to create a baseline of results
- Incremental Processing: Materialized pipes compute results as new data flows in, maintaining up-to-date outputs without reprocessing the entire dataset
- Precomputed Results: Instead of executing a query from scratch every time, results are precomputed and stored in a target source
- Fast Queries: Precomputed results are served instantly, avoiding recomputation
Use Cases
Frequent Queries:
- Ideal for scenarios where the same query is executed repeatedly, such as dashboards or real-time monitoring
Expected Aggregations:
- If you know ahead of time which aggregations are needed, such as sales metrics or user activity trends
Note: Materialized pipes are not suitable for ad hoc queries where the structure of the data or aggregations is determined dynamically
Recent Data Analysis:
- Perfect for use cases where the focus is on recently ingested data
- Materialized pipes maintain results incrementally as new data arrives
Caveats
Immutable Results:
- Once data is materialized, you cannot change how the results were computed
- Any change in logic requires creating a new materialized pipe
No Retroactive Updates:
- Materialized pipes work with newly ingested data and cannot retroactively apply changes to older data
- This makes them unsuitable for scenarios where historical recalculations are needed
UI
Upon creating a pipe, navigate to the three dots (...
) on the top right corner and click “materialize”:
Confirm your pipe, then select a compatible target source or ”+ Add Source”:
You can also make the pipe refreshable, meaning that it would automatically update at a set time interval. Check “Refreshable” to enable this and customize your interval:
Click “Materialize” to finish.
Your materialized pipe will now store and incrementally update results.
YAML
Draft to Materialized Pipe
Define a Draft Pipe
Start by creating a draft pipe to develop and test your pipeline.
Push this draft pipe by running:
Materialize the Draft Pipe
To materialize the pipe, run the af pipe materialize
command:
Refreshable Materialized Pipe
You may also make the materialized pipe refreshable by adding a refresh
attribute to your YAML file:
Query Materialized Data
After materializing the pipe, the results are stored in the target source logs_by_level
.
You can query it directly by running: