Bring your data into Airfold
engine type
(required)Primary Key
defines how data is stored on disc (required)Partitions
- Used mainly for data management (optional)Data types
There are many data types in Airfold that you might not find in other platforms that can help to cut down on storage cost and query efficiency.
💡 Tip:
If you plan on ingesting data using the Airfold API, you need to create a source without a connector. Selecting the Text
option in this window will allow you to create an unconnected Source, but still define its schema - ready for data to be ingested into.
Below is an example of creating a Source using a file upload
. This will create an unconnected Source that will infer its schema from the file uploaded. You can choose whether or not to actually ingest the data from the file OR to just use the file as a means of inferring a schema.
File Upload
, you will be asked to specify the file.
Name
of your Source. The Source name is how you will reference your data in your SQL queries - so make sure to select a SQL-friendly name!
You can also choose whether or not to actually ingest the data from the file (or just infer its schema). Check the box if you want to ingest the data from the uploaded file into your Source.
schema
and the table settings
. You can update names of columns, data types, engine type etc. Refer to the schema page of these docs for best practices.
Click Create to finalize the Source.
Usage metrics
so that you can monitor your key Source metrics. The tabs below the graph:
Data
Gives you a preview of your data. You can explore the data by using:
Filters
allow you to filter out certain rows based on a conditionSort
can be used to order your rows in ascending/descending order based on a specified columnGroup by
is for organizing your rows by a specified column, grouping related rows together based on a shared value in that columnSchema
provides the table schema and ClickHouse settings.
Data Graph
shows all the dependencies for the Source. As you build downstream analytics from the Source, using Pipes, the dependency chart will populate showing all nodes that reference this source.
Logs
provides error logs. Any ingestion errors can be seen there.
/Sources
directory. For a Source that isn’t using a connector, your YAML should look like this:
Table
ORDER BY
, PARTITION BY
, Table Engine,
etc. The settings can be a String, a key-value pair, or an array comprising
either or both. Optional.af push
.
For example, to push web_events.yaml
, run: