Process Batch Data

Load data into Rill's data store via storage locations

Getting Started

Many customers start with Rill loading data from storage locations - s3, GCS, etc.

Data is loaded into Rill once client processing is complete (or potentially raw data) at some regular interval (usually hourly).

If your data is already in aggregate or final format, you can load directly into Rill:

test your data manually (more details here) which will create your Druid spec for ingestion
using that ingestion spec, add an orchestration step post-processing to load data into Druid. any scheduling tool will work - we typically use Ariflow. See the this example Airflow dag for more details
if you plan to use Rill Explore, contact support@rilldata.com to create your first staging dashboard for review

Batch Ingestion

For more details on Druid ingestion, visit:

For customers with more complex joins/transformations requiring Rill managed pipelines:

grant Rill access to the storage location (usually Amazon s3 or Google Cloud Storage)
Rill to develop pipeline logic as required
review the sample output with the Rill team to confirm layout and values
Rill to poll source locations on regular intervals

Rill Managed Pipelines

Email contact@rilldata.com if you're interested in having the Rill team build and manage ingestion for your data pipelines