Getting Started¶
This guide will help you install Aqueducts and run your first pipeline.
Installation¶
CLI¶
The Aqueducts CLI is the primary way to execute pipelines locally or submit them to remote executors.
Download pre-built binaries from the latest release.
Executor (Optional)¶
For advanced use cases, you can deploy executors to run pipelines remotely within your infrastructure. Executors are typically deployed using Docker images. See Execution for detailed setup instructions.
Quick Start¶
Run Your First Pipeline¶
Execute the simple example pipeline:
This pipeline:
- Reads temperature data from CSV files
- Aggregates the data by date and location
- Enriches with location names
- Outputs to a Parquet file
Example Pipeline Structure¶
Here's what a basic pipeline looks like:
# yaml-language-server: $schema=https://raw.githubusercontent.com/vigimite/aqueducts/main/json_schema/aqueducts.schema.json
version: "v2"
sources:
# Read temperature readings from CSV
- type: file
name: temp_readings
format:
type: csv
options: {}
location: ./examples/temp_readings_${month}_${year}.csv
stages:
# Aggregate temperature data by date and location
- - name: aggregated
query: >
SELECT
cast(timestamp as date) date,
location_id,
round(avg(temperature_c),2) avg_temp_c
FROM temp_readings
GROUP by 1,2
ORDER by 1 asc
# Write results to Parquet file
destination:
type: file
name: results
format:
type: parquet
options: {}
location: ./examples/output_${month}_${year}.parquet
Next Steps¶
- Learn Pipeline Development: See Writing Pipelines to understand sources, stages, and destinations
- Explore Execution Options: Check Execution for local and remote execution patterns
- Browse Examples: View more complex examples in the examples folder
- Schema Reference: Detailed configuration options in Schema Reference
Parameter Templates
Notice the ${month}
and ${year}
parameters in the example. Aqueducts supports parameter substitution to make pipelines reusable across different inputs and outputs.
Editor Support
The yaml-language-server
comment at the top enables autocompletion and validation in VS Code, Neovim, and other editors with YAML Language Server support.