Self-hosted storage
Self-hosted storage extends the base self-hosted agent model to store
input parameters, file uploads, logs, and outputs in your team's infrastructure, in addition to
running tasks there. When this feature is enabled, this data never transits through Airplane's
systems, which can help your organization satisfy rigorous security and compliance requirements for
your data.
Self-hosted storage can be enabled for all of your agents or just a subset.
Architecture
Architecture
Self-hosted storage builds upon the existing self-hosted agent model by modifying the agent process
and adding a few new components. These are shown in the following diagram and discussed more in the
sections below.
Agent as a server
Agent as a server
With self-hosted storage, each agent becomes a server in addition to just handling task
orchestration. This server is used to process requests from your team member's browsers, which now
hit an agent instead of the Airplane API to set inputs or get outputs for runs that are managed by
the agent.
Each client request includes an Airplane-generated JWT so the agent can verify that the request is
from a user who's allowed to view the associated run data. The agent server instances are also
fronted by an external load balancer, using HTTPS with a certificate on an Airplane-managed domain,
to protect data in transit.
The agent server also receives requests from task runners to append new logs and outputs. These go
through an internal load balancer that's protected by network firewall rules.
Intermediate storage
Intermediate storage
With self-hosted storage enabled, the agent uses an intermediate DB for storing fresh run data. The
latter is currently implemented via Redis.
The intermediate data are made available to respond to user queries and are periodically aggregated
and flushed into the long-term store described in the next section.
Long-term storage
Long-term storage
The agent uses a blobstore system for low cost, long-term persistance of run data. We currently
support AWS S3 for this, but in the future this may be extended for equivalents in other cloud
providers.
Storage zones
Storage zones
A "zone" is a grouping of storage servers and the associated infrastructure (buckets, etc.) for some
subset of task runs. Agents are currently 1:1 to zones, but in the future, this may be extended to
allow different groups of agents to share storage within a single zone so that there's less
replication of this infrastructure.
Each zone is identified by a slug that is used by the API to uniquely identify the zone. Zone
slugs must begin with a lower-case letter and contain only lower-case letters and digits. They must
also be unique for each agent installation in your account.
The zone for storing a run's inputs may be different than the zone where the run actually executes
if your team has multiple agents and zones. This is because an inputs zone is picked and the
associated data is saved before the run is queued in the Airplane API. The API will use the
following criteria, in the order listed, to pick an inputs zone:
- A zone matching the hard-coded (i.e., non-templated) constraints for the run
- The execution zone of the run's parent (if it was created from another run via an Airplane SDK)
- A zone matching the run's environment
- The first zone when sorting the team's zones by slug
See the Verifying section below for more details on determining which zone(s) were
used for a specific run.
Inputs vs. outputs
Inputs vs. outputs
Self-hosted storage can be enabled for just logs and outputs (the default) or also for inputs,
including task run parameters and file uploads. Enabling the latter introduces some additional
limitations, which are described in the Limitations section below.
Setup
Setup
The setup process for self-hosted storage varies based on your agent environment and the
configuration system that you're using. See the appropriate sections of the self-hosting docs for
more details:
- AWS ECS (Terraform)
- AWS ECS (CloudFormation)
- Kubernetes on AWS EKS
- Kubernetes on GCP GKE
- Generic Kubernetes
If you have an existing self-hosted agent setup, we'd advise setting up a storage-enabled one
separately as opposed to doing an in-place upgrade. See the
Migrating to self-hosted storage section
below for more details.
SDK support
SDK support
Various self-hosted storage features rely on SDK-side code to properly route requests to agent-based
servers. To ensure these work properly, please update your JavaScript and Python tasks to use at
least versions
v0.2.80
and
v0.3.45
of the JS and Python
SDKs, respectively.Verifying
Verifying
If all goes well, you should see the zone listed in your team's settings page. In addition, the zone
slug will be shown next to a globe icon underneath the full name of each agent in the zone.
Logs, outputs, and inputs (if enabled) for subsequent runs scheduled on the associated agent(s) will
then be stored in your team's infrastructure. You can verify this by looking at the "Advanced" tab
for a run—the zones used for storing the run's inputs and for executing it will be listed there if
they were used:
An "Execution zone" indicates that logs and outputs were self-hosted whereas an "Inputs zone"
indicates that inputs were self-hosted. If no zones are listed, this means that both the run's
inputs and its outputs were stored in Airplane-hosted infrastructure.
You can also use your browser's development tools to verify that requests to set inputs, get logs,
etc. are hitting the load balancer listed in the settings page.
Finally, you can examine the actual run data in your configured storage bucket. The contents will
have the following layout:
Copied1YOUR_BUCKET/2├─ inputs/3│ ├─ INPUT_ID1/4│ ├─ INPUT_ID2/5│ └─ ...6├─ teams/7│ └─ YOUR_TEAM_ID/8│ └─ runs/9│ ├─ RUN_ID1/10│ │ ├─ logs/11│ │ └─ outputs/12│ ├─ RUN_ID2/13│ │ ├─ logs/14│ │ └─ outputs/15│ └─ ...16└─ uploads/17├─ UPLOAD_ID1/18├─ UPLOAD_ID2/19└─ ...
Run logs and outputs are temporarily stored in Redis and may take up to 20 minutes to be archived
in your bucket.
The following is a screenshot of the top-level layout in S3:
Migrating to self-hosted storage
Migrating to self-hosted storage
As mentioned previously, we recommend creating agents with self-hosted storage from scratch as
opposed to doing in-place upgrades of existing, non-storage-enabled agents. To smoothly transition
from the old agents to the new ones:
- Spin up your storage-enabled agents with a required label, e.g.
required:storage:true
. This can be configured via theagent_labels
parameter for ECS Terraform, theAgentLabels
parameter for ECS CloudFormation, or theairplane.agentLabels
parameter in Helm - Add a
storage:true
constraint to one or more test tasks - Verify that the storage-enabled agents are working for those tasks
- Remove the
required:storage:true
label from your agent configuration and then re-apply to update the agents - Bring down your old agents
Limitations
Limitations
The following are the current limitations of self-hosted storage:
- The workflow runtime is not supported because it depends on additional storage systems that aren't included in the self-hosted agent environment. Please continue to use regular self-hosted agents for workflow tasks, if applicable.
- Agents and zones are 1:1; this means that each agent needs its own load balancers, blobstore bucket, etc.
- Runbooks are not supported because they depend on our API "seeing" the inputs and outputs of individual steps. Please use tasks an alternative.
- Self-hosted storage is not used for runs that are executed through Airplane Studio, and the Airplane API may be able to see inputs and outputs for these.
- (Inputs only) JavaScript templates that do comparisons for specific
input values (e.g.,
params.name === "Bob" ? ...
) won't work as expected because they'll be evaluated with temporary placeholders. On the other hand, templates that do regular string interpolation (e.g.,SELECT * from users where name = '{{params.name}}'
) are fine. These kinds of expressions are often used to configure inputs for Airplane built-ins like SQL and REST tasks. - (Inputs only) Input parameters that are used for run constraints will be sent to the Airplane API without obfuscation. This is needed so that the associated runs can be scheduled properly by the API.
These limitations may be relaxed in the future.