Self-hosted storage

Self-hosted storage extends the base self-hosted agent model to store logs and outputs in your team's infrastructure, in addition to running tasks there. When this feature is enabled, these logs and outputs never transit through Airplane's systems, which can help your organization satisfy rigorous security and compliance requirements for your data.
Self-hosted storage can be enabled for all of your agents or just a subset.

Architecture

Self-hosted storage builds upon the existing self-hosted agent model by modifying the agent process and adding a few new components. These are shown in the following diagram and discussed more in the sections below.

Agent as a server

With self-hosted storage, each agent becomes a server in addition to just handling task orchestration. This server is used to process requests from your team member's browsers, which now hit an agent instead of the Airplane API to get logs and outputs for runs that are managed by the agent.
Each client request includes an Airplane-generated JWT so the agent can verify that the request is from a user who's allowed to view the associated run data. The agent server instances are also fronted by an external load balancer, using HTTPS with a certificate on an Airplane-managed domain, to protect data in transit.
The agent server also receives requests from task runners to append new logs and outputs. These go through an internal load balancer that's protected by network firewall rules.

Intermediate storage

With self-hosted storage enabled, the agent uses an intermediate DB for storing fresh run data. The latter is currently implemented via Redis.
The intermediate data are made available to respond to user queries and are periodically aggregated and flushed into the long-term store described in the next section.

Long-term storage

The agent uses a blobstore system for low cost, long-term persistance of run data. We currently support AWS S3 for this, but in the future this may be extended for equivalents in other cloud providers.

Storage zones

A "zone" is a grouping of storage servers and the associated infrastructure (buckets, etc.) for some subset of task runs. Agents are currently 1:1 to zones, but in the future, this may be extended to allow different groups of agents to share storage within a single zone so that there's less replication of this infrastructure.
Each zone is identified by a slug that is used by the API to uniquely identify the zone. Zone slugs must begin with a lower-case letter and contain only lower-case letters and digits. They must also be unique for each agent installation in your account.

Setup

The setup process for self-hosted storage varies based on your agent environment and the configuration system that you're using. See the appropriate sections of the self-hosting docs for more details:
If you have an existing self-hosted agent setup, we'd advise setting up a storage-enabled one separately as opposed to doing an in-place upgrade. See the Migrating to self-hosted storage section below for more details.

Verifying

If all goes well, you should see the zone listed in your team's settings page. In addition, the zone slug will be shown next to a globe icon underneath the full name of each agent in the zone.
Logs and outputs for subsequent runs scheduled on the associated agent(s) will then be stored in your team's infrastructure. You can verify this by looking at the "Advanced" tab for a run- the zone will be listed there if the run data is not hosted in Airplane's infrastructure.
You can also use your browser's development tools to verify that requests to get logs and outputs are hitting the load balancer listed in the settings page.

Migrating to self-hosted storage

As mentioned previously, we recommend creating agents with self-hosted storage from scratch as opposed to doing in-place upgrades of existing, non-storage-enabled agents. To smoothly transition from the old agents to the new ones:
  1. Spin up your storage-enabled agents with a required label, e.g. required:storage:true. This can be configured via the agent_labels parameter for ECS Terraform, the AgentLabels parameter for ECS CloudFormation, or the airplane.agentLabels parameter in Helm
  2. Add a storage:true constraint to one or more test tasks
  3. Verify that the storage-enabled agents are working for those tasks
  4. Remove the required:storage:true label from your agent configuration and then re-apply to update the agents
  5. Bring down your old agents

Limitations

The following are the current limitations of self-hosted storage:
  1. The workflow runtime is not supported. Please continue to use regular self-hosted agents for workflow tasks, if applicable.
  2. Agents and zones are 1:1; this means that each agent needs its own load balancers, blobstore bucket, etc.
  3. Runbooks that depend on fetching the outputs of previous blocks won't work because our API no longer has access to these. Please use tasks as an alternative.
These limitations may be relaxed in the future.