Skip to content

Multi-Agent Deployment

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Persistent MLS state directory — mount as a volume
VOLUME /data/skytale
ENV SKYTALE_DATA_DIR=/data/skytale
CMD ["python", "agent.py"]
services:
agent-alice:
build: .
environment:
- SKYTALE_API_KEY=${SKYTALE_API_KEY}
- SKYTALE_RELAY=https://relay.skytale.sh:5000
- SKYTALE_API_URL=https://api.skytale.sh
volumes:
- alice-data:/data/skytale
restart: unless-stopped
agent-bob:
build: .
environment:
- SKYTALE_API_KEY=${SKYTALE_API_KEY}
- SKYTALE_RELAY=https://relay.skytale.sh:5000
- SKYTALE_API_URL=https://api.skytale.sh
volumes:
- bob-data:/data/skytale
restart: unless-stopped
volumes:
alice-data:
bob-data:

VariableDescriptionDefault
SKYTALE_API_KEYAPI key for authentication (sk_live_...)— (required for production)
SKYTALE_RELAYRelay server URLhttps://relay.skytale.sh:5000
SKYTALE_API_URLAPI server URLhttps://api.skytale.sh
SKYTALE_DATA_DIRDirectory for MLS state persistence~/.skytale/<identity_hex>
SKYTALE_MOCKEnable mock mode (1, true, yes)false
SKYTALE_IDENTITYDefault agent identity (TypeScript SDK)

One agent must create the channel first. This agent becomes the MLS group owner and processes join requests from other agents.

from skytale_sdk import SkytaleChannelManager
# The creator must run first and stay running
creator = SkytaleChannelManager(
identity=b"orchestrator",
data_dir="/data/skytale",
)
creator.create("myorg/agents/tasks")
# Generate invite tokens for other agents
tokens = []
for i in range(5):
token = creator.invite("myorg/agents/tasks", max_uses=1, ttl=3600)
tokens.append(token)
# Distribute tokens to joining agents (env vars, config files, API, etc.)

Other agents join using invite tokens. They don’t need to know each other — only the creator’s channel.

import os
from skytale_sdk import SkytaleChannelManager
worker = SkytaleChannelManager(
identity=b"worker-1",
data_dir="/data/skytale",
)
token = os.environ["SKYTALE_INVITE_TOKEN"]
worker.join_with_token("myorg/agents/tasks", token)
# Now the worker can send and receive on the channel
worker.send("myorg/agents/tasks", "worker-1 online")
PatternWhen to useHow
Environment variableFixed agent set, known at deploy timeSKYTALE_INVITE_TOKEN in docker-compose
Config fileAgents read config on startupJSON/TOML file mounted as a volume
API endpointDynamic agent scalingCreator exposes an endpoint that returns tokens
Shared storeKubernetes or orchestrator-managedStore tokens in a Secret or KV store

For auto-scaling scenarios where new agents spin up dynamically:

# Create a reusable token (up to 100 uses, valid for 24 hours)
token = creator.invite("myorg/agents/tasks", max_uses=100, ttl=86400)
# Store in a shared secret manager

If an agent restarts but its data_dir is intact (volume-mounted), it can resume without rejoining:

# On restart, recreate the manager with the same identity and data_dir
mgr = SkytaleChannelManager(
identity=b"worker-1",
data_dir="/data/skytale", # Volume-mounted, survived restart
)
# Channels are restored from local MLS state
# No need to rejoin — just start sending/receiving
mgr.send("myorg/agents/tasks", "worker-1 back online")

Agent restart without state (data_dir lost)

Section titled “Agent restart without state (data_dir lost)”

If the data_dir is lost, the agent must rejoin with a new invite token:

mgr = SkytaleChannelManager(
identity=b"worker-1",
data_dir="/data/skytale",
)
# Need a fresh invite token from the channel owner
new_token = get_new_token_from_orchestrator()
mgr.join_with_token("myorg/agents/tasks", new_token)

Always call close() before stopping an agent to cleanly shut down background threads:

import signal
import sys
def shutdown(signum, frame):
mgr.close()
sys.exit(0)
signal.signal(signal.SIGTERM, shutdown)
signal.signal(signal.SIGINT, shutdown)

Or use the context manager:

with SkytaleChannelManager(identity=b"agent") as mgr:
mgr.create("org/ns/chan")
run_agent_loop(mgr)
# mgr.close() called automatically on exit

  • data_dir is mounted as a persistent volume (Docker volume, EBS, PVC)
  • Each agent has its own unique data_dir — never shared
  • Backup strategy for data_dir if channel state is critical
  • Volume permissions allow the agent process to read/write
  • SKYTALE_API_KEY set via environment variable, not hardcoded
  • API key stored in a secrets manager (AWS Secrets Manager, Vault, K8s Secret)
  • Separate API keys per environment (dev, staging, production)
  • Relay URL configured correctly (SKYTALE_RELAY)
  • Outbound TCP 5000 (gRPC) and UDP 4433 (QUIC) allowed through firewall
  • Health check: curl https://relay.skytale.sh:5000/health
  • Graceful shutdown handler (SIGTERM / SIGINT) calls mgr.close()
  • Container restart policy set to unless-stopped or always
  • Error handling for all 5 exception types (see Error handling guide)
  • Logging configured to capture SkytaleError codes for monitoring
  • Each agent has a unique, stable identity across restarts
  • Identity is deterministic (not randomly generated on each start)
  • No two running agents share the same identity

The most common production issue. Without persistent MLS state, agents cannot decrypt messages on existing channels. Always use a volume mount.

# docker-compose.yml — WRONG: no volume
services:
agent:
build: .
# data_dir defaults to a temp path — lost on container restart
# docker-compose.yml — CORRECT: persistent volume
services:
agent:
build: .
volumes:
- agent-data:/data/skytale
environment:
- SKYTALE_DATA_DIR=/data/skytale

Agents on separate machines need invite tokens

Section titled “Agents on separate machines need invite tokens”

Agents cannot join channels by just knowing the channel name. The MLS protocol requires a cryptographic handshake mediated by invite tokens. There is no “open” channel that anyone can join.

# WRONG: trying to join without a token
bob.create("org/ns/chan") # This creates a NEW channel, not joins Alice's
# CORRECT: use invite token from the channel owner
token = alice.invite("org/ns/chan")
# Send token to Bob (env var, API call, config file, etc.)
bob.join_with_token("org/ns/chan", token)

Each agent identity must have its own data_dir. Sharing causes MLS epoch conflicts and decryption failures.

# WRONG: shared volume
services:
agent-1:
volumes:
- shared-data:/data/skytale # Both agents write to same dir
agent-2:
volumes:
- shared-data:/data/skytale # MLS state conflicts
# CORRECT: separate volumes
services:
agent-1:
volumes:
- agent1-data:/data/skytale
agent-2:
volumes:
- agent2-data:/data/skytale

If your agent generates a random identity on each start, it creates a new MLS participant every time. Use a stable, deterministic identity:

# WRONG: random identity
import os
mgr = SkytaleChannelManager(identity=os.urandom(16))
# CORRECT: stable identity
mgr = SkytaleChannelManager(identity=b"order-processor-1")