12 KiB
Reference architectures
You can easily run Fleet on a single VPS that would be capable of supporting hundreds if not thousands of hosts, but this page details an opinionated view of running Fleet in a production environment, as well as different configuration strategies to enable High Availability (HA).
Availability components
There are a few strategies that can be used to ensure high availability:
- Database HA
- Traffic load balancing
Database HA
Fleet recommends RDS Aurora MySQL when running on AWS. More details about backups/snapshots can be found here. It is also possible to dynamically scale read replicas to increase performance and enable database fail-over. It is also possible to use Aurora Global to span multiple regions for more advanced configurations(not included in the reference terraform).
In some cases adding a read replica can increase database performance for specific access patterns. In scenarios when automating the API or with fleetctl
there can be benefits to read performance.
Note:Fleet servers need to talk to a writer in the same datacenter. Cross region replication can be used for failover but writes need to be local.
Traffic load balancing
Load balancing enables distributing request traffic over many instances of the backend application. Using AWS Application Load Balancer can also offload SSL termination, freeing Fleet to spend the majority of it's allocated compute dedicated to its core functionality. More details about ALB can be found here.
Note if using terraform reference architecture all configurations can dynamically scale based on load(cpu/memory) and all configurations assume On-Demand pricing (savings are available through Reserved Instances). Calculations do not take into account NAT gateway charges or other networking related ingress/egress costs.
Cloud providers
AWS
Example configuration breakpoints
Up to 1000 hosts
Fleet instances | CPU Units | RAM |
---|---|---|
1 Fargate task | 512 CPU Units | 4GB |
Dependencies | Version | Instance type |
---|---|---|
Redis | 6 | t4g.small |
MySQL | 5.7.mysql_aurora.2.10.0 | db.t3.small |
Up to 25000 hosts
Fleet instances | CPU Units | RAM |
---|---|---|
10 Fargate task | 1024 CPU Units | 4GB |
Dependencies | Version | Instance type |
---|---|---|
Redis | 6 | m6g.large |
MySQL | 5.7.mysql_aurora.2.10.0 | db.r6g.large |
Up to 150000 hosts
Fleet instances | CPU Units | RAM |
---|---|---|
20 Fargate task | 1024 CPU Units | 4GB |
Dependencies | Version | Instance type | Nodes |
---|---|---|---|
Redis | 6 | m6g.large | 3 |
MySQL | 5.7.mysql_aurora.2.10.0 | db.r6g.4xlarge | 1 |
Up to 300000 hosts
Fleet instances | CPU Units | RAM |
---|---|---|
20 Fargate task | 1024 CPU Units | 4GB |
Dependencies | Version | Instance type | Nodes |
---|---|---|---|
Redis | 6 | m6g.large | 3 |
MySQL | 5.7.mysql_aurora.2.10.0 | db.r6g.16xlarge | 2 |
AWS reference architecture can be found here. This configuration includes:
- VPC
- Subnets
- Public & Private
- ACLs
- Security Groups
- Subnets
- ECS as the container orchestrator
- Fargate for underlying compute
- Task roles via IAM
- RDS Aurora MySQL 5.7
- Elasticache Redis Engine
- Firehose osquery log destination
- S3 bucket sync to allow further ingestion/processing
- Monitoring via Cloudwatch alarms
Some AWS services used in the provider reference architecture are billed as pay-per-use such as Firehose. This means that osquery scheduled query frequency can have a direct correlation to how much these services cost, something to keep in mind when configuring Fleet in AWS.
AWS Terraform CI/CD IAM permissions
The following permissions are the minimum required to apply AWS terraform resources:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:*",
"cloudwatch:*",
"s3:*",
"lambda:*",
"ecs:*",
"rds:*",
"rds-data:*",
"secretsmanager:*",
"pi:*",
"ecr:*",
"iam:*",
"aps:*",
"vpc:*",
"kms:*",
"elasticloadbalancing:*",
"ce:*",
"cur:*",
"logs:*",
"cloudformation:*",
"ssm:*",
"sns:*",
"elasticache:*",
"application-autoscaling:*",
"acm:*",
"route53:*",
"dynamodb:*",
"kinesis:*",
"firehose:*"
],
"Resource": "*"
}
]
}
GCP
GCP reference architecture can be found in the Fleet repository. This configuration includes:
- Cloud Run (Fleet backend)
- Cloud SQL MySQL 5.7 (Fleet database)
- Memorystore Redis (Fleet cache & live query orchestrator)
Example configuration breakpoints
Up to 1000 hosts
Fleet instances | CPU | RAM |
---|---|---|
2 Cloud Run | 1 | 2GB |
Dependencies | Version | Instance type |
---|---|---|
Redis | MemoryStore Redis 6 | M1 Basic |
MySQL | Cloud SQL for MySQL 5.7 | db-standard-1 |
Up to 25000 hosts
Fleet instances | CPU | RAM |
---|---|---|
10 Cloud Run | 1 | 2GB |
Dependencies | Version | Instance type |
---|---|---|
Redis | MemoryStore Redis 6 | M1 2GB |
MySQL | Cloud SQL for MySQL 5.7 | db-standard-4 |
Up to 150000 hosts
Fleet instances | CPU | RAM |
---|---|---|
30 Cloud Run | 1 CPU | 2GB |
Dependencies | Version | Instance type | Nodes |
---|---|---|---|
Redis | MemoryStore Redis 6 | M1 4GB | 1 |
MySQL | Cloud SQL for MySQL 5.7 | db-highmem-16 | 1 |
Azure
Coming soon
Render
Using Render's IAC see the repository for full details.
services:
- name: fleet
plan: standard
type: web
env: docker
healthCheckPath: /healthz
envVars:
- key: FLEET_MYSQL_ADDRESS
fromService:
name: fleet-mysql
type: pserv
property: hostport
- key: FLEET_MYSQL_DATABASE
fromService:
name: fleet-mysql
type: pserv
envVarKey: MYSQL_DATABASE
- key: FLEET_MYSQL_PASSWORD
fromService:
name: fleet-mysql
type: pserv
envVarKey: MYSQL_PASSWORD
- key: FLEET_MYSQL_USERNAME
fromService:
name: fleet-mysql
type: pserv
envVarKey: MYSQL_USER
- key: FLEET_REDIS_ADDRESS
fromService:
name: fleet-redis
type: pserv
property: hostport
- key: FLEET_SERVER_TLS
value: false
- key: PORT
value: 8080
- name: fleet-mysql
type: pserv
env: docker
repo: https://github.com/render-examples/mysql
branch: mysql-5
disk:
name: mysql
mountPath: /var/lib/mysql
sizeGB: 10
envVars:
- key: MYSQL_DATABASE
value: fleet
- key: MYSQL_PASSWORD
generateValue: true
- key: MYSQL_ROOT_PASSWORD
generateValue: true
- key: MYSQL_USER
value: fleet
- name: fleet-redis
type: pserv
env: docker
repo: https://github.com/render-examples/redis
disk:
name: redis
mountPath: /var/lib/redis
sizeGB: 10
Digital Ocean
Using Digital Ocean's App Spec to deploy on the App on the App Platform
alerts:
- rule: DEPLOYMENT_FAILED
- rule: DOMAIN_FAILED
databases:
- cluster_name: fleet-redis
engine: REDIS
name: fleet-redis
production: true
version: "6"
- cluster_name: fleet-mysql
db_name: fleet
db_user: fleet
engine: MYSQL
name: fleet-mysql
production: true
version: "8"
domains:
- domain: demo.fleetdm.com
type: PRIMARY
envs:
- key: FLEET_MYSQL_ADDRESS
scope: RUN_TIME
value: ${fleet-mysql.HOSTNAME}:${fleet-mysql.PORT}
- key: FLEET_MYSQL_PASSWORD
scope: RUN_TIME
value: ${fleet-mysql.PASSWORD}
- key: FLEET_MYSQL_USERNAME
scope: RUN_TIME
value: ${fleet-mysql.USERNAME}
- key: FLEET_MYSQL_DATABASE
scope: RUN_TIME
value: ${fleet-mysql.DATABASE}
- key: FLEET_REDIS_ADDRESS
scope: RUN_TIME
value: ${fleet-redis.HOSTNAME}:${fleet-redis.PORT}
- key: FLEET_SERVER_TLS
scope: RUN_AND_BUILD_TIME
value: "false"
- key: FLEET_REDIS_PASSWORD
scope: RUN_AND_BUILD_TIME
value: ${fleet-redis.PASSWORD}
- key: FLEET_REDIS_USE_TLS
scope: RUN_AND_BUILD_TIME
value: "true"
jobs:
- envs:
- key: DATABASE_URL
scope: RUN_TIME
value: ${fleet-redis.DATABASE_URL}
image:
registry: fleetdm
registry_type: DOCKER_HUB
repository: fleet
tag: latest
instance_count: 1
instance_size_slug: basic-xs
kind: PRE_DEPLOY
name: fleet-migrate
run_command: fleet prepare --no-prompt=true db
source_dir: /
name: fleet
region: nyc
services:
- envs:
- key: FLEET_VULNERABILITIES_DATABASES_PATH
scope: RUN_TIME
value: /home/fleet
- key: FLEET_BETA_SOFTWARE_INVENTORY
scope: RUN_TIME
value: "1"
health_check:
http_path: /healthz
http_port: 8080
image:
registry: fleetdm
registry_type: DOCKER_HUB
repository: fleet
tag: latest
instance_count: 1
instance_size_slug: basic-xs
name: fleet
routes:
- path: /
run_command: fleet serve
source_dir: /