Local - lite load testing and monitoringΒΆ
Quick Start CommandsΒΆ
Step-1: (Skip if you are using linux running docker engine)ΒΆ
Colima Setup : 10 CPU / 18GBΒΆ
# Start colima with 10 cpu and 18gb memory
colima start --cpu 10 --memory 18
# Check with colima status
colima list
# TERMINAL OUTPUT -
PROFILE STATUS ARCH CPUS MEMORY DISK RUNTIME ADDRESS
default Running aarch64 10 18GiB 100GiB docker
Step-2: Application setup , load testing and monitoring setup.ΒΆ
# Create production docker image
make docker-prod
# Note: Refer docker-compose.override.lite.yml for cpu and memory allocation
# Run lite replicas (uses docker compose)
make compose-lite-down && make compose-lite-up
# Access the gateway via proxy - http://localhost:8080
# Run lite monitoring (Not running pgadmin and redis commander)
make monitoring-lite-down && make monitoring-lite-up
# Access monitoring - http://localhost:3000
# Start load test UI (Runs on local host)
make load-test-ui
# Access locust - http://localhost:8089 and run the test
# Concurreny = 800
# Ramp up = 100
# Run Time - 10m
# uncheck FastTestEchoUser & FastTestTimeUser
Note: The lite commands (make compose-lite-up/down and make monitoring-lite-up/down) apply the docker-compose.override.lite.yml override file for reduced resource limits.
Gateway Lite ServiceΒΆ
| Aspect | Original (docker-compose.yml) | Lite (docker-compose.override.lite.yml) | Reason |
|---|---|---|---|
| Replicas | 3 | 2 | Reduce resource footprint for Colima |
| CPU Limit | 8 CPUs per replica | 3.5 CPUs per replica | More reasonable for 10 CPU host |
| CPU Reservation | 4 CPUs per replica | 2 CPUs per replica | Allows flexible burst up to limit |
| Memory Limit | 8GB per replica | 3GB per replica | Colima memory efficiency |
| Memory Reservation | 4GB per replica | 2.5GB per replica | Leaves headroom for other services |
| Total Gateway Only | 24 CPUs / 24GB (limits) | 7 CPUs / 6GB (limits) | 71% CPU reduction, 75% memory reduction |
| 12 CPUs / 12GB (reservations) | 4 CPUs / 5GB (reservations) |
HTTP Server SelectionΒΆ
Both use Gunicorn (battle-tested, lower memory than Granian)
- 24 workers per replica (matches CPU cores)
- Connection pooling via PgBouncer
Database Layer (PostgreSQL + PgBouncer)ΒΆ
| Service | CPU Limit | Memory Limit | Reservation CPU | Reservation Memory |
|---|---|---|---|---|
| PostgreSQL | 2 CPUs | 4GB | 1 CPU | 2GB |
| PgBouncer | 1 CPU | 256MB | 0.5 CPU | 128MB |
No changes between versions - both tuned for 3000 concurrent users.
Cache (Redis)ΒΆ
| Aspect | Original | Lite |
|---|---|---|
| CPU Limit | 2 CPUs | 1.5 CPUs |
| Memory Limit | 2GB | 1.5GB |
| CPU Reservation | 1 CPU | 0.75 CPU |
| Memory Reservation | 1GB | 0.75GB |
Resource Allocation: Old vs NewΒΆ
Understanding Limits vs ReservationsΒΆ
- Limits = Hard ceiling (Docker throttles) - can exceed available CPU
- Reservations = Soft guarantee - must fit in available CPU
- Actual Usage = Real consumption (typically 30-50% of limits)
Safety: Limits can exceed 10 CPU because containers don't hit them simultaneously. Docker throttles (not crashes).
Safety Verdict β ΒΆ
| Config | CPU Reserved | Memory Reserved | Status |
|---|---|---|---|
| compose-lite-up | 6.75 CPU (67.5%) | 8.125GB (45%) | β SAFE |
| + monitoring-lite | 8.75 CPU (87.5%) | 9.625GB (53%) | β SAFE |
| Actual peak load | 4.6 CPU (46%) | 5.25GB (29%) | β Good headroom |
Original compose-up: 12 CPU reserved (120% of 10 CPU) β Won't schedule
Resource Breakdown (Verified Against docker-compose.override.lite.yml)ΒΆ
compose-lite-up (core only):
Nginx: 0.5 CPU / 256MB reservation
Gateway(2Γ): 4.0 CPU / 5GB reservation
PostgreSQL: 1.0 CPU / 2GB reservation
PgBouncer: 0.5 CPU / 128MB reservation
Redis: 0.75 CPU / 0.75GB reservation
βββββββββββββββββββββββββββββββββββββ
Total: 6.75 CPU / 8.125GB β
+ monitoring-lite (10 services):
prometheus: 0.25 CPU / 256MB
grafana: 0.25 CPU / 256MB
loki: 0.25 CPU / 256MB
tempo: 0.25 CPU / 256MB
promtail: 0.25 CPU / 128MB
cadvisor: 0.25 CPU / 128MB
postgres_exporter: 0.125 CPU / 64MB
redis_exporter: 0.125 CPU / 64MB
pgbouncer_exporter: 0.125 CPU / 64MB
nginx_exporter: 0.125 CPU / 64MB
βββββββββββββββββββββββββββββββββββββ
Additional: 2.0 CPU / 1.5GB
Grand Total: 8.75 CPU / 9.625GB β
(87.5% utilization, 1.25 CPU headroom)
make compose-lite-up : ServicesΒΆ
Included:
- β Nginx (caching proxy)
- β Gateway (2 replicas)
- β PostgreSQL 18
- β PgBouncer (connection pooler)
- β Redis (cache)
Use Case: Fast, lightweight local load testing without observability overhead
make monitoring-lite-up: ServicesΒΆ
Included:
- β Prometheus (metrics collection)
- β Grafana (dashboards)
- β Loki (log aggregation)
- β Tempo (distributed tracing)
- β Promtail (log shipper)
- β postgres_exporter (DB metrics)
- β redis_exporter (cache metrics)
- β pgbouncer_exporter (pool metrics)
- β nginx_exporter (proxy metrics)
- β cAdvisor (container metrics)
Excluded (vs. monitoring profile):
- β pgAdmin (0.5 CPU / 256MB) - Disabled for Colima
- β redis_commander (0.25 CPU / 128MB) - Disabled for Colima
Resource Savings: 0.75 CPU / 384MB by excluding admin tools
Use Case: Full observability (metrics, logs, traces) without admin UIs for local load testing.
Original Profiles (Not for 10 CPU Colima/ Docker Engine)ΒΆ
| Profile | Resources | Issue |
|---|---|---|
| compose-up | 12 CPU / 12GB reserved | 120% of 10 CPU - won't schedule |
| monitoring-up | + pgAdmin, redis_commander | Further overprovisioned |
Performance Characteristics & Actual UsageΒΆ
Throughput CapacityΒΆ
| Configuration | RPS | Users | Failure Rate |
|---|---|---|---|
| compose-lite (2 replicas) | 400-500 | 800 | ~0% |
Note: Database (PostgreSQL max_connections=800) is the bottleneck, not CPU.
Load test summary(800 users, 100 ramp rate, 10minutes)ΒΆ
Overall Metrics:
- Total Requests: 266,151
- Total Failures: 2 (0.00%)
- Requests/sec (RPS): 444.86
Response Times (ms):
- Average: 25.60
- Min: 0.92
- Max: 30011.87
- Median (p50): 8.00
- p90: 26.00
- p95: 60.00
- p99: 220.00
Lite local load testΒΆ
Monitoring-Lite vs Full Monitoring ComparisonΒΆ
| Feature | monitoring-lite | Full monitoring | Reason |
|---|---|---|---|
| Prometheus | β | β | Metrics collection (both have) |
| Grafana | β | β | Dashboards (both have) |
| Loki | β | β | Log aggregation (both have) |
| Tempo | β | β | Distributed tracing (both have) |
| Promtail | β | β | Log shipper (both have) |
| cAdvisor | β | β | Container metrics (both have) |
| postgres_exporter | β | β | DB metrics (both have) |
| redis_exporter | β | β | Cache metrics (both have) |
| pgbouncer_exporter | β | β | Pool metrics (both have) |
| nginx_exporter | β | β | Proxy metrics (both have) |
| pgAdmin | β | β | Admin UI - excluded from lite for resource saving |
| redis_commander | β | β | Admin UI - excluded from lite for resource saving |
When to use each:
make monitoring-lite-up- Local load testing on Colima/Docker Desktop (same observability stack, just excludes admin UIs)make monitoring-up- Full monitoring including pgAdmin and redis_commander admin UIs (requires full docker-compose.yml)
