Timeouts Configuration¶
Timeouts control how long Go Overlay waits for various operations to complete. Proper timeout configuration ensures graceful shutdowns while preventing indefinite hangs.
Timeout Configuration Section¶
Timeouts are configured in the [timeouts] section of your configuration file:
[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 30
global_shutdown_timeout = 60
dependency_wait_timeout = 300
Timeout Options¶
post_script_timeout¶
Type: integer (seconds)
Default: 30
Maximum time to wait for a service's pos_script to complete after the service starts.
When it applies: - After a service process starts successfully - Only if the service has a pos_script configured
What happens on timeout: - The pos_script process is terminated - The service continues running - A warning is logged
Post-Script Timeout
If your post-scripts perform health checks or critical initialization, ensure this timeout is long enough. A timeout doesn't stop the service itself, only the post-script.
Example use case:
[[services]]
name = "database"
command = "/usr/bin/mysqld"
pos_script = "/scripts/verify-db-ready.sh" # May take 20 seconds
[timeouts]
post_script_timeout = 30 # Allow enough time for verification
service_shutdown_timeout¶
Type: integer (seconds)
Default: 30
Maximum time to wait for an individual service to shutdown gracefully after receiving SIGTERM.
When it applies: - During system shutdown - When stopping a service via CLI - When a required service fails
What happens on timeout: - SIGKILL is sent to force termination - Service is marked as stopped - Shutdown process continues
Forced Termination
After this timeout expires, the service is forcefully killed with SIGKILL. This may result in data loss or corruption for services that need time to flush buffers or complete transactions.
Example use case:
[[services]]
name = "database"
command = "/usr/bin/postgres"
# Database needs time to flush writes and close connections
[timeouts]
service_shutdown_timeout = 60 # Give database more time to shutdown cleanly
global_shutdown_timeout¶
Type: integer (seconds)
Default: 60
Maximum time for the entire system to complete shutdown of all services.
When it applies: - During system shutdown (SIGTERM/SIGINT to Go Overlay) - When a required service fails - When shutdown is initiated via CLI
What happens on timeout: - All remaining services are forcefully killed with SIGKILL - Go Overlay exits immediately - May result in ungraceful termination
System-Wide Timeout
This timeout applies to the entire shutdown sequence. If you have many services or services with long shutdown times, increase this value accordingly.
Calculation guideline:
Example use case:
# System with 5 services, each needing up to 30 seconds
[timeouts]
service_shutdown_timeout = 30
global_shutdown_timeout = 180 # 5 * 30 + 30 second buffer
dependency_wait_timeout¶
Type: integer (seconds)
Default: 300
Maximum time to wait for service dependencies to become ready during startup.
When it applies: - During system startup - When a service has depends_on configured - Waiting for dependent services to reach RUNNING state
What happens on timeout: - The waiting service fails to start - An error is logged - System may shutdown if the service is marked as required
Dependency Chains
For complex dependency chains, ensure this timeout accounts for the cumulative startup time of all dependencies plus their wait_after delays.
Example use case:
[[services]]
name = "database"
command = "/usr/bin/postgres"
wait_after = 10 # Needs 10 seconds to initialize
[[services]]
name = "app"
command = "/app/server"
depends_on = ["database"]
[timeouts]
dependency_wait_timeout = 60 # Allow time for database startup + wait_after
Timeout Reference Table¶
| Timeout | Default | Unit | Applies To | On Timeout |
|---|---|---|---|---|
post_script_timeout | 30 | seconds | Post-start scripts | Script terminated, service continues |
service_shutdown_timeout | 30 | seconds | Individual service shutdown | SIGKILL sent to service |
global_shutdown_timeout | 60 | seconds | Complete system shutdown | All services killed with SIGKILL |
dependency_wait_timeout | 300 | seconds | Waiting for dependencies | Service fails to start |
Complete Configuration Example¶
[timeouts]
# Post-script execution timeout
# Increase if health checks or verification scripts take longer
post_script_timeout = 45
# Individual service shutdown timeout
# Increase for databases or services that need time to flush data
service_shutdown_timeout = 60
# Global system shutdown timeout
# Should be larger than service_shutdown_timeout * number_of_services
global_shutdown_timeout = 180
# Dependency wait timeout
# Increase for complex dependency chains or slow-starting services
dependency_wait_timeout = 300
Timeout Scenarios¶
Scenario 1: Fast Microservices¶
For lightweight services that start and stop quickly:
[timeouts]
post_script_timeout = 10
service_shutdown_timeout = 15
global_shutdown_timeout = 30
dependency_wait_timeout = 60
Scenario 2: Database-Heavy Stack¶
For systems with databases that need time for graceful shutdown:
[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 90 # Databases need time to flush
global_shutdown_timeout = 300 # Multiple services with long shutdowns
dependency_wait_timeout = 180 # Database initialization can be slow
Scenario 3: Complex Dependency Chain¶
For systems with many interdependent services:
[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 30
global_shutdown_timeout = 120
dependency_wait_timeout = 600 # Long chain of dependencies
Best Practices¶
1. Set Realistic Timeouts¶
Don't set timeouts too short. Consider: - Service initialization time - Data flush/commit operations - Network operations - Dependency chains
2. Test Shutdown Behavior¶
Verify your timeouts work correctly:
# Start the system
go-overlay
# In another terminal, trigger shutdown
kill -TERM $(pidof go-overlay)
# Monitor logs to ensure graceful shutdown
3. Monitor Timeout Events¶
Watch logs for timeout warnings: - Frequent timeouts indicate values are too low - Services being killed with SIGKILL suggest insufficient shutdown time
4. Account for Dependencies¶
When using depends_on, ensure dependency_wait_timeout accounts for: - Startup time of all dependencies - Any wait_after delays - Service initialization time
5. Balance Safety and Speed¶
- Too short: Risk of data loss, corruption, or incomplete operations
- Too long: Slow shutdown, delayed restarts, poor user experience
Timeout Interactions¶
Startup Sequence¶
Service with dependencies starts
↓
Wait for dependencies (dependency_wait_timeout)
↓
Start service process
↓
Run pos_script if configured (post_script_timeout)
↓
Service marked as RUNNING
Shutdown Sequence¶
Shutdown initiated
↓
Send SIGTERM to all services
↓
Wait for each service (service_shutdown_timeout)
↓
If timeout: Send SIGKILL
↓
All services must stop within (global_shutdown_timeout)
↓
If global timeout: Force kill all remaining services
Troubleshooting¶
Services Being Killed During Shutdown¶
Symptom: Services are forcefully terminated with SIGKILL
Solution: Increase service_shutdown_timeout
System Shutdown Takes Too Long¶
Symptom: Shutdown hangs or takes excessive time
Solution: Decrease timeouts or investigate why services aren't stopping
[timeouts]
service_shutdown_timeout = 20 # Reduced from 30
global_shutdown_timeout = 60 # Reduced from 120
Services Fail to Start Due to Dependencies¶
Symptom: "dependency wait timeout exceeded" errors
Solution: Increase dependency_wait_timeout or reduce dependency wait_after values
Post-Scripts Being Terminated¶
Symptom: Post-scripts don't complete, warnings in logs
Solution: Increase post_script_timeout or optimize scripts
Advanced Timeout Patterns¶
Progressive Timeout Strategy¶
Use longer timeouts for critical services:
[timeouts]
service_shutdown_timeout = 30 # Default for most services
global_shutdown_timeout = 120 # Allow critical services extra time
[[services]]
name = "cache"
command = "/usr/bin/redis-server"
required = false # Can be killed quickly
[[services]]
name = "database"
command = "/usr/bin/postgres"
required = true # Gets full shutdown timeout
Development vs Production¶
Use different timeout configurations:
Development (fast iteration):
[timeouts]
post_script_timeout = 10
service_shutdown_timeout = 15
global_shutdown_timeout = 30
dependency_wait_timeout = 60
Production (data safety):
[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 90
global_shutdown_timeout = 300
dependency_wait_timeout = 300
Next Steps¶
- Learn about service configuration
- Explore graceful shutdown behavior
- Understand dependency management