Timeouts Configuration¶

Timeouts control how long Go Overlay waits for various operations to complete. Proper timeout configuration ensures graceful shutdowns while preventing indefinite hangs.

Timeout Configuration Section¶

Timeouts are configured in the [timeouts] section of your configuration file:

[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 30
global_shutdown_timeout = 60
dependency_wait_timeout = 300

Timeout Options¶

post_script_timeout¶

Type: integer (seconds)
Default: 30

Maximum time to wait for a service's pos_script to complete after the service starts.

[timeouts]
post_script_timeout = 30

When it applies: - After a service process starts successfully - Only if the service has a pos_script configured

What happens on timeout: - The pos_script process is terminated - The service continues running - A warning is logged

Post-Script Timeout

If your post-scripts perform health checks or critical initialization, ensure this timeout is long enough. A timeout doesn't stop the service itself, only the post-script.

Example use case:

[[services]]
name = "database"
command = "/usr/bin/mysqld"
pos_script = "/scripts/verify-db-ready.sh"  # May take 20 seconds

[timeouts]
post_script_timeout = 30  # Allow enough time for verification

service_shutdown_timeout¶

Type: integer (seconds)
Default: 30

Maximum time to wait for an individual service to shutdown gracefully after receiving SIGTERM.

[timeouts]
service_shutdown_timeout = 30

When it applies: - During system shutdown - When stopping a service via CLI - When a required service fails

What happens on timeout: - SIGKILL is sent to force termination - Service is marked as stopped - Shutdown process continues

Forced Termination

After this timeout expires, the service is forcefully killed with SIGKILL. This may result in data loss or corruption for services that need time to flush buffers or complete transactions.

Example use case:

[[services]]
name = "database"
command = "/usr/bin/postgres"
# Database needs time to flush writes and close connections

[timeouts]
service_shutdown_timeout = 60  # Give database more time to shutdown cleanly

global_shutdown_timeout¶

Type: integer (seconds)
Default: 60

Maximum time for the entire system to complete shutdown of all services.

[timeouts]
global_shutdown_timeout = 60

When it applies: - During system shutdown (SIGTERM/SIGINT to Go Overlay) - When a required service fails - When shutdown is initiated via CLI

What happens on timeout: - All remaining services are forcefully killed with SIGKILL - Go Overlay exits immediately - May result in ungraceful termination

System-Wide Timeout

This timeout applies to the entire shutdown sequence. If you have many services or services with long shutdown times, increase this value accordingly.

Calculation guideline:

global_shutdown_timeout >= (number_of_services * service_shutdown_timeout) + buffer

Example use case:

# System with 5 services, each needing up to 30 seconds
[timeouts]
service_shutdown_timeout = 30
global_shutdown_timeout = 180  # 5 * 30 + 30 second buffer

dependency_wait_timeout¶

Type: integer (seconds)
Default: 300

Maximum time to wait for service dependencies to become ready during startup.

[timeouts]
dependency_wait_timeout = 300

When it applies: - During system startup - When a service has depends_on configured - Waiting for dependent services to reach RUNNING state

What happens on timeout: - The waiting service fails to start - An error is logged - System may shutdown if the service is marked as required

Dependency Chains

For complex dependency chains, ensure this timeout accounts for the cumulative startup time of all dependencies plus their wait_after delays.

Example use case:

[[services]]
name = "database"
command = "/usr/bin/postgres"
wait_after = 10  # Needs 10 seconds to initialize

[[services]]
name = "app"
command = "/app/server"
depends_on = ["database"]

[timeouts]
dependency_wait_timeout = 60  # Allow time for database startup + wait_after

Timeout Reference Table¶

Timeout	Default	Unit	Applies To	On Timeout
`post_script_timeout`	30	seconds	Post-start scripts	Script terminated, service continues
`service_shutdown_timeout`	30	seconds	Individual service shutdown	SIGKILL sent to service
`global_shutdown_timeout`	60	seconds	Complete system shutdown	All services killed with SIGKILL
`dependency_wait_timeout`	300	seconds	Waiting for dependencies	Service fails to start

Complete Configuration Example¶

[timeouts]
# Post-script execution timeout
# Increase if health checks or verification scripts take longer
post_script_timeout = 45

# Individual service shutdown timeout
# Increase for databases or services that need time to flush data
service_shutdown_timeout = 60

# Global system shutdown timeout
# Should be larger than service_shutdown_timeout * number_of_services
global_shutdown_timeout = 180

# Dependency wait timeout
# Increase for complex dependency chains or slow-starting services
dependency_wait_timeout = 300

Timeout Scenarios¶

Scenario 1: Fast Microservices¶

For lightweight services that start and stop quickly:

[timeouts]
post_script_timeout = 10
service_shutdown_timeout = 15
global_shutdown_timeout = 30
dependency_wait_timeout = 60

Scenario 2: Database-Heavy Stack¶

For systems with databases that need time for graceful shutdown:

[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 90  # Databases need time to flush
global_shutdown_timeout = 300  # Multiple services with long shutdowns
dependency_wait_timeout = 180  # Database initialization can be slow

Scenario 3: Complex Dependency Chain¶

For systems with many interdependent services:

[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 30
global_shutdown_timeout = 120
dependency_wait_timeout = 600  # Long chain of dependencies

Best Practices¶

1. Set Realistic Timeouts¶

Don't set timeouts too short. Consider: - Service initialization time - Data flush/commit operations - Network operations - Dependency chains

2. Test Shutdown Behavior¶

Verify your timeouts work correctly:

# Start the system
go-overlay

# In another terminal, trigger shutdown
kill -TERM $(pidof go-overlay)

# Monitor logs to ensure graceful shutdown

3. Monitor Timeout Events¶

Watch logs for timeout warnings: - Frequent timeouts indicate values are too low - Services being killed with SIGKILL suggest insufficient shutdown time

4. Account for Dependencies¶

When using depends_on, ensure dependency_wait_timeout accounts for: - Startup time of all dependencies - Any wait_after delays - Service initialization time

5. Balance Safety and Speed¶

Too short: Risk of data loss, corruption, or incomplete operations
Too long: Slow shutdown, delayed restarts, poor user experience

Timeout Interactions¶

Startup Sequence¶

Service with dependencies starts
    ↓
Wait for dependencies (dependency_wait_timeout)
    ↓
Start service process
    ↓
Run pos_script if configured (post_script_timeout)
    ↓
Service marked as RUNNING

Shutdown Sequence¶

Shutdown initiated
    ↓
Send SIGTERM to all services
    ↓
Wait for each service (service_shutdown_timeout)
    ↓
If timeout: Send SIGKILL
    ↓
All services must stop within (global_shutdown_timeout)
    ↓
If global timeout: Force kill all remaining services

Troubleshooting¶

Services Being Killed During Shutdown¶

Symptom: Services are forcefully terminated with SIGKILL

Solution: Increase service_shutdown_timeout

[timeouts]
service_shutdown_timeout = 60  # Increased from 30

System Shutdown Takes Too Long¶

Symptom: Shutdown hangs or takes excessive time

Solution: Decrease timeouts or investigate why services aren't stopping

[timeouts]
service_shutdown_timeout = 20  # Reduced from 30
global_shutdown_timeout = 60   # Reduced from 120

Services Fail to Start Due to Dependencies¶

Symptom: "dependency wait timeout exceeded" errors

Solution: Increase dependency_wait_timeout or reduce dependency wait_after values

[timeouts]
dependency_wait_timeout = 600  # Increased from 300

Post-Scripts Being Terminated¶

Symptom: Post-scripts don't complete, warnings in logs

Solution: Increase post_script_timeout or optimize scripts

[timeouts]
post_script_timeout = 60  # Increased from 30

Advanced Timeout Patterns¶

Progressive Timeout Strategy¶

Use longer timeouts for critical services:

[timeouts]
service_shutdown_timeout = 30  # Default for most services
global_shutdown_timeout = 120  # Allow critical services extra time

[[services]]
name = "cache"
command = "/usr/bin/redis-server"
required = false  # Can be killed quickly

[[services]]
name = "database"
command = "/usr/bin/postgres"
required = true  # Gets full shutdown timeout

Development vs Production¶

Use different timeout configurations:

Development (fast iteration):

[timeouts]
post_script_timeout = 10
service_shutdown_timeout = 15
global_shutdown_timeout = 30
dependency_wait_timeout = 60

Production (data safety):

[timeouts]
post_script_timeout = 30
service_shutdown_timeout = 90
global_shutdown_timeout = 300
dependency_wait_timeout = 300

Timeouts Configuration¶

Timeout Configuration Section¶

Timeout Options¶

post_script_timeout¶

service_shutdown_timeout¶

global_shutdown_timeout¶

dependency_wait_timeout¶

Timeout Reference Table¶

Complete Configuration Example¶

Timeout Scenarios¶

Scenario 1: Fast Microservices¶

Scenario 2: Database-Heavy Stack¶

Scenario 3: Complex Dependency Chain¶

Best Practices¶

1. Set Realistic Timeouts¶

2. Test Shutdown Behavior¶

3. Monitor Timeout Events¶

4. Account for Dependencies¶

5. Balance Safety and Speed¶

Timeout Interactions¶

Startup Sequence¶

Shutdown Sequence¶

Troubleshooting¶

Services Being Killed During Shutdown¶

System Shutdown Takes Too Long¶

Services Fail to Start Due to Dependencies¶

Post-Scripts Being Terminated¶

Advanced Timeout Patterns¶

Progressive Timeout Strategy¶

Development vs Production¶

Next Steps¶