## Summary
A Circuit Breaker is used in service design to prevent repeated attempts at something that is likely to fail.
## States
There are three states:
1. **Closed:** Operations are allowed to proceed, with failure counts monitored. Failure count is periodically reset. If failure count hits threshold, state changes to **Open**.
2. **Open:** Operations are rejected for a period of time, after which state changes to **Half-Open**.
3. **Half-Open:** Operations are allowed to proceed, with success counts monitored. If success count hits threshold, state changes to **Closed**. If any error is hit, success changes to **Open**.
## Logic
### Closed
```python
def on_state_activation():
failure_count = 0
last_failure_timestamp = NULL
def on_operation_request():
if last_failure_timestamp != NULL and
NOW() - last_failure_timestamp >= RESET_FAILURE_INTERVAL:
failure_count = 0
perform_operation()
if success:
return result
else:
failure_count++
last_failure_timestamp = NOW()
if failure_count >= FAILURE_THRESHOLD:
set_state(OPEN)
return failure
```
### Open
```python
def on_state_activated():
state_activated_timestamp = NOW()
set_timeout(timeout=OPEN_TIMEOUT,
callback=set_state(CLOSED))
def on_operation_request():
if NOW() - state_activated_timestamp >= OPEN_TIMEOUT:
set_state(HALF_OPEN)
return rerun_perform_operation()
else:
return failure
```
### Half-Open
```python
def on_state_activated():
success_count = 0
def on_operation_request():
perform_operation()
if success:
success_count++
if success_count >= SUCCESS_THRESHOLD:
set_state(CLOSED)
return result
else:
set_state(OPEN)
return failure
```
## Additional Considerations
* Timeouts could use exponential (or similar) backoff.
* As this will often be global, all state should be thread-safe (governed by a lock)
Microsoft:
> The pattern is customizable and can be adapted according to the type of the possible failure. For example, you can apply an increasing timeout timer to a circuit breaker. You could place the circuit breaker in the **Open** state for a few seconds initially, and then if the failure hasn't been resolved increase the timeout to a few minutes, and so on. In some cases, rather than the **Open** state returning failure and raising an exception, it could be useful to return a default value that is meaningful to the application.
## Implementations
### Python
This is exception-based, and should be tailored for specific circumstances.
```python
class CircuitBreakerState(Enum):
CLOSED = 0
OPEN = 1
HALF_OPEN = 2
class CircuitBreaker:
# Constants
CLOSED_RESET_FAILURES_INTERVAL_MS = ...
CLOSED_FAILURE_THRESHOLD = ...
HALF_OPEN_SUCCESS_THRESHOLD = ...
OPEN_TIMEOUT_MS = ...
# Common state
state: CircuitBreakerState
state_active_timestamp: datetime
lock: threading.Lock
# Closed state
closed_failure_count: int
closed_last_failure_timestamp: datetime
# Half-open state
half_open_success_count: int
def __init__(self) -> None:
self._state = CircuitBreakerState.CLOSED
@property
def state(self) -> CircuitBreakerState:
return self._state
@state.setter
def state(
self,
new_state: CircuitBreakerState,
) -> None:
if not self.lock.acquire():
# Another thread is managing state. Bail.
return
try:
self.state = new_state
self.state_active_timestamp = datetime.now()
self.closed_failure_count = 0
self.closed_last_failure_timestamp = None
self.half_open_success_count = 0
finally:
self.lock.release()
def run_operation(
self,
handler: Callable,
) -> Any:
now = datetime.now()
if self.state == CircuitBreakerState.CLOSED:
# The circuit breaker is closed.
if (self.closed_last_failure_timestamp is not None and
now - self.closed_last_failure_timestamp >=
self.CLOSED_RESET_FAILURES_INTERVAL_MS):
# Enough time passed. Reset the failure count.
self.closed_failure_count = 0
try:
return handler()
except Exception:
self.closed_failure_count += 1
if self.closed_failure_count >= self.CLOSED_FAILURE_THRESHOLD:
self.state = CircuitBreakerState.OPEN
raise
elif self.state == CircuitBreakerState.OPEN:
# The circuit breaker is open.
if now - self.state_active_timestamp >= self.OPEN_TIMEOUT_MS:
self.state = CircuitBreakerState.HALF_OPEN
# Re-run the operation.
return self.run_operation(handler)
else:
raise Exception('...')
elif self.state == CircuitBreakerState.HALF_OPEN:
# The circuit breaker is half-open.
try:
result = handler()
self.half_open_success_count += 1
if self.half_open_success_count >= HALF_OPEN_SUCCESS_THRESHOLD:
self.state = CircuitBreakerState.CLOSED
return result
except Exception:
self.state = CircuitBreakerState.OPEN
raise
```
> [!seealso] See Also:
> * [Circuit Breaker — #Microsoft](https://learn.microsoft.com/en-us/azure/architecture/patterns/circuit-breaker)