8.0 KiB
Daemon Threads and Event-Based Timing Refactor
Date: 2026-03-24 Status: Draft Scope: All APRSD thread classes
Overview
Refactor all APRSD thread classes to use daemon threads and replace counter-based sleep patterns with threading.Event() for interruptible waits and cleaner periodic timing.
Goals
- Faster shutdown — threads respond to shutdown signals immediately instead of waiting for
time.sleep()to complete - Cleaner periodic timing — replace counter-based timing (
loop_count % 60) with explicit period-based waits - Proper daemon semantics — most threads become daemon threads; critical I/O threads remain non-daemon for graceful shutdown
Current State
- 14 thread classes extend
APRSDThreadbase class - All use
time.sleep(1)with counter-based conditionals for periodic work - Shutdown via boolean
thread_stopflag, polled in while loop - No threads set as daemon — all block program exit
- No
threading.Event()usage anywhere
Current Problems
- Shutdown delay: up to 1-5 seconds waiting for sleep to finish
- Counter math is fragile and obscures intent
- Non-daemon threads prevent clean program exit
Design
Base Class Changes
File: aprsd/threads/aprsd.py
class APRSDThread(threading.Thread, metaclass=abc.ABCMeta):
# Class attributes (subclasses override as needed)
daemon = True # Most threads are daemon
period = 1 # Default wait period in seconds
def __init__(self, name):
super().__init__(name=name)
self.daemon = self.__class__.daemon
self._shutdown_event = threading.Event()
self.loop_count = 0 # Retained for debugging
self._last_loop = datetime.datetime.now()
APRSDThreadList().add(self)
def _should_quit(self) -> bool:
"""Check if thread should exit."""
return self._shutdown_event.is_set()
def stop(self):
"""Signal thread to stop. Returns immediately."""
self._shutdown_event.set()
def wait(self, timeout: float | None = None) -> bool:
"""Wait for shutdown signal or timeout.
Args:
timeout: Seconds to wait. Defaults to self.period.
Returns:
True if shutdown was signaled, False if timeout expired.
"""
wait_time = timeout if timeout is not None else self.period
return self._shutdown_event.wait(timeout=wait_time)
def run(self):
while not self._should_quit():
self.loop_count += 1
self._last_loop = datetime.datetime.now()
if not self.loop():
break
APRSDThreadList().remove(self)
Daemon vs Non-Daemon Threads
Non-daemon threads (3) — require graceful shutdown for I/O operations:
| Thread | File | Reason |
|---|---|---|
PacketSendSchedulerThread |
tx.py | Manages packet send queue |
AckSendSchedulerThread |
tx.py | Manages ACK send queue |
APRSDStatsStoreThread |
stats.py | Writes stats to disk |
Daemon threads (12) — can be terminated immediately:
APRSDRXThread,APRSDFilterThread,APRSDProcessPacketThread,APRSDPluginProcessPacketThreadAPRSDPushStatsThread,StatsLogThreadKeepAliveThread,APRSRegistryThreadSendPacketThread,SendAckThread,BeaconSendThreadAPRSDListenProcessThread(listen command)
Subclass Migration Pattern
Before:
class KeepAliveThread(APRSDThread):
def loop(self):
if self.loop_count % 60 == 0:
self._do_keepalive_work()
time.sleep(1)
return True
After:
class KeepAliveThread(APRSDThread):
period = 60
def loop(self):
self._do_keepalive_work()
self.wait()
return True
Thread Periods
| Thread | New period |
Source |
|---|---|---|
KeepAliveThread |
60 | Fixed |
APRSDStatsStoreThread |
10 | Fixed |
APRSDPushStatsThread |
config | CONF.push_stats.period |
StatsLogThread |
10 | Fixed |
APRSRegistryThread |
config | CONF.aprs_registry.frequency_seconds |
BeaconSendThread |
config | CONF.beacon_interval |
APRSDRXThread |
1 | Default |
APRSDFilterThread |
1 | Default |
APRSDProcessPacketThread |
1 | Default |
APRSDPluginProcessPacketThread |
1 | Default |
SendPacketThread |
1 | Default |
SendAckThread |
1 | Default |
PacketSendSchedulerThread |
1 | Default |
AckSendSchedulerThread |
1 | Default |
APRSDListenProcessThread |
1 | Default |
Config-based periods are set in __init__ or setup(). Note: setup() is called by subclasses in their __init__ before the thread starts; the base class does not call it automatically.
class APRSRegistryThread(APRSDThread):
period = 1 # Default
def setup(self):
self.period = CONF.aprs_registry.frequency_seconds
ThreadList Changes
File: aprsd/threads/aprsd.py
class APRSDThreadList:
def stop_all(self):
"""Signal all threads to stop."""
with self.lock:
for th in self.threads_list:
th.stop()
def join_non_daemon(self, timeout: float = 5.0):
"""Wait for non-daemon threads to complete gracefully."""
with self.lock:
for th in self.threads_list:
if not th.daemon and th.is_alive():
th.join(timeout=timeout)
Shutdown Handler Changes
File: aprsd/main.py
def signal_handler(sig, frame):
LOG.info("Shutdown signal received")
thread_list = threads.APRSDThreadList()
thread_list.stop_all()
thread_list.join_non_daemon(timeout=5.0)
# Daemon threads killed automatically on exit
Queue-Based Threads
Threads that block on queues use queue timeout as interruptible wait:
class APRSDFilterThread(APRSDThread):
period = 1
def loop(self):
try:
packet = self.queue.get(timeout=self.period)
self._process(packet)
except queue.Empty:
pass # Timeout, loop checks _should_quit
return True
Error Recovery Waits
Threads needing longer waits for error recovery use explicit timeout:
class APRSDRXThread(APRSDThread):
period = 1
def loop(self):
try:
self._process_packets()
except ConnectionError:
LOG.error("Connection lost, retrying in 5s")
if self.wait(timeout=5):
return False # Shutdown signaled
self.wait()
return True
Files to Modify
aprsd/threads/aprsd.py— Base class and ThreadListaprsd/threads/rx.py— RX thread classes (4)aprsd/threads/tx.py— TX thread classes (5)aprsd/threads/stats.py— Stats thread classes (3)aprsd/threads/keepalive.py— KeepAliveThreadaprsd/threads/registry.py— APRSRegistryThreadaprsd/main.py— Signal handleraprsd/cmds/listen.py— APRSDListenProcessThread
Testing Strategy
-
Unit tests for base class:
wait()returnsTrueimmediately when event is already setwait(timeout=5)returnsFalseafter 5 seconds if event not setstop()causes_should_quit()to returnTruedaemonattribute is set correctly from class attribute
-
Integration tests:
- Shutdown completes in <1s for daemon-only scenarios
- Non-daemon threads get up to 5s grace period
join_non_daemon()respects timeout parameter
-
Manual testing:
- Send SIGINT during operation, verify clean exit
- Verify no "thread still running" warnings on shutdown
-
Existing test updates:
- Update any tests that mock
thread_stopto mock_shutdown_eventinstead - Update tests that check
time.sleepcalls to checkwait()calls
- Update any tests that mock
Backwards Compatibility
loop_countretained for debugging/logging_should_quit()method signature unchanged- Default
period=1matches current 1-second sleep behavior
Rollout
Single PR with all changes — the refactor is atomic and affects thread behavior globally.