1
0
mirror of https://github.com/craigerl/aprsd.git synced 2026-03-31 20:25:34 -04:00
aprsd/docs/superpowers/plans/2026-03-24-daemon-threads-event-refactor.md

42 KiB

Daemon Threads and Event-Based Timing Implementation Plan

For agentic workers: REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Refactor all APRSD thread classes to use daemon threads and threading.Event() for interruptible waits and cleaner periodic timing.

Architecture: Modify APRSDThread base class to add daemon and period class attributes, replace boolean thread_stop with threading.Event, add wait() method. Update all 15 subclasses to use new pattern.

Tech Stack: Python threading, threading.Event, existing APRSD thread infrastructure


File Structure

File Action Responsibility
aprsd/threads/aprsd.py Modify Base class + ThreadList changes
aprsd/threads/keepalive.py Modify KeepAliveThread period migration
aprsd/threads/stats.py Modify 3 stats threads, APRSDStatsStoreThread non-daemon
aprsd/threads/rx.py Modify 4 RX threads with queue-based waits
aprsd/threads/tx.py Modify 5 TX threads, 2 schedulers non-daemon
aprsd/threads/registry.py Modify APRSRegistryThread period from config
aprsd/cmds/listen.py Modify APRSDListenProcessThread (inherits from APRSDFilterThread)
aprsd/main.py Modify Signal handler update
tests/threads/test_aprsd_thread.py Modify Update existing tests for Event-based API

Chunk 1: Base Class Refactor

Task 1: Update APRSDThread base class

Files:

  • Modify: aprsd/threads/aprsd.py:13-76

  • Test: tests/threads/test_aprsd_thread.py

  • Step 1: Write failing test for daemon attribute

Note: The existing test file already has a TestThread fixture class with a should_loop parameter. Tests in this plan use that existing fixture.

Add to tests/threads/test_aprsd_thread.py:

def test_daemon_attribute_default(self):
    """Test that daemon attribute defaults to True."""
    thread = TestThread('DaemonTest')
    self.assertTrue(thread.daemon)


def test_daemon_attribute_override(self):
    """Test that daemon attribute can be overridden via class attribute."""
    class NonDaemonThread(APRSDThread):
        daemon = False
        def loop(self):
            return False

    thread = NonDaemonThread('NonDaemonTest')
    self.assertFalse(thread.daemon)
  • Step 2: Run test to verify it fails

Run: pytest tests/threads/test_aprsd_thread.py::TestAPRSDThread::test_daemon_attribute_default -v Expected: FAIL with AssertionError (daemon is False by default currently)

  • Step 3: Write failing test for period attribute

Add to tests/threads/test_aprsd_thread.py:

def test_period_attribute_default(self):
    """Test that period attribute defaults to 1."""
    thread = TestThread('PeriodTest')
    self.assertEqual(thread.period, 1)


def test_period_attribute_override(self):
    """Test that period attribute can be overridden via class attribute."""
    class LongPeriodThread(APRSDThread):
        period = 60
        def loop(self):
            return False

    thread = LongPeriodThread('LongPeriodTest')
    self.assertEqual(thread.period, 60)
  • Step 4: Run test to verify it fails

Run: pytest tests/threads/test_aprsd_thread.py::TestAPRSDThread::test_period_attribute_default -v Expected: FAIL with AttributeError: 'TestThread' object has no attribute 'period'

  • Step 5: Write failing test for _shutdown_event and wait()

Add to tests/threads/test_aprsd_thread.py:

def test_shutdown_event_exists(self):
    """Test that _shutdown_event is created."""
    thread = TestThread('EventTest')
    self.assertIsInstance(thread._shutdown_event, threading.Event)
    self.assertFalse(thread._shutdown_event.is_set())


def test_wait_returns_false_on_timeout(self):
    """Test that wait() returns False when timeout expires."""
    thread = TestThread('WaitTimeoutTest')
    start = time.time()
    result = thread.wait(timeout=0.1)
    elapsed = time.time() - start
    self.assertFalse(result)
    self.assertGreaterEqual(elapsed, 0.1)


def test_wait_returns_true_when_stopped(self):
    """Test that wait() returns True immediately when stop() was called."""
    thread = TestThread('WaitStopTest')
    thread.stop()
    start = time.time()
    result = thread.wait(timeout=10)  # Would timeout in 10s if not stopped
    elapsed = time.time() - start
    self.assertTrue(result)
    self.assertLess(elapsed, 1)  # Should return immediately


def test_wait_uses_period_by_default(self):
    """Test that wait() uses self.period when no timeout specified."""
    class ShortPeriodThread(APRSDThread):
        period = 0.1
        def loop(self):
            return False

    thread = ShortPeriodThread('ShortPeriodTest')
    start = time.time()
    result = thread.wait()
    elapsed = time.time() - start
    self.assertFalse(result)
    self.assertGreaterEqual(elapsed, 0.1)
    self.assertLess(elapsed, 0.5)
  • Step 6: Run tests to verify they fail

Run: pytest tests/threads/test_aprsd_thread.py -k "shutdown_event or wait_" -v Expected: FAIL with AttributeError

  • Step 7: Implement base class changes

Update aprsd/threads/aprsd.py:

import abc
import datetime
import logging
import threading
import time
from typing import List

import wrapt

LOG = logging.getLogger('APRSD')


class APRSDThread(threading.Thread, metaclass=abc.ABCMeta):
    """Base class for all threads in APRSD."""

    # Class attributes - subclasses override as needed
    daemon = True  # Most threads are daemon threads
    period = 1     # Default wait period in seconds
    loop_count = 1
    _pause = False

    def __init__(self, name):
        super().__init__(name=name)
        # Set daemon from class attribute
        self.daemon = self.__class__.daemon
        # Set period from class attribute (can be overridden in __init__)
        self.period = self.__class__.period
        self._shutdown_event = threading.Event()
        self.loop_count = 0
        APRSDThreadList().add(self)
        self._last_loop = datetime.datetime.now()

    def _should_quit(self):
        """Check if thread should exit."""
        return self._shutdown_event.is_set()

    def pause(self):
        """Logically pause the processing of the main loop."""
        LOG.debug(f"Pausing thread '{self.name}' loop_count {self.loop_count}")
        self._pause = True

    def unpause(self):
        """Logically resume processing of the main loop."""
        LOG.debug(f"Resuming thread '{self.name}' loop_count {self.loop_count}")
        self._pause = False

    def stop(self):
        """Signal thread to stop. Returns immediately."""
        LOG.debug(f"Stopping thread '{self.name}'")
        self._shutdown_event.set()

    def wait(self, timeout: float | None = None) -> bool:
        """Wait for shutdown signal or timeout.

        Args:
            timeout: Seconds to wait. Defaults to self.period.

        Returns:
            True if shutdown was signaled, False if timeout expired.
        """
        wait_time = timeout if timeout is not None else self.period
        return self._shutdown_event.wait(timeout=wait_time)

    @abc.abstractmethod
    def loop(self):
        pass

    def _cleanup(self):
        """Add code to subclass to do any cleanup"""

    def __str__(self):
        out = (
            f'Thread <{self.__class__.__name__}({self.name}) Alive? {self.is_alive()}>'
        )
        return out

    def loop_age(self):
        """How old is the last loop call?"""
        return datetime.datetime.now() - self._last_loop

    def run(self):
        LOG.debug('Starting')
        while not self._should_quit():
            if self._pause:
                self.wait(timeout=1)
            else:
                self.loop_count += 1
                can_loop = self.loop()
                self._last_loop = datetime.datetime.now()
                if not can_loop:
                    self.stop()
        self._cleanup()
        APRSDThreadList().remove(self)
        LOG.debug('Exiting')
  • Step 8: Run base class tests to verify they pass

Run: pytest tests/threads/test_aprsd_thread.py::TestAPRSDThread -v Expected: PASS

  • Step 9: Update existing tests that reference thread_stop

Update tests/threads/test_aprsd_thread.py - change references from thread_stop to _shutdown_event.is_set():

def test_init(self):
    """Test thread initialization."""
    thread = TestThread('TestThread1')
    self.assertEqual(thread.name, 'TestThread1')
    self.assertFalse(thread._shutdown_event.is_set())
    self.assertFalse(thread._pause)
    # Note: loop_count starts at 0 now (was 1)
    self.assertEqual(thread.loop_count, 0)

    # Should be registered in thread list
    thread_list = APRSDThreadList()
    self.assertIn(thread, thread_list.threads_list)


def test_should_quit(self):
    """Test _should_quit() method."""
    thread = TestThread('TestThread2')
    self.assertFalse(thread._should_quit())

    thread._shutdown_event.set()
    self.assertTrue(thread._should_quit())


def test_stop(self):
    """Test stop() method."""
    thread = TestThread('TestThread4')
    self.assertFalse(thread._shutdown_event.is_set())

    thread.stop()
    self.assertTrue(thread._shutdown_event.is_set())


def test_stop_all(self):
    """Test stop_all() method."""
    thread_list = APRSDThreadList()
    thread1 = TestThread('TestThread8')
    thread2 = TestThread('TestThread9')
    thread_list.add(thread1)
    thread_list.add(thread2)

    thread_list.stop_all()
    self.assertTrue(thread1._shutdown_event.is_set())
    self.assertTrue(thread2._shutdown_event.is_set())
  • Step 10: Run all base class tests

Run: pytest tests/threads/test_aprsd_thread.py -v Expected: PASS

  • Step 11: Commit base class changes
git add aprsd/threads/aprsd.py tests/threads/test_aprsd_thread.py
git commit -m "refactor(threads): add daemon, period, Event-based shutdown to APRSDThread

- Add daemon=True class attribute (subclasses override to False)
- Add period=1 class attribute for wait interval
- Replace thread_stop boolean with _shutdown_event (threading.Event)
- Add wait() method for interruptible sleeps
- Update tests for new Event-based API

BREAKING: thread_stop boolean replaced with _shutdown_event.
Code checking thread.thread_stop directly must use thread._shutdown_event.is_set()"

Task 2: Add join_non_daemon to APRSDThreadList

Files:

  • Modify: aprsd/threads/aprsd.py:79-169

  • Test: tests/threads/test_aprsd_thread.py

  • Step 1: Write failing test for join_non_daemon

Add to tests/threads/test_aprsd_thread.py:

def test_join_non_daemon(self):
    """Test join_non_daemon() waits for non-daemon threads."""
    class NonDaemonTestThread(APRSDThread):
        daemon = False
        def __init__(self, name):
            super().__init__(name)
            self.finished = False

        def loop(self):
            time.sleep(0.2)
            self.finished = True
            return False

    thread_list = APRSDThreadList()
    thread = NonDaemonTestThread('NonDaemonJoinTest')
    thread_list.add(thread)
    thread.start()

    # Stop triggers the event, thread should finish its loop then exit
    thread.stop()
    thread_list.join_non_daemon(timeout=5.0)

    self.assertTrue(thread.finished or not thread.is_alive())


def test_join_non_daemon_skips_daemon_threads(self):
    """Test join_non_daemon() does not wait for daemon threads."""
    thread_list = APRSDThreadList()
    # Clear existing threads
    thread_list.threads_list = []

    # Create a daemon thread that loops forever
    thread = TestThread('DaemonSkipTest', should_loop=True)
    thread_list.add(thread)
    thread.start()

    # This should return quickly since it's a daemon thread
    start = time.time()
    thread_list.join_non_daemon(timeout=0.1)
    elapsed = time.time() - start

    self.assertLess(elapsed, 0.5)  # Should not wait for daemon

    # Cleanup
    thread.stop()
    thread.join(timeout=1)
  • Step 2: Run test to verify it fails

Run: pytest tests/threads/test_aprsd_thread.py::TestAPRSDThreadList::test_join_non_daemon -v Expected: FAIL with AttributeError: 'APRSDThreadList' object has no attribute 'join_non_daemon'

  • Step 3: Implement join_non_daemon method

Add to APRSDThreadList class in aprsd/threads/aprsd.py:

@wrapt.synchronized(lock)
def join_non_daemon(self, timeout: float = 5.0):
    """Wait for non-daemon threads to complete gracefully.

    Args:
        timeout: Maximum seconds to wait per thread.
    """
    for th in self.threads_list:
        if not th.daemon and th.is_alive():
            LOG.info(f'Waiting for non-daemon thread {th.name} to finish')
            th.join(timeout=timeout)
  • Step 4: Run tests to verify they pass

Run: pytest tests/threads/test_aprsd_thread.py::TestAPRSDThreadList::test_join_non_daemon -v Expected: PASS

  • Step 5: Commit ThreadList changes
git add aprsd/threads/aprsd.py tests/threads/test_aprsd_thread.py
git commit -m "feat(threads): add join_non_daemon() to APRSDThreadList

Allows graceful shutdown by waiting for non-daemon threads to complete
while allowing daemon threads to be terminated immediately."

Chunk 2: Stats and Keepalive Thread Migration

Task 3: Migrate KeepAliveThread

Files:

  • Modify: aprsd/threads/keepalive.py

  • Step 1: Update KeepAliveThread to use period

Replace the loop method in aprsd/threads/keepalive.py:

class KeepAliveThread(APRSDThread):
    cntr = 0
    checker_time = datetime.datetime.now()
    period = 60  # Run keepalive every 60 seconds

    def __init__(self):
        tracemalloc.start()
        super().__init__('KeepAlive')
        max_timeout = {'hours': 0.0, 'minutes': 2, 'seconds': 0}
        self.max_delta = datetime.timedelta(**max_timeout)

    def loop(self):
        stats_json = collector.Collector().collect()
        pl = packets.PacketList()
        thread_list = APRSDThreadList()
        now = datetime.datetime.now()

        if (
            'APRSClientStats' in stats_json
            and stats_json['APRSClientStats'].get('transport') == 'aprsis'
        ):
            if stats_json['APRSClientStats'].get('server_keepalive'):
                last_msg_time = utils.strfdelta(
                    now - stats_json['APRSClientStats']['server_keepalive']
                )
            else:
                last_msg_time = 'N/A'
        else:
            last_msg_time = 'N/A'

        tracked_packets = stats_json['PacketTrack']['total_tracked']
        tx_msg = 0
        rx_msg = 0
        if 'PacketList' in stats_json:
            msg_packets = stats_json['PacketList'].get('MessagePacket')
            if msg_packets:
                tx_msg = msg_packets.get('tx', 0)
                rx_msg = msg_packets.get('rx', 0)

        keepalive = (
            '{} - Uptime {} RX:{} TX:{} Tracker:{} Msgs TX:{} RX:{} '
            'Last:{} - RAM Current:{} Peak:{} Threads:{} LoggingQueue:{}'
        ).format(
            stats_json['APRSDStats']['callsign'],
            stats_json['APRSDStats']['uptime'],
            pl.total_rx(),
            pl.total_tx(),
            tracked_packets,
            tx_msg,
            rx_msg,
            last_msg_time,
            stats_json['APRSDStats']['memory_current_str'],
            stats_json['APRSDStats']['memory_peak_str'],
            len(thread_list),
            aprsd_log.logging_queue.qsize(),
        )
        LOG.info(keepalive)
        if 'APRSDThreadList' in stats_json:
            thread_list = stats_json['APRSDThreadList']
            for thread_name in thread_list:
                thread = thread_list[thread_name]
                alive = thread['alive']
                age = thread['age']
                key = thread['name']
                if not alive:
                    LOG.error(f'Thread {thread}')

                thread_hex = f'fg {utils.hex_from_name(key)}'
                t_name = f'<{thread_hex}>{key:<15}</{thread_hex}>'
                thread_msg = f'{t_name} Alive? {str(alive): <5} {str(age): <20}'
                LOGU.opt(colors=True).info(thread_msg)

        # Go through the registered keepalive collectors
        # and check them as well as call log.
        collect = keepalive_collector.KeepAliveCollector()
        collect.check()
        collect.log()

        # Check version every day
        delta = now - self.checker_time
        if delta > datetime.timedelta(hours=24):
            self.checker_time = now
            level, msg = utils._check_version()
            if level:
                LOG.warning(msg)
        self.cntr += 1

        self.wait()  # Wait for period (60s) or shutdown signal
        return True
  • Step 2: Run existing tests

Run: pytest tests/ -k keepalive -v Expected: PASS (or skip if no specific keepalive tests)

  • Step 3: Commit KeepAliveThread changes
git add aprsd/threads/keepalive.py
git commit -m "refactor(threads): migrate KeepAliveThread to Event-based timing

- Set period=60 class attribute
- Remove counter-based conditional (loop_count % 60)
- Replace time.sleep(1) with self.wait()"

Task 4: Migrate Stats Threads

Files:

  • Modify: aprsd/threads/stats.py

  • Step 1: Update APRSDStatsStoreThread (non-daemon)

class APRSDStatsStoreThread(APRSDThread):
    """Save APRSD Stats to disk periodically."""

    daemon = False  # Non-daemon for graceful disk writes
    period = 10     # Save every 10 seconds

    def __init__(self):
        super().__init__('StatsStore')

    def loop(self):
        stats = collector.Collector().collect()
        ss = StatsStore()
        ss.add(stats)
        ss.save()

        self.wait()  # Wait for period (10s) or shutdown signal
        return True
  • Step 2: Update APRSDPushStatsThread
class APRSDPushStatsThread(APRSDThread):
    """Push the local stats to a remote API."""

    def __init__(
        self, push_url=None, frequency_seconds=None, send_packetlist: bool = False
    ):
        super().__init__('PushStats')
        self.push_url = push_url if push_url else CONF.push_stats.push_url
        # Set period from config
        self.period = (
            frequency_seconds
            if frequency_seconds
            else CONF.push_stats.frequency_seconds
        )
        self.send_packetlist = send_packetlist

    def loop(self):
        stats_json = collector.Collector().collect(serializable=True)
        url = f'{self.push_url}/stats'
        headers = {'Content-Type': 'application/json'}
        # Remove the PacketList section to reduce payload size
        if not self.send_packetlist:
            if 'PacketList' in stats_json:
                del stats_json['PacketList']['packets']

        now = datetime.datetime.now()
        time_format = '%m-%d-%Y %H:%M:%S'
        stats = {
            'time': now.strftime(time_format),
            'stats': stats_json,
        }

        try:
            response = requests.post(url, json=stats, headers=headers, timeout=5)
            response.raise_for_status()

            if response.status_code == 200:
                LOGU.info(f'Successfully pushed stats to {self.push_url}')
            else:
                LOGU.warning(
                    f'Failed to push stats to {self.push_url}: HTTP {response.status_code}'
                )

        except requests.exceptions.RequestException as e:
            LOGU.error(f'Error pushing stats to {self.push_url}: {e}')
        except Exception as e:
            LOGU.error(f'Unexpected error in stats push: {e}')

        self.wait()  # Wait for period or shutdown signal
        return True
  • Step 3: Update StatsLogThread
class StatsLogThread(APRSDThread):
    """Log the stats from the PacketList."""

    period = 10  # Log every 10 seconds

    def __init__(self):
        super().__init__('PacketStatsLog')
        self._last_total_rx = 0
        self.start_time = time.time()

    def loop(self):
        # log the stats every period seconds
        stats_json = collector.Collector().collect(serializable=True)
        stats = stats_json['PacketList']
        total_rx = stats['rx']
        rx_delta = total_rx - self._last_total_rx
        rate = rx_delta / self.period

        # Get unique callsigns count from SeenList stats
        seen_list_instance = seen_list.SeenList()
        seen_list_stats = seen_list_instance.stats()
        seen_list_instance.save()
        seen_list_stats = seen_list_stats.copy()
        unique_callsigns_count = len(seen_list_stats)

        # Calculate uptime
        elapsed = time.time() - self.start_time
        elapsed_minutes = elapsed / 60
        elapsed_hours = elapsed / 3600

        # Log summary stats
        LOGU.opt(colors=True).info(
            f'<green>RX Rate: {rate:.2f} pps</green>  '
            f'<yellow>Total RX: {total_rx}</yellow> '
            f'<red>RX Last {self.period} secs: {rx_delta}</red> '
        )
        LOGU.opt(colors=True).info(
            f'<cyan>Uptime: {elapsed:.0f}s ({elapsed_minutes:.1f}m / {elapsed_hours:.2f}h)</cyan>  '
            f'<magenta>Unique Callsigns: {unique_callsigns_count}</magenta>',
        )
        self._last_total_rx = total_rx

        # Log individual type stats, sorted by RX count (descending)
        sorted_types = sorted(
            stats['types'].items(), key=lambda x: x[1]['rx'], reverse=True
        )
        for k, v in sorted_types:
            percentage = (v['rx'] / total_rx * 100) if total_rx > 0 else 0.0
            packet_type_str = f'{k:<15}'
            rx_count_str = f'{v["rx"]:6d}'
            tx_count_str = f'{v["tx"]:6d}'
            percentage_str = f'{percentage:5.1f}%'
            rx_color_tag = (
                'green' if v['rx'] > 100 else 'yellow' if v['rx'] > 10 else 'red'
            )
            LOGU.opt(colors=True).info(
                f'  <cyan>{packet_type_str}</cyan>: '
                f'<{rx_color_tag}>RX: {rx_count_str}</{rx_color_tag}> '
                f'<red>TX: {tx_count_str}</red> '
                f'<magenta>({percentage_str})</magenta>',
            )

        # Extract callsign counts from seen_list stats
        callsign_counts = {}
        for callsign, data in seen_list_stats.items():
            if isinstance(data, dict) and 'count' in data:
                callsign_counts[callsign] = data['count']

        # Sort callsigns by packet count (descending) and get top 10
        sorted_callsigns = sorted(
            callsign_counts.items(), key=lambda x: x[1], reverse=True
        )[:10]

        # Log top 10 callsigns
        if sorted_callsigns:
            LOGU.opt(colors=True).info(
                '<cyan>Top 10 Callsigns by Packet Count:</cyan>'
            )
            total_ranks = len(sorted_callsigns)
            for rank, (callsign, count) in enumerate(sorted_callsigns, 1):
                percentage = (count / total_rx * 100) if total_rx > 0 else 0.0
                if rank == 1:
                    count_color_tag = 'red'
                elif rank == total_ranks:
                    count_color_tag = 'green'
                else:
                    count_color_tag = 'yellow'
                LOGU.opt(colors=True).info(
                    f'  <cyan>{rank:2d}.</cyan> '
                    f'<white>{callsign:<12}</white>: '
                    f'<{count_color_tag}>{count:6d} packets</{count_color_tag}> '
                    f'<magenta>({percentage:5.1f}%)</magenta>',
                )

        self.wait()  # Wait for period (10s) or shutdown signal
        return True
  • Step 4: Run tests

Run: pytest tests/ -v Expected: PASS

  • Step 5: Commit stats thread changes
git add aprsd/threads/stats.py
git commit -m "refactor(threads): migrate stats threads to Event-based timing

- APRSDStatsStoreThread: daemon=False, period=10
- APRSDPushStatsThread: period from config
- StatsLogThread: period=10
- Remove counter-based conditionals
- Replace time.sleep(1) with self.wait()"

Chunk 3: RX Thread Migration

Task 5: Migrate RX Threads

Files:

  • Modify: aprsd/threads/rx.py

  • Step 1: Update APRSDRXThread

class APRSDRXThread(APRSDThread):
    """Thread to receive packets from the APRS Client."""

    _client = None
    packet_queue = None
    pkt_count = 0

    def __init__(self, packet_queue: queue.Queue):
        super().__init__('RX_PKT')
        self.packet_queue = packet_queue

    def stop(self):
        self._shutdown_event.set()
        if self._client:
            self._client.close()

    def loop(self):
        if not self._client:
            self._client = APRSDClient()
            self.wait(timeout=1)
            return True

        if not self._client.is_alive:
            self._client = APRSDClient()
            self.wait(timeout=1)
            return True

        try:
            self._client.consumer(
                self.process_packet,
                raw=True,
            )
        except (
            aprslib.exceptions.ConnectionDrop,
            aprslib.exceptions.ConnectionError,
        ):
            LOG.error('Connection dropped, reconnecting')
            self._client.reset()
            if self.wait(timeout=5):  # Interruptible 5s wait
                return False  # Shutdown signaled
        except Exception as ex:
            LOG.exception(ex)
            LOG.error('Resetting connection and trying again.')
            self._client.reset()
            if self.wait(timeout=5):  # Interruptible 5s wait
                return False  # Shutdown signaled
        return True

    def process_packet(self, *args, **kwargs):
        """Put the raw packet on the queue."""
        if args:
            data = args[0]
        elif 'raw' in kwargs:
            data = kwargs['raw']
        elif 'frame' in kwargs:
            data = kwargs['frame']
        elif 'packet' in kwargs:
            data = kwargs['packet']
        else:
            LOG.warning('No frame received to process?!?!')
            return

        self.pkt_count += 1
        self.packet_queue.put(data)
  • Step 2: Update APRSDFilterThread (queue-based wait)
class APRSDFilterThread(APRSDThread):
    """Thread to filter packets on the packet queue."""

    def __init__(self, thread_name: str, packet_queue: queue.Queue):
        super().__init__(thread_name)
        self.packet_queue = packet_queue
        self.packet_count = 0
        self._client = APRSDClient()

    def filter_packet(self, packet: type[core.Packet]) -> type[core.Packet] | None:
        if not filter.PacketFilter().filter(packet):
            return None
        return packet

    def print_packet(self, packet):
        packet_log.log(packet, packet_count=self.packet_count)

    def loop(self):
        try:
            # Queue timeout serves as interruptible wait
            pkt = self.packet_queue.get(timeout=self.period)
            self.packet_count += 1
            packet = self._client.decode_packet(pkt)
            if not packet:
                LOG.debug(f'Packet failed to parse. "{pkt}"')
                return True
            self.print_packet(packet)
            if packet:
                if self.filter_packet(packet):
                    collector.PacketCollector().rx(packet)
                    self.process_packet(packet)
        except queue.Empty:
            pass  # Normal timeout, loop will check _should_quit
        return True

Note: APRSDProcessPacketThread and APRSDPluginProcessPacketThread extend APRSDFilterThread and inherit the queue-based wait pattern. They don't need changes to their loop() method as they override process_packet() and process_our_message_packet().

  • Step 3: Verify APRSDListenProcessThread (listen.py) inherits changes

APRSDListenProcessThread in aprsd/cmds/listen.py extends APRSDFilterThread. It inherits the queue-based wait pattern automatically. No code changes needed in listen.py - verify the class still works by inspection:

# aprsd/cmds/listen.py - NO CHANGES NEEDED
# APRSDListenProcessThread extends APRSDFilterThread which now uses:
#   queue.get(timeout=self.period) for interruptible wait
# The daemon=True default is inherited from base APRSDThread
  • Step 4: Run tests

Run: pytest tests/ -v Expected: PASS

  • Step 5: Commit RX thread changes
git add aprsd/threads/rx.py
git commit -m "refactor(threads): migrate RX threads to Event-based timing

- APRSDRXThread: use wait(timeout=5) for error recovery
- APRSDFilterThread: use queue.get(timeout=period) as interruptible wait
- APRSDListenProcessThread inherits changes from APRSDFilterThread
- Remove time.sleep() calls"

Chunk 4: TX and Registry Thread Migration

Task 6: Migrate TX Threads

Files:

  • Modify: aprsd/threads/tx.py

  • Step 1: Update PacketSendSchedulerThread (non-daemon)

class PacketSendSchedulerThread(aprsd_threads.APRSDThread):
    """Scheduler thread that uses a threadpool to send packets."""

    daemon = False  # Non-daemon for graceful packet handling

    def __init__(self, max_workers=5):
        super().__init__('PacketSendSchedulerThread')
        self.executor = ThreadPoolExecutor(
            max_workers=max_workers, thread_name_prefix='PacketSendWorker'
        )
        self.max_workers = max_workers

    def loop(self):
        """Check all tracked packets and submit send tasks to threadpool."""
        pkt_tracker = tracker.PacketTrack()

        for msg_no in list(pkt_tracker.keys()):
            packet = pkt_tracker.get(msg_no)
            if not packet:
                continue

            if isinstance(packet, core.AckPacket):
                continue

            if packet.send_count >= packet.retry_count:
                continue

            self.executor.submit(_send_packet_worker, msg_no)

        self.wait()  # Wait for period (1s) or shutdown signal
        return True

    def _cleanup(self):
        """Cleanup threadpool executor on thread shutdown."""
        LOG.debug('Shutting down PacketSendSchedulerThread executor')
        self.executor.shutdown(wait=True)
  • Step 2: Update AckSendSchedulerThread (non-daemon)
class AckSendSchedulerThread(aprsd_threads.APRSDThread):
    """Scheduler thread that uses a threadpool to send ack packets."""

    daemon = False  # Non-daemon for graceful ACK handling

    def __init__(self, max_workers=3):
        super().__init__('AckSendSchedulerThread')
        self.executor = ThreadPoolExecutor(
            max_workers=max_workers, thread_name_prefix='AckSendWorker'
        )
        self.max_workers = max_workers
        self.max_retries = CONF.default_ack_send_count

    def loop(self):
        """Check all tracked ack packets and submit send tasks to threadpool."""
        pkt_tracker = tracker.PacketTrack()

        for msg_no in list(pkt_tracker.keys()):
            packet = pkt_tracker.get(msg_no)
            if not packet:
                continue

            if not isinstance(packet, core.AckPacket):
                continue

            if packet.send_count >= self.max_retries:
                continue

            self.executor.submit(_send_ack_worker, msg_no, self.max_retries)

        self.wait()  # Wait for period (1s) or shutdown signal
        return True

    def _cleanup(self):
        """Cleanup threadpool executor on thread shutdown."""
        LOG.debug('Shutting down AckSendSchedulerThread executor')
        self.executor.shutdown(wait=True)
  • Step 3: Update SendPacketThread
class SendPacketThread(aprsd_threads.APRSDThread):

    def __init__(self, packet):
        self.packet = packet
        super().__init__(f'TX-{packet.to_call}-{self.packet.msgNo}')

    def loop(self):
        """Loop until a message is acked or max retries reached."""
        pkt_tracker = tracker.PacketTrack()
        packet = pkt_tracker.get(self.packet.msgNo)
        if not packet:
            LOG.info(
                f'{self.packet.__class__.__name__}'
                f'({self.packet.msgNo}) '
                'Message Send Complete via Ack.',
            )
            return False

        send_now = False
        if packet.send_count >= packet.retry_count:
            LOG.info(
                f'{packet.__class__.__name__} '
                f'({packet.msgNo}) '
                'Message Send Complete. Max attempts reached'
                f' {packet.retry_count}',
            )
            pkt_tracker.remove(packet.msgNo)
            return False

        if packet.last_send_time:
            now = int(round(time.time()))
            sleeptime = (packet.send_count + 1) * 31
            delta = now - packet.last_send_time
            if delta > sleeptime:
                send_now = True
        else:
            send_now = True

        if send_now:
            packet.last_send_time = int(round(time.time()))
            sent = False
            try:
                sent = _send_direct(packet)
            except Exception as ex:
                LOG.error(f'Failed to send packet: {packet}')
                LOG.error(ex)
            else:
                if sent:
                    packet.send_count += 1

        self.wait()  # Wait for period (1s) or shutdown signal
        return True
  • Step 4: Update SendAckThread
class SendAckThread(aprsd_threads.APRSDThread):
    max_retries = 3

    def __init__(self, packet):
        self.packet = packet
        super().__init__(f'TXAck-{packet.to_call}-{self.packet.msgNo}')
        self.max_retries = CONF.default_ack_send_count

    def loop(self):
        """Separate thread to send acks with retries."""
        send_now = False
        if self.packet.send_count == self.max_retries:
            LOG.debug(
                f'{self.packet.__class__.__name__}'
                f'({self.packet.msgNo}) '
                'Send Complete. Max attempts reached'
                f' {self.max_retries}',
            )
            return False

        if self.packet.last_send_time:
            now = int(round(time.time()))
            sleep_time = 31
            delta = now - self.packet.last_send_time
            if delta > sleep_time:
                send_now = True
        else:
            send_now = True

        if send_now:
            sent = False
            try:
                sent = _send_direct(self.packet)
            except Exception:
                LOG.error(f'Failed to send packet: {self.packet}')
            else:
                if sent:
                    self.packet.send_count += 1

            self.packet.last_send_time = int(round(time.time()))

        self.wait()  # Wait for period (1s) or shutdown signal
        return True
  • Step 5: Update BeaconSendThread
class BeaconSendThread(aprsd_threads.APRSDThread):
    """Thread that sends a GPS beacon packet periodically."""

    def __init__(self):
        super().__init__('BeaconSendThread')
        # Set period from config
        self.period = CONF.beacon_interval
        if not CONF.latitude or not CONF.longitude:
            LOG.error(
                'Latitude and Longitude are not set in the config file.'
                'Beacon will not be sent and thread is STOPPED.',
            )
            self.stop()
        LOG.info(
            'Beacon thread is running and will send '
            f'beacons every {CONF.beacon_interval} seconds.',
        )

    def loop(self):
        pkt = core.BeaconPacket(
            from_call=CONF.callsign,
            to_call='APRS',
            latitude=float(CONF.latitude),
            longitude=float(CONF.longitude),
            comment='APRSD GPS Beacon',
            symbol=CONF.beacon_symbol,
        )
        try:
            pkt.retry_count = 1
            send(pkt, direct=True)
        except Exception as e:
            LOG.error(f'Failed to send beacon: {e}')
            APRSDClient().reset()
            if self.wait(timeout=5):  # Interruptible error recovery wait
                return False

        self.wait()  # Wait for beacon_interval or shutdown signal
        return True
  • Step 6: Run tests

Run: pytest tests/ -v Expected: PASS

  • Step 7: Commit TX thread changes
git add aprsd/threads/tx.py
git commit -m "refactor(threads): migrate TX threads to Event-based timing

- PacketSendSchedulerThread: daemon=False
- AckSendSchedulerThread: daemon=False
- SendPacketThread, SendAckThread, BeaconSendThread: daemon=True (default)
- BeaconSendThread: period from CONF.beacon_interval
- Replace time.sleep(1) with self.wait()
- Remove counter-based timing (_loop_cnt)"

Task 7: Migrate APRSRegistryThread

Files:

  • Modify: aprsd/threads/registry.py

  • Step 1: Update APRSRegistryThread

class APRSRegistryThread(aprsd_threads.APRSDThread):
    """This sends service information to the configured APRS Registry."""

    def __init__(self):
        super().__init__('APRSRegistryThread')
        # Set period from config
        self.period = CONF.aprs_registry.frequency_seconds
        if not CONF.aprs_registry.enabled:
            LOG.error('APRS Registry is not enabled.')
            LOG.error('APRS Registry thread is STOPPING.')
            self.stop()
        LOG.info(
            'APRS Registry thread is running and will send '
            f'info every {CONF.aprs_registry.frequency_seconds} seconds '
            f'to {CONF.aprs_registry.registry_url}.',
        )

    def loop(self):
        info = {
            'callsign': CONF.callsign,
            'owner_callsign': CONF.owner_callsign,
            'description': CONF.aprs_registry.description,
            'service_website': CONF.aprs_registry.service_website,
            'software': f'APRSD version {aprsd.__version__} '
            'https://github.com/craigerl/aprsd',
        }
        try:
            requests.post(
                f'{CONF.aprs_registry.registry_url}',
                json=info,
            )
        except Exception as e:
            LOG.error(f'Failed to send registry info: {e}')

        self.wait()  # Wait for frequency_seconds or shutdown signal
        return True
  • Step 2: Run tests

Run: pytest tests/ -v Expected: PASS

  • Step 3: Commit registry thread changes
git add aprsd/threads/registry.py
git commit -m "refactor(threads): migrate APRSRegistryThread to Event-based timing

- Set period from CONF.aprs_registry.frequency_seconds
- Remove counter-based timing (_loop_cnt)
- Replace time.sleep(1) with self.wait()"

Chunk 5: Signal Handler and Final Cleanup

Task 8: Update Signal Handler

Files:

  • Modify: aprsd/main.py:76-97

  • Step 1: Update signal_handler function

def signal_handler(sig, frame):
    click.echo('signal_handler: called')
    collector.Collector().stop_all()
    thread_list = threads.APRSDThreadList()
    thread_list.stop_all()

    if 'subprocess' not in str(frame):
        LOG.info(
            'Ctrl+C, Sending all threads exit! {}'.format(
                datetime.datetime.now(),
            ),
        )
        # Wait for non-daemon threads to finish gracefully
        thread_list.join_non_daemon(timeout=5.0)
        try:
            packets.PacketTrack().save()
            packets.WatchList().save()
            packets.SeenList().save()
            packets.PacketList().save()
            collector.Collector().collect()
        except Exception as e:
            LOG.error(f'Failed to save data: {e}')
            sys.exit(0)
  • Step 2: Run full test suite

Run: pytest tests/ -v Expected: PASS

  • Step 3: Commit signal handler changes
git add aprsd/main.py
git commit -m "refactor(main): update signal handler for Event-based thread shutdown

- Replace time.sleep(1.5) with join_non_daemon(timeout=5.0)
- Non-daemon threads get graceful shutdown
- Daemon threads terminate immediately on exit"

Task 9: Remove unused imports

Files:

  • Modify: aprsd/threads/keepalive.py

  • Modify: aprsd/threads/stats.py

  • Modify: aprsd/threads/rx.py

  • Modify: aprsd/threads/tx.py

  • Modify: aprsd/threads/registry.py

  • Step 1: Remove import time where no longer needed

Check each file and remove import time if time.sleep is no longer used. Keep it if time.time() is still used.

Files to check:

  • keepalive.py: Remove import time (no longer used)

  • stats.py: Keep import time (uses time.time())

  • rx.py: Remove import time (no longer used)

  • tx.py: Keep import time (uses time.time())

  • registry.py: Remove import time (no longer used)

  • Step 2: Run linter

Run: tox -e lint Expected: PASS

  • Step 3: Run full test suite

Run: pytest tests/ -v Expected: PASS

  • Step 4: Commit cleanup
git add aprsd/threads/keepalive.py aprsd/threads/rx.py aprsd/threads/registry.py
git commit -m "chore(threads): remove unused time imports"

Task 10: Final Verification

  • Step 1: Run full test suite with coverage

Run: pytest tests/ --cov=aprsd -v Expected: PASS with good coverage

  • Step 2: Run type checking

Run: tox -e type-check Expected: PASS (or existing errors only)

  • Step 3: Manual smoke test

Start the server briefly to verify threads start correctly:

aprsd server --help
  • Step 4: Create summary commit

If any fixes were needed during verification, commit them.


Summary

Total Tasks: 10 Total Steps: ~45

Files Modified:

  • aprsd/threads/aprsd.py (base class + ThreadList)
  • aprsd/threads/keepalive.py
  • aprsd/threads/stats.py
  • aprsd/threads/rx.py
  • aprsd/threads/tx.py
  • aprsd/threads/registry.py
  • aprsd/main.py
  • tests/threads/test_aprsd_thread.py

Key Changes:

  1. Base class: daemon, period attributes, _shutdown_event, wait() method
  2. Non-daemon threads: PacketSendSchedulerThread, AckSendSchedulerThread, APRSDStatsStoreThread
  3. All counter-based timing replaced with period-based wait()
  4. Signal handler uses join_non_daemon() instead of sleep(1.5)