Another set of additions to the paper, both text and figures.

git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@6352 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
This commit is contained in:
Joe Taylor 2016-01-05 20:13:13 +00:00
parent 3576e0b868
commit 0f62680559
1 changed files with 156 additions and 123 deletions

View File

@ -126,21 +126,18 @@ A major reason for the success and popularity of JT65 is its use of a strong
error-correction code: a short block-length, low-rate Reed-Solomon code
based on a 64-symbol alphabet.
Until now, nearly all programs implementing JT65 have used the patented
Koetter-Vardy (KV) algebraic soft-decision decoder
Koetter-Vardy (KV) algebraic soft-decision decoder
\begin_inset CommandInset citation
LatexCommand cite
key "kv2001"
\end_inset
, as licensed to K1JT and implemented in a closed-source program for use
only in amateur radio applications.
, licensed to and implemented by K1JT in a closed-source executable for
use only in amateur radio applications.
Since 2001 the KV decoder has been considered the best available soft-decision
decoder for Reed Solomon codes.
\end_layout
\begin_layout Standard
We describe here a new open-source alternative called the Franke-Taylor
We describe here a new open-source alternative called the Franke-Taylor
(FT, or K9AN-K1JT) algorithm.
It is conceptually simple, built around the well-known Berlekamp-Massey
errors-and-erasures algorithm, and in this application it performs even
@ -149,8 +146,8 @@ We describe here a new open-source alternative called the Franke-Taylor
\emph on
WSJT-X
\emph default
, widely used for amateur weak-signal communication with JT65 and several
other specialized digital modes.
, widely used for amateur weak-signal communication with JT65 and other
specialized digital modes.
The program is freely available and licensed under the GNU General Public
License.
\end_layout
@ -160,22 +157,22 @@ The JT65 protocol specifies transmissions that normally start one second
into a UTC minute and last for 46.8 seconds.
Receiving software therefore has up to several seconds to decode a message,
before the operator sends a reply at the start of the next minute.
With today's personal computers, this relatively long time for decoding
a short message encourages experimentation with decoders of high computational
complexity.
As a result, on a typical fading channel the FT algorithm extends the decoding
threshold by many dB over the hard-decision Berlekamp-Massey decoder, and
by a meaningful amount over the KV decoder.
With today's personal computers, this relatively long time available for
decoding a short message encourages experimentation with decoders of high
computational complexity.
As a result, on a typical fading channel the FT algorithm can extend the
decoding threshold by many dB over the hard-decision Berlekamp-Massey decoder,
and by a meaningful amount over the KV decoder.
In addition to its excellent performance, the new algorithm has other desirable
properties---not the least of which is its conceptual simplicity.
properties, not least of which is its conceptual simplicity.
Decoding performance and complexity scale in a convenient way, providing
steadily increasing soft-decision decoding gain as a tunable computational
complexity parameter is increased over more than 5 orders of magnitude.
This means that appreciable gain is available from our decoder even on
very simple (and relatively slow) computers.
Appreciable gain is available from our decoder even on very simple (and
relatively slow) computers.
On the other hand, because the algorithm benefits from a large number of
independent decoding trials, it should be possible to obtain further performanc
e gains through parallelization on high-performance computers.
independent decoding trials, further performance gains should be achievable
through parallelization on high-performance computers.
\end_layout
\begin_layout Section
@ -943,7 +940,7 @@ Here
\end_inset
if the received symbol and codeword symbol are different, and
\begin_inset Formula $p_{1\,j}$
\begin_inset Formula $p_{1,\,j}$
\end_inset
is the fractional power associated with received symbol
@ -965,12 +962,7 @@ In practice we find that
\end_inset
can reliably indentify the correct codeword if the signal-to-noise ratio
for individual symbols is greater than about 4 in linear power units, or
\begin_inset Formula $E_{s}/N_{0}\apprge6$
\end_inset
dB (*** check these numbers ***).
for individual symbols is greater than about 4 in linear power units.
We also find that significantly weaker signals can be decoded by using
soft-symbol information beyond that contained in
\begin_inset Formula $p_{1}$
@ -1117,7 +1109,7 @@ est metrics
will likely be close to 1.
We therefore apply a ratio threshold test, say
\begin_inset Formula $r<r_{0}$
\begin_inset Formula $r<r_{1}$
\end_inset
, to identify codewords with high probability of being correct.
@ -1128,7 +1120,7 @@ reference "sec:Theory,-Simulation,-and"
\end_inset
, we have used simulations to set an empirical acceptance threshold
, we use simulations to set an empirical acceptance threshold
\begin_inset Formula $r_{0}$
\end_inset
@ -1145,21 +1137,32 @@ Technically the FT algorithm is a list decoder.
is retained.
As with all such algorithms, a stopping criterion is necessary.
FT accepts a codeword unconditionally if the Hamming distance and soft
distance
FT accepts a codeword unconditionally if the Hamming distance
\begin_inset Formula $X$
\end_inset
and soft distance
\begin_inset Formula $d_{s}$
\end_inset
are less than some conservatively specified limits.
Secondary acceptance criteria
\begin_inset Formula $d_{s}<d_{0}$
are less than conservatively specified limits
\begin_inset Formula $X_{0}$
\end_inset
and
\begin_inset Formula $r<r_{0}$
\begin_inset Formula $d_{0}$
\end_inset
are used to validate additional decodes.
.
Secondary acceptance criteria
\begin_inset Formula $d_{s}<d_{1}$
\end_inset
and
\begin_inset Formula $r<r_{1}$
\end_inset
are used to validate additional decodes that did not pass the first test.
A timeout is used to limit the algorithm's execution time if no acceptable
codeword is found in a reasonable number of trials,
\begin_inset Formula $T$
@ -1227,7 +1230,7 @@ If BM decoding was not successful, go to step 2.
\begin_layout Enumerate
Calculate the hard-decision Hamming distance
\begin_inset Formula $h$
\begin_inset Formula $X$
\end_inset
between the candidate codeword and the received symbols, the corresponding
@ -1244,7 +1247,7 @@ Calculate the hard-decision Hamming distance
\begin_inset Formula $u$
\end_inset
is the largest one encountered so far, preserve the previous value of
is the largest one encountered so far, preserve any previous value of
\begin_inset Formula $u_{1}$
\end_inset
@ -1261,7 +1264,7 @@ Calculate the hard-decision Hamming distance
\begin_layout Enumerate
If
\begin_inset Formula $h<h_{0}$
\begin_inset Formula $X<X_{0}$
\end_inset
and
@ -1290,7 +1293,7 @@ If
\end_inset
and
\begin_inset Formula $r<r_{1}$
\begin_inset Formula $r<r_{1},$
\end_inset
go to step 10.
@ -1301,11 +1304,7 @@ Otherwise, declare decoding failure and exit.
\end_layout
\begin_layout Enumerate
An acceptable codeword with
\begin_inset Formula $u_{max}>u_{0}$
\end_inset
has been found.
An acceptable codeword has been found.
Declare a successful decode and return this codeword.
\end_layout
@ -1316,7 +1315,7 @@ An acceptable codeword with
\begin_layout Standard
Inspiration for the FT decoding algorithm came from a number of sources,
particularly references
particularly references
\begin_inset CommandInset citation
LatexCommand cite
key "lhmg2010"
@ -1330,7 +1329,7 @@ key "lk2008"
\end_inset
and the textbook by Lin and Costello
and the textbook by Lin and Costello
\begin_inset CommandInset citation
LatexCommand cite
key "lc2004"
@ -1365,8 +1364,8 @@ key "ls2009"
is applied to higher-rate Reed-Solomon codes on a binary-input channel
with BPSK-modulated symbols.
Our 64-ary input channel with 64-FSK modulation required us to develop
unique methods for assigning erasure probabilities and for defining an
acceptance criteria to select the best codeword from the list of candidates.
unique methods for assigning erasure probabilities and for defining acceptance
criteria to select the best codeword from the list of candidates.
\end_layout
@ -1381,21 +1380,24 @@ Hinted Decoding
\end_layout
\begin_layout Standard
The FT algorithm is completely general: it recovers with equal sensitivity
The FT algorithm is completely general: with equal sensitivity it recovers
any one of the
\begin_inset Formula $2^{72}\approx4.7\times10^{21}$
\end_inset
different messages that can be transmitted using the JT65 protocol.
In many circumstances it's easy to imagine a much smaller list of messages
(say, a few thousand or less) that may be among the most likely ones to
be received.
For example, one such situation exists when making short ham-radio contacts
exchanging minimal amounts of information such as callsigns, signal reports,
perhaps a Maidenhead locator, and acknowledgments.
Similarly, on the EME path or on a VHF or UHF band with limited geographical
coverage, the most likely received messages will often originate from callsigns
that have been decoded before.
different messages that can be transmitted with the JT65 protocol.
In some circumstances it's easy to imagine a
\emph on
much
\emph default
smaller list of messages (say, a few thousand messages or less) that may
be among the most likely ones to be received.
One such situation exists when making short ham-radio contacts that exchange
minimal information including callsigns, signal reports, perhaps Maidenhead
locators, and acknowledgments.
On the EME path or on a VHF or UHF band with limited geographical coverage,
the most likely received messages often originate from callsigns that have
been decoded before.
Saving a list of previously decoded callsigns makes it easy to generate
lists of hypothetical messages and their corresponding codewords, at very
little computational expense.
@ -1420,13 +1422,14 @@ hinted decoding;
\begin_inset Quotes eld
\end_inset
Deep Search
deep search
\begin_inset Quotes erd
\end_inset
algorithm.
In certain limited situations it can provide enhanced sensitivity for the
principal task of any decoder, namely to determine what message was sent.
principal task of any decoder, namely to determine precisely what message
was sent.
\end_layout
\begin_layout Standard
@ -1459,7 +1462,8 @@ small enough
\begin_inset Quotes erd
\end_inset
for adequate confidence, while still ensuring that false decodes are rare.
to establish adequate confidence, while still ensuring that false decodes
are rare.
Because tested candidate codewords are drawn from a list typically no longer
than a few thousand, rather than
\begin_inset Formula $2^{72},$
@ -1469,22 +1473,26 @@ small enough
\begin_inset Formula $r_{2}$
\end_inset
can be a more relaxed limit than the ones
\begin_inset Formula $r_{0}$
can set a more relaxed limit than
\begin_inset Formula $r_{1},$
\end_inset
and
\begin_inset Formula $r_{1}$
as used in the FT algorithm.
For the limited subset of messages established by operator experience as
\begin_inset Quotes eld
\end_inset
used in the FT algorithm.
For the limited subset of messages considered as likely, hinted decodes
can be obtained at lower signal levels than would be required for decodes
selected from the full universe of
likely,
\begin_inset Quotes erd
\end_inset
hinted decodes can be obtained at lower signal levels than required for
decodes obtained from the full universe of
\begin_inset Formula $2^{72}$
\end_inset
distinct messages.
possible messages.
\end_layout
\begin_layout Section
@ -1497,10 +1505,6 @@ name "sec:Theory,-Simulation,-and"
Decoder Performance Evaluation
\end_layout
\begin_layout Subsection
Simulated results on the AWGN channel
\end_layout
\begin_layout Standard
Comparisons of decoding performance are usually presented in the professional
literature as plots of word error rate versus
@ -1514,8 +1518,8 @@ Comparisons of decoding performance are usually presented in the professional
.
For weak-signal amateur radio work, performance is more conveniently presented
as the probability of successfully decoding a received word versus signal-to-no
ise ratio in a 2500 Hz reference bandwidth,
as the probability of successfully decoding a received word plotted against
signal-to-noise ratio in a 2500 Hz reference bandwidth,
\begin_inset Formula $\mathrm{SNR}{}_{2500}$
\end_inset
@ -1536,12 +1540,36 @@ reference "sec:Appendix:SNR"
\end_inset
.
Examples of both types of plot are included in the following discussion,
where we describe a number of simulations carried out to compare performance
of the FT algorithm with others, and with theoretical expectations.
We have also used simulations to establish suitable default values for
the acceptance parameters
\begin_inset Formula $h_{0},$
\end_inset
\begin_inset Formula $d_{0},$
\end_inset
\begin_inset Formula $d_{1},$
\end_inset
and
\begin_inset Formula $r_{1}.$
\end_inset
\end_layout
\begin_layout Subsection
Simulated results on the AWGN channel
\end_layout
\begin_layout Standard
Results of simulations using the BM, FT, and KV decoding algorithms on the
JT65 (63,12) code are presented in terms of word error-rate vs
JT65 code are presented in terms of word error rate versus
\begin_inset Formula $E_{b}/N_{o}$
\end_inset
@ -1556,9 +1584,9 @@ reference "fig:bodide"
For these tests we generated at least 1000 signals at each signal-to-noise
ratio, assuming the additive white gaussian noise (AWGN) channel, and processed
the data using each algorithm.
For word error-rates less than 0.1 it was necessary to process 10,000 or
For word error rates less than 0.1 it was necessary to process 10,000 or
even 100,000 simulated signals in order to capture enough errors to make
the estimates of word-error-rate statistically meaningful.
the measurements statistically meaningful.
As a test of the fidelity of our numerical simulations, Figure
\begin_inset CommandInset ref
LatexCommand ref
@ -1566,8 +1594,7 @@ reference "fig:bodide"
\end_inset
also shows theoretical results (filled squares) for comparison with the
BM results.
also shows theoretical results for comparison with the BM results.
The simulated BM results agree with theory to within about 0.1 dB.
This difference between simulated BM results and theory is caused by small
errors in the estimates of time- and frequency-offset of the received signal
@ -1628,29 +1655,23 @@ Word error rates as a function of
\begin_inset Formula $E_{b}/N_{0},$
\end_inset
the signal-to-noise ratio per bit.
The single curve marked with filled squares shows a theoretical prediction
for the BM decoder.
Open squares illustrate simulation results for an AWGN channel with the
BM, FT (
\begin_inset Formula $T=10^{5}$
\end_inset
) and KV (
the signal-to-noise ratio per information bit.
Theory: theoretical prediction for the hard-decision BM decoder.
The remaining curves represent simulation results on an AWGN channel for
the BM, KV, and FT decoders.
The KV algorithm was executed with complexity coefficient
\begin_inset Formula $\lambda=15$
\end_inset
) decoders used in program
, the most aggressive setting historically used in the
\emph on
WSJT-X
WSJT
\emph default
.
The KV results are for decoding complexity coefficient
\begin_inset Formula $\lambda=15$
programs.
The FT alrithm was run with timeout setting
\begin_inset Formula $T=10^{5}.$
\end_inset
, the most aggressive setting that has historically been used in earlier
versions of the WSJT programs.
\end_layout
@ -1702,15 +1723,15 @@ reference "fig:bodide"
\end_inset
in this format along with additional FT results for
\begin_inset Formula $T=10^{4},10^{3},10^{2}$
\begin_inset Formula $T=10^{4},\:10^{3},\:10^{2}$
\end_inset
and
\begin_inset Formula $10^{1}$
\begin_inset Formula $10$
\end_inset
.
The KV results are plotted with open triangles.
The KV results are plotted with open squares.
It is apparent that the FT decoder produces more decodes than KV when
\begin_inset Formula $T=10^{4}$
\end_inset
@ -1747,24 +1768,19 @@ name "fig:WER2"
\end_inset
Percent of JT65 messages copied as a function of SNR in 2.5 kHz bandwidth.
Solid lines with filled round circles are results from the FT decoder with
Percent of JT65 messages copied as a function of SNR in 2500 Hz bandwidth.
Solid lines with filled circles are results from the FT decoder; numbers
adjacent to the curves specify values of the timeout parameter
\begin_inset Formula $T.$
\end_inset
The dotted line with open squares is the KV decoder with complexity coefficient
\begin_inset Formula $T=10^{5},10^{4},10^{3},10^{2}$
\end_inset
and
\begin_inset Formula $10$
\end_inset
, respectively, from left to right.
The dashed line with open triangles is the KV decoder with complexity coefficie
nt
\begin_inset Formula $\lambda=15$
\end_inset
.
Results from the BM algorithm are also shown with filled triangles.
Results from the BM algorithm are shown with a dashed line and crosses.
\end_layout
\end_inset
@ -1809,7 +1825,7 @@ reference "fig:N_vs_X"
\begin_inset Formula $X\le25$
\end_inset
because all such words were successfully decoded by the BM algorithm.
because all such words are successfully decoded by the BM algorithm.
Figure
\begin_inset CommandInset ref
LatexCommand ref
@ -1826,8 +1842,8 @@ reference "fig:N_vs_X"
with the number of errors in the received word.
The variability of the decoding time also increases dramatically with the
number of errors in the received word.
These results also provide insight into the mean and variance of the execution
time for the FT algorithm, as execution time will be roughly proportional
These results provide insight into the mean and variance of the execution
time for the FT algorithm, since execution time will be roughly proportional
to the number of required trials.
\end_layout
@ -1859,13 +1875,21 @@ name "fig:N_vs_X"
\end_inset
Number of trials needed to decode a received word versus Hamming distance
\begin_inset Formula $X$
\end_inset
between the received word and the decoded codeword, for 1000 simulated
frames on an AWGN channel with no fading.
The SNR in 2500 Hz bandwidth is -24 dB (
The SNR in 2500 Hz bandwidth is
\begin_inset Formula $-24$
\end_inset
dB, which corresponds to
\begin_inset Formula $E_{b}/N_{o}=5.1$
\end_inset
dB).
dB.
\end_layout
@ -1880,7 +1904,7 @@ Number of trials needed to decode a received word versus Hamming distance
\end_layout
\begin_layout Subsection
Simulated results for hinted decoding and Rayleigh fading
Simulated results for Rayleigh fading and hinted decoding
\end_layout
\begin_layout Standard
@ -1904,9 +1928,11 @@ reference "fig:Psuccess"
We include three curves for each decoding algorithm: one for the AWGN channel
and no fading, and two more for simulated Doppler spreads of 0.2 and 1.0
Hz.
For reference, we note that the JT65 symbol rate is about 2.69 Hz.
The simulated Doppler spreads are comparable to those encountered on HF
ionospheric paths and for EME at VHF and lower UHF bands.
For reference, we note that the JT65 symbol rate is about 2.69 Hz.
(*** A little more description of hinted decoding is needed here, and new
data for the DS curves.***)
\end_layout
\begin_layout Standard
@ -1948,7 +1974,14 @@ Deep Search
\begin_inset Quotes erd
\end_inset
) matched-filter algorithm.
) algorithm.
Numbers adjacent to the curves are the simulated Doppler spreads in Hz.
The curve labeled Sync illustrates the dependence of proper time and frequency
synchronization in the decoder presently implemented in
\emph on
WSJT-X
\emph default
.
\end_layout
\end_inset