Another set of additions to the paper, both text and figures.

git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@6352 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
2016-01-05 20:13:13 +00:00 · 2016-01-05 20:13:13 +00:00 · 0f62680559
parent 3576e0b868
commit 0f62680559
1 changed files with 156 additions and 123 deletions
--- a/lib/ftrsd/ftrsd_paper/ftrsd.lyx
+++ b/lib/ftrsd/ftrsd_paper/ftrsd.lyx
@ -126,21 +126,18 @@ A major reason for the success and popularity of JT65 is its use of a strong
 error-correction code: a short block-length, low-rate Reed-Solomon code
 based on a 64-symbol alphabet.
 Until now, nearly all programs implementing JT65 have used the patented
- Koetter-Vardy (KV) algebraic soft-decision decoder
+ Koetter-Vardy (KV) algebraic soft-decision decoder 
 \begin_inset CommandInset citation
 LatexCommand cite
 key "kv2001"

 \end_inset

-, as licensed to K1JT and implemented in a closed-source program for use
- only in amateur radio applications.
+, licensed to and implemented by K1JT in a closed-source executable for
+ use only in amateur radio applications.
 Since 2001 the KV decoder has been considered the best available soft-decision
 decoder for Reed Solomon codes.
-\end_layout
-
-\begin_layout Standard
-We describe here a new open-source alternative called the Franke-Taylor
+ We describe here a new open-source alternative called the Franke-Taylor
 (FT, or K9AN-K1JT) algorithm.
 It is conceptually simple, built around the well-known Berlekamp-Massey
 errors-and-erasures algorithm, and in this application it performs even
@ -149,8 +146,8 @@ We describe here a new open-source alternative called the Franke-Taylor
 \emph on
 WSJT-X
 \emph default
-, widely used for amateur weak-signal communication with JT65 and several
- other specialized digital modes.
+, widely used for amateur weak-signal communication with JT65 and other
+ specialized digital modes.
 The program is freely available and licensed under the GNU General Public
 License.
 \end_layout
@ -160,22 +157,22 @@ The JT65 protocol specifies transmissions that normally start one second
 into a UTC minute and last for 46.8 seconds.
 Receiving software therefore has up to several seconds to decode a message,
 before the operator sends a reply at the start of the next minute.
- With today's personal computers, this relatively long time for decoding
- a short message encourages experimentation with decoders of high computational
- complexity.
- As a result, on a typical fading channel the FT algorithm extends the decoding
- threshold by many dB over the hard-decision Berlekamp-Massey decoder, and
- by a meaningful amount over the KV decoder.
+ With today's personal computers, this relatively long time available for
+ decoding a short message encourages experimentation with decoders of high
+ computational complexity.
+ As a result, on a typical fading channel the FT algorithm can extend the
+ decoding threshold by many dB over the hard-decision Berlekamp-Massey decoder,
+ and by a meaningful amount over the KV decoder.
 In addition to its excellent performance, the new algorithm has other desirable
- properties---not the least of which is its conceptual simplicity.
+ properties, not least of which is its conceptual simplicity.
 Decoding performance and complexity scale in a convenient way, providing
 steadily increasing soft-decision decoding gain as a tunable computational
 complexity parameter is increased over more than 5 orders of magnitude.
- This means that appreciable gain is available from our decoder even on
- very simple (and relatively slow) computers.
+ Appreciable gain is available from our decoder even on very simple (and
+ relatively slow) computers.
 On the other hand, because the algorithm benefits from a large number of
- independent decoding trials, it should be possible to obtain further performanc
-e gains through parallelization on high-performance computers.
+ independent decoding trials, further performance gains should be achievable
+ through parallelization on high-performance computers.
 \end_layout

 \begin_layout Section
@ -943,7 +940,7 @@ Here
 \end_inset

 if the received symbol and codeword symbol are different, and 
-\begin_inset Formula $p_{1\,j}$
+\begin_inset Formula $p_{1,\,j}$
 \end_inset

 is the fractional power associated with received symbol 
@ -965,12 +962,7 @@ In practice we find that
 \end_inset

 can reliably indentify the correct codeword if the signal-to-noise ratio
- for individual symbols is greater than about 4 in linear power units, or
- 
-\begin_inset Formula $E_{s}/N_{0}\apprge6$
-\end_inset
-
- dB (*** check these numbers ***).
+ for individual symbols is greater than about 4 in linear power units.
 We also find that significantly weaker signals can be decoded by using
 soft-symbol information beyond that contained in 
 \begin_inset Formula $p_{1}$
@ -1117,7 +1109,7 @@ est metrics

 will likely be close to 1.
 We therefore apply a ratio threshold test, say 
-\begin_inset Formula $r<r_{0}$
+\begin_inset Formula $r<r_{1}$
 \end_inset

 , to identify codewords with high probability of being correct.
@ -1128,7 +1120,7 @@ reference "sec:Theory,-Simulation,-and"

 \end_inset

-, we have used simulations to set an empirical acceptance threshold 
+, we use simulations to set an empirical acceptance threshold 
 \begin_inset Formula $r_{0}$
 \end_inset

@ -1145,21 +1137,32 @@ Technically the FT algorithm is a list decoder.

 is retained.
 As with all such algorithms, a stopping criterion is necessary.
- FT accepts a codeword unconditionally if the Hamming distance and soft
- distance 
+ FT accepts a codeword unconditionally if the Hamming distance 
+\begin_inset Formula $X$
+\end_inset
+
+ and soft distance 
 \begin_inset Formula $d_{s}$
 \end_inset

- are less than some conservatively specified limits.
- Secondary acceptance criteria 
-\begin_inset Formula $d_{s}<d_{0}$
+ are less than conservatively specified limits 
+\begin_inset Formula $X_{0}$
 \end_inset

 and 
-\begin_inset Formula $r<r_{0}$
+\begin_inset Formula $d_{0}$
 \end_inset

- are used to validate additional decodes.
+.
+ Secondary acceptance criteria 
+\begin_inset Formula $d_{s}<d_{1}$
+\end_inset
+
+ and 
+\begin_inset Formula $r<r_{1}$
+\end_inset
+
+ are used to validate additional decodes that did not pass the first test.
 A timeout is used to limit the algorithm's execution time if no acceptable
 codeword is found in a reasonable number of trials, 
 \begin_inset Formula $T$
@ -1227,7 +1230,7 @@ If BM decoding was not successful, go to step 2.

 \begin_layout Enumerate
 Calculate the hard-decision Hamming distance 
-\begin_inset Formula $h$
+\begin_inset Formula $X$
 \end_inset

 between the candidate codeword and the received symbols, the corresponding
@ -1244,7 +1247,7 @@ Calculate the hard-decision Hamming distance
 \begin_inset Formula $u$
 \end_inset

- is the largest one encountered so far, preserve the previous value of 
+ is the largest one encountered so far, preserve any previous value of 
 \begin_inset Formula $u_{1}$
 \end_inset

@ -1261,7 +1264,7 @@ Calculate the hard-decision Hamming distance

 \begin_layout Enumerate
 If 
-\begin_inset Formula $h<h_{0}$
+\begin_inset Formula $X<X_{0}$
 \end_inset

 and 
@ -1290,7 +1293,7 @@ If
 \end_inset

 and 
-\begin_inset Formula $r<r_{1}$
+\begin_inset Formula $r<r_{1},$
 \end_inset

 go to step 10.
@ -1301,11 +1304,7 @@ Otherwise, declare decoding failure and exit.
 \end_layout

 \begin_layout Enumerate
-An acceptable codeword with 
-\begin_inset Formula $u_{max}>u_{0}$
-\end_inset
-
- has been found.
+An acceptable codeword has been found.
 Declare a successful decode and return this codeword.
 \end_layout

@ -1316,7 +1315,7 @@ An acceptable codeword with

 \begin_layout Standard
 Inspiration for the FT decoding algorithm came from a number of sources,
- particularly references
+ particularly references 
 \begin_inset CommandInset citation
 LatexCommand cite
 key "lhmg2010"
@ -1330,7 +1329,7 @@ key "lk2008"

 \end_inset

- and the textbook by Lin and Costello
+ and the textbook by Lin and Costello 
 \begin_inset CommandInset citation
 LatexCommand cite
 key "lc2004"
@ -1365,8 +1364,8 @@ key "ls2009"
 is applied to higher-rate Reed-Solomon codes on a binary-input channel
 with BPSK-modulated symbols.
 Our 64-ary input channel with 64-FSK modulation required us to develop
- unique methods for assigning erasure probabilities and for defining an
- acceptance criteria to select the best codeword from the list of candidates.
+ unique methods for assigning erasure probabilities and for defining acceptance
+ criteria to select the best codeword from the list of candidates.
 
 \end_layout

@ -1381,21 +1380,24 @@ Hinted Decoding
 \end_layout

 \begin_layout Standard
-The FT algorithm is completely general: it recovers with equal sensitivity
+The FT algorithm is completely general: with equal sensitivity it recovers
 any one of the 
 \begin_inset Formula $2^{72}\approx4.7\times10^{21}$
 \end_inset

- different messages that can be transmitted using the JT65 protocol.
- In many circumstances it's easy to imagine a much smaller list of messages
- (say, a few thousand or less) that may be among the most likely ones to
- be received.
- For example, one such situation exists when making short ham-radio contacts
- exchanging minimal amounts of information such as callsigns, signal reports,
- perhaps a Maidenhead locator, and acknowledgments.
- Similarly, on the EME path or on a VHF or UHF band with limited geographical
- coverage, the most likely received messages will often originate from callsigns
- that have been decoded before.
+ different messages that can be transmitted with the JT65 protocol.
+ In some circumstances it's easy to imagine a 
+\emph on
+much
+\emph default
+ smaller list of messages (say, a few thousand messages or less) that may
+ be among the most likely ones to be received.
+ One such situation exists when making short ham-radio contacts that exchange
+ minimal information including callsigns, signal reports, perhaps Maidenhead
+ locators, and acknowledgments.
+ On the EME path or on a VHF or UHF band with limited geographical coverage,
+ the most likely received messages often originate from callsigns that have
+ been decoded before.
 Saving a list of previously decoded callsigns makes it easy to generate
 lists of hypothetical messages and their corresponding codewords, at very
 little computational expense.
@ -1420,13 +1422,14 @@ hinted decoding;
 \begin_inset Quotes eld
 \end_inset

-Deep Search
+deep search
 \begin_inset Quotes erd
 \end_inset

 algorithm.
 In certain limited situations it can provide enhanced sensitivity for the
- principal task of any decoder, namely to determine what message was sent.
+ principal task of any decoder, namely to determine precisely what message
+ was sent.
 \end_layout

 \begin_layout Standard
@ -1459,7 +1462,8 @@ small enough
 \begin_inset Quotes erd
 \end_inset

- for adequate confidence, while still ensuring that false decodes are rare.
+ to establish adequate confidence, while still ensuring that false decodes
+ are rare.
 Because tested candidate codewords are drawn from a list typically no longer
 than a few thousand, rather than 
 \begin_inset Formula $2^{72},$
@ -1469,22 +1473,26 @@ small enough
 \begin_inset Formula $r_{2}$
 \end_inset

- can be a more relaxed limit than the ones 
-\begin_inset Formula $r_{0}$
+ can set a more relaxed limit than 
+\begin_inset Formula $r_{1},$
 \end_inset

- and 
-\begin_inset Formula $r_{1}$
+ as used in the FT algorithm.
+ For the limited subset of messages established by operator experience as
+ 
+\begin_inset Quotes eld
 \end_inset

- used in the FT algorithm.
- For the limited subset of messages considered as likely, hinted decodes
- can be obtained at lower signal levels than would be required for decodes
- selected from the full universe of 
+likely,
+\begin_inset Quotes erd
+\end_inset
+
+ hinted decodes can be obtained at lower signal levels than required for
+ decodes obtained from the full universe of 
 \begin_inset Formula $2^{72}$
 \end_inset

-distinct messages.
+ possible messages.
 \end_layout

 \begin_layout Section
@ -1497,10 +1505,6 @@ name "sec:Theory,-Simulation,-and"
 Decoder Performance Evaluation
 \end_layout

-\begin_layout Subsection
-Simulated results on the AWGN channel
-\end_layout
-
 \begin_layout Standard
 Comparisons of decoding performance are usually presented in the professional
 literature as plots of word error rate versus 
@ -1514,8 +1518,8 @@ Comparisons of decoding performance are usually presented in the professional

 .
 For weak-signal amateur radio work, performance is more conveniently presented
- as the probability of successfully decoding a received word versus signal-to-no
-ise ratio in a 2500 Hz reference bandwidth, 
+ as the probability of successfully decoding a received word plotted against
+ signal-to-noise ratio in a 2500 Hz reference bandwidth, 
 \begin_inset Formula $\mathrm{SNR}{}_{2500}$
 \end_inset

@ -1536,12 +1540,36 @@ reference "sec:Appendix:SNR"
 \end_inset

 .
+ Examples of both types of plot are included in the following discussion,
+ where we describe a number of simulations carried out to compare performance
+ of the FT algorithm with others, and with theoretical expectations.
+ We have also used simulations to establish suitable default values for
+ the acceptance parameters 
+\begin_inset Formula $h_{0},$
+\end_inset
+
 
+\begin_inset Formula $d_{0},$
+\end_inset
+
+ 
+\begin_inset Formula $d_{1},$
+\end_inset
+
+ and 
+\begin_inset Formula $r_{1}.$
+\end_inset
+
+
+\end_layout
+
+\begin_layout Subsection
+Simulated results on the AWGN channel
 \end_layout

 \begin_layout Standard
 Results of simulations using the BM, FT, and KV decoding algorithms on the
- JT65 (63,12) code are presented in terms of word error-rate vs 
+ JT65 code are presented in terms of word error rate versus 
 \begin_inset Formula $E_{b}/N_{o}$
 \end_inset

@ -1556,9 +1584,9 @@ reference "fig:bodide"
 For these tests we generated at least 1000 signals at each signal-to-noise
 ratio, assuming the additive white gaussian noise (AWGN) channel, and processed
 the data using each algorithm.
- For word error-rates less than 0.1 it was necessary to process 10,000 or
+ For word error rates less than 0.1 it was necessary to process 10,000 or
 even 100,000 simulated signals in order to capture enough errors to make
- the estimates of word-error-rate statistically meaningful.
+ the measurements statistically meaningful.
 As a test of the fidelity of our numerical simulations, Figure 
 \begin_inset CommandInset ref
 LatexCommand ref
@ -1566,8 +1594,7 @@ reference "fig:bodide"

 \end_inset

- also shows theoretical results (filled squares) for comparison with the
- BM results.
+ also shows theoretical results for comparison with the BM results.
 The simulated BM results agree with theory to within about 0.1 dB.
 This difference between simulated BM results and theory is caused by small
 errors in the estimates of time- and frequency-offset of the received signal
@ -1628,29 +1655,23 @@ Word error rates as a function of
 \begin_inset Formula $E_{b}/N_{0},$
 \end_inset

- the signal-to-noise ratio per bit.
- The single curve marked with filled squares shows a theoretical prediction
- for the BM decoder.
- Open squares illustrate simulation results for an AWGN channel with the
- BM, FT (
-\begin_inset Formula $T=10^{5}$
-\end_inset
-
-) and KV (
+ the signal-to-noise ratio per information bit.
+ Theory: theoretical prediction for the hard-decision BM decoder.
+ The remaining curves represent simulation results on an AWGN channel for
+ the BM, KV, and FT decoders.
+ The KV algorithm was executed with complexity coefficient 
 \begin_inset Formula $\lambda=15$
 \end_inset

-) decoders used in program 
+, the most aggressive setting historically used in the 
 \emph on
-WSJT-X
+WSJT
 \emph default
-.
- The KV results are for decoding complexity coefficient 
-\begin_inset Formula $\lambda=15$
+ programs.
+ The FT alrithm was run with timeout setting 
+\begin_inset Formula $T=10^{5}.$
 \end_inset

-, the most aggressive setting that has historically been used in earlier
- versions of the WSJT programs.
 
 \end_layout

@ -1702,15 +1723,15 @@ reference "fig:bodide"
 \end_inset

 in this format along with additional FT results for 
-\begin_inset Formula $T=10^{4},10^{3},10^{2}$
+\begin_inset Formula $T=10^{4},\:10^{3},\:10^{2}$
 \end_inset

 and 
-\begin_inset Formula $10^{1}$
+\begin_inset Formula $10$
 \end_inset

 .
- The KV results are plotted with open triangles.
+ The KV results are plotted with open squares.
 It is apparent that the FT decoder produces more decodes than KV when 
 \begin_inset Formula $T=10^{4}$
 \end_inset
@ -1747,24 +1768,19 @@ name "fig:WER2"

 \end_inset

-Percent of JT65 messages copied as a function of SNR in 2.5 kHz bandwidth.
- Solid lines with filled round circles are results from the FT decoder with
+Percent of JT65 messages copied as a function of SNR in 2500 Hz bandwidth.
+ Solid lines with filled circles are results from the FT decoder; numbers
+ adjacent to the curves specify values of the timeout parameter 
+\begin_inset Formula $T.$
+\end_inset
+
+ The dotted line with open squares is the KV decoder with complexity coefficient
 
-\begin_inset Formula $T=10^{5},10^{4},10^{3},10^{2}$
-\end_inset
-
- and 
-\begin_inset Formula $10$
-\end_inset
-
-, respectively, from left to right.
- The dashed line with open triangles is the KV decoder with complexity coefficie
-nt 
 \begin_inset Formula $\lambda=15$
 \end_inset

 .
- Results from the BM algorithm are also shown with filled triangles.
+ Results from the BM algorithm are shown with a dashed line and crosses.
 \end_layout

 \end_inset
@ -1809,7 +1825,7 @@ reference "fig:N_vs_X"
 \begin_inset Formula $X\le25$
 \end_inset

- because all such words were successfully decoded by the BM algorithm.
+ because all such words are successfully decoded by the BM algorithm.
 Figure 
 \begin_inset CommandInset ref
 LatexCommand ref
@ -1826,8 +1842,8 @@ reference "fig:N_vs_X"
 with the number of errors in the received word.
 The variability of the decoding time also increases dramatically with the
 number of errors in the received word.
- These results also provide insight into the mean and variance of the execution
- time for the FT algorithm, as execution time will be roughly proportional
+ These results provide insight into the mean and variance of the execution
+ time for the FT algorithm, since execution time will be roughly proportional
 to the number of required trials.
 \end_layout

@ -1859,13 +1875,21 @@ name "fig:N_vs_X"
 \end_inset

 Number of trials needed to decode a received word versus Hamming distance
+ 
+\begin_inset Formula $X$
+\end_inset
+
 between the received word and the decoded codeword, for 1000 simulated
 frames on an AWGN channel with no fading.
- The SNR in 2500 Hz bandwidth is -24 dB (
+ The SNR in 2500 Hz bandwidth is 
+\begin_inset Formula $-24$
+\end_inset
+
+ dB, which corresponds to 
 \begin_inset Formula $E_{b}/N_{o}=5.1$
 \end_inset

- dB).
+ dB.
 
 \end_layout

@ -1880,7 +1904,7 @@ Number of trials needed to decode a received word versus Hamming distance
 \end_layout

 \begin_layout Subsection
-Simulated results for hinted decoding and Rayleigh fading
+Simulated results for Rayleigh fading and hinted decoding
 \end_layout

 \begin_layout Standard
@ -1904,9 +1928,11 @@ reference "fig:Psuccess"
 We include three curves for each decoding algorithm: one for the AWGN channel
 and no fading, and two more for simulated Doppler spreads of 0.2 and 1.0
 Hz.
- For reference, we note that the JT65 symbol rate is about 2.69 Hz.
 The simulated Doppler spreads are comparable to those encountered on HF
 ionospheric paths and for EME at VHF and lower UHF bands.
+ For reference, we note that the JT65 symbol rate is about 2.69 Hz.
+ (*** A little more description of hinted decoding is needed here, and new
+ data for the DS curves.***)
 \end_layout

 \begin_layout Standard
@ -1948,7 +1974,14 @@ Deep Search
 \begin_inset Quotes erd
 \end_inset

-) matched-filter algorithm.
+) algorithm.
+ Numbers adjacent to the curves are the simulated Doppler spreads in Hz.
+ The curve labeled Sync illustrates the dependence of proper time and frequency
+ synchronization in the decoder presently implemented in 
+\emph on
+WSJT-X
+\emph default
+.
 \end_layout

 \end_inset