Add separate folder sfrsd_paper to hold the whole project.

git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@6185 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
2025-08-20 14:32:28 -04:00 · 2015-11-26 03:55:56 +00:00 · 2015-11-26 03:55:56 +00:00 · 331fb62e23
commit 331fb62e23
parent 167b21ce7f
1 changed files with 755 additions and 0 deletions
--- a/lib/sfrsd2/sfrsd_paper/sfrsd.lyx
+++ b/lib/sfrsd2/sfrsd_paper/sfrsd.lyx
@ -0,0 +1,755 @@
+#LyX 2.1 created this file. For more info see http://www.lyx.org/
+\lyxformat 474
+\begin_document
+\begin_header
+\textclass IEEEtran
+\use_default_options true
+\maintain_unincluded_children false
+\language english
+\language_package default
+\inputencoding auto
+\fontencoding global
+\font_roman default
+\font_sans default
+\font_typewriter default
+\font_math auto
+\font_default_family default
+\use_non_tex_fonts false
+\font_sc false
+\font_osf false
+\font_sf_scale 100
+\font_tt_scale 100
+\graphics default
+\default_output_format default
+\output_sync 0
+\bibtex_command default
+\index_command default
+\paperfontsize default
+\spacing single
+\use_hyperref false
+\papersize default
+\use_geometry false
+\use_package amsmath 1
+\use_package amssymb 1
+\use_package cancel 1
+\use_package esint 1
+\use_package mathdots 1
+\use_package mathtools 1
+\use_package mhchem 1
+\use_package stackrel 1
+\use_package stmaryrd 1
+\use_package undertilde 1
+\cite_engine basic
+\cite_engine_type default
+\biblio_style plain
+\use_bibtopic false
+\use_indices false
+\paperorientation portrait
+\suppress_date false
+\justification true
+\use_refstyle 1
+\index Index
+\shortcut idx
+\color #008000
+\end_index
+\secnumdepth 3
+\tocdepth 3
+\paragraph_separation indent
+\paragraph_indentation default
+\quotes_language english
+\papercolumns 1
+\papersides 1
+\paperpagestyle default
+\tracking_changes false
+\output_changes false
+\html_math_output 0
+\html_css_as_file 0
+\html_be_strict false
+\end_header
+
+\begin_body
+
+\begin_layout Title
+A stochastic successive erasures soft-decision decoder for the JT65 (63,12)
+ Reed-Solomon code
+\end_layout
+
+\begin_layout Author
+Steven J.
+ Franke, K9AN and Joseph H.
+ Taylor, K1JT
+\end_layout
+
+\begin_layout Abstract
+The JT65 mode has revolutionized amateur-radio weak-signal communication
+ by enabling amateur radio operators with small antennas and relatively
+ low-power transmitters to communicate over propagation paths that could
+ not be utilized using traditional technologies.
+ One reason for the success and popularity of the JT65 mode is its use of
+ strong error-correction coding.
+ The JT65 code is a short block-length, low-rate, Reed-Solomon code based
+ on a 64-symbol alphabet.
+ Since 200?, decoders for the JT65 code have used the 
+\begin_inset Quotes eld
+\end_inset
+
+Koetter-Vardy
+\begin_inset Quotes erd
+\end_inset
+
+ (KV) algebraic soft-decision decoder.
+ The KV decoder is implemented in a closed-source program that is licensed
+ to K1JT for use in amateur applications.
+ This note describes a new open-source alternative to the KV decoder called
+ the SFRSD decoder.
+ The SFRSD decoding algorithm is shown to perform at least as well as the
+ KV decoder.
+ The SFRSD algorithm is conceptually simple and is built around the well-known
+ Berlekamp-Massey errors-and-erasures decoder.
+ 
+\end_layout
+
+\begin_layout Standard
+JT65 message frames consist of a short, compressed, message that is encoded
+ for transmission using a Reed-Solomon code.
+ Reed-Solomon codes are block codes and, like all block codes, are characterized
+ by the length of their codewords, 
+\begin_inset Formula $n$
+\end_inset
+
+, the number of message symbols conveyed by the codeword, 
+\begin_inset Formula $k$
+\end_inset
+
+, and the number of possible values for each symbol in the codewords.
+ The codeword length and the number of message symbols are specified as
+ a tuple in the form 
+\begin_inset Formula $(n,k)$
+\end_inset
+
+.
+ JT65 uses a (63,12) Reed-Solomon code with 64 possible values for each
+ symbol, so each symbol represents 
+\begin_inset Formula $\log_{2}64=6$
+\end_inset
+
+ message bits.
+ The source-encoded messages conveyed by a 63-symbol JT65 frame consist
+ of 72 bits.
+ The JT65 code is systematic, which means that the 12 message symbols are
+ embedded in the codeword without modification and another 51 parity symbols
+ derived from the message symbols are added to form the codeword consisting
+ of 63 total symbols.
+ 
+\end_layout
+
+\begin_layout Standard
+The concept of Hamming distance is used as a measure of 
+\begin_inset Quotes eld
+\end_inset
+
+distance
+\begin_inset Quotes erd
+\end_inset
+
+ between different codewords, or between a received word and a codeword.
+ Hamming distance is the number of code symbols that differ in the two words
+ that are being compared.
+ Reed-Solomon codes have minimum Hamming distance 
+\begin_inset Formula $d$
+\end_inset
+
+, where 
+\begin_inset Formula 
+\begin{equation}
+d=n-k+1.\label{eq:minimum_distance}
+\end{equation}
+
+\end_inset
+
+The minimum Hamming distance of the JT65 code is 
+\begin_inset Formula $d=52$
+\end_inset
+
+, which means that any particular codeword differs from all other codewords
+ in at least 52 positions.
+ 
+\end_layout
+
+\begin_layout Standard
+Given only a received word containing some incorrect symbols (errors), the
+ received word can be decoded into the correct codeword using a deterministic,
+ algebraic, algorithm provided that no more than 
+\begin_inset Formula $t$
+\end_inset
+
+ symbols were received incorrectly, where
+\begin_inset Formula 
+\begin{equation}
+t=\left\lfloor \frac{n-k}{2}\right\rfloor .\label{eq:t}
+\end{equation}
+
+\end_inset
+
+For the JT65 code, 
+\begin_inset Formula $t=25$
+\end_inset
+
+, which means that it is always possible to efficiently decode a received
+ word that contains no more than 25 symbol errors.
+ 
+\end_layout
+
+\begin_layout Standard
+There are a number of well-known algebraic algorithms that can carry out
+ the process of decoding a received codeword that contains no more than
+ 
+\begin_inset Formula $t$
+\end_inset
+
+ errors.
+ One such algorithm is the Berlekamp-Massey (BM) decoding algorithm.
+\end_layout
+
+\begin_layout Standard
+A decoder, such as BM, must carry out two tasks: 
+\end_layout
+
+\begin_layout Enumerate
+figure out which symbols were received incorrectly 
+\end_layout
+
+\begin_layout Enumerate
+figure out the correct value of the incorrect symbols 
+\end_layout
+
+\begin_layout Standard
+If it is somehow known that certain symbols are incorrect, such information
+ can be used in the decoding algorithm to reduce the amount of work required
+ in step 1 and to allow step 2 to correct more than 
+\begin_inset Formula $t$
+\end_inset
+
+ errors.
+ In fact, in the unlikely event that the location of each and every error
+ is known and is provided to the BM decoder, and if no correct symbols are
+ accidentally labeled as errors, then the BM decoder can correct up to 
+\begin_inset Formula $d$
+\end_inset
+
+ errors! 
+\end_layout
+
+\begin_layout Standard
+In the decoding algorithm described herein, a list of symbols that are known
+ or suspected to be incorrect is sent to the BM decoder.
+ Symbols in the received word that are flagged as being incorrect are called
+ 
+\begin_inset Quotes eld
+\end_inset
+
+erasures
+\begin_inset Quotes erd
+\end_inset
+
+.
+ Symbols that are not erased and that are incorrect will be called 
+\begin_inset Quotes eld
+\end_inset
+
+errors
+\begin_inset Quotes erd
+\end_inset
+
+.
+ The BM decoder accepts erasure information in the form of a list of indices
+ corresponding to the incorrect, or suspected incorrect, symbols in the
+ received word.
+ As already noted, if the erasure information is perfect, then up to 51
+ errors will be corrected.
+ When the erasure information is imperfect, then some of the erased symbols
+ will actually be correct, and some of the unerased symbols will be in error.
+ If a total of 
+\begin_inset Formula $n_{era}$
+\end_inset
+
+ symbols are erased and the remaining unerased symbols contain 
+\begin_inset Formula $n_{err}$
+\end_inset
+
+ errors, then the BM algorithm can find the correct codeword as long as
+ 
+\begin_inset Formula 
+\begin{equation}
+n_{era}+2n_{err}\le d-1\label{eq:erasures_and_errors}
+\end{equation}
+
+\end_inset
+
+If 
+\begin_inset Formula $n_{era}=0$
+\end_inset
+
+, then the decoder is said to be an 
+\begin_inset Quotes eld
+\end_inset
+
+errors-only
+\begin_inset Quotes erd
+\end_inset
+
+ decoder and it can correct up to 
+\begin_inset Formula $t$
+\end_inset
+
+ errors (
+\begin_inset Formula $t$
+\end_inset
+
+=25 for JT65).
+ If 
+\begin_inset Formula $0<n_{era}\le d-1$
+\end_inset
+
+ (
+\begin_inset Formula $d-1=51$
+\end_inset
+
+ for JT65), then the decoder is said to be an 
+\begin_inset Quotes eld
+\end_inset
+
+errors-and-erasures
+\begin_inset Quotes erd
+\end_inset
+
+ decoder.
+ 
+\end_layout
+
+\begin_layout Standard
+For the JT65 code, (
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:erasures_and_errors"
+
+\end_inset
+
+) says that if 
+\begin_inset Formula $n_{era}$
+\end_inset
+
+ symbols are declared to be erased, then the BM decoder will find the correct
+ codeword as long as the remaining un-erased symbols contain no more than
+ 
+\begin_inset Formula $\left\lfloor \frac{51-n_{era}}{2}\right\rfloor $
+\end_inset
+
+ errors.
+ The errors-and-erasures capability of the BM decoder is a very powerful
+ feature that serves as the core of the new soft-decision decoder described
+ herein.
+ 
+\end_layout
+
+\begin_layout Standard
+It will be helpful to have some understanding of the errors and erasures
+ tradeoff described by (
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:erasures_and_errors"
+
+\end_inset
+
+) to appreciate how the new decoder algorithm works.
+ Section 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sec:Errors-and-erasures-decoding-exa"
+
+\end_inset
+
+ describes some examples that should illustrate how the errors-and-erasures
+ capability can be combined with some information about the quality of the
+ received symbols to enable development of a decoding algorithm that can
+ reliably decode received words that contain many more than 25 errors.
+ Section describes the SFRSD decoding algorithm.
+\end_layout
+
+\begin_layout Section
+\begin_inset CommandInset label
+LatexCommand label
+name "sec:Errors-and-erasures-decoding-exa"
+
+\end_inset
+
+You've got to ask yourself.
+ Do I feel lucky?
+\end_layout
+
+\begin_layout Standard
+Consider a particular received codeword that contains 40 incorrect symbols
+ and 23 correct symbols.
+ It is not known which 40 symbols are in error.
+ 
+\begin_inset Foot
+status open
+
+\begin_layout Plain Layout
+In practice the number of errors will not be known either, but this is not
+ a serious problem.
+\end_layout
+
+\end_inset
+
+ Suppose that the decoder randomly chooses 40 symbols to erase (
+\begin_inset Formula $n_{era}=40$
+\end_inset
+
+), leaving 23 unerased symbols.
+ According to (
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:erasures_and_errors"
+
+\end_inset
+
+), the BM decoder can successfully decode this word as long as the number
+ of errors present in the 23 unerased symbols is 5 or less.
+ This means that the number of errors captured in the set of 40 erased symbols
+ must be at least 35.
+ 
+\end_layout
+
+\begin_layout Standard
+The probability of selecting some particular number of bad symbols in a
+ randomly selected subset of the codeword symbols is governed by the hypergeomet
+ric probability distribution.
+\end_layout
+
+\begin_layout Standard
+Define:
+\end_layout
+
+\begin_layout Itemize
+\begin_inset Formula $N$
+\end_inset
+
+= number of symbols in a codeword (63 for JT65),
+\end_layout
+
+\begin_layout Itemize
+\begin_inset Formula $K$
+\end_inset
+
+= number of incorrect symbols in a codeword,
+\end_layout
+
+\begin_layout Itemize
+\begin_inset Formula $n$
+\end_inset
+
+= number of symbols erased for errors-and-erasures decoding,
+\end_layout
+
+\begin_layout Itemize
+\begin_inset Formula $k$
+\end_inset
+
+= number of incorrect symbols in the set of erased symbols.
+\end_layout
+
+\begin_layout Standard
+Let 
+\begin_inset Formula $X$
+\end_inset
+
+ be the number of incorrect symbols in a set of 
+\begin_inset Formula $n$
+\end_inset
+
+ symbols chosen for erasure.
+ Then
+\begin_inset Formula 
+\begin{equation}
+P(X=k)=\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}\label{eq:hypergeometric_pdf-1}
+\end{equation}
+
+\end_inset
+
+where 
+\begin_inset Formula $\binom{n}{m}=\frac{n!}{m!(n-m)!}$
+\end_inset
+
+ is the binomial coefficient.
+ The binomial coefficient can be calculated using the 
+\begin_inset Quotes eld
+\end_inset
+
+nchoosek(n,k)
+\begin_inset Quotes erd
+\end_inset
+
+ function in Gnu Octave.
+ The hypergeometric probability mass function is available in Gnu Octave
+ as function 
+\begin_inset Quotes eld
+\end_inset
+
+hygepdf(k,N,K,n)
+\begin_inset Quotes erd
+\end_inset
+
+.
+ 
+\end_layout
+
+\begin_layout Case
+A codeword contains 
+\begin_inset Formula $K=40$
+\end_inset
+
+ incorrect symbols.
+ In an attempt to decode using an errors-and-erasures decoder, 
+\begin_inset Formula $n=40$
+\end_inset
+
+ symbols are randomly selected for erasure.
+ The probability that 
+\begin_inset Formula $35$
+\end_inset
+
+ of the erased symbols are incorrect is:
+\begin_inset Formula 
+\[
+P(X=35)=\frac{\binom{40}{35}\binom{63-40}{40-35}}{\binom{63}{40}}=2.356\times10^{-7}.
+\]
+
+\end_inset
+
+Similarly:
+\begin_inset Formula 
+\[
+P(X=36)=8.610\times10^{-9}.
+\]
+
+\end_inset
+
+Since the probability of catching 36 errors is so much smaller than the
+ probability of catching 35 errors, it is safe to say that the probability
+ of randomly selecting an erasure vector that can decode the received word
+ is essentially equal to 
+\begin_inset Formula $P(X=35)\simeq2.4\times10^{-7}$
+\end_inset
+
+.
+ The odds of successfully decoding the word on the first try are about 1
+ in 4 million.
+\end_layout
+
+\begin_layout Case
+A codeword contains 
+\begin_inset Formula $K=40$
+\end_inset
+
+ incorrect symbols.
+ It is interesting to work out the best choice for the number of symbols
+ that should be selected at random for erasure if the goal is to maximize
+ the probability of successfully decoding the word.
+ By exhaustive search, it turns out that the best case is to erase 
+\begin_inset Formula $n=45$
+\end_inset
+
+ symbols, in which case the word will be decoded if the set of erased symbols
+ contains at least 37 errors.
+ With 
+\begin_inset Formula $N=63$
+\end_inset
+
+, 
+\begin_inset Formula $K=40$
+\end_inset
+
+, 
+\begin_inset Formula $n=45$
+\end_inset
+
+, then 
+\begin_inset Formula 
+\[
+P(X\ge37)\simeq2\times10^{-6}.
+\]
+
+\end_inset
+
+This probability is about 8 times higher than the probability of success
+ when only 
+\begin_inset Formula $40$
+\end_inset
+
+ symbols were erased, and the odds of successfully decoding on the first
+ try are roughly 1 in 500,000.
+ 
+\end_layout
+
+\begin_layout Case
+Cases 1 and 2 illustrate the fact that a strategy that tries to guess which
+ symbols to erase is not going to be very successful unless we are prepared
+ to wait all day for an answer.
+ Consider a slight modification to the strategy that can tip the odds in
+ our favor.
+ Suppose that the codeword contains 
+\begin_inset Formula $K=40$
+\end_inset
+
+ incorrect symbols, as before.
+ In this case it is known that 10 of the symbols are much more reliable
+ than the other 53 symbols.
+ The 10 most reliable symbols are all correct and these 10 symbols are protected
+ from erasure, i.e.
+ the set of erasures is chosen from the smaller set of 53 less reliable
+ symbols.
+ If 
+\begin_inset Formula $n=40$
+\end_inset
+
+ symbols are chosen randomly from the set of 
+\begin_inset Formula $N=53$
+\end_inset
+
+ least reliable symbols, it is still necessary for the erased symbols to
+ include at least 35 errors (as in Case 1).
+ In this case, with 
+\begin_inset Formula $N=53$
+\end_inset
+
+, 
+\begin_inset Formula $K=40$
+\end_inset
+
+, 
+\begin_inset Formula $n=35$
+\end_inset
+
+, 
+\begin_inset Formula $P(X=35)=0.001$
+\end_inset
+
+! Now, the situation is much better.
+ The odds of decoding the word on the first try are approximately 1 in 1000.
+ The odds are even better if 41 symbols are erased, in which case 
+\begin_inset Formula $P(X=35)=0.0042$
+\end_inset
+
+, giving odds of about 1 in 200!
+\end_layout
+
+\begin_layout Standard
+Case 3 illustrates how, with the addition of some reliable information about
+ the quality of just 10 of the 63 symbols, it is possible to decode received
+ words containing a relatively large number of errors using only the BM
+ errors-and-erasures decoder.
+ The key to improving the odds enough to make the strategy of 
+\begin_inset Quotes eld
+\end_inset
+
+guessing
+\begin_inset Quotes erd
+\end_inset
+
+ at the erasure vector useful for practical implementation is to use information
+ about the quality of the received symbols to decide which ones are most
+ likely to be in error, and to assign a relatively high probability of erasure
+ to the lowest quality symbols and a relatively low probability of erasure
+ to the highest quality symbols.
+ It turns out that a good choice of the erasure probabilities can increase
+ the probability of a successful decode by several orders of magnitude relative
+ to a bad choice.
+\end_layout
+
+\begin_layout Standard
+Rather than selecting a fixed number of symbols to erase, the SFRSD algorithm
+ uses information available from the demodulator to assign a variable probabilit
+y of erasure to each received symbol.
+ Symbols that are determined to be of low quality and thus likely to be
+ incorrect are assigned a high probability of erasure, and symbols that
+ are likely to be correct are assigned low erasure probabilities.
+ The erasure probability for a symbol is determined using two quality indices
+ that are derived from information provided by the demodulator.
+ 
+\end_layout
+
+\begin_layout Section
+The decoding algorithm
+\end_layout
+
+\begin_layout Standard
+Preliminary setup: Using a large dataset of received words that have been
+ successfully decoded, estimate the probability of symbol error as a function
+ of the symbol's metrics P1-rank and P2/P1.
+ The resulting matrix is scaled by a factor (1.3) and used as the erasure-probabi
+lity matrix in step 2.
+\end_layout
+
+\begin_layout Standard
+For each received word:
+\end_layout
+
+\begin_layout Standard
+1.
+ Determine symbol metrics for each symbol in the received word.
+ The metrics are the rank {1,2,...,63} of the symbol's power percentage and
+ the ratio of the power percentages of the second most likely symbol and
+ the most likely symbol.
+ Denote these metrics by P1-rank and P2/P1.
+\end_layout
+
+\begin_layout Standard
+2.
+ Use the erasure probability for each symbol, make independent decisions
+ about whether or not to erase each symbol in the word.
+ Allow a total of up to 51 symbols to be erased.
+ 
+\end_layout
+
+\begin_layout Standard
+3.
+ Attempt errors-and-erasures decoding with the erasure vector that was determine
+d in step 3.
+ If the decoder is successful, it returns a candidate codeword.
+ Go to step 5.
+\end_layout
+
+\begin_layout Standard
+4.
+ If decoding is not successful, go to step 2.
+\end_layout
+
+\begin_layout Standard
+5.
+ If a candidate codeword is returned by the decoder, calculate its soft
+ distance from the received word and save the codeword if the soft distance
+ is the smallest one encountered so far.
+ If the soft distance is smaller than threshold dthresh, delare a successful
+ decode and return the codeword.
+\end_layout
+
+\begin_layout Standard
+6.
+ If the number of trials is equal to the maximum allowed number, exit and
+ return the current best codeword.
+ Otherwise, go to 2
+\end_layout
+
+\begin_layout Bibliography
+\begin_inset CommandInset bibitem
+LatexCommand bibitem
+key "key-1"
+
+\end_inset
+
+
+\end_layout
+
+\end_body
+\end_document