#LyX 2.1 created this file. For more info see http://www.lyx.org/ \lyxformat 474 \begin_document \begin_header \textclass paper \use_default_options true \maintain_unincluded_children false \language english \language_package default \inputencoding auto \fontencoding global \font_roman default \font_sans default \font_typewriter default \font_math auto \font_default_family default \use_non_tex_fonts false \font_sc false \font_osf false \font_sf_scale 100 \font_tt_scale 100 \graphics default \default_output_format default \output_sync 0 \bibtex_command default \index_command default \float_placement H \paperfontsize 12 \spacing onehalf \use_hyperref false \papersize default \use_geometry true \use_package amsmath 1 \use_package amssymb 1 \use_package cancel 1 \use_package esint 1 \use_package mathdots 1 \use_package mathtools 1 \use_package mhchem 1 \use_package stackrel 1 \use_package stmaryrd 1 \use_package undertilde 1 \cite_engine basic \cite_engine_type default \biblio_style plain \use_bibtopic false \use_indices false \paperorientation portrait \suppress_date false \justification true \use_refstyle 1 \index Index \shortcut idx \color #008000 \end_index \leftmargin 1in \topmargin 1in \rightmargin 1in \bottommargin 1in \secnumdepth 3 \tocdepth 3 \paragraph_separation indent \paragraph_indentation default \quotes_language english \papercolumns 1 \papersides 1 \paperpagestyle default \tracking_changes false \output_changes false \html_math_output 0 \html_css_as_file 0 \html_be_strict false \end_header \begin_body \begin_layout Title Open Source Soft-Decision Decoder for the JT65 (63,12) Reed-Solomon code \end_layout \begin_layout Author Steven J. Franke, K9AN and Joseph H. Taylor, K1JT \end_layout \begin_layout Section \begin_inset CommandInset label LatexCommand label name "sec:Introduction-and-Motivation" \end_inset Background and Motivation \end_layout \begin_layout Standard The JT65 protocol has revolutionized amateur-radio weak-signal communication by enabling operators with small or compromise antennas and relatively low-power transmitters to communicate over propagation paths not usable with traditional technologies. The protocol was developed in 2003 for Earth-Moon-Earth (EME, or \begin_inset Quotes eld \end_inset moonbounce \begin_inset Quotes erd \end_inset ) communication, where the scattered return signals are always weak. It was soon found that JT65 also enables worldwide communication on the HF bands with low power, modest antennas, and efficient spectral usage. \end_layout \begin_layout Standard A major reason for the success and popularity of JT65 is its use of a strong error-correction code: a short block-length, low-rate Reed-Solomon code based on a 64-symbol alphabet. Until now, nearly all programs implementing JT65 have used the patented Kötter-Vardy (KV) algebraic soft-decision decoder \begin_inset CommandInset citation LatexCommand cite key "kv2001" \end_inset , licensed to and implemented by K1JT as a closed-source executable for use only in amateur radio applications. Since 2001 the KV decoder has been considered the best available soft-decision decoder for Reed Solomon codes. We describe here a new open-source alternative called the Franke-Taylor (FT, or K9AN-K1JT) algorithm. It is conceptually simple, built around the well-known Berlekamp-Massey errors-and-erasures algorithm, and in this application it performs even better than the KV decoder. The FT algorithm is implemented in the popular program \emph on WSJT-X \emph default , widely used for amateur weak-signal communication with JT65 and other specialized digital modes. The program is freely available and licensed under the GNU General Public License \begin_inset CommandInset citation LatexCommand cite key "wsjt" \end_inset . \end_layout \begin_layout Standard The JT65 protocol specifies transmissions that normally start one second into a UTC minute and last for 46.8 seconds. Receiving software therefore has up to several seconds to decode a message before the start of the next minute, when the operator sends a reply. With today's personal computers, this relatively long available time encourages experimentation with decoders of high computational complexity. As a result, on a typical fading channel the FT algorithm can extend the decoding threshold by many dB over the hard-decision Berlekamp-Massey decoder, and by a meaningful amount over the KV decoder. In addition to its excellent performance, the new algorithm has other desirable properties, not least of which is its conceptual simplicity. Decoding performance and complexity scale in a convenient way, providing steadily increasing soft-decision decoding gain as a tunable computational complexity parameter is increased over more than five orders of magnitude. Appreciable gain is available from our decoder even on very simple (and relatively slow) computers. On the other hand, because the algorithm benefits from a large number of independent decoding trials, further performance gains should be achievable through parallelization on high-performance computers. \end_layout \begin_layout Section \begin_inset CommandInset label LatexCommand label name "sec:JT65-messages-and" \end_inset JT65 messages and Reed Solomon Codes \end_layout \begin_layout Standard JT65 message frames consist of a short compressed message encoded for transmissi on with a Reed-Solomon code. Reed-Solomon codes are block codes characterized by \begin_inset Formula $n$ \end_inset , the length of their codewords, \begin_inset Formula $k$ \end_inset , the number of message symbols conveyed by the codeword, and the number of possible values for each symbol in the codewords. The codeword length and the number of message symbols are specified with the notation \begin_inset Formula $(n,k)$ \end_inset . JT65 uses a (63,12) Reed-Solomon code with 64 possible values for each symbol. Each of the 12 message symbols represents \begin_inset Formula $\log_{2}64=6$ \end_inset message bits. The source-encoded messages conveyed by a 63-symbol JT65 frame thus consist of 72 information bits. The JT65 code is systematic, which means that the 12 message symbols are embedded in the codeword without modification and another 51 parity symbols derived from the message symbols are added to form a codeword of 63 symbols. \end_layout \begin_layout Standard In coding theory the concept of Hamming distance is used as a measure of \begin_inset Quotes eld \end_inset distance \begin_inset Quotes erd \end_inset between different codewords, or between a received word and a codeword. Hamming distance is the number of code symbols that differ in two words being compared. Reed-Solomon codes have minimum Hamming distance \begin_inset Formula $d$ \end_inset , where \begin_inset Formula \begin{equation} d=n-k+1.\label{eq:minimum_distance} \end{equation} \end_inset The minimum Hamming distance of the JT65 code is \begin_inset Formula $d=52$ \end_inset , which means that any particular codeword differs from all other codewords in at least 52 of the 63 symbol positions. \end_layout \begin_layout Standard Given a received word containing some incorrect symbols (errors), the received word can be decoded into the correct codeword using a deterministic, algebraic algorithm provided that no more than \begin_inset Formula $t$ \end_inset symbols were received incorrectly, where \begin_inset Formula \begin{equation} t=\left\lfloor \frac{n-k}{2}\right\rfloor .\label{eq:t} \end{equation} \end_inset For the JT65 code \begin_inset Formula $t=25$ \end_inset , so it is always possible to decode a received word having 25 or fewer symbol errors. Any one of several well-known algebraic algorithms, such as the widely used Berlekamp-Massey (BM) algorithm, can carry out the decoding. Two steps are necessarily involved in this process. We must (1) determine which symbols were received incorrectly, and (2) find the correct value of the incorrect symbols. If we somehow know that certain symbols are incorrect, that information can be used to reduce the work involved in step 1 and allow step 2 to correct more than \begin_inset Formula $t$ \end_inset errors. In the unlikely event that the location of every error is known and if no correct symbols are accidentally labeled as errors, the BM algorithm can correct up to \begin_inset Formula $d-1=n-k$ \end_inset errors. \end_layout \begin_layout Standard The FT algorithm creates lists of symbols suspected of being incorrect and sends them to the BM decoder. Symbols flagged in this way are called \begin_inset Quotes eld \end_inset erasures, \begin_inset Quotes erd \end_inset while other incorrect symbols will be called \begin_inset Quotes eld \end_inset errors. \begin_inset Quotes erd \end_inset With perfect erasure information up to 51 incorrect symbols can be corrected for the JT65 code. Imperfect erasure information means that some erased symbols may be correct, and some other symbols in error. If \begin_inset Formula $s$ \end_inset symbols are erased and the remaining \begin_inset Formula $n-s$ \end_inset symbols contain \begin_inset Formula $e$ \end_inset errors, the BM algorithm can find the correct codeword as long as \begin_inset Formula \begin{equation} s+2e\le d-1.\label{eq:erasures_and_errors} \end{equation} \end_inset If \begin_inset Formula $s=0$ \end_inset , the decoder is said to be an \begin_inset Quotes eld \end_inset errors-only \begin_inset Quotes erd \end_inset decoder. If \begin_inset Formula $0X$ \end_inset . Correspondingly, the FT algorithm works best when the probability of erasing a symbol is somewhat larger than the probability that the symbol is incorrect. For the JT65 code we found empirically that good decoding performance is obtained when the symbol erasure probability is about 1.3 times the symbol error probability. \end_layout \begin_layout Standard The FT algorithm tries successively to decode the received word using independen t educated guesses to select symbols for erasure. For each iteration a stochastic erasure vector is generated based on the symbol erasure probabilities. The erasure vector is sent to the BM decoder along with the full set of 63 hard-decision symbol values. When the BM decoder finds a candidate codeword it is assigned a quality metric \begin_inset Formula $d_{s}$ \end_inset , the soft distance between the received word and the codeword: \begin_inset Formula \begin{equation} d_{s}=\sum_{j=1}^{n}\alpha_{j}\,(1+p_{1,\, j}).\label{eq:soft_distance} \end{equation} \end_inset Here \begin_inset Formula $\alpha_{j}=0$ \end_inset if received symbol \begin_inset Formula $j$ \end_inset is the same as the corresponding symbol in the codeword, \begin_inset Formula $\alpha_{j}=1$ \end_inset if the received symbol and codeword symbol are different, and \begin_inset Formula $p_{1,\, j}$ \end_inset is the fractional power associated with received symbol \begin_inset Formula $j$ \end_inset . Think of the soft distance as made up of two terms: the first is the Hamming distance between the received word and the codeword, and the second ensures that if two candidate codewords have the same Hamming distance from the received word, a smaller soft distance will be assigned to the one where differences occur in symbols of lower estimated reliability. \end_layout \begin_layout Standard In practice we find that \begin_inset Formula $d_{s}$ \end_inset can reliably indentify the correct codeword if the signal-to-noise ratio for individual symbols is greater than about 4 in linear power units. We also find that significantly weaker signals can be decoded by using soft-symbol information beyond that contained in \begin_inset Formula $p_{1}$ \end_inset and \begin_inset Formula $p_{2}$ \end_inset . To this end we define an additional metric \begin_inset Formula $u$ \end_inset , the average signal-plus-noise power in all symbols according to a candidate codeword's symbol values: \end_layout \begin_layout Standard \begin_inset Formula \begin{equation} u=\frac{1}{n}\sum_{j=1}^{n}S(c_{j},\, j).\label{eq:u-metric} \end{equation} \end_inset Here the \begin_inset Formula $c_{j}$ \end_inset 's are the symbol values for the candidate codeword being tested. \end_layout \begin_layout Standard The correct JT65 codeword produces a value for \begin_inset Formula $u$ \end_inset equal to the average of \begin_inset Formula $n=63$ \end_inset bins containing both signal and noise power. Incorrect codewords have at most \begin_inset Formula $k-1=11$ \end_inset such bins and at least \begin_inset Formula $n-k+1=52$ \end_inset bins containing noise only. Thus, if the spectral array \begin_inset Formula $S(i,\, j)$ \end_inset has been normalized so that the average value of the noise-only bins is unity, \begin_inset Formula $u$ \end_inset for the correct codeword has expectation value (average over many random realizations) \end_layout \begin_layout Standard \begin_inset Formula \begin{equation} \bar{u}_{1}=1+y,\label{eq:u1-exp} \end{equation} \end_inset where \begin_inset Formula $y$ \end_inset is the signal-to-noise ratio in linear power units. If we assume Gaussian statistics and a large number of trials, the standard deviation of measured values of \begin_inset Formula $u_{1}$ \end_inset is \end_layout \begin_layout Standard \begin_inset Formula \begin{equation} \sigma_{1}=\left(\frac{1+2y}{n}\right)^{1/2}.\label{eq:sigma1} \end{equation} \end_inset In contrast, the expected value and standard deviation of the \begin_inset Formula $u$ \end_inset -metric for an incorrect codeword (randomly selected from a population of all \begin_inset Quotes eld \end_inset worst case \begin_inset Quotes erd \end_inset codewords, \emph on i.e. \emph default , those with \begin_inset Formula $k-1$ \end_inset symbols identical to corresponding ones in the correct word) are given by \end_layout \begin_layout Standard \begin_inset Formula \begin{equation} \bar{u}_{i}=1+\left(\frac{k-1}{n}\right)y,\label{eq:u2-exp} \end{equation} \end_inset \end_layout \begin_layout Standard \begin_inset Formula \begin{equation} \sigma_{i}=\frac{1}{n}\left[n+2y(k-1)\right]^{1/2}.\label{eq:sigma2} \end{equation} \end_inset \end_layout \begin_layout Standard If \begin_inset Formula $u$ \end_inset is evaluated for a large number of candidate codewords, one of which is correct, we should expect the largest value \begin_inset Formula $u_{1}$ \end_inset to be drawn from a population with statistics described by \begin_inset Formula $\bar{u}_{1}$ \end_inset and \begin_inset Formula $\sigma_{1}.$ \end_inset If no tested codeword is correct, \begin_inset Formula $u_{1}$ \end_inset is likely to come from the \begin_inset Formula $(\bar{u}_{i},\,\sigma_{i})$ \end_inset population and to be several standard deviations above the mean. In either case the second-largest value, \begin_inset Formula $u_{2},$ \end_inset will likely come from the \begin_inset Formula $(\bar{u}_{i},\,\sigma_{i})$ \end_inset population, again several standard deviations above the mean. \end_layout \begin_layout Standard If the signal-to-noise ratio \begin_inset Formula $y$ \end_inset is too small for decoding to be possible, or for some other reason the correct codeword is never presented as a candidate, the ratio \begin_inset Formula $r=u_{2}/u_{1}$ \end_inset will likely be close to 1. On the other hand, correctly identified codewords will produce \begin_inset Formula $u_{1}$ \end_inset significantly larger than \begin_inset Formula $u_{2}$ \end_inset and thus smaller values of \begin_inset Formula $r$ \end_inset . We therefore apply a ratio threshold test, say \begin_inset Formula $r \begin_inset Text \begin_layout Plain Layout Case \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout # decodes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout one-pass, BM \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout 9 \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout one-pass, FT \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout 12 \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout two-pass, BM \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout 12 \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout two-pass, FT \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout 21 \end_layout \end_inset \end_inset \end_layout \begin_layout Plain Layout \begin_inset Caption Standard \begin_layout Plain Layout The effect of soft-symbol decoding combined with signal subtraction on the data shown in Figure \begin_inset CommandInset ref LatexCommand ref reference "fig:spectrogram" \end_inset . \end_layout \end_inset \end_layout \begin_layout Plain Layout \end_layout \end_inset \end_layout \begin_layout Subsubsection* Experience with FT on crowded HF bands: \end_layout \begin_layout Standard (Re the following paragraph and Figure \begin_inset CommandInset ref LatexCommand ref reference "fig:spectrogram" \end_inset - just playing around with ideas - feel free to change, delete, etc.) \end_layout \begin_layout Standard The JT65 mode has proven to be remarkably versatile. Thousands of users regularly use the mode for two-way communication over terrestrial paths and the earth-moon-earth ( \begin_inset Quotes eld \end_inset moonbounce \begin_inset Quotes erd \end_inset ) path at frequencies from VHF to microwaves, and over multi-hop ionospheric reflection paths at HF. Use on HF was not originally an intended application for the mode, but at present HF use accounts for the largest number of 2-way contacts. Figure \begin_inset CommandInset ref LatexCommand ref reference "fig:spectrogram" \end_inset is a spectrogram with frequency in Hz on the horizontal axis showing JT65 activity during a one-minute time-segment (JT65 transmissions start at the beginning of a minute and last for approximately 47 s). The data was collected in the 20m amateur band during daytime under crowded band conditions. With some straightforward signal processing to demodulate the signals and produce soft-symbol data for the FT decoder it is possible to extract and decode 21 messages from the data summarized in Figure 5. This is achieved with a relatively small timeout parameter \begin_inset Formula $T=1000$ \end_inset and in spite of the fact that the 200 Hz-wide 65-FSK (sync plut 64-FSK) signals overlap, with as many as 4 signals superposed in some parts of the spectrum. Using exactly the same pre-processing, but without soft-symbol information the errors-only BM decoder is able to decode only 12 messages. \end_layout \begin_layout Standard \begin_inset Float figure wide false sideways false status open \begin_layout Plain Layout \align center \begin_inset Graphics filename fig_waterfall.tiff width 6.5in BoundingBox 0bp 0bp 1124bp 200bp clip \end_inset \end_layout \begin_layout Plain Layout \begin_inset Caption Standard \begin_layout Plain Layout \begin_inset CommandInset label LatexCommand label name "fig:spectrogram" \end_inset A spectrogram showing one minute of data collected under crowded band conditions on 20m during daytime hours. WSJT-X extracted 21 JT65 messages in approximately the same bandwidth required for a single SSB signal. \end_layout \end_inset \end_layout \begin_layout Plain Layout \end_layout \end_inset \end_layout \begin_layout Standard Maybe one screen shot, or partial screen shot of the \begin_inset Quotes eld \end_inset Band Activity \begin_inset Quotes erd \end_inset window? \end_layout \begin_layout Standard Some EME results needed! \end_layout \begin_layout Standard Something about the code repository and how to build \emph on WSJT-X \emph default . \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "1" key "kv2001" \end_inset “Algebraic soft-decision decoding of Reed-Solomon codes,” R. Kötter and A. Vardy, \emph on IEEE Transactions on Information Theory \emph default , Vol. 49, Nov. 2003. \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "2" key "wsjt" \end_inset \emph on WSJT Home Page \emph default : http://www.physics.princeton.edu/pulsar/K1JT/. \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "3" key "lc2004" \end_inset \emph on Error Control Coding, 2nd Edition \emph default , Shu Lin and Daniel J. Costello, Pearson-Prentice Hall, 2004. \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "4" key "lhmg2010" \end_inset "Stochastic Chase Decoding of Reed-Solomon Codes", Camille Leroux, Saied Hemati, Shie Mannor, Warren J. Gross, \emph on IEEE Communications Letters \emph default , Vol. 14, No. 9, September 2010. \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "5" key "lk2008" \end_inset "Soft-Decision Decoding of Reed-Solomon Codes Using Successive Error-and-Erasure Decoding," Soo-Woong Lee and B. V. K. Vijaya Kumar, \emph on IEEE \begin_inset Quotes eld \end_inset GLOBECOM \begin_inset Quotes erd \end_inset 2008 proceedings. \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "6" key "ls2009" \end_inset \begin_inset Quotes erd \end_inset Stochastic Erasure-Only List Decoding Algorithms for Reed-Solomon Codes, \begin_inset Quotes erd \end_inset Chang-Ming Lee and Yu T. Su, \emph on IEEE Signal Processing Letters, \emph default Vol. 16, No. 8, August 2009. \end_layout \begin_layout Bibliography \begin_inset CommandInset bibitem LatexCommand bibitem label "7" key "karn" \end_inset Berlekamp-Massey decoder written by Phil Karn, KA9Q: http://www.ka9q.net/code/fec/ \end_layout \begin_layout Section \start_of_appendix \begin_inset CommandInset label LatexCommand label name "sec:Appendix:SNR" \end_inset Appendix: Signal to Noise Ratios \end_layout \begin_layout Standard The signal to noise ratio in a bandwidth, \begin_inset Formula $B$ \end_inset , that is at least as large as the bandwidth occupied by the signal is: \begin_inset Formula \begin{equation} \mathrm{SNR}_{B}=\frac{P_{s}}{N_{0}B}\label{eq:SNR} \end{equation} \end_inset where \begin_inset Formula $P_{s}$ \end_inset is the average signal power (W), \begin_inset Formula $N_{0}$ \end_inset is one-sided noise power spectral density (W/Hz), and \begin_inset Formula $B$ \end_inset is the bandwidth in Hz. In amateur radio applications, digital modes are often compared based on the SNR defined in a 2.5 kHz reference bandwidth, \begin_inset Formula $\mathrm{SNR}_{2500}$ \end_inset . \end_layout \begin_layout Standard In the professional literature, decoder performance is characterized in terms of \begin_inset Formula $E_{b}/N_{0}$ \end_inset , the ratio of the energy collected per information bit, \begin_inset Formula $E_{b}$ \end_inset , to the one-sided noise power spectral density, \begin_inset Formula $N_{0}$ \end_inset . Denote the duration of a channel symbol by \begin_inset Formula $\tau_{s}$ \end_inset (for JT65, \begin_inset Formula $\tau_{s}=0.3715\,\mathrm{s}$ \end_inset ). JT65 signals have constant envelope, so the average signal power is related to the energy per symbol, \begin_inset Formula $E_{s}$ \end_inset , by \begin_inset Formula \begin{equation} P_{s}=E_{s}/\tau_{s}.\label{eq:signal_power} \end{equation} \end_inset The total energy in a received JT65 message consisting of \begin_inset Formula $n=63$ \end_inset channel symbols is \begin_inset Formula $63E_{s}$ \end_inset . The energy collected for each of the 72 bits of information conveyed by the message is then \begin_inset Formula \begin{equation} E_{b}=\frac{63E_{s}}{72}=0.875E_{s.}\label{eq:Eb_Es} \end{equation} \end_inset Using equations ( \begin_inset CommandInset ref LatexCommand ref reference "eq:SNR" \end_inset )-( \begin_inset CommandInset ref LatexCommand ref reference "eq:Eb_Es" \end_inset ), \begin_inset Formula $\mathrm{SNR}_{2500}$ \end_inset can be written in terms of \begin_inset Formula $E_{b}/N_{o}$ \end_inset : \begin_inset Formula \begin{equation} \mathrm{SNR}_{2500}=1.23\times10^{-3}\frac{E_{b}}{N_{0}}.\label{eq:SNR2500} \end{equation} \end_inset If all quantities are expressed in dB, then: \end_layout \begin_layout Standard \begin_inset Formula \begin{equation} \mathrm{SNR}_{2500}=(E_{b}/N_{0})_{\mathrm{dB}}-29.1\,\mathrm{dB}=(E_{s}/N_{0})_{\mathrm{dB}}-29.7\,\mathrm{dB}.\label{eq:SNR_all_types} \end{equation} \end_inset \end_layout \end_body \end_document