Also moved the same large array from stack to heap which along with
other prior changes now allows the Windows jt9 OpenMP executable to
run with a default stack size again.
This also removes a crash on the Mac version which was probably due to
excessive stack usage.
Net result is an even faster JT9 decoder.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4942 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
calls to a subroutine. I believe this fixes the known outstanding decode
issue.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4941 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
This is only a temporary fix becuase if both decoders were to produce
results that need accumulating e.g. number of decodes, then more
complex code to merge the results would be needed.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4940 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Accounts for each traced call per thread and accumulates by rolling up
calls with an identical call chain before printing the statistics. The
print now accounts for function calls in their call chain so the same
function will be reported more than once if it is called in different
places.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4937 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Disable timer.out generation in OpenMP builds as it is broken.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4931 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Also limit the required threads for parallel decoding to 2.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4930 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
More detailed message to come, with comparative timing statistics.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4926 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
This tries to account for function calls in different threads
separately by decorating the function name with the thread number it
is running in. This may not be the best strategy for performance
timing but it is the easiest way of making it thread safe that I can
see.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4924 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
This change introduces the program jt9_omp which is a testbed for a
multi-threaded version of the jt9 decoder program. The program jt9_omp
should be a directly substitutable for jt9 except that JT65 and JT9
decodes are computed in parallel.
Also enable the OpenMP directives in decoder.f90 - note this is not
yet a working multi-threaded decoder and the existing jt9 is still the
correct decoder to be used in WSJT-X.
Increased the available stack size for jt9_omp.exe as this is a hard
limit on Windows and the default is not big enough for the OpenMP
version of jt9.
Also Fortran array bounds checking is now disabled for Release
configuration builds so as to improve performance a little.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4922 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Also note: something's wrong when trying to decode a file read
by the GUI from disk. Will fix it soon...
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4920 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
The long FFTs can now use the multi-threaded FFTW routines.
Subroutine decode9.f90 was renamed jt9fano.f90.
The JT9 decoder's top-level functions were removed from decoder.f90
and put into a separate subroutine decjt90.f90.
Subroutine decoder.f90 is now configured for possible use of OpenMP
SECTIONS, with the JT9 and JT65 decoders running concurrently on
a multi-core machine. Note, however, that this concurrent processing
is not yet fully implemented. Probably calls to timer need to be removed;
some variables used in calls to jt65a and decjt9 may need to be
declared PRIVATE in decoder; some sections probably need to be declared
CRITICAL; probably some SAVE statements in downstream routines have
made them not thread-safe; etc., etc.
I'm a neophyte at using OpenMP. Comments, suggestions, and/or tests by
others will be welcome!
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4919 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
New command-line option for jt9: [-m nthreads]. Default is nthreads=1.
Also refactored a loop in filbig.f90 that was taking far too much
time.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4916 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Pass the temporary directory to jt9 and use it to give the correct
paths to temporary files. Also jt9 passes the absolute path to
kvasd.dat in the temporary directory to kvasd.
Clear out all the annoying cruft that has accumulated due to having to
run with $CWD as the temporary directory.
Use QStandardPaths to find the writable data directory where needed
rather than passing it around between objects. This now works because
the $CWD hasn't been changed.
Do away with the CMake option WSJT_STANDARD_FILE_LOCATIONS as it is no
longer needed.
Fix astro status file azel.dat formatting.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4732 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Also complete the wrapper code in wisdom.c.
TBD: should be possible to use fftw3f.f03 instead of the ad hoc wisdom.c.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4617 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
This means that the first decode from a saved data file will be slow,
but the saved wisdom for the decoded mode(s) will be better than
for the default npatience = 1. All subsequent decodes in the same
mode(s) will take advantage of the newly saved wisdom.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4616 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
New optional argument to jt9: -w patience
Default is patience = 1
Example timing measurements for 130610_2343.wav:
patience plan execute
(s) (s)
-----------------------------------------------
0 0.01 1.25 FFTW_ESTIMATE
1 0.69 1.25 FFTW_ESTIMATE_PATIENT
2 16.97 1.15 FFTW_MEASURE
3 390.88 1.15 FFTW_PATIENT
Conclusions, consistent with expectation based on past experience
with similar FFTs:
- First decode (in each mode) with patience = 2 is slow.
- Speed advantage of patience = 2 is small but measurable.
- No measurable advantage in using patience > 2.
Present mainwindow.cpp has "-w 1" hard-wired.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4610 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Notes:
1. Equivalents of wisdom1.bat will be needed for *nix and OS X. (The
version now added to the source .../lib directory is an example only.)
2. Installers should offer to run the wisdom1[.bat] script at installation
time.
3. wisdom1[.bat] and fftwf-wisdom[.exe] must be installed in .../bin directory
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4607 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
Both decoders now have slightly better performance and faster
execution. The rare "duplicate decodes" in JT9 were eliminated.
On Windows, at least, calls to f90 routine system_clock() do not
provide correct wall time increments. Changed to using secnds()
instead.
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4571 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
The lib/Makefile.MinGW makefile has been enhanced to link to the DLL
version of fftw3 which is the normal version of the library that you
get with the Windows installer.
The library is located by passing the FFTW3_DIR variable on the make
command line. For example on my system:
$ # In a MinGW console
$ cd ~/src/wsjtx/lib
$ make QT_DIR=/c/Tools/Qt/5.2.1/mingw48_32 \
> FFTW3_DIR=/c/Tools/fftw-3.3.3-dll32-2
Similarly with the qmake project:
$ # In a Qt MinGW 32-bit console
$ cd ~/src/wsjtx
$ qmake \
> HAMLIB_DIR=c:/test-install/hamlib/mingw48_32 \
> FFTW3_DIR=c:/Tools/fftw-3.3.3-dll32-2
$ mingw32-make
git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@4551 ab8295b8-cf94-4d9e-aec4-7959e3be5d79