.. _readme-testsystem: ===================== Chapel Testing System ===================== The Chapel testing system is a key piece of technology for the Chapel developer. We use it as a harness for doing test-driven development, for performing sanity checks on code before committing it, for bug and issue tracking, and for nightly correctness and performance regression testing. Getting really comfortable with it is one of the most important things a developer can do early in the development cycle. The tests for the testing system are located in ``$CHPL_HOME/test``. The main script that drives the test system itself is ``$CHPL_HOME/util/start_test``, though it relies on several helper scripts located in ``$CHPL_HOME/util/test``. This document provides only a high-level introduction to the testing system. For further details, ask a core Chapel developer for suggestions. You can also get a sense for the test system by looking through the test directory itself to see how it is used in practice. Outline ======= * `How to Make`_ - `A Correctness Test`_ - `With Outside Arguments`_ - `Compile-time Arguments`_ - `Execution-time Arguments`_ - `Environment Variables`_ - `Controlling How It Runs`_ - `Limiting Time Taken`_ - `With Varying Output`_ - `Test Not Applicable In All Settings`_ - `Testing Different Behavior in Different Settings`_ - `Using precomp, preexec, and prediff files`_ - `Using PRETEST`_ - `A Performance Test`_ - `Identifying Performance Keys`_ - `Validating Performance Test Output`_ - `Accumulating Performance Data in .dat files`_ - `Other Performance Testing Options`_ - `Comparing Multiple Versions`_ - `Comparing to a C version`_ - `Creating a graph comparing multiple variations`_ - `Multilocale Performance Testing`_ - `Multilocale Communication Counts Testing`_ - `Test Your Test Before Submitting`_ - `A Test That Tracks A Failure`_ - `Github Issues`_ - `Tracking Current Failure Mode`_ - `Resolving a Future`_ * `Invoking start_test`_ - `Correctness Testing`_ - `Parallel Testing`_ - `GPU Testing`_ - `Performance Testing`_ - `Sample Output`_ * `Summary of Testing Files`_ .. _With Outside Arguments: `Outside Arguments or Settings`_ .. _With Varying Output: `Tests With Varying Output`_ .. _Test Not Applicable In All Settings: `Limiting Where the Test Runs`_ How to Make =========== A Correctness Test ------------------ Though trivial, this test is available at `$CHPL_HOME/test/Samples/Correctness`_ in the Chapel source repository .. _`$CHPL_HOME/test/Samples/Correctness`: https://github.com/chapel-lang/chapel/pull/295/commits/8c0aaf04dabc007e061588876082f5a1f95c0cae A simplest use of the test system is to create a ``.chpl`` file containing some Chapel code and a ``.good`` file containing the expected output. For example, given a directory containing two files: ``hi.chpl`` .. code-block:: chapel writeln("Hi!"); ``hi.good`` .. code-block:: text Hi! The test system can be exercised by invoking: ``start_test hi.chpl`` This is assuming ``$CHPL_HOME/util/`` is in the user's `$PATH`, which is taken care of when sourcing ``$CHPL_HOME/util/setchplenv.bash``. This will cause the compiler to compile hi.chpl. If compiling hi.chpl does not cause a compilation failure, start_test will then execute the resulting binary. The concatenation of the compiler and executable output will then be compared against the ``.good`` file. A transcript of the test system's actions is printed to the console and also stored in ``$CHPL_HOME/test/Logs/`` by default. For more information on using ``start_test``, see `Invoking start_test`_. Outside Arguments or Settings +++++++++++++++++++++++++++++ In addition to the simplest form of test shown above, the test system supports a number of additional options for creating more complex tests. These options are all specified using files in the same directory as the test. Some files apply to a directory as a whole while others will apply to a single test by sharing the same base filename. Those files which impact the entire directory are named in upper case, e.g. ``COMPOPTS``, or ``PERFNUMTRIALS``. They can be overridden or augmented with test-specific settings using the same name but in lower case, e.g. ``foo.compopts``. Compile-time Arguments ~~~~~~~~~~~~~~~~~~~~~~ To specify arguments to the compiler, provide a ``COMPOPTS`` or ``.compopts`` file for the test. All options for a single compilation should be on the same line - specifying multiple lines will result in multiple compilations of the test file. For instance, to specify that the program should be compiled statically, this file would be provided: ``hi.compopts`` .. code-block:: bash --static To specify that the program should be compiled once statically and once dynamically, the file would look like this: ``hi.compopts`` .. code-block:: bash --static --dynamic Note that sometimes different compilation arguments will result in different output. `Testing Different Behavior in Different Settings`_ provides guidance on how a test could respond to different behavior without modifying the output that is generated. Execution-time Arguments ~~~~~~~~~~~~~~~~~~~~~~~~ Specification of arguments for execution time is performed similarly, using a ``EXECOPTS`` or ``.execopts`` file. Should both an ``.execopts`` and a ``.compopts`` file be provided for a test, their options will be used in combination. For example, a test specified like this: ``multiple-options.chpl`` .. code-block:: chapel config var x = true; if (x) then writeln(5); else writeln(7); ``multiple-options.compopts`` .. code-block:: bash --static --dynamic ``multiple-options.execopts`` .. code-block:: bash --x=true --x=false will be compiled twice, and executed four times by ``start_test``: - Compilation 1: ``chpl --static multiple-options.chpl`` - Execution 1: ``./multiple-options --x=true`` - Execution 2: ``./multiple-options --x=false`` - Compilation 2: ``chpl --dynamic multiple-options.chpl`` - Execution 3: ``./multiple-options --x=true`` - Execution 4: ``./multiple-options --x=false`` Note that sometimes different execution arguments will result in different output. `Testing Different Behavior in Different Settings`_ provides guidance on how a test could respond to different behavior without modifying the output that is generated. Environment Variables ~~~~~~~~~~~~~~~~~~~~~ Environment variables can be set for a particular test or directory using a ``.execenv`` or ``EXECENV`` file. Each environment variable must be specified on a separate line, but all will be set for a particular run. Here is an example ``.execenv`` file: .. code-block:: bash CHPL_RT_NUM_THREADS_PER_LOCALE=100 Controlling How It Runs +++++++++++++++++++++++ The testing system has a variety of files that can fine tune when a test gets run. If the test should only be compiled and not executed, mark it with an empty file with the suffix ``.noexec``, e.g. ``foo.noexec``. If the test should not be compiled or executed on its own (for instance, if it is solely a helper file for another test), give an empty file with the suffix ``.notest``. A directory with an empty ``NOTEST`` file will similarly not be run by the testing system (unless its contents are explicitly listed in the call to ``start_test``). Limiting Time Taken ~~~~~~~~~~~~~~~~~~~ Normally, ``start_test`` will kill a test that has taken longer than 300 seconds to execute or has been compiling for longer than four times the execution timeout value. The execution timeout value can be overridden for a test by specifying the number of seconds in a ``.timeout`` file. It can be set either higher than the default timeout (for tests that take an unusually long time to run) or lower (for tests that are expected to finish very quickly). The former is used more frequently, but the latter is useful when diagnosing a test failure - if the test is usually quick but occasionally hangs, a smaller timeout value can help speed up the time to run the testing system when the failure mode does occur. Note that if the value in this file is longer than the global timeout, any explicit ``-num-trials`` value or ``.perfnumtrials`` file will be ignored (see `A Performance Test`_ for more details on the ``-num-trials`` setting). Tests With Varying Output +++++++++++++++++++++++++ Limiting Where the Test Runs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Sometimes a test is only applicable to certain test environments: it might rely on multi-locale state, or change its behavior dramatically depending on if optimizations are used, for instance. If a test is only intended to run in certain settings, a ``SKIPIF`` or ``.skipif`` file should be used. A directory-wide ``SKIPIF`` file or a test-specific ``.skipif`` file can take two forms. The first is a line separated list of easily computed conditions, any one of which will cause the test not to run in that particular setting. For instance, the following file would only allow ``foo.chpl`` to run in a single-locale setting: ``foo.skipif`` .. code-block:: bash CHPL_COMM != none This is useful when the conditions required to skip a test can be easily determined from the environment. A condition of ``<=`` indicates that the test should be skipped when the environment variable on the left contains the contents on the right, while ``>=`` indicates the opposite - this is useful for imprecise matches, e.g. ``CHPL_HOST_PLATFORM >= cygwin`` would cause the test to run on both ``cygwin64`` and ``cygwin32``. The second form a ``.skipif`` or ``SKIPIF`` file can take is that of a script. This form is intended for conditions that require some computation to determine, or when the combination of conditions is necessary (i.e. this setting **and** this setting are required for the behavior we want to avoid). The script can be in any commonly supported scripting language, usually bash or python. The ``.skipif`` or ``SKIPIF`` file must have executable permissions for this form to work. Printing ``True`` to standard output will result in the test being skipped, while printing ``False`` will result in the test being run. For instance: ``foo.skipif`` .. code-block:: python #!/usr/bin/env python3 import os print(os.getenv('CHPL_TEST_PERF') == 'on' and os.getenv('CHPL_ATOMICS') == 'locks') would cause the test to be skipped when performance testing is done with CHPL_ATOMICS=locks, but not ordinary performance testing, or correctness testing with CHPL_ATOMICS=locks Testing Different Behavior in Different Settings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If a test is intended to work in all settings but will have slightly different behavior in some situations, it is appropriate to add additional ``.good`` files for those settings. Some of these additional ``.good`` files will be used automatically by the testing system, while others will need to be specified explicitly in the ``.compopts`` or ``.execopts`` file for the test. ``start_test`` automatically recognizes ``.good`` files with prefixes for ``--no-local``, communication layer, locale model, and ``chpldoc``. For example: - ``.comm-none.good``: used with ``CHPL_COMM=none`` (the unqualified ``.good`` file will then apply for CHPL_COMM != none) - ``.comm-gasnet.good``: used with ``CHPL_COMM=gasnet`` - ``.comm-ofi.good``: used with ``CHPL_COMM=ofi`` - ``.comm-ugni.good``: used with ``CHPL_COMM=ugni`` - ``.no-local.good``: used with ``--no-local`` testing - ``.na-none.good``: used with ``CHPL_NETWORK_ATOMICS=none`` - ``.tasks-fifo.good``: used with ``CHPL_TASKS=fifo`` - ``.doc.good``: used when testing ``chpldoc`` instead of ``chpl`` Note that ``.comm-``, ``.na-``, and ``lm-`` can be combined, in that order. For instance ``mytest.comm-none.tasks-fifo.good``. Requests can be made for supporting additional formats if a common format does not appear to be covered automatically. If only some compilations or executions of a test need a specialized ``.good`` file, a comment on the same line as the relevant options can be used. For instance: ``foo.execopts`` .. code-block:: bash --x=true # foo.true.good --x=false # foo.false.good will compare test output to ``foo.true.good`` for the first execution and ``foo.false.good`` for the second. Any line that is unlabeled will use the default ``.good`` for that test. Undefined behavior will occur when both the ``.compopts`` and ``.execopts`` files specify a ``.good`` file in this way. If you want to use use default arguments for the test but specify a different ``.good`` file, you can add a line in your compopts/execopts file as follows (note the space before the #): .. code-block:: bash # foo.execopts # foo.true.good Using precomp, preexec, and prediff files +++++++++++++++++++++++++++++++++++++++++ When creating a ``.precomp``, ``.preexec``, or ``.prediff`` file, the file must be an executable. You can turn your script into an executable by running: ``chmod +x foo.precomp``. To specify these files for entire directories, the files should be named ``PRECOMP``, ``PREXEC``, and ``PREDIFF``, respectively. If you wish to have a system wide ``.prediff`` file, you can use the ``CHPL_SYSTEM_PREDIFF`` environment variable that takes a comma-separated list of prediffs to run after every test before comparing to the ``.good`` file. Using PRETEST +++++++++++++ ``PRETEST`` allows you to run a script once before any test is run in a directory. This can be used to set up a test, for example, by generating ``.good`` files, or create/build other programs that are used by the test. The file must be an executable. You can turn your script into an executable by running: ``chmod +x PRETEST``. Note that the ``PRETEST`` script will not be run for any subdirectories and must be either duplicated or have a symbolic link to the parent directory. You can add a symlink to a file in a parent directory by running: ``ln -s ../PRETEST PRETEST`` A Performance Test ------------------ This section covers how to make a performance test, including: - how to indicate it is a performance test - how to specify which parts of the output should be tracked - how to validate the output - how to specify compilation and execution options that are different from the test's normal run - how to track output for multiple tests - how to compare against a version written in C - how to graph the data that has been tracked [Files used to illustrate the running example here can be found at `$CHPL_HOME/test/Samples/Performance`_ in the Chapel source repository] .. _`$CHPL_HOME/test/Samples/Performance`: https://github.com/chapel-lang/chapel/pull/8971 Identifying Performance Keys ++++++++++++++++++++++++++++ Most of the information above pertains to the creation of a correctness test, in which the test's output is compared to a ``.good`` file. The testing system also supports performance tests in which one or more values from a test's output can be tracked on a nightly basis and optionally graphed. Information about running a performance test can be found in `Performance Testing`_. Performance tests are specified using a ``.perfkeys`` file, which lists strings that the test system should look for in the output serving as prefixes for a piece of data to track. When crawling a directory hierarchy, only tests with ``.perfkeys`` files will be considered when testing in performance mode. For example, if a test named ``foo.chpl`` generates output in the following format: .. code-block:: text Time: 194.3 seconds Memory: 24GB Validation: SUCCESS one could track the two numeric values using a ``.perfkeys`` file as follows: ``foo.perfkeys`` .. code-block:: text Time: Memory: As the test system runs, it will look for the specified performance keys in the test output and store the string following the key as part of the performance test output (described below). Note that one could also track the Validation string in this way, though there are better ways to track success/failure conditions, described in the next section. Validating Performance Test Output ++++++++++++++++++++++++++++++++++ In addition to identifying key-value pairs to track, performance testing can also do some simple validation of test output using regular expression-based matching. A line starting with ``verify:[:]`` (or ``reject:[:]``) followed by a regular expression will ensure that the test output contains (does not contain) the given regular expression, and count any surprises as failures in the testing results. The optional line# constrains what line number the output must appear on, where a negative number indicates that the counting should start at the end of the file. For example, adding a third line to the ``.perfkeys`` file, we can also verify that the last line of output contains the string "SUCCESS": ``foo.perfkeys`` .. code-block:: text Time: Memory: verify:-1: SUCCESS Accumulating Performance Data in .dat files +++++++++++++++++++++++++++++++++++++++++++ The values collected during performance testing are stored in a ``.dat`` file in the directory specified by ``$CHPL_TEST_PERF_DIR`` (if undefined, the test system defaults to ``$CHPL_HOME/test/perfdat/``). Each time the test is run in performance mode, a new line of data is added to the end of the ``.dat`` file. The line will start with the date, and the data for each key will be tab-separated. The base name for the ``.dat`` file is taken from the ``.perfkeys`` file. For example, the output for the test above would be stored in a file named ``foo.dat``. Here is a sample ``.dat`` file, for the performance test at `$CHPL_HOME/test/Samples/Performance`_: .. code-block:: text # Date Time: Memory: 03/26/18 194.3 24 04/02/18 194.3 24 Because the lines are tab-separated, the key will not necessarily "line up" visually with the corresponding header. Modifying these files by hand is inadvisable. Performance tests submitted to the Chapel repository are run on a nightly basis, generating these ``.dat`` files. Modifications to the ``.perfkeys`` that specify them **will** impact the ``.dat`` files that have already been generated, so please be careful when updating already existing performance tests. Note that in practice, most tests are written to be run in both a correctness and a performance mode, using a ``bool config const`` to skip the printing of nondeterministic data such as the Time (and possibly Memory) values above. We tend to make tests run in performance mode by default and use a ``foo.execopts`` file to make the correctness testing flip this switch (since end users will typically want the performance data on and there's nothing worse than firing off a long run only to find you didn't turn on the performance metrics). Other Performance Testing Options +++++++++++++++++++++++++++++++++ Like correctness testing, performance testing supports the ability to specify different compiler and execution-time options, etc. This is done using files, as in correctness testing, where the filenames tend to start with ``PERF*`` or ``.perf*``. For example, ``foo.perfcompopts`` would specify compiler options that should be used when compiling the test for performance mode while ``foo.perfexecopts`` specifies execution-time options for performance testing. Comparing Multiple Versions +++++++++++++++++++++++++++ Most performance tests are most interesting when comparing multiple things to one another -- for example, multiple implementations of an algorithm, a test compiled in various configurations, a Chapel vs. C version, etc. The approach typically taken here is to have each configuration write output to its own ``.dat`` file and then to graph columns from various ``.dat`` files against one another. To compare multiple distinct Chapel tests, the approach is easy; simply make each one a performance test with a distinct name. (In fact, Chapel performance tests must have unique names across the entire testing system because all ``.dat`` files are placed into a single directory at the end; the system itself checks for conflicts and complains if it finds any). To compare a single Chapel test compiled or run in multiple configurations, the approach taken is to use multi-line versions of the ``.perfcompopts`` OR ``.perfexecopts`` files, where each line represents a different configuration that should be tested. Each option line should be concluded with a ``#`` comment delimiter, after which a ``.perfkeys`` file should be named. For example, to compare two problem sizes, one might use: ``bar.perfexecopts`` .. code-block:: text --n=100 # bar-100.perfkeys --n=10000 # bar-10000.perfkeys This would cause ``bar.chpl`` to be compiled once and executed twice, one with ``--n=100`` and the second time with ``--n=10000``. The first execution would use ``bar-100.perfkeys`` for its performance keys and write its output to ``bar-100.dat`` while the second would use ``bar-10000.perfkeys`` and write its output to ``bar-10000.dat``. Comparing to a C version ~~~~~~~~~~~~~~~~~~~~~~~~ To compare a C version of a test to a Chapel version, the C version of the test must end with the suffix ``.test.c`` for single locale tests and ``.ml-test.c`` for multilocale tests. Since ``.dat`` files must have unique names, the base name for the C test should vary from the Chapel equivalent. For example, I might name the C version of the ``foo.chpl`` performance test ``foo-c.test.c``. Like any other test, the C test needs a ``.good`` file for correctness testing and a ``.perfkeys`` file for performance testing. C versions do not have to be performance tests, but this is their most common use case. Creating a graph comparing multiple variations ++++++++++++++++++++++++++++++++++++++++++++++ Once you are creating multiple ``.dat`` files containing data you would like to graph, you'll create a ``.graph`` file indicating which data from which ``.dat`` files should be graphed. For example, to compare the timing data from the ``foo.chpl`` and ``foo-c.c`` tests described above, one might use the following ``foo.graph`` file (note that the graph file's basename need not have any relation to the tests it is graphing since they are typically pulling from multiple ``.dat`` files; making the filename useful to human readers is the main consideration). ``foo.graph`` .. code-block:: text perfkeys: Time:, Time: files: foo.dat, foo-c.dat graphkeys: Chapel version, C version ylabel: Time (seconds) graphtitle: Sample Performance Test (Bogus) Briefly, the following three entries need to have the same arity, corresponding to the lines in the graph: * ``perfkeys:`` is a comma-separated list of perfkeys to graph from... * ``files:`` ...the comma-separated list of .dat files, respectively * ``graphkeys:`` this is a comma-separated list of strings to use in the graph's legend. The following two entries are singletons: * ``ylabel:`` a label for the graph's y-axis (the x-axis will be the date the test was run by default) * ``graphtitle:`` a title for the graph as a whole Finally, add the ``.graph`` file to ``$CHPL_HOME/test/GRAPHFILES``. This file is separated into a number of suites (indicated by comments) followed by graphs that should appear in those suites (a graph may appear in multiple suites). This file determines how graphs are organized on the Chapel performance graphing webpages (currently hosted at ``http://chapel-lang.org/perf/``). Once the ``.graph`` file exists and is listed in ``GRAPHFILES``, running ``start_test -performance`` will cause the test system to not only create the ``.dat`` files, but also to create a graph as described in the .graph file. To view the graph, point your browser to ``$CHPL_TEST_PERF_DIR//html/index.html``. Then select the suite(s) in which your graph appears, and you should see data for it. (Note that for a new graph with only one day of data, it can be hard to see the singleton points at first). Multilocale Performance Testing +++++++++++++++++++++++++++++++ Writing a performance test for multilocale setting has similarities to single locale performance testing and multilocale correctness testing. However, helper file suffixes differ from the previously covered ones as follows: ========================= ======================= Single Locale Performance Multilocale Performance ========================= ======================= ``.perfexecopts`` ``.ml-execopts`` ``.perfcompopts`` ``.ml-compopts`` ``.perfkeys`` ``.ml-keys`` ``.graph`` ``.ml-perf.graph`` ``.execenv`` ``.ml-execenv`` ========================= ======================= ======================= ======================= Multilocale Correctness Multilocale Performance ======================= ======================= ``.numlocales`` ``.ml-numlocales`` ======================= ======================= Graph files for multilocale performance tests are listed in ``ML-GRAPHFILES`` instead of ``GRAPHFILES``. Finally to run a multilocale performance test ``start_test --perflabel ml-`` must be used. Multilocale Communication Counts Testing ++++++++++++++++++++++++++++++++++++++++ Another type of multilocale testing is where the number of communication calls (e.g. GETs, PUTs, ONs) generated is tracked. These numbers can be obtained with the help of `CommDiagnostics`_ module and be printed out similar to printing out the time elapsed or throughput. .. _`CommDiagnostics`: https://chapel-lang.org/docs/modules/standard/CommDiagnostics.html Communication counts testing is only applicable in a multilocale setting, and it is similar to multilocale performance testing. However, for helper files ``cc-`` label is used instead of ``ml-``. Test Your Test Before Submitting ++++++++++++++++++++++++++++++++ Before submitting your test for review, be sure that it works under - ``start_test`` - ``start_test --performance`` - ``start_test --perflabel ml-`` (if applicable) - ``start_test --perflabel cc-`` (if applicable) modes when running within the directory (or directories) in question. Nothing is more embarrassing than committing a test that doesn't work on day one. Once the test(s), ``.graph`` files, and ``GRAPHFILES`` are committed to the Chapel repository, they will start showing up on the Chapel public pages as well. A Test That Tracks A Failure ---------------------------- The testing system also serves as our current system for tracking code-driven bugs and open issues. When a bug is encountered (either by a user or a developer), if it is not quickly resolved then it will be tracked by making what is known as a future. When making a new test that is a future, follow the guidelines for making a correctness test. Like normal correctness tests, a future will specify a ``.good`` file with its intended output. However, the future is not expected to match against the ``.good`` file when the future is filed - developer effort is usually required to fix the bug. Once this test is created (or if a test already exists), add a ``.future`` file sharing the same base name as the test to mark it as a future. For example, adding a ``hi.future`` file would make the simple correctness test at the start of this document into a future test. Marking a test as a future causes it to be tested every night, but not to be counted against the compiler's success/failure statistics. If/when the future matches its ``.good`` file, developers will be alerted by the testing system. The format of the ``.future`` file itself is minimally structured. The first line should contain the type of future (see list below) followed by a brief (one 80-column line) description of the future, which ideally reflects the associated GitHub issue title. The next line should contain the associated GitHub issue number in the `#issue-number` format, e.g. `#1`. The rest of the file is optional and free-form. It can be used over the future's lifetime to describe in what way the test isn't working or should be working, implementation notes, philosophical arguments, etc. The current categories of futures reflect GitHub labels: * **bug**: this test exhibits a bug in the implementation * **error message**: this test correctly generates an error message, but the error message needs clarification/improvement * **feature request**: a way of filing a request for a particular feature in Chapel * **performance**: indicates a performance issue that needs to be addressed * **design**: this test raises a question about Chapel's semantics that we ultimately need to address * **portability**: indicates a portability issue that needs to be addressed * **unimplemented feature**: this test uses features that are specified, but which have not yet been implemented. GitHub Issues +++++++++++++ Currently, it is mandatory to include a GitHub issue number with any new futures. That said, futures that pre-date Chapel's adoption of GitHub issues may have a description instead of an issue number. When filing a bug report as an issue, it is considered good practice to include a future for the issue tracked on the `GitHub issues page`_. .. _`GitHub issues page`: https://github.com/chapel-lang/chapel/issues Tracking Current Failure Mode +++++++++++++++++++++++++++++ Sometimes a future will change its behavior, but not be resolved. The future should be updated to continue to track the issue as much as possible - to alert developers when this happens, it is necessary to track not only the expected good output but also the output indicating the current failure. This is done via a ``.bad`` file. The contents of a ``.bad`` file are similar to a ``.good`` file and should match the currently generated output of the test. Tests whose current/``.bad`` output varies based on the compiler version number, line numbers of standard modules and such are fragile since these things change frequently; in such cases, either a ``.prediff`` should be used to filter the output before comparing to ``.bad``, or the ``.bad`` should be omitted. Ultimately, our intention is to support a library of common recipes for ``.bad`` files, but this has not been implemented yet. An easy way to obtain this file is to run the future once using ``start_test`` - the output for that configuration can then be found in a ``.out.tmp`` file in the same directory as the test. Resolving a Future ++++++++++++++++++ There are three situations under which a future will get resolved. 1) A developer explicitly works on resolving the future. 2) A developer works on another feature or issue and as a consequence the future gets resolved. - This could happen if the two issues appeared to be unrelated, or if the existence of the future had been forgotten 3) A developer examines the future and determines the current behavior is correct - The developer may then either remove the supporting files for futures, or remove the test entirely. Invoking start_test =================== A brief description of flags that can be used with ``start_test`` itself can be obtained by calling ``start_test -h``. Correctness Testing ------------------- The section titled `A Correctness Test`_ demonstrates invoking ``start_test`` on a single explicitly-named file. More generally, ``start_test`` takes a list of test and directory names on the command line and will run all tests explicitly named or contained within the directories (or their subdirectories). For example: ``start_test foo.chpl bar/baz.chpl typeTests/ OOPTests/`` will test the two explicitly-named tests (``foo.chpl`` and ``baz.chpl`` stored in the ``bar/`` directory). It will also recursively search for any tests stored in the ``typeTests/`` and ``OOPTests/`` subdirectories. If invoked without any arguments, ``start_test`` will start in the current directory and recursively look for tests in subdirectories. If invoked with the ``--valgrindexe`` flag, ``start_test`` will compile the program and execute it with ``valgrind``. The ``--valgrind`` flag does the same, plus it also runs the compiler under ``valgrind``, which increases testing time compared to ``--valgrindexe``. To learn about best practices with ``valgrind``, see ``Valgrind.rst``. Parallel Testing ++++++++++++++++ To run correctness tests in parallel, ``paratest.local`` can be invoked directly. For example: ``(cd $CHPL_HOME/test && $CHPL_HOME/util/test/paratest.local -dirs deprecated -dirs unstable)`` This command will run all tests in ``$CHPL_HOME/test/deprecated`` and ``$CHPL_HOME/test/deprecated`` using 10 processes. Note that the parallelism is at the directory level granularity, so if a directory is flat (containing only files) it will still run serially with this command. GPU Testing +++++++++++ To run tests with the GPU locale model, the environment variable ``CHPL_TEST_GPU`` needs to be set. For more information on running tests with GPUs, see the :ref:`GPU tech note `. Performance Testing ------------------- To run performance testing, add the ``--performance`` flag to ``start_test`` along with the traditional options. So for example, to run this single test in performance mode, one could use: ``start_test --performance foo.chpl`` When crawling a directory hierarchy, only tests with ``.perfkeys`` files will be considered when testing in performance mode. All performance tests are compiled with ``--fast`` by default and ``--static`` when it's not problematic for the target configuration. Sample Output ------------- The output from a ``start_test`` run will begin with a list of the settings used, following the environment settings as obtained from ``printchplenv`` (see `Setting up Your Environment for Chapel`_). This will be followed by information from running the individual tests or directories. .. _Setting up Your Environment for Chapel: https://chapel-lang.org/docs/usingchapel/chplenv.html The output from ``start_test`` will end with the location of the log file containing all the output from its execution, as well as a summary of all tests that failed and any futures that were run. This will look something like this: .. code-block:: text [Test Summary - 180328.134706] [Error matching program output for path/to/failing/correctness/test] Future (bug: description of bug from future file) [Error matching program output for path/to/failing/future] Future (bug: description of bug from future file) [Success matching program output for path/to/passing/future] [Summary: #Successes = 1 | #Failures = 1 | #Futures = 2 | #Warnings = 0 ] [Summary: #Passing Suppressions = 0 | #Passing Futures = 1 ] [END] Successful tests will not be printed after the line beginning with ``[Test Summary`` unless they had a ``.future`` file (see `A Test That Tracks A Failure`_ for information about ``.future`` files). When nightly testing is run, core developers will be notified of every configuration with a new failure, warning, passing suppression, and/or passing future. Summary of Testing Files ======================== .. TODO: When we move these docs to Sphinx, add :ref:'s to other parts of file, within this table The following table serves as a quick reference for the various test files, and as a table of contents for this page. It is not necessarily complete, and not all of it has been covered in this document. Please ask a member of the core team for more information on a specific file. Using file base name, ``foo`` for the filenames in this table. ================= =========================================================== File Contents of file ================= =========================================================== **correctness** ------------------------------------------------------------------------------- foo.chpl Chapel test program to compile and run foo.test.c Single locale C test program to compile and run. See `Comparing to a C version`_ for more information foo.ml-test.c Multilocale C test program to compile and run. See `Comparing to a C version`_ for more information foo.good expected output of test program .. ------------------------------------------------------------------------------- **Test Settings** ------------------------------------------------------------------------------- foo.compopts line separated compiler flag configurations. See `Compile-time Arguments`_ for more information COMPOPTS directory-wide compiler flags foo.execopts line separated runtime flag configurations. See `Execution-time Arguments`_ for more information EXECOPTS directory-wide runtime flags foo.execenv line separated list of environment variables settings. See `Environment Variables`_ for more information EXECENV directory-wide environment variables foo.numlocales number of locales to use in multilocale run NUMLOCALES directory-wide number of locales to use in multilocale run .. ------------------------------------------------------------------------------- **Helper files** ------------------------------------------------------------------------------- foo.catfiles line separated list of files to include when validating the expected output CATFILES directory-wide list of files to compare with output foo.prediff script that is run on the test output, before taking the diff between the output and .good file PREDIFF directory-wide script that is run over test output foo.precomp script that is run prior to compilation of the test program PRECOMP directory-wide script that is run prior to compilation foo.preexec script that is run prior to execution of the test program PREEXEC directory-wide script that is run prior to execution PRETEST script that is run once per directory prior to any test being run .. ------------------------------------------------------------------------------- **Testing System Settings** ------------------------------------------------------------------------------- foo.cleanfiles line separated list of files to remove before the next test run CLEANFILES directory-wide list of files to remove before test runs foo.noexec empty file. Indicates .chpl file should only be compiled, not executed. See `Controlling How It Runs`_ for more information. NOEXEC Indicates all .chpl files in this directory should only be compiled, not executed. foo.notest empty file. Indicates the file should not be run explicitly See `Controlling How It Runs`_ for more information. NOTEST empty file. Indicates the directory should not be run foo.skipif line separated list of conditions under which the test should not be run, or a script to compute the same. See `Limiting Where the Test Runs`_ for more information SKIPIF same as above, but applied to the entire directory foo.suppressif line separated list of conditions under which the test is expected to fail, or a script to compute the same. Note that unless otherwise specified, a ``.skipif`` or ``.future`` is likely more appropriate for the test. foo.timeout time in seconds after which start_test should stop this test See `Limiting Time Taken`_ for more information .. ------------------------------------------------------------------------------- **performance** (replace "perf" with "ml-" and "cc-" as necessary) ------------------------------------------------------------------------------- foo.perfcompopts compiler flags, overrides .compopts for --performance PERFCOMPOPTS directory-wide performance compiler flags foo.perfexecopts runtime flags, overrides .execopts for --performance PERFEXECOPTS directory-wide performance runtime flags foo.perfexecenv environment variables, overrides .execenv for --performance PERFEXECENV directory-wide performance environment variables foo.perfnumtrials number of execution trials to run if no timeout specified PERFNUMTRIALS directory-wide number of execution trials to run foo.perftimeout time in seconds after which start_test should stop this test foo.perfkeys keys to search for in the output foo.graph Specifies which data files and perfkeys to graph, and contains meta-data associated with labeling data sets, axis, and graphs test/GRAPHFILES Acts as an index that tracks all .graph that should be graphed. .. ------------------------------------------------------------------------------- **futures** ------------------------------------------------------------------------------- foo.future Describes the future being tested, following the newline-separated format of: *category*, *title*, *issue #* foo.bad output generated on a failing test, to track if a known failing future begins failing a different way. See `Tracking Current Failure Mode`_ for more information .. ================= ===========================================================