Testplan

One of the inputs to sample-tester is the “testplan”, which outlines how to run the samples and what checks to perform.

  1. The testplan is specified in any number of YAML documents that live inside any number of YAML files. Each YAML file may contain multiple YAML documents, separate with the standard --- YAML document separator. Each testplan document self-identifies as such via the use of the type: test/samples top-level field.

    • For backwards compatibility, any document that does not have a type: top-level field will be treated as a testplan if the file in which it was specified ends in .yaml but not .manifest.yaml
  2. The testplan can have any number of test suites.

  3. Each test suite can have setup, teardown, and cases sections.

  4. The cases section is a list of test cases. For _each_ test case, setup is executed before running the test case and teardown is executed after.

  5. setup, teardown and each cases[...].spec is a list of directives and arguments. The directives can be any of the following YAML directives:

    • log: print the arguments, printf style
      • Substrings of the form {} are interpolated with the corresponding positional arguments specified

      • Substrings of the form {id:name} are substituted with the value of the manifest tag corresponding to the key name for the sample identified by id. This can be useful when debugging your test to make sure that all the tags are as you expect.

        • id must resolve to a sample ID specified in the manifest file
        • if name does not match any tag key for the sample id, the substring is substituted with the empty string
        • if name is not specified, the substring is substituted with a serialized representation of all the tags specified for the sample id
    • uuid: return a uuid (if called from yaml, assign it to the variable names as an argument)

    • shell: run in the shell the command specified in the argument

    • call: call the artifact named in the argument; error if the call fails

    • call_may_fail: call the artifact named in the argument; do not error even if the call fails

    • assert_contains: require the output of the last call* to contain all of the strings provided (case-insensitively); abort the test case otherwise

    • assert_excludes_all (and the previous deprecated form assert_not_contains): require the output of the last call* to not contain any of the strings provided (case-insensitively); abort the test case otherwise

    • assert_contains_any: require the output of the last call* to contain at least one of the strings provided (case-insensitively); abort the test case otherwise

    • assert_excludes_any: require the output of the last call* to not contain at least one of the strings provided (case-insensitively); abort the test case otherwise

    • assert_success: require that the exit code of the last call_may_fail was 0; abort the test case otherwise. If the preceding call was a just a call, it would have already failed on a non-zero exit code.

    • assert_failure: require that the exit code of the last call_may_fail or call was NOT 0; abort the test case otherwise. Note, though, that if we’re executing this after just a call, it must have succeeded so this assertion will fail.

    • env: assign the value of an environment (identified by variable) variable to a test case variable (given by name)

    • extract_match: extrack regex matches into local variables

    • code: execute the argument as a chunk of Python code. The other directives above are available as Python calls with the names above. In addition, the following functions are available inside Python code only:

      • fail: mark the test as having failed, but continue executing
      • abort: mark the test as having failed and stop executing
      • assert_that: if the condition in the first argument is false, abort the test case

Here is an informative instance of a sample testfile:

type: test/samples
schema_version: 1
test:
  suites:
  - name:  "Language samples test"
    setup:     # can have yaml and/or code, just as in the cases below
      - code:
          log('In setup "hi"')
    teardown:  # can have yaml and/or code, just as in the cases below
      - code:
          log('In teardown bye')          
    cases:
      
    - name: "A test defined via yaml directives"
      spec:
      - log:
          - 'Reading from manifest at {language_analyze_sentiment_text:@manifest_source}'
      - call:
          sample: "language_analyze_sentiment_text"
          params:
            content:
              literal: "happy happy smile @hope"
      - assert_success: [] # try assert_failure to see how failure looks
      - assert_contains:
          - message: "Have score and magnitude"
          - literal: "score"
          - literal: "magnitude"
      - assert_contains_any:
          - message: "Have magnitude or strength"
          - literal: "strength"
          - literal: "magnitude"
      - assert_contains:
          - message: "Score is very positive"
          - literal: "score: 0.8"
      - assert_contains:
          - message: "Magnitude is very positive"
          - literal: "magnitude: 0.8"
      - assert_excludes:
          - message: "Random message"
          - literal: "The rain in Spain falls mainly in the plain"
      - assert_not_contains:        # deprecated: use "assert_excludes" instead
          - message: "Random message"
          - literal: "The rain in Spain falls mainly in the plain"
# Above is the typical usage

    - name: "A test defined via 'code'"
      spec:
      - code: |
          log('Reading from manifest at {language_analyze_sentiment_text:@manifest_source}')
          
          out = call("language_analyze_sentiment_text", content="happy happy smile hope")
          assert_success("that should have worked", "well")

          assert_contains('score', 'magnitude', message='Have both score and magnitude')
          assert_contains_any('strength', 'magnitude', message='Have either strength or magnitude')
          
          import re

          score_found = re.search('score: ([0123456789.]+)', out) 
          assert_that(score_found is not None, 'score matches regexp')
          score = float(score_found.group(1))
          assert_that(score > 0.7, 'score is high')

          magnitude_found = re.search('magnitude: ([0123456789.]+)', out)
          assert_that(magnitude_found is not None, 'magnitude matches regexp')
          magnitude = float(magnitude_found.group(1))
          assert_that(magnitude > 0.7, 'magnitude is high')

          assert_excludes("the rain in Spain", message="random message")

          # deprecated: use "assert_excludes" instead
          assert_not_contains("the rain in Spain",message="random message")
          
    - name: "A test defined via 'code', with explicit calls to specific samples"
      spec:
      - code: |
          _, out = shell("python3 examples/mock-samples/python/language-v1/analyze_sentiment_request_language_sentiment_text.py -content='happy happy smile hope'")

      # You can interleave yaml and code!
      - assert_success:
        - "that should have worked {}"  
        - well

      - code: |
          import re

          score_found = re.search('score: ([0123456789.]+)', out)  # TODO: Can this be negative?
          assert_that(score_found is not None, 'score matches regexp')
          score = float(score_found.group(1))
          assert_that(score > 0.7, 'score is high')
          home = env('HOME')
          log('home directory: {}'.format(home))

          magnitude_found = re.search('magnitude: ([0123456789.]+)', out)
          assert_that(magnitude_found is not None, 'magnitude matches regexp')
          magnitude = float(magnitude_found.group(1))
          assert_that(magnitude > 0.7, 'magnitude is high')

This test plan has three equivalent representations of the same test, one with canonical artifact paths in the declarative style (using YAML directives), the second with canonical artifact paths in the imperative style (using a code block), and the third using absolute artifact paths in the imperative style (which you would rarely use, since th point of this tool is to not have to hardcode different paths to semantically identical samples).

Unless you specify explicit paths to each sample (which means your test plan cannot run for different languages/environments simultaneously), you will need one or more manifest files (*.manifest.yaml) listing the path and identifiers for each sample in each language/environment. . Refer to the Manifest file format page for an explanation of the structure of the *.manifest.yaml files.