Tool: mp_seq

Overview

The mp_seq tool provides an interface to generate stressmarks that execute sequences that combine the instructions provided in a certain way to comply the requirements provided. Sequence stressmarks are loops with a particular sequence repeated several times. They are ideal to perform core characterizations during the first steps of the maximum power generation process (i.e. find the sequence that maximizes the power consumption of the cores). Users can control the level of instruction-level parallelism and the loop size via configurable parameters.

Basic usage

> mp_seq -T TARGET -D OUTPUT_DIR -ig group1 group2

where:

Flag/Argument	Description
`-ig groups`, `--instruction-groups groups`	Comma separated list of instruction candidates per group. E.g. -ins ins1,ins2 ins3,ins4. Defines two groups of instruction candidates: group 1: ins1,ins2 and group 2: ins3,ins4.
`-T TARGET`, `--target TARGET`	Target definition string. Check: Command line target definition scheme.
`-D OUTPUT_DIR`, `--seq-output-dir OUTPUTDIR`	Output directory.

There are other parameters to tune the code being generated. Check the rest of this document for details. The example section provides a very detailed use case scenario that uses all the parameters.

Full usage

mp_seq.py: INFO: Processing input arguments...
usage: mp_seq.py [-h] [-P SEARCH_PATH [SEARCH_PATH ...]] [-V] [-v] [-d]
                 [-c CONFIG_FILE [CONFIG_FILE ...]] [-C FORCE_CONFIG_FILE]
                 [--dump-configuration-file OUTPUT_CONFIG_FILE]
                 [--dump-full-configuration-file OUTPUT_CONFIG_FILE]
                 [-A ARCHITECTURE_PATHS] [-M MICROARCHITECTURE_PATHS]
                 [-E ENVIRONMENT_PATHS] -T TARGET [--list-architectures]
                 [--list-microarchitectures] [--list-environments]
                 [--traceback] [--profile PROFILE_OUTPUT] -D SEQ_OUTPUT_DIR
                 [-is INSTRUCTION_SLOTS]
                 [-ig INSTRUCTION_GROUPS [INSTRUCTION_GROUPS ...]]
                 [-im INSTRUCTION_MAP [INSTRUCTION_MAP ...]] [-bs BASE_SEQ]
                 [-gM GROUP_MAX [GROUP_MAX ...]]
                 [-gm GROUP_MIN [GROUP_MIN ...]] [-B BENCHMARK_SIZE]
                 [-dd DEPENDENCY_DISTANCE] [-fs] [-e] [-R] [-p]
                 [-bn BATCH_NUMBER] [-nb NUM_BATCHES] [-s] [-combinations]
                 [-sn] [-CC] [-N]

Microprobe seq tool

optional arguments:
  -h, --help            show this help message and exit
  -P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
                        Default search paths for microprobe target definitions
  -V, --version         Show Microprobe version and exit
  -v, --verbosity       Verbosity level (Values: [0,1,2,3,4]). Each time this
                        argument is specified the verbosity level is
                        increased. By default, no logging messages are shown.
                        These are the four levels available:
                        
                          -v (1): critical messages
                          -v -v (2): critical and error messages
                          -v -v -v (3): critical, error and warning messages
                          -v -v -v -v (4): critical, error, warning and info messages
                        
                        Specifying more than four verbosity flags, will
                        default to the maximum of four. If you need extra
                        information, enable the debug mode (--debug or -d
                        flags).
  -d, --debug           Enable debug mode in Microprobe framework. Lots of
                        output messages will be generated

Configuration arguments:

  Command arguments related to configuration file handling

  -c CONFIG_FILE [CONFIG_FILE ...], --configuration CONFIG_FILE [CONFIG_FILE ...]
                        Configuration file. The configuration files will be
                        readed in order of appearance. Values are reset by the
                        last configuration file in case of non-list values.
                        List values will be appended (not reset)
  -C FORCE_CONFIG_FILE, --force-configuration FORCE_CONFIG_FILE
                        Force configuration file. Use this configuration file
                        as the default start configuration. This disables any
                        system-wide, or user-provided configuration.
  --dump-configuration-file OUTPUT_CONFIG_FILE
                        Dump a configuration file with the actual
                        configuration used
  --dump-full-configuration-file OUTPUT_CONFIG_FILE
                        Dump a configuration file with the actual
                        configuration used plus all the configuration options
                        not set

Target path arguments:

  Command arguments related to target paths

  -A ARCHITECTURE_PATHS, --architecture-paths ARCHITECTURE_PATHS
                        Search path for architecture definitions. Microprobe
                        will search in these paths for architecture
                        definitions
  -M MICROARCHITECTURE_PATHS, --microarchitecture-paths MICROARCHITECTURE_PATHS
                        Search path for microarchitecture definitions.
                        Microprobe will search in these paths for
                        microarchitecture definitions
  -E ENVIRONMENT_PATHS, --environment-paths ENVIRONMENT_PATHS
                        Search path for environment definitions. Microprobe
                        will search in these paths for environment definitions

Target arguments:

  Command arguments related to target specification and queries

  -T TARGET, --target TARGET
                        Target tuple. Microprobe follows a GCC-like target
                        definition scheme, where a target is defined by a
                        tuple as following:
                        
                          <arch-name>-<uarch-name>-<env-name>
                        
                        where:
                        
                          <arch-name>: is the name of the architecture
                          <uarch-name>: is the name of the microarchitecture
                          <env-name>: is the name of the environment
                        
                        One can use --list-* options to get the list of
                        definitions available in the default search paths or
                        the paths specified by the different --*-paths options
  --list-architectures  Generate a list of architectures available in the
                        defined search paths and exit
  --list-microarchitectures
                        Generate a list of microarchitectures available in the
                        defined search paths and exit
  --list-environments   Generate a list of environments available in the
                        defined search paths and exit

Debug arguments:

  Command arguments related to debugging facilities

  --traceback           show a traceback and starts a python debugger (pdb)
                        when an error occurs. 'pdb' is an interactive python
                        shell that facilitates the debugging of errors
  --profile PROFILE_OUTPUT
                        dump profiling information into given file (see
                        'pstats' module)

SEQ arguments:

  Command arguments related to Sequence generation

  -D SEQ_OUTPUT_DIR, --seq-output-dir SEQ_OUTPUT_DIR
                        Output directory name
  -is INSTRUCTION_SLOTS, --instruction-slots INSTRUCTION_SLOTS
                        Number of instructions slots in the sequence. E.g.
                        '-is 4' will generate sequences of length 4.
  -ig INSTRUCTION_GROUPS [INSTRUCTION_GROUPS ...], --instruction-groups INSTRUCTION_GROUPS [INSTRUCTION_GROUPS ...]
                        Comma separated list of instruction candidates per
                        group. E.g. -ins ins1,ins2 ins3,ins4. Defines two
                        groups of instruction candidates: group 1: ins1,ins2
                        and group 2: ins3,ins4.
  -im INSTRUCTION_MAP [INSTRUCTION_MAP ...], --instruction-map INSTRUCTION_MAP [INSTRUCTION_MAP ...]
                        Comma separated list specifying groups instruction
                        candidate groups to be used on each instruction slot.
                        The list length should match the number of instruction
                        slots defined. A -1 value means all groups can be used
                        for that slot. E.g. -im 1 2,3 will generate sequences
                        containing in slot 1 instructions from group 1, and in
                        slot 2 instructions from groups 2 and 3.
  -bs BASE_SEQ, --base-seq BASE_SEQ
                        Comma separated list specifying the base instruction
                        sequence
  -gM GROUP_MAX [GROUP_MAX ...], --group-max GROUP_MAX [GROUP_MAX ...]
                        Comma separated list specifying the maximum number of
                        instructions of each group to be used. E.g. -gM 1,3
                        will generate sequences containing at most 1
                        instruction of group 1 and 3 instructions of group 2.
                        A -1 value means no maximum. The list length should
                        match the number of instruction groups defined.
  -gm GROUP_MIN [GROUP_MIN ...], --group-min GROUP_MIN [GROUP_MIN ...]
                        Comma separated list specifying the minimum number of
                        instructions of each group to be used. E.g. -gm 1,3
                        will generate sequences containing at least 1
                        instruction of group 1 and 3 instructions of group 2.
                        A -1 value means no minimum. The list length should
                        match the number of instruction groups defined.
  -B BENCHMARK_SIZE, --benchmark-size BENCHMARK_SIZE
                        Size in instructions of the microbenchmark. If more
                        instruction are needed, nested loops are automatically
                        generated
  -dd DEPENDENCY_DISTANCE, --dependency-distance DEPENDENCY_DISTANCE
                        Average dependency distance between instructions. A
                        value below 1 means not dependency between
                        instructions. A value of 1 means a chain of dependent
                        instructions.
  -fs, --force-switch   Force data switching in all instructions, fail if not
                        supported.
  -e, --endless         Some backends allow the control to wrap the sequence
                        generated in an endless loop. Depending on the target
                        specified, this flag will force to generate sequences
                        in an endless loop (some targets might ignore it)
  -R, --reset           Reset the register contents on each loop iteration
  -p, --parallel        Generate benchmarks in parallel
  -bn BATCH_NUMBER, --batch-number BATCH_NUMBER
                        Batch number to generate. Check --num-batches option
                        for more details
  -nb NUM_BATCHES, --num-batches NUM_BATCHES
                        Number of batches. The number of microbenchmark to
                        generate is divided by this number, and the number the
                        batch number specified using -bn option is generated.
                        This is useful to split the generation of many test
                        cases in various batches.
  -s, --skip            Skip benchmarks already generated
  -combinations, --combinations
                        Only generate combinations of the given instructions
  -sn, --shortnames     Use short output names
  -CC, --compress       Compress output files
  -N, --count           Only count the number of sequence to generate. Do not
                        generate anything

Environment variables:

  MICROPROBETEMPLATES    Default path for microprobe templates
  MICROPROBEDEBUG        If set, enable debug
  MICROPROBEDEBUGPASSES  If set, enable debug during passes
  MICROPROBEASMHEXFMT    Assembly hexadecimal format. Options:
                         'all' -> All immediates in hex format
                         'address' -> Address immediates in hex format (default)
                         'none' -> All immediate in integer format

Example use case

This use case is using the power_v300-power9-ppc64_linux_gcc target for illustrative purposes. The same can be done on other targets.

Let’s assume that you have analyzed the ISA of the target and you have selected some instruction candidates you want to use for your exploration. For instance the following:

Fix point instructions: ADDI_V0, ORI_V0, AND_V0
Vector instructions: XVMULDP_V0, XVDIVDP_V0, XVMADDADP_V0
Load instructions: LD_V0, LDX_V0, LBZ_V0
Branch instructions: B_V0, BC_V0

Then, you have to decide the instruction sequence length. In this case we pick 6 because we know that the target machine can commit up to 6 instruction per cycle. You can increase that number further but the number of combinations will explode.

To generate all the unique combination of length 6 of the instructions above you need to issue the following command:

> mp_seq -T power_v300-power9-ppc64_linux_gcc -p -s -D <output_dir> -is 6 -ig ADDI_V0,ORI_V0,AND_V0 XVMULDP_V0,XVDIVDP_V0,XVMADDADP_V0 LD_V0,LDX_V0,LBZ_V0 B_V0,BC_V0

In the command above, we used the -ig parameter to specify 4 instruction groups (4 groups of comma-separated instructions names). Without any other restrictions the number of combinations to generate is 1771561.

It is better to constraint the design space and discard the sequences that we know that will not be useful for our study. First, let’s put some maximum to the number of instructions of each group we want in the sequence. Let’s say that we want a maximum of 3 fix point instructions, 2 vector instructions, 2 load instructions and 1 branch instruction per sequence. We can specify that using the following command:

> mp_seq -T power_v300-power9-ppc64_linux_gcc -p -s -D <output_dir> -is 6 -ig ADDI_V0,ORI_V0,AND_V0 XVMULDP_V0,XVDIVDP_V0,XVMADDADP_V0 LD_V0,LDX_V0,LBZ_V0 B_V0,BC_V0 -gM 3 2 2 1

That will reduce the number of combinations to 532170. We used the -gM parameter to specify a list of maximum instructions per group (3 2 2 1).

Similarly, we can constrain further the design space by specifying that we need at least one fixed point, one vector, one load and one branch instruction per sequence. We can specify that using the following command:

> mp_seq -T power_v300-power9-ppc64_linux_gcc -p -s -D <output_dir> -is 6 -ig ADDI_V0,ORI_V0,AND_V0 XVMULDP_V0,XVDIVDP_V0,XVMADDADP_V0 LD_V0,LDX_V0,LBZ_V0 B_V0,BC_V0 -gM 3 2 2 1 -gm 1 1 1 1

and the number of combinations is reduced to 320760 combinations. We used the -gm parameter with to specify of list of maximum instructions per group (1 1 1 1).

Finally, we control further the number of combinations to generate. Let’s say that we want the branches to be placed at the end of the sequence (position 6) and we want the vector instructions to be placed only at positions 1 or 4 of the sequence. We can specify that using the following command:

> mp_seq -T power_v300-power9-ppc64_linux_gcc -p -s -D <output_dir> -is 6 -ig ADDI_V0,ORI_V0,AND_V0 XVMULDP_V0,XVDIVDP_V0,XVMADDADP_V0 LD_V0,LDX_V0,LBZ_V0 B_V0,BC_V0 -gM 3 2 2 1 -gm 1 1 1 1 -im 1,2,3 1,3 1,3 1,2,3 1,3 4

We used the -im parameter to specify a list of the instructions groups allowed on each sequence slot (1,2,3 1,3 1,3 1,2,3 1,3 4). Notice that group 1 (fixed point instruction) is allowed at positions 1,2,3,4,5 ; the group 2 (vector instructions) is allowed at positions 1 and 4; and so on. This instruction mask reduces the number of sequences to 12636.

This example showed you how to constraint the number of combination to your needs. Notice that we use the -p flags to generate the microbenchmarks in parallel, and the -s flag to skip the benchmarks if they are already generated.

Note

One can use the -N flag to check the sequence definition and the number of sequences that are going to be generated before starting the generation process.