Examples on POWER
In the definitions/power/examples
directory of the Microprobe distribution
(if you installed the microprobe_target_power package),
you will find different examples showing the usage of Microprobe
for the power architecture. Although we have split the examples by
architecture, the concepts we introduce in these examples are common in all
the architectures.
We recommend users to go through the code of these examples to understand specific details on how to use the framework.
Contents:
isa_power_v206_info.py
The first example we show is isa_power_v206_info.py
. This example
shows how to search for architecture definitions (e.g. the ISA properties),
how to import the definitions and then how to dump the definition.
If you execute the following command:
> ./isa_power_v206_info.py
will generate the following output, which shows all the details of the POWER v2.06 architecture (first and last 20 lines for brevity):
--------------------------------------------------------------------------------
ISA Name: power_v206
ISA Description: power_v206
--------------------------------------------------------------------------------
Register Types:
GPR: General Register (bit size: 64)
VSCR: Vector Status and Control Register (bit size: 32)
FPR: Floating-Point Register (bit size: 64)
SPR: Special Purpose Register (64 bits) (bit size: 64)
VR: Vector Register (bit size: 128)
MSR: Machine State Register (bit size: 64)
SPR32: Special Purpose Register (32 bits) (bit size: 32)
VSR: Vector Scalar Register (bit size: 128)
FPSCR: Floating-Point Status and Control Register (bit size: 32)
CR: Condition Register (bit size: 4)
--------------------------------------------------------------------------------
Architected registers:
AESR : AESR Register (Type: SPR)
AMOR : AMOR Register (Type: SPR)
AMR : Authority Mask Register (Type: SPR)
...
access_storage : False (Boolean indicating if the instruction has storage operands )
access_storage_with_update : False (Boolean indicating if the instruction accesses to storage and updates the source register with the generated address)
algebraic : False (Boolean indicating if operation uses algebraic rules to keep values )
branch : False (Boolean indicating if the instruction is a branch )
branch_conditional : False (Boolean indicating if the instruction is a branch conditional )
branch_relative : False (Boolean indicating if the instruction is a relative branch )
category : VSX (String indicating if the instruction the instruction category )
decimal : False (Boolean indication if the instruction requires inputs in decimal format )
disable_asm : False (Boolean indicating if ASM generation is disabled for the instruction. If so, binary codification is used. )
hypervisor : False (Boolean indicating if the instruction need hypervisor mode )
privileged : False (Boolean indicating if the instruction is privileged )
privileged_optional : False (Boolean indicating the instrucion is priviledged or not depending on the input values )
switching : None (Input values required to maximize the computational switching )
syscall : False (Boolean indicating if the instruction is a syscall or return from one )
trap : False (Boolean indicating if the instruction is a trap )
Instructions defined: 938
Variants defined: 964
--------------------------------------------------------------------------------
The following code is what has been executed:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16isa_power_v206_info.py
17
18Example module to show how to access to isa definitions.
19"""
20
21# Futures
22from __future__ import absolute_import, print_function
23
24# Built-in modules
25import os
26
27# Own modules
28from microprobe.target.isa import find_isa_definitions, import_isa_definition
29
30__author__ = "Ramon Bertran"
31__copyright__ = "Copyright 2011-2021 IBM Corporation"
32__credits__ = []
33__license__ = "IBM (c) 2011-2021 All rights reserved"
34__version__ = "0.5"
35__maintainer__ = "Ramon Bertran"
36__email__ = "rbertra@us.ibm.com"
37__status__ = "Development" # "Prototype", "Development", or "Production"
38
39# Constants
40ISANAME = "power_v206"
41
42# Functions
43
44# Classes
45
46# Main
47
48# Search and import definition
49ISADEF = import_isa_definition(
50 os.path.dirname([
51 isa for isa in find_isa_definitions() if isa.name == ISANAME
52 ][0].filename))
53
54# Print definition
55print((ISADEF.full_report()))
56exit(0)
In this simple code, first the find_isa_definitions
,
import_isa_definition
from the microprobe.target.isa module
are imported (line 14). Then, the first one is used to look for definitions of
architectures, a list returned and filtered and only the one with
name power_v206
is imported using the second method:
import_isa_definition
(lines 34-37). Finally, the full report of
the ISADEF
object is printed to standard output in line 40.
In the case, the full report is printed but the user can query any
information about the particular ISA that has been imported by using the
microprobe.target.isa.ISA
API.
power_v206_power7_ppc64_linux_gcc_profile.py
The aim of this example is to show how the code generation works in Microprobe. In particular, this example shows how to generate, for each instruction of the ISA, an endless loop containing such instruction. The size of the loop and the dependency distance between the instructions of the loop can specified as a parameter. Using Microprobe you can generate thousands of microbenchmarks in few minutes. Let’s start with the command line interface. Executing:
> ./power_v206_power7_ppc64_linux_gcc_profile.py --help
will generate the following output:
power_v206_power7_ppc64_linux_gcc_profile.py: INFO: Processing input arguments...
usage: power_v206_power7_ppc64_linux_gcc_profile.py [-h]
[-P SEARCH_PATH [SEARCH_PATH ...]]
[-V] [-v] [-d]
[-i INSTRUCTION_NAME [INSTRUCTION_NAME ...]]
[--output_prefix PREFIX]
[-O PATH] [-p NUM_JOBS]
[-S BENCHMARK_SIZE]
[-D DEPENDECY_DISTANCE]
ISA power v206 profile example
optional arguments:
-h, --help show this help message and exit
-P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
Default search paths for microprobe target definitions
-V, --version Show Microprobe version and exit
-v, --verbosity Verbosity level (Values: [0,1,2,3,4]). Each time this
argument is specified the verbosity level is
increased. By default, no logging messages are shown.
These are the four levels available:
-v (1): critical messages
-v -v (2): critical and error messages
-v -v -v (3): critical, error and warning messages
-v -v -v -v (4): critical, error, warning and info messages
Specifying more than four verbosity flags, will
default to the maximum of four. If you need extra
information, enable the debug mode (--debug or -d
flags).
-d, --debug Enable debug mode in Microprobe framework. Lots of
output messages will be generated
-i INSTRUCTION_NAME [INSTRUCTION_NAME ...], --instruction INSTRUCTION_NAME [INSTRUCTION_NAME ...]
Instruction names to generate. Default: All
instructions
--output_prefix PREFIX
Output prefix of the generated files. Default:
POWER_V206_PROFILE
-O PATH, --output_path PATH
Output path. Default: current path
-p NUM_JOBS, --parallel NUM_JOBS
Number of parallel jobs. Default: number of CPUs
available (16). Valid values: 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16
-S BENCHMARK_SIZE, --size BENCHMARK_SIZE
Benchmark size (number of instructions in the endless
loop). Default: 64 instructions
-D DEPENDECY_DISTANCE, --dependency_distance DEPENDECY_DISTANCE
Average dependency distance between the instructions.
Default: 1000 (no dependencies)
Environment variables:
MICROPROBETEMPLATES Default path for microprobe templates
MICROPROBEDEBUG If set, enable debug
MICROPROBEDEBUGPASSES If set, enable debug during passes
MICROPROBEASMHEXFMT Assembly hexadecimal format. Options:
'all' -> All immediates in hex format
'address' -> Address immediates in hex format (default)
'none' -> All immediate in integer format
Lets look at the code to see how this command line tool is implemented. This is the complete code of the script:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_profile.py
17
18Example module to show how to generate a benchmark for each instruction
19of the ISA
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import sys
29import traceback
30
31# Own modules
32import microprobe.code.ins
33import microprobe.passes.address
34import microprobe.passes.branch
35import microprobe.passes.decimal
36import microprobe.passes.float
37import microprobe.passes.ilp
38import microprobe.passes.initialization
39import microprobe.passes.instruction
40import microprobe.passes.memory
41import microprobe.passes.register
42import microprobe.passes.structure
43import microprobe.utils.cmdline
44from microprobe import MICROPROBE_RC
45from microprobe.exceptions import MicroprobeException
46from microprobe.target import import_definition
47from microprobe.utils.cmdline import (
48 existing_dir,
49 int_type,
50 print_error,
51 print_info,
52 print_warning,
53)
54from microprobe.utils.logger import get_logger
55from microprobe.utils.misc import RND, RNDINT
56
57__author__ = "Ramon Bertran"
58__copyright__ = "Copyright 2011-2021 IBM Corporation"
59__credits__ = []
60__license__ = "IBM (c) 2011-2021 All rights reserved"
61__version__ = "0.5"
62__maintainer__ = "Ramon Bertran"
63__email__ = "rbertra@us.ibm.com"
64__status__ = "Development" # "Prototype", "Development", or "Production"
65
66# Constants
67LOG = get_logger(__name__) # Get the generic logging interface
68
69
70# Functions
71def main_setup():
72 """
73 Set up the command line interface (CLI) with the arguments required by
74 this command line tool.
75 """
76
77 args = sys.argv[1:]
78
79 # Create the CLI interface object
80 cmdline = microprobe.utils.cmdline.CLI(
81 "ISA power v206 profile example",
82 config_options=False,
83 target_options=False,
84 debug_options=False,
85 )
86
87 # Add the different parameters for this particular tool
88 cmdline.add_option(
89 "instruction",
90 "i",
91 None,
92 "Instruction names to generate. Default: All instructions",
93 required=False,
94 nargs="+",
95 metavar="INSTRUCTION_NAME",
96 )
97
98 cmdline.add_option(
99 "output_prefix",
100 None,
101 "POWER_V206_PROFILE",
102 "Output prefix of the generated files. Default: POWER_V206_PROFILE",
103 opt_type=str,
104 required=False,
105 metavar="PREFIX",
106 )
107
108 cmdline.add_option(
109 "output_path",
110 "O",
111 "./",
112 "Output path. Default: current path",
113 opt_type=existing_dir,
114 metavar="PATH",
115 )
116
117 cmdline.add_option(
118 "parallel",
119 "p",
120 MICROPROBE_RC["cpus"],
121 "Number of parallel jobs. Default: number of CPUs available (%s)"
122 % mp.cpu_count(),
123 opt_type=int,
124 choices=list(range(1, MICROPROBE_RC["cpus"] + 1)),
125 metavar="NUM_JOBS",
126 )
127
128 cmdline.add_option(
129 "size",
130 "S",
131 64,
132 "Benchmark size (number of instructions in the endless loop). "
133 "Default: 64 instructions",
134 opt_type=int_type(1, 2**20),
135 metavar="BENCHMARK_SIZE",
136 )
137
138 cmdline.add_option(
139 "dependency_distance",
140 "D",
141 1000,
142 "Average dependency distance between the instructions. "
143 "Default: 1000 (no dependencies)",
144 opt_type=int_type(1, 1000),
145 metavar="DEPENDECY_DISTANCE",
146 )
147
148 # Start the main
149 print_info("Processing input arguments...")
150 cmdline.main(args, _main)
151
152
153def _main(arguments):
154 """
155 Main program. Called after the arguments from the CLI interface have
156 been processed.
157 """
158
159 print_info("Arguments processed!")
160
161 print_info(
162 "Importing target definition " "'power_v206-power7-ppc64_linux_gcc'..."
163 )
164 target = import_definition("power_v206-power7-ppc64_linux_gcc")
165
166 # Get the arguments
167 instructions = arguments.get("instruction", None)
168 prefix = arguments["output_prefix"]
169 output_path = arguments["output_path"]
170 parallel_jobs = arguments["parallel"]
171 size = arguments["size"]
172 distance = arguments["dependency_distance"]
173
174 # Process the arguments
175 if instructions is not None:
176
177 # If the user has provided some instructions, make sure they
178 # exists and then we call the generation function
179
180 instructions = _validate_instructions(instructions, target)
181
182 if len(instructions) == 0:
183 print_error("No valid instructions defined.")
184 exit(-1)
185
186 # Set more verbose level
187 # set_log_level(10)
188 #
189 list(
190 map(
191 _generate_benchmark,
192 [
193 (instruction, prefix, output_path, target, size, distance)
194 for instruction in instructions
195 ],
196 )
197 )
198
199 else:
200
201 # If the user has not provided any instruction, go for all of them
202 # and then call he generation function
203
204 instructions = _generate_instructions(target, output_path, prefix)
205
206 # Since several benchmark will be generated, reduce verbose level
207 # and call the generation function in parallel
208
209 # set_log_level(30)
210
211 if parallel_jobs > 1:
212 pool = mp.Pool(processes=parallel_jobs)
213 pool.map(
214 _generate_benchmark,
215 [
216 (instruction, prefix, output_path, target, size, distance)
217 for instruction in instructions
218 ],
219 1,
220 )
221 else:
222 list(
223 map(
224 _generate_benchmark,
225 [
226 (
227 instruction,
228 prefix,
229 output_path,
230 target,
231 size,
232 distance,
233 )
234 for instruction in instructions
235 ],
236 )
237 )
238
239
240def _validate_instructions(instructions, target):
241 """
242 Validate the provided instruction for a given target
243 """
244
245 nins = []
246 for instruction in instructions:
247
248 if instruction not in list(target.isa.instructions.keys()):
249 print_warning(
250 "'%s' not defined in the ISA. Skipping..." % instruction
251 )
252 continue
253 nins.append(instruction)
254 return nins
255
256
257def _generate_instructions(target, path, prefix):
258 """
259 Generate the list of instruction to be generated for a given target
260 """
261
262 instructions = []
263 for name, instr in target.instructions.items():
264
265 if instr.privileged or instr.hypervisor:
266 # Skip priv/hyper instructions
267 continue
268
269 if instr.branch and not instr.branch_relative:
270 # Skip branch absolute due to relocation problems
271 continue
272
273 if instr.category in ["LMA", "LMV", "DS", "EC"]:
274 # Skip some instruction categories
275 continue
276
277 if name in [
278 "LSWI_V0",
279 "LSWX_V0",
280 "LMW_V0",
281 "STSWX_V0",
282 "LD_V1",
283 "LWZ_V1",
284 "STW_V1",
285 ]:
286 # Some instructions are not completely supported yet
287 # String-related instructions and load multiple
288
289 continue
290
291 # Skip if the files already exists
292
293 fname = f"{path}/{prefix}_{name}.c"
294 ffname = f"{path}/{prefix}_{name}.c.fail"
295
296 if os.path.isfile(fname):
297 print_warning(f"Skip {name}. '{fname}' already generated")
298 continue
299
300 if os.path.isfile(ffname):
301 print_warning(
302 "Skip %s. '%s' already generated (failed)" % (name, ffname)
303 )
304 continue
305
306 instructions.append(name)
307
308 return instructions
309
310
311def _generate_benchmark(args):
312 """
313 Actual benchmark generation policy. This is the function that defines
314 how the microbenchmark are going to be generated
315 """
316
317 instr_name, prefix, output_path, target, size, distance = args
318
319 try:
320
321 # Name of the output file
322 fname = f"{output_path}/{prefix}_{instr_name}"
323
324 # Name of the fail output file (generated in case of exception)
325 ffname = f"{fname}.c.fail"
326
327 print_info(f"Generating {fname} ...")
328
329 instruction = microprobe.code.ins.Instruction()
330 instruction.set_arch_type(target.instructions[instr_name])
331 sequence = [target.instructions[instr_name]]
332
333 # Get the wrapper object. The wrapper object is in charge of
334 # translating the internal representation of the microbenchmark
335 # to the final output format.
336 #
337 # In this case, we obtain the 'CInfGen' wrapper, which embeds
338 # the generated code within an infinite loop using C plus
339 # in-line assembly statements.
340 cwrapper = microprobe.code.get_wrapper("CInfGen")
341
342 # Create the synthesizer object, which is in charge of driving the
343 # generation of the microbenchmark, given a set of passes
344 # (a.k.a. transformations) to apply to the an empty internal
345 # representation of the microbenchmark
346 synth = microprobe.code.Synthesizer(target, cwrapper(), value=RNDINT)
347
348 rand = RND
349
350 # Add the transformation passes
351
352 #######################################################################
353 # Pass 1: Init integer registers to a given value #
354 #######################################################################
355 synth.add_pass(
356 microprobe.passes.initialization.InitializeRegistersPass(
357 value=_init_value()
358 )
359 )
360 floating = False
361 vector = False
362
363 for operand in instruction.operands():
364 if operand.type.immediate:
365 continue
366
367 if operand.type.float:
368 floating = True
369
370 if operand.type.vector:
371 vector = True
372
373 if vector and floating:
374 ###################################################################
375 # Pass 1.A: if instruction uses vector floats, init vector #
376 # registers to float values #
377 ###################################################################
378 synth.add_pass(
379 microprobe.passes.initialization.InitializeRegistersPass(
380 v_value=(1.000000000000001, 64)
381 )
382 )
383 elif vector:
384 ###################################################################
385 # Pass 1.B: if instruction uses vector but not floats, init #
386 # vector registers to integer value #
387 ###################################################################
388 synth.add_pass(
389 microprobe.passes.initialization.InitializeRegistersPass(
390 v_value=(_init_value(), 64)
391 )
392 )
393 elif floating:
394 ###################################################################
395 # Pass 1.C: if instruction uses floats, init float #
396 # registers to float values #
397 ###################################################################
398 synth.add_pass(
399 microprobe.passes.initialization.InitializeRegistersPass(
400 fp_value=1.000000000000001
401 )
402 )
403
404 #######################################################################
405 # Pass 2: Add a building block of size 'size' #
406 #######################################################################
407 synth.add_pass(
408 microprobe.passes.structure.SimpleBuildingBlockPass(size)
409 )
410
411 #######################################################################
412 # Pass 3: Fill the building block with the instruction sequence #
413 #######################################################################
414 synth.add_pass(
415 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
416 sequence
417 )
418 )
419
420 #######################################################################
421 # Pass 4: Compute addresses of instructions (this pass is needed to #
422 # update the internal representation information so that in #
423 # case addresses are required, they are up to date). #
424 #######################################################################
425 synth.add_pass(
426 microprobe.passes.address.UpdateInstructionAddressesPass()
427 )
428
429 #######################################################################
430 # Pass 5: Set target of branches to be the next instruction in the #
431 # instruction stream #
432 #######################################################################
433 synth.add_pass(microprobe.passes.branch.BranchNextPass())
434
435 #######################################################################
436 # Pass 6: Set memory-related operands to access 16 storage locations #
437 # in a round-robin fashion in stride 256 bytes. #
438 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
439 #######################################################################
440 synth.add_pass(
441 microprobe.passes.memory.SingleMemoryStreamPass(16, 256)
442 )
443
444 #######################################################################
445 # Pass 7.A: Initialize the storage locations accessed by floating #
446 # point instructions to have a valid floating point value #
447 #######################################################################
448 synth.add_pass(
449 microprobe.passes.float.InitializeMemoryFloatPass(
450 value=1.000000000000001
451 )
452 )
453
454 #######################################################################
455 # Pass 7.B: Initialize the storage locations accessed by decimal #
456 # instructions to have a valid decimal value #
457 #######################################################################
458 synth.add_pass(
459 microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1)
460 )
461
462 #######################################################################
463 # Pass 8: Set the remaining instructions operands (if not set) #
464 # (Required to set remaining immediate operands) #
465 #######################################################################
466 synth.add_pass(
467 microprobe.passes.register.DefaultRegisterAllocationPass(
468 rand, dd=distance
469 )
470 )
471
472 # Synthesize the microbenchmark.The synthesize applies the set of
473 # transformation passes added before and returns object representing
474 # the microbenchmark
475 bench = synth.synthesize()
476
477 # Save the microbenchmark to the file 'fname'
478 synth.save(fname, bench=bench)
479
480 print_info(f"{fname} generated!")
481
482 # Remove fail file if exists
483 if os.path.isfile(ffname):
484 os.remove(ffname)
485
486 except MicroprobeException:
487
488 # In case of exception during the generation of the microbenchmark,
489 # print the error, write the fail file and exit
490 print_error(traceback.format_exc())
491 open(ffname, "a").close()
492 exit(-1)
493
494
495def _init_value():
496 """Return a init value"""
497 return RNDINT
498
499
500# Main
501if __name__ == "__main__":
502 # run main if executed from the command line
503 # and the main method exists
504
505 if callable(locals().get("main_setup")):
506 main_setup()
507 exit(0)
The code is self-documented. You can take a look to understand the basic concepts of the code generation in Microprobe. In order to help the readers, let us summarize and elaborate the explanations in the code. The following are the suggested steps required to implement a command line tool to generate microbenchmarks using Microprobe:
Define the command line interface and parameters (
main_setup()
function in the example). This includes:Create a command line interface object
Define parameters using the
add_option
interfaceCall the actual main with the arguments
Define the function to process the input parameters (
_main()
function in the example). This includes:Import target definition
Get processed arguments
Validate and use the arguments to call the actual microbenchmark generation function
Define the function to generate the microbenchmark (
_generate_benchmark
function in the example). The main elements are the following:Get the wrapper object. The wrapper object defines the general characteristics of code being generated (i.e. how the internal representation will be translated to the final file being generated). General characteristics are, for instance, code prologs such as
#include <header.h>
directives, the main function declaration, epilogs, etc. In this case, the wrapper selected is theCInfGen
. This wrapper generates C code with an infinite loop of instructions. This results in the following code:#include <stdio.h> #include <string.h> // <declaration of variables> int main(int argc, char** argv, char** envp) { // <initialization_code> while(1) { // <generated_code> } // end while }
The user can subclass or define their own wrappers to fulfill their needs. See
microprobe.code.wrapper.Wrapper
for more details.Instantiate synthesizer. The benchmark synthesizer object is in charge of driving the code generation object by applying the set of transformation passes defined by the user.
Define the transformation passes. The transformation passes will fill the
declaration of variables
,<initialization_code>
and<generated_code>
sections of the previous code block. Depending on the order and the type of passes applied, the code generated will be different. The user has plenty of transformation passes to apply. Seemicroprobe.passes
and all its submodules for further details. Also, the use can define its own passes by subclassing the classmicroprobe.passes.Pass
.Finally, once the generation policy is defined, the user only has to synthesize the benchmark and save it to a file.
power_v206_power7_ppc64_linux_gcc_fu_stress.py
The following example shows how to generate microbenchmarks that stress a particular functional unit of the architecture. The code is self explanatory:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_fu_stress.py
17
18Example module to show how to generate a benchmark stressing a particular
19functional unit of the microarchitecture at different rate using the
20average latency of instructions as well as the average dependency distance
21between the instructions
22"""
23
24# Futures
25from __future__ import absolute_import
26
27# Built-in modules
28import os
29import sys
30import traceback
31
32# Own modules
33import microprobe.code.ins
34import microprobe.passes.address
35import microprobe.passes.branch
36import microprobe.passes.decimal
37import microprobe.passes.float
38import microprobe.passes.ilp
39import microprobe.passes.initialization
40import microprobe.passes.instruction
41import microprobe.passes.memory
42import microprobe.passes.register
43import microprobe.passes.structure
44import microprobe.utils.cmdline
45from microprobe.exceptions import (
46 MicroprobeException,
47 MicroprobeTargetDefinitionError,
48)
49from microprobe.target import import_definition
50from microprobe.utils.cmdline import (
51 dict_key,
52 existing_dir,
53 float_type,
54 int_type,
55 print_error,
56 print_info,
57)
58from microprobe.utils.logger import get_logger
59from microprobe.utils.misc import RND, RNDINT
60
61__author__ = "Ramon Bertran"
62__copyright__ = "Copyright 2011-2021 IBM Corporation"
63__credits__ = []
64__license__ = "IBM (c) 2011-2021 All rights reserved"
65__version__ = "0.5"
66__maintainer__ = "Ramon Bertran"
67__email__ = "rbertra@us.ibm.com"
68__status__ = "Development" # "Prototype", "Development", or "Production"
69
70# Constants
71LOG = get_logger(__name__) # Get the generic logging interface
72
73
74# Functions
75def main_setup():
76 """
77 Set up the command line interface (CLI) with the arguments required by
78 this command line tool.
79 """
80
81 args = sys.argv[1:]
82
83 # Get the target definition
84 try:
85 target = import_definition("power_v206-power7-ppc64_linux_gcc")
86 except MicroprobeTargetDefinitionError as exc:
87 print_error("Unable to import target definition")
88 print_error(f"Exception message: {str(exc)}")
89 exit(-1)
90
91 func_units = {}
92 valid_units = [elem.name for elem in target.elements.values()]
93
94 for instr in target.isa.instructions.values():
95 if instr.execution_units == "None":
96 LOG.debug("Execution units for: '%s' not defined", instr.name)
97 continue
98
99 for unit in instr.execution_units:
100 if unit not in valid_units:
101 continue
102
103 if unit not in func_units:
104 func_units[unit] = [
105 elem
106 for elem in target.elements.values()
107 if elem.name == unit
108 ][0]
109
110 # Create the CLI interface object
111 cmdline = microprobe.utils.cmdline.CLI(
112 "ISA power v206 profile example",
113 config_options=False,
114 target_options=False,
115 debug_options=False,
116 )
117
118 # Add the different parameters for this particular tool
119 cmdline.add_option(
120 "functional_unit",
121 "f",
122 [func_units["ALU"]],
123 "Functional units to stress. Default: ALU",
124 required=False,
125 nargs="+",
126 choices=func_units,
127 opt_type=dict_key(func_units),
128 metavar="FUNCTIONAL_UNIT_NAME",
129 )
130
131 cmdline.add_option(
132 "output_prefix",
133 None,
134 "POWER_V206_FU_STRESS",
135 "Output prefix of the generated files. Default: POWER_V206_FU_STRESS",
136 opt_type=str,
137 required=False,
138 metavar="PREFIX",
139 )
140
141 cmdline.add_option(
142 "output_path",
143 "O",
144 "./",
145 "Output path. Default: current path",
146 opt_type=existing_dir,
147 metavar="PATH",
148 )
149
150 cmdline.add_option(
151 "size",
152 "S",
153 64,
154 "Benchmark size (number of instructions in the endless loop). "
155 "Default: 64 instructions",
156 opt_type=int_type(1, 2**20),
157 metavar="BENCHMARK_SIZE",
158 )
159
160 cmdline.add_option(
161 "dependency_distance",
162 "D",
163 1000,
164 "Average dependency distance between the instructions. "
165 "Default: 1000 (no dependencies)",
166 opt_type=int_type(1, 1000),
167 metavar="DEPENDECY_DISTANCE",
168 )
169
170 cmdline.add_option(
171 "average_latency",
172 "L",
173 2,
174 "Average latency of the selected instructins. " "Default: 2 cycles",
175 opt_type=float_type(1, 1000),
176 metavar="AVERAGE_LATENCY",
177 )
178
179 # Start the main
180 print_info("Processing input arguments...")
181 cmdline.main(args, _main)
182
183
184def _main(arguments):
185 """
186 Main program. Called after the arguments from the CLI interface have
187 been processed.
188 """
189
190 print_info("Arguments processed!")
191
192 print_info(
193 "Importing target definition " "'power_v206-power7-ppc64_linux_gcc'..."
194 )
195 target = import_definition("power_v206-power7-ppc64_linux_gcc")
196
197 # Get the arguments
198 functional_units = arguments["functional_unit"]
199 prefix = arguments["output_prefix"]
200 output_path = arguments["output_path"]
201 size = arguments["size"]
202 latency = arguments["average_latency"]
203 distance = arguments["dependency_distance"]
204
205 if functional_units is None:
206 functional_units = ["ALL"]
207
208 _generate_benchmark(
209 target,
210 f"{output_path}/{prefix}_",
211 (functional_units, size, latency, distance),
212 )
213
214
215def _generate_benchmark(target, output_prefix, args):
216 """
217 Actual benchmark generation policy. This is the function that defines
218 how the microbenchmark are going to be generated
219 """
220
221 functional_units, size, latency, distance = args
222
223 try:
224
225 # Name of the output file
226 func_unit_names = [unit.name for unit in functional_units]
227 fname = f"{output_prefix}{'_'.join(func_unit_names)}"
228 fname = f"{fname}_LAT_{latency}"
229 fname = f"{fname}_DEP_{distance}"
230
231 # Name of the fail output file (generated in case of exception)
232 ffname = f"{fname}.c.fail"
233
234 print_info(f"Generating {fname} ...")
235
236 # Get the wrapper object. The wrapper object is in charge of
237 # translating the internal representation of the microbenchmark
238 # to the final output format.
239 #
240 # In this case, we obtain the 'CInfGen' wrapper, which embeds
241 # the generated code within an infinite loop using C plus
242 # in-line assembly statements.
243 cwrapper = microprobe.code.get_wrapper("CInfGen")
244
245 # Create the synthesizer object, which is in charge of driving the
246 # generation of the microbenchmark, given a set of passes
247 # (a.k.a. transformations) to apply to the an empty internal
248 # representation of the microbenchmark
249 synth = microprobe.code.Synthesizer(target, cwrapper(), value=RNDINT)
250
251 rand = RND
252
253 # Add the transformation passes
254
255 #######################################################################
256 # Pass 1: Init integer registers to a given value #
257 #######################################################################
258 synth.add_pass(
259 microprobe.passes.initialization.InitializeRegistersPass(
260 value=_init_value()
261 )
262 )
263
264 #######################################################################
265 # Pass 2: Add a building block of size 'size' #
266 #######################################################################
267 synth.add_pass(
268 microprobe.passes.structure.SimpleBuildingBlockPass(size)
269 )
270
271 #######################################################################
272 # Pass 3: Fill the building block with the instruction sequence #
273 #######################################################################
274 synth.add_pass(
275 microprobe.passes.instruction.SetInstructionTypeByElementPass(
276 target, functional_units, {}
277 )
278 )
279
280 #######################################################################
281 # Pass 4: Compute addresses of instructions (this pass is needed to #
282 # update the internal representation information so that in #
283 # case addresses are required, they are up to date). #
284 #######################################################################
285 synth.add_pass(
286 microprobe.passes.address.UpdateInstructionAddressesPass()
287 )
288
289 #######################################################################
290 # Pass 5: Set target of branches to be the next instruction in the #
291 # instruction stream #
292 #######################################################################
293 synth.add_pass(microprobe.passes.branch.BranchNextPass())
294
295 #######################################################################
296 # Pass 6: Set memory-related operands to access 16 storage locations #
297 # in a round-robin fashion in stride 256 bytes. #
298 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
299 #######################################################################
300 synth.add_pass(
301 microprobe.passes.memory.SingleMemoryStreamPass(16, 256)
302 )
303
304 #######################################################################
305 # Pass 7.A: Initialize the storage locations accessed by floating #
306 # point instructions to have a valid floating point value #
307 #######################################################################
308 synth.add_pass(
309 microprobe.passes.float.InitializeMemoryFloatPass(
310 value=1.000000000000001
311 )
312 )
313
314 #######################################################################
315 # Pass 7.B: Initialize the storage locations accessed by decimal #
316 # instructions to have a valid decimal value #
317 #######################################################################
318 synth.add_pass(
319 microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1)
320 )
321
322 #######################################################################
323 # Pass 8: Set the remaining instructions operands (if not set) #
324 # (Required to set remaining immediate operands) #
325 #######################################################################
326 synth.add_pass(
327 microprobe.passes.register.DefaultRegisterAllocationPass(
328 rand, dd=distance
329 )
330 )
331
332 # Synthesize the microbenchmark.The synthesize applies the set of
333 # transformation passes added before and returns object representing
334 # the microbenchmark
335 bench = synth.synthesize()
336
337 # Save the microbenchmark to the file 'fname'
338 synth.save(fname, bench=bench)
339
340 print_info(f"{fname} generated!")
341
342 # Remove fail file if exists
343 if os.path.isfile(ffname):
344 os.remove(ffname)
345
346 except MicroprobeException:
347
348 # In case of exception during the generation of the microbenchmark,
349 # print the error, write the fail file and exit
350 print_error(traceback.format_exc())
351 open(ffname, "a").close()
352 exit(-1)
353
354
355def _init_value():
356 """Return a init value"""
357 return RNDINT
358
359
360# Main
361if __name__ == "__main__":
362 # run main if executed from the command line
363 # and the main method exists
364
365 if callable(locals().get("main_setup")):
366 main_setup()
367 exit(0)
power_v206_power7_ppc64_linux_gcc_memory.py
The following example shows how to create microbenchmarks with different activity (stress levels) on the different levels of the cache hierarchy. Note that it is not necessary to use the built-in command line interface provided by Microprobe, as the example shows.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_memory.py
17
18Example python script to show how to generate microbenchmarks with particular
19levels of activity in the memory hierarchy.
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import sys
29from typing import List, Tuple
30
31# Own modules
32import microprobe.code
33import microprobe.passes.address
34import microprobe.passes.ilp
35import microprobe.passes.initialization
36import microprobe.passes.instruction
37import microprobe.passes.memory
38import microprobe.passes.register
39import microprobe.passes.structure
40from microprobe import MICROPROBE_RC
41from microprobe.exceptions import MicroprobeTargetDefinitionError
42from microprobe.model.memory import EndlessLoopDataMemoryModel
43from microprobe.target import import_definition
44from microprobe.target.isa.instruction import InstructionType
45from microprobe.target.uarch.cache import SetAssociativeCache
46from microprobe.utils.cmdline import print_error, print_info
47from microprobe.utils.typeguard_decorator import typeguard_testsuite
48from microprobe.utils.misc import RND
49
50__author__ = "Ramon Bertran"
51__copyright__ = "Copyright 2011-2021 IBM Corporation"
52__credits__ = []
53__license__ = "IBM (c) 2011-2021 All rights reserved"
54__version__ = "0.5"
55__maintainer__ = "Ramon Bertran"
56__email__ = "rbertra@us.ibm.com"
57__status__ = "Development" # "Prototype", "Development", or "Production"
58
59# Get the target definition
60try:
61 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
62except MicroprobeTargetDefinitionError as exc:
63 print_error("Unable to import target definition")
64 print_error("Exception message: %s" % str(exc))
65 exit(-1)
66
67assert TARGET.microarchitecture is not None, \
68 "Target must have a defined microarchitecture"
69
70BASE_ELEMENT = [
71 element for element in TARGET.microarchitecture.elements.values()
72 if element.name == 'L1D'
73][0]
74CACHE_HIERARCHY: List[SetAssociativeCache] = \
75 TARGET.microarchitecture.cache_hierarchy.get_data_hierarchy_from_element(
76 BASE_ELEMENT)
77
78# Benchmark size
79BENCHMARK_SIZE = 8 * 1024
80
81# Fill a list of the models to be generated
82
83MEMORY_MODELS: List[Tuple[str, List[SetAssociativeCache], List[int]]] = []
84
85#
86# Due to performance issues (long exec. time) this
87# model is disabled
88#
89# MEMORY_MODELS.append(
90# (
91# "ALL", CACHE_HIERARCHY, [
92# 25, 25, 25, 25]))
93
94MEMORY_MODELS.append(("L1", CACHE_HIERARCHY, [100, 0, 0, 0]))
95MEMORY_MODELS.append(("L2", CACHE_HIERARCHY, [0, 100, 0, 0]))
96MEMORY_MODELS.append(("L3", CACHE_HIERARCHY, [0, 0, 100, 0]))
97MEMORY_MODELS.append(("L1L3", CACHE_HIERARCHY, [50, 0, 50, 0]))
98MEMORY_MODELS.append(("L1L2", CACHE_HIERARCHY, [50, 50, 0, 0]))
99MEMORY_MODELS.append(("L2L3", CACHE_HIERARCHY, [0, 50, 50, 0]))
100MEMORY_MODELS.append(("CACHES", CACHE_HIERARCHY, [33, 33, 34, 0]))
101MEMORY_MODELS.append(("MEM", CACHE_HIERARCHY, [0, 0, 0, 100]))
102
103# Enable parallel generation
104PARALLEL = False
105
106DIRECTORY = None
107
108
109@typeguard_testsuite
110def main():
111 """Main function. """
112 # call the generate method for each model in the memory model list
113
114 if PARALLEL:
115 print_info("Start parallel execution...")
116 pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
117 pool.map(generate, MEMORY_MODELS, 1)
118 else:
119 print_info("Start sequential execution...")
120 list(map(generate, MEMORY_MODELS))
121
122 exit(0)
123
124
125@typeguard_testsuite
126def generate(model: Tuple[str, List[SetAssociativeCache], List[int]]):
127 """Benchmark generation policy function. """
128
129 assert DIRECTORY is not None, "DIRECTORY variable cannot be None"
130
131 print_info(f"Creating memory model '{model[0]}' ...")
132 memmodel = EndlessLoopDataMemoryModel(*model)
133
134 modelname = memmodel.name
135
136 print_info(f"Generating Benchmark mem-{modelname} ...")
137
138 # Get the architecture
139 garch = TARGET
140
141 # For all the supported instructions, get the memory operations,
142 sequence: List[InstructionType] = []
143 for instr_name in sorted(garch.isa.instructions.keys()):
144
145 instr = garch.isa.instructions[instr_name]
146
147 if not instr.access_storage:
148 continue
149 if instr.privileged: # Skip privileged
150 continue
151 if instr.hypervisor: # Skip hypervisor
152 continue
153 if instr.trap: # Skip traps
154 continue
155 if "String" in instr.description: # Skip unsupported string instr.
156 continue
157 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
158 continue
159 if instr.category in ['LMA', 'LMV', 'DS', 'EC',
160 'WT']: # Skip unsupported categories
161 continue
162 if instr.access_storage_with_update: # Not supported by mem. model
163 continue
164 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
165 continue
166 if "Conditional Indexed" in instr.description: # Skip (illegal intr.)
167 continue
168 if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1']:
169 continue
170
171 sequence.append(instr)
172
173 # Get the loop wrapper. In this case we take the 'CInfPpc', which
174 # generates an infinite loop in C using PowerPC embedded assembly.
175 cwrapper = microprobe.code.get_wrapper("CInfPpc")
176
177 # Define function to return random numbers (used afterwards)
178 def rnd():
179 """Return a random value. """
180 return RND.randrange(0, (1 << 64) - 1)
181
182 # Create the benchmark synthesizer
183 synth = microprobe.code.Synthesizer(garch, cwrapper())
184
185 rand = RND
186
187 ##########################################################################
188 # Add the passes we want to apply to synthesize benchmarks #
189 ##########################################################################
190
191 # --> Init registers to random values
192 synth.add_pass(
193 microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
194
195 # --> Add a single basic block of size 'size'
196 if memmodel.name in ['MEM']:
197 synth.add_pass(
198 microprobe.passes.structure.SimpleBuildingBlockPass(
199 BENCHMARK_SIZE * 4))
200 else:
201 synth.add_pass(
202 microprobe.passes.structure.SimpleBuildingBlockPass(
203 BENCHMARK_SIZE))
204
205 # --> Fill the basic block using the sequence of instructions provided
206 synth.add_pass(
207 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
208 sequence))
209
210 # --> Set the memory operations parameters to fulfill the given model
211 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(memmodel))
212
213 # --> Set the dependency distance and the default allocation. Sets the
214 # remaining undefined instruction operands (register allocation,...)
215 synth.add_pass(microprobe.passes.register.NoHazardsAllocationPass())
216 synth.add_pass(
217 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=0))
218
219 # Generate the benchmark (applies the passes).
220 bench = synth.synthesize()
221
222 print_info(f"Benchmark mem-{modelname} saving to disk...")
223
224 # Save the benchmark
225 synth.save(f"{DIRECTORY}/mem-{modelname}", bench=bench)
226
227 print_info(f"Benchmark mem-{modelname} generated")
228 return True
229
230
231if __name__ == '__main__':
232 # run main if executed from the command line
233 # and the main method exists
234
235 if len(sys.argv) != 2:
236 print_info("Usage:")
237 print_info("%s output_dir" % (sys.argv[0]))
238 exit(-1)
239
240 DIRECTORY = sys.argv[1]
241
242 if not os.path.isdir(DIRECTORY):
243 print_error(f"Output directory '{DIRECTORY}' does not exists")
244 exit(-1)
245
246 main()
power_v206_power7_ppc64_linux_gcc_random.py
The following example generates random microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_memory.py
17
18Example python script to show how to generate random microbenchmarks.
19"""
20
21# Futures
22from __future__ import absolute_import
23
24# Built-in modules
25import multiprocessing as mp
26import os
27import sys
28from typing import List
29
30# Own modules
31import microprobe.code
32import microprobe.passes.address
33import microprobe.passes.branch
34import microprobe.passes.ilp
35import microprobe.passes.initialization
36import microprobe.passes.instruction
37import microprobe.passes.memory
38import microprobe.passes.register
39import microprobe.passes.structure
40from microprobe import MICROPROBE_RC
41from microprobe.exceptions import (
42 MicroprobeError,
43 MicroprobeTargetDefinitionError,
44)
45from microprobe.model.memory import EndlessLoopDataMemoryModel
46from microprobe.target import import_definition
47from microprobe.target.isa.instruction import InstructionType
48from microprobe.utils.cmdline import print_error, print_info
49from microprobe.utils.misc import RND
50from microprobe.utils.typeguard_decorator import typeguard_testsuite
51
52__author__ = "Ramon Bertran"
53__copyright__ = "Copyright 2011-2021 IBM Corporation"
54__credits__ = []
55__license__ = "IBM (c) 2011-2021 All rights reserved"
56__version__ = "0.5"
57__maintainer__ = "Ramon Bertran"
58__email__ = "rbertra@us.ibm.com"
59__status__ = "Development" # "Prototype", "Development", or "Production"
60
61# Benchmark size
62BENCHMARK_SIZE = 8 * 1024
63
64# Get the target definition
65try:
66 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
67except MicroprobeTargetDefinitionError as exc:
68 print_error("Unable to import target definition")
69 print_error(f"Exception message: {str(exc)}")
70 exit(-1)
71
72assert (
73 TARGET.microarchitecture is not None
74), "Target must have a defined microarchitecture"
75BASE_ELEMENT = [
76 element
77 for element in TARGET.microarchitecture.elements.values()
78 if element.name == "L1D"
79][0]
80CACHE_HIERARCHY = (
81 TARGET.microarchitecture.cache_hierarchy.get_data_hierarchy_from_element(
82 BASE_ELEMENT
83 )
84)
85
86PARALLEL = True
87
88DIRECTORY = None
89
90
91@typeguard_testsuite
92def main():
93 """Main program."""
94 if PARALLEL:
95 pool = mp.Pool(processes=MICROPROBE_RC["cpus"])
96 pool.map(generate, list(range(0, 100)), 1)
97 else:
98 list(map(generate, list(range(0, 100))))
99
100
101@typeguard_testsuite
102def generate(name: str):
103 """Benchmark generation policy."""
104
105 assert DIRECTORY is not None, "DIRECTORY variable cannot be None"
106
107 if os.path.isfile(f"{DIRECTORY}/random-{name}.c"):
108 print_info(f"Skip {name}")
109 return
110
111 print_info(f"Generating {name}...")
112
113 # Seed the randomness
114 rand = RND
115
116 # Generate a random memory model (used afterwards)
117 model: List[int] = []
118 total = 100
119 for mcomp in CACHE_HIERARCHY[0:-1]:
120 weight = rand.randint(0, total)
121 model.append(weight)
122 print_info("%s: %d%%" % (mcomp, weight))
123 total = total - weight
124
125 # Fix remaining
126 level = rand.randint(0, len(CACHE_HIERARCHY[0:-1]) - 1)
127 model[level] += total
128
129 # Last level always zero
130 model.append(0)
131
132 # Sanity check
133 psum = 0
134 for elem in model:
135 psum += elem
136 assert psum == 100
137
138 modelobj = EndlessLoopDataMemoryModel("random-%s", CACHE_HIERARCHY, model)
139
140 # Get the loop wrapper. In this case we take the 'CInfPpc', which
141 # generates an infinite loop in C using PowerPC embedded assembly.
142 cwrapper = microprobe.code.get_wrapper("CInfPpc")
143
144 # Define function to return random numbers (used afterwards)
145 def rnd():
146 """Return a random value."""
147 return rand.randrange(0, (1 << 64) - 1)
148
149 # Create the benchmark synthesizer
150 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
151
152 ##########################################################################
153 # Add the passes we want to apply to synthesize benchmarks #
154 ##########################################################################
155
156 # --> Init registers to random values
157 synth.add_pass(
158 microprobe.passes.initialization.InitializeRegistersPass(value=rnd)
159 )
160
161 # --> Add a single basic block of size size
162 synth.add_pass(
163 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
164 )
165
166 # --> Fill the basic block with instructions picked randomly from the list
167 # provided
168
169 instructions: List[InstructionType] = []
170 for instr in TARGET.isa.instructions.values():
171
172 if instr.privileged: # Skip privileged
173 continue
174 if instr.hypervisor: # Skip hypervisor
175 continue
176 if instr.trap: # Skip traps
177 continue
178 if instr.syscall: # Skip syscall
179 continue
180 if "String" in instr.description: # Skip unsupported string instr.
181 continue
182 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
183 continue
184 if instr.category in [
185 "LMA",
186 "LMV",
187 "DS",
188 "EC",
189 "WT",
190 ]: # Skip unsupported categories
191 continue
192 if instr.access_storage_with_update: # Not supported by mem. model
193 continue
194 if instr.branch and not instr.branch_relative: # Skip branches
195 continue
196 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
197 continue
198 if "Conitional Indexed" in instr.description: # Skip (illegal intr.)
199 continue
200 if instr.name in [
201 "LD_V1",
202 "LWZ_V1",
203 "STW_V1",
204 ]:
205 continue
206
207 instructions.append(instr)
208
209 synth.add_pass(
210 microprobe.passes.instruction.SetRandomInstructionTypePass(
211 instructions, rand
212 )
213 )
214
215 # --> Set the memory operations parameters to fulfill the given model
216 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(modelobj))
217
218 # --> Set target of branches to next instruction (first compute addresses)
219 synth.add_pass(microprobe.passes.address.UpdateInstructionAddressesPass())
220 synth.add_pass(microprobe.passes.branch.BranchNextPass())
221
222 # --> Set the dependency distance and the default allocation. Dependency
223 # distance is randomly picked
224 synth.add_pass(
225 microprobe.passes.register.DefaultRegisterAllocationPass(
226 rand, dd=rand.randint(1, 20)
227 )
228 )
229
230 # Generate the benchmark (applies the passes)
231 # Since it is a randomly generated code, the generation might fail
232 # (e.g. not enough access to fulfill the requested memory model, etc.)
233 # Because of that, we handle the exception accordingly.
234 try:
235 print_info(f"Synthesizing {name}...")
236 bench = synth.synthesize()
237 print_info(f"Synthesized {name}!")
238 # Save the benchmark
239 synth.save(f"{DIRECTORY}/random-{name}", bench=bench)
240 except MicroprobeError:
241 print_info(f"Synthesizing error in '{name}'. This is Ok.")
242
243 return True
244
245
246if __name__ == "__main__":
247 # run main if executed from the command line
248 # and the main method exists
249
250 if len(sys.argv) != 2:
251 print_info("Usage:")
252 print_info(f"{sys.argv[0]} output_dir")
253 exit(-1)
254
255 DIRECTORY = sys.argv[1]
256
257 if not os.path.isdir(DIRECTORY):
258 print_error(f"Output directory '{DIRECTORY}' does not exists")
259 exit(-1)
260
261 if callable(locals().get("main")):
262 main()
power_v206_power7_ppc64_linux_gcc_custom.py
The following example shows different examples on how to customize the generation of microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_custom.py
17
18Example python script to show how to generate random microbenchmarks.
19"""
20
21# Futures
22from __future__ import absolute_import
23
24# Built-in modules
25import os
26import sys
27
28# Own modules
29import microprobe.code
30import microprobe.passes.initialization
31import microprobe.passes.instruction
32import microprobe.passes.memory
33import microprobe.passes.register
34import microprobe.passes.structure
35from microprobe.exceptions import MicroprobeTargetDefinitionError
36from microprobe.model.memory import EndlessLoopDataMemoryModel
37from microprobe.target import import_definition
38from microprobe.utils.cmdline import print_error, print_info
39from microprobe.utils.misc import RND, RNDINT
40
41__author__ = "Ramon Bertran"
42__copyright__ = "Copyright 2011-2021 IBM Corporation"
43__credits__ = []
44__license__ = "IBM (c) 2011-2021 All rights reserved"
45__version__ = "0.5"
46__maintainer__ = "Ramon Bertran"
47__email__ = "rbertra@us.ibm.com"
48__status__ = "Development" # "Prototype", "Development", or "Production"
49
50# Benchmark size
51BENCHMARK_SIZE = 8 * 1024
52
53if len(sys.argv) != 2:
54 print_info("Usage:")
55 print_info(f"{sys.argv[0]} output_dir")
56 exit(-1)
57
58DIRECTORY = sys.argv[1]
59
60if not os.path.isdir(DIRECTORY):
61 print_info(f"Output DIRECTORY '{DIRECTORY}' does not exists")
62 exit(-1)
63
64# Get the target definition
65try:
66 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
67except MicroprobeTargetDefinitionError as exc:
68 print_error("Unable to import target definition")
69 print_error(f"Exception message: {str(exc)}")
70 exit(-1)
71
72
73###############################################################################
74# Example 1: loop with instructions accessing storage , hitting the first #
75# level of cache and with dependency distance of 3 #
76###############################################################################
77def example_1():
78 """Example 1"""
79 name = "L1-LOADS"
80
81 base_element = [
82 element
83 for element in TARGET.elements.values()
84 if element.name == "L1D"
85 ][0]
86 cache_hierarchy = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
87 base_element
88 )
89
90 model = [0] * len(cache_hierarchy)
91 model[0] = 100
92
93 mmodel = EndlessLoopDataMemoryModel("random-%s", cache_hierarchy, model)
94
95 profile = {}
96 for instr_name in sorted(TARGET.instructions.keys()):
97 instr = TARGET.instructions[instr_name]
98 if not instr.access_storage:
99 continue
100 if instr.privileged: # Skip privileged
101 continue
102 if instr.hypervisor: # Skip hypervisor
103 continue
104 if "String" in instr.description: # Skip unsupported string instr.
105 continue
106 if "ultiple" in instr.description: # Skip unsupported mult. ld/sts
107 continue
108 if instr.category in [
109 "DS",
110 "LMA",
111 "LMV",
112 "EC",
113 ]: # Skip unsupported categories
114 continue
115 if instr.access_storage_with_update: # Not supported
116 continue
117
118 if instr.name in [
119 "LD_V1",
120 "LWZ_V1",
121 "STW_V1",
122 ]:
123 continue
124
125 if any(
126 [moper.is_load for moper in instr.memory_operand_descriptors]
127 ) and all(
128 [not moper.is_store for moper in instr.memory_operand_descriptors]
129 ):
130 profile[instr] = 1
131
132 rand = RND
133
134 cwrapper = microprobe.code.get_wrapper("CInfPpc")
135 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
136
137 synth.add_pass(
138 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
139 )
140 synth.add_pass(
141 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT)
142 )
143 synth.add_pass(
144 microprobe.passes.initialization.InitializeRegisterPass(
145 "GPR1", 0, force=True, reserve=True
146 )
147 )
148 synth.add_pass(
149 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile)
150 )
151 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(mmodel))
152 synth.add_pass(
153 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=3)
154 )
155
156 print_info(f"Generating {name}...")
157 bench = synth.synthesize()
158 print_info(f"{name} Generated!")
159 synth.save(f"{DIRECTORY}/{name}", bench=bench) # Save the benchmark
160
161
162###############################################################################
163# Example 2: loop with instructions using the MUL unit and with dependency #
164# distance of 4 #
165###############################################################################
166def example_2():
167 """Example 2"""
168 name = "FXU-MUL"
169
170 cwrapper = microprobe.code.get_wrapper("CInfPpc")
171 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
172
173 rand = RND
174
175 synth.add_pass(
176 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT)
177 )
178 synth.add_pass(
179 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
180 )
181 synth.add_pass(
182 microprobe.passes.instruction.SetInstructionTypeByElementPass(
183 TARGET, [TARGET.elements["MUL_FXU0_Core0_SCM_Processor"]], {}
184 )
185 )
186 synth.add_pass(
187 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=4)
188 )
189
190 print_info(f"Generating {name}...")
191 bench = synth.synthesize()
192 print_info(f"{name} Generated!")
193 synth.save(f"{DIRECTORY}/{name}", bench=bench) # Save the benchmark
194
195
196###############################################################################
197# Example 3: loop with instructions using the ALU unit and with dependency #
198# distance of 1 #
199###############################################################################
200def example_3():
201 """Example 3"""
202 name = "FXU-ALU"
203
204 cwrapper = microprobe.code.get_wrapper("CInfPpc")
205 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
206
207 rand = RND
208
209 synth.add_pass(
210 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT)
211 )
212 synth.add_pass(
213 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
214 )
215 synth.add_pass(
216 microprobe.passes.instruction.SetInstructionTypeByElementPass(
217 TARGET, [TARGET.elements["ALU_FXU0_Core0_SCM_Processor"]], {}
218 )
219 )
220 synth.add_pass(
221 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=1)
222 )
223
224 print_info(f"Generating {name}...")
225 bench = synth.synthesize()
226 print_info(f"{name} Generated!")
227 synth.save(f"{DIRECTORY}/{name}", bench=bench) # Save the benchmark
228
229
230###############################################################################
231# Example 4: loop with FMUL* instructions with different weights and with #
232# dependency distance 10 #
233###############################################################################
234def example_4():
235 """Example 4"""
236 name = "VSU-FMUL"
237
238 profile = {}
239 profile[TARGET.instructions["FMUL_V0"]] = 4
240 profile[TARGET.instructions["FMULS_V0"]] = 3
241 profile[TARGET.instructions["FMULx_V0"]] = 2
242 profile[TARGET.instructions["FMULSx_V0"]] = 1
243
244 cwrapper = microprobe.code.get_wrapper("CInfPpc")
245 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
246
247 rand = RND
248
249 synth.add_pass(
250 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT)
251 )
252 synth.add_pass(
253 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
254 )
255 synth.add_pass(
256 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile)
257 )
258 synth.add_pass(
259 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=10)
260 )
261
262 print_info(f"Generating {name}...")
263 bench = synth.synthesize()
264 print_info(f"{name} Generated!")
265 synth.save(f"{DIRECTORY}/{name}", bench=bench) # Save the benchmark
266
267
268###############################################################################
269# Example 5: loop with FADD* instructions with different weights and with #
270# dependency distance 1 #
271###############################################################################
272def example_5():
273 """Example 5"""
274 name = "VSU-FADD"
275
276 profile = {}
277 profile[TARGET.instructions["FADD_V0"]] = 100
278 profile[TARGET.instructions["FADDx_V0"]] = 1
279 profile[TARGET.instructions["FADDS_V0"]] = 10
280 profile[TARGET.instructions["FADDSx_V0"]] = 1
281
282 cwrapper = microprobe.code.get_wrapper("CInfPpc")
283 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
284
285 rand = RND
286
287 synth.add_pass(
288 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT)
289 )
290 synth.add_pass(
291 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
292 )
293 synth.add_pass(
294 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile)
295 )
296 synth.add_pass(
297 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=1)
298 )
299
300 print_info(f"Generating {name}...")
301 bench = synth.synthesize()
302 print_info(f"{name} Generated!")
303 synth.save(f"{DIRECTORY}/{name}", bench=bench) # Save the benchmark
304
305
306###############################################################################
307# Call the examples #
308###############################################################################
309example_1()
310example_2()
311example_3()
312example_4()
313example_5()
314exit(0)
power_v206_power7_ppc64_linux_gcc_genetic.py
Deprecated since version 0.5: Support for the PyEvolve and genetic algorithm based searches has been discontinued
The following example shows how to use the design exploration module and the genetic algorithm based searches to look for a solution. In particular, for each functional unit of the architecture and a range of IPCs (instruction per cycle), the example looks for a solution that stresses that functional unit at the given IPC. External commands (not included) are needed to evaluate the generated microbenchmarks in the target platform.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_genetic.py
17
18Example python script to show how to generate a set of microbenchmark
19stressing a particular unit but at different IPC ratio using a genetic
20search algorithm to play with two knobs: average latency and dependency
21distance.
22
23An IPC evaluation and scoring script is required. For instance:
24
25.. code:: bash
26
27 #!/bin/bash
28 # ARGS: $1 is the target IPC
29 # $2 is the name of the generate benchnark
30 target_ipc=$1
31 source_bench=$2
32
33 # Compile the benchmark
34 gcc -O0 -mcpu=power7 -mtune=power7 -std=c99 $source_bench.c -o $source_bench
35
36 # Evaluate the ipc
37 ipc=< your preferred commands to evaluate the IPC >
38
39 # Compute the score (the closer to the target IPC the
40 score=(1/($ipc-$target_ipc))^2 | bc -l
41
42 echo $score
43
44Use the script above as a template for your own GA-based search.
45"""
46
47# Futures
48from __future__ import absolute_import, division
49
50# Built-in modules
51import datetime
52import os
53import sys
54import time as runtime
55from typing import List, Tuple
56
57# Own modules
58import microprobe.code
59import microprobe.driver.genetic
60import microprobe.passes.ilp
61import microprobe.passes.initialization
62import microprobe.passes.instruction
63import microprobe.passes.register
64import microprobe.passes.structure
65from microprobe.exceptions import MicroprobeTargetDefinitionError
66from microprobe.target import import_definition
67from microprobe.utils.cmdline import print_error, print_info, print_warning
68from microprobe.utils.misc import RND, RNDINT
69from microprobe.utils.typeguard_decorator import typeguard_testsuite
70
71__author__ = "Ramon Bertran"
72__copyright__ = "Copyright 2011-2021 IBM Corporation"
73__credits__ = []
74__license__ = "IBM (c) 2011-2021 All rights reserved"
75__version__ = "0.5"
76__maintainer__ = "Ramon Bertran"
77__email__ = "rbertra@us.ibm.com"
78__status__ = "Development" # "Prototype", "Development", or "Production"
79
80# Benchmark size
81BENCHMARK_SIZE = 20
82
83COMMAND = None
84DIRECTORY = None
85
86# Get the target definition
87try:
88 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
89except MicroprobeTargetDefinitionError as exc:
90 print_error("Unable to import target definition")
91 print_error(f"Exception message: {str(exc)}")
92 exit(-1)
93
94
95@typeguard_testsuite
96def main():
97 """Main function."""
98
99 component_list = ["FXU", "FXU-noLSU", "FXU-LSU", "VSU", "VSU-FXU"]
100 ipcs = [float(x) / 10 for x in range(1, 41)]
101 ipcs = ipcs[5:] + ipcs[:5]
102
103 for name in component_list:
104 for ipc in ipcs:
105 generate_genetic(name, ipc)
106
107
108@typeguard_testsuite
109def generate_genetic(compname: str, ipc: float):
110 """Generate a microbenchmark stressing compname at the given ipc."""
111
112 assert COMMAND is not None, "COMMAND variable cannot be None"
113 assert DIRECTORY is not None, "DIRECTORY variable cannot be None"
114
115 comps = []
116 bcomps = []
117 any_comp: bool = False
118
119 assert (
120 TARGET.microarchitecture is not None
121 ), "Target must have a defined microarchitecture"
122
123 if compname.find("FXU") >= 0:
124 comps.append(
125 TARGET.microarchitecture.elements["FXU0_Core0_SCM_Processor"]
126 )
127
128 if compname.find("VSU") >= 0:
129 comps.append(
130 TARGET.microarchitecture.elements["VSU0_Core0_SCM_Processor"]
131 )
132
133 if len(comps) == 2:
134 any_comp = True
135 elif compname.find("noLSU") >= 0:
136 bcomps.append(
137 TARGET.microarchitecture.elements["LSU0_Core0_SCM_Processor"]
138 )
139 elif compname.find("LSU") >= 0:
140 comps.append(
141 TARGET.microarchitecture.elements["LSU_Core0_SCM_Processor"]
142 )
143
144 if (len(comps) == 1 and ipc > 2) or (len(comps) == 2 and ipc > 4):
145 return True
146
147 for elem in os.listdir(DIRECTORY):
148 if not elem.endswith(".c"):
149 continue
150 if elem.startswith(f"{compname}:IPC:{ipc:.2f}:DIST"):
151 print_info("Already generated: %s %d" % (compname, ipc))
152 return True
153
154 print_info(f"Going for IPC: {ipc} and Element: {compname}")
155
156 def generate(name: str, dist: float, latency: float):
157 """Benchmark generation function.
158
159 First argument is name, second the dependency distance and the
160 third is the average instruction latency.
161 """
162 wrapper = microprobe.code.get_wrapper("CInfPpc")
163 synth = microprobe.code.Synthesizer(TARGET, wrapper())
164 rand = RND
165 synth.add_pass(
166 microprobe.passes.initialization.InitializeRegistersPass(
167 value=RNDINT
168 )
169 )
170 synth.add_pass(
171 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
172 )
173 synth.add_pass(
174 microprobe.passes.instruction.SetInstructionTypeByElementPass(
175 TARGET,
176 comps,
177 {},
178 block=bcomps,
179 avelatency=latency,
180 any_comp=any_comp,
181 )
182 )
183 synth.add_pass(
184 microprobe.passes.register.DefaultRegisterAllocationPass(
185 rand, dd=dist
186 )
187 )
188 bench = synth.synthesize()
189 synth.save(name, bench=bench)
190
191 # Set the genetic algorithm parameters
192 ga_params: List[Tuple[int, int, float]] = []
193 ga_params.append((0, 20, 0.05)) # Average dependency distance design space
194 ga_params.append((2, 8, 0.05)) # Average instruction latency design space
195
196 # Set up the search driver
197 driver = microprobe.driver.genetic.ExecCmdDriver(
198 generate, 20, 30, 30, f"'{COMMAND}' {ipc} ", ga_params
199 )
200
201 starttime = runtime.time()
202 print_info("Start search...")
203 driver.run(1)
204 print_info("Search end")
205 endtime = runtime.time()
206
207 print_info(
208 "Genetic time::" f"{datetime.timedelta(seconds=endtime - starttime)}"
209 )
210
211 # Check if we found a solution
212 ga_sol_params: Tuple[float, float] = driver.solution()
213 score = driver.score()
214
215 print_info(f"IPC found: {ipc}, score: {score}")
216
217 if score < 20:
218 print_warning(f"Unable to find an optimal solution with IPC: {ipc}:")
219 print_info("Generating the closest solution...")
220 generate(
221 f"{DIRECTORY}/{compname}:IPC:{ipc:.2f}:"
222 f"DIST:{ga_sol_params[0]:.2f}:LAT:{ga_sol_params[1]:.2f}-check",
223 ga_sol_params[0],
224 ga_sol_params[1],
225 )
226 print_info("Closest solution generated")
227 else:
228 print_info(
229 "Solution found for %s and IPC %f -> dist: %f , "
230 "latency: %f "
231 % (compname, ipc, ga_sol_params[0], ga_sol_params[1])
232 )
233 print_info("Generating solution...")
234 generate(
235 f"{DIRECTORY}/{compname}:IPC:{ipc:.2f}:"
236 f"DIST:{ga_sol_params[0]:.2f}:LAT:{ga_sol_params[1]:.2f}",
237 ga_sol_params[0],
238 ga_sol_params[1],
239 )
240 print_info("Solution generated")
241 return True
242
243
244if __name__ == "__main__":
245 # run main if executed from the COMMAND line
246 # and the main method exists
247
248 if len(sys.argv) != 3:
249 print_info("Usage:")
250 print_info(f"{sys.argv[0]} output_dir eval_cmd")
251 print_info("")
252 print_info("Output dir: output directory for the generated benchmarks")
253 print_info("eval_cmd: command accepting 2 parameters: the target IPC")
254 print_info(" and the filename of the generate benchmark. ")
255 print_info(" Output: the score used for the GA search. E.g.")
256 print_info(" the close the IPC of the generated benchmark to")
257 print_info(" the target IPC, the cmd should give a higher ")
258 print_info(" score. ")
259 exit(-1)
260
261 DIRECTORY = sys.argv[1]
262 COMMAND = sys.argv[2]
263
264 if not os.path.isdir(DIRECTORY):
265 print_info(f"Output DIRECTORY '{DIRECTORY}' does not exists")
266 exit(-1)
267
268 if not os.path.isfile(COMMAND):
269 print_info(f"The COMMAND '{COMMAND}' does not exists")
270 exit(-1)
271
272 if callable(locals().get("main")):
273 main()