Examples on POWER
In the definitions/power/examples
directory of the Microprobe distribution
(if you installed the microprobe_target_power package),
you will find different examples showing the usage of Microprobe
for the power architecture. Although we have split the examples by
architecture, the concepts we introduce in these examples are common in all
the architectures.
We recommend users to go through the code of these examples to understand specific details on how to use the framework.
Contents:
isa_power_v206_info.py
The first example we show is isa_power_v206_info.py
. This example
shows how to search for architecture definitions (e.g. the ISA properties),
how to import the definitions and then how to dump the definition.
If you execute the following command:
> ./isa_power_v206_info.py
will generate the following output, which shows all the details of the POWER v2.06 architecture (first and last 20 lines for brevity):
--------------------------------------------------------------------------------
ISA Name: power_v206
ISA Description: power_v206
--------------------------------------------------------------------------------
Register Types:
GPR: General Register (bit size: 64)
VSCR: Vector Status and Control Register (bit size: 32)
FPR: Floating-Point Register (bit size: 64)
SPR: Special Purpose Register (64 bits) (bit size: 64)
VR: Vector Register (bit size: 128)
MSR: Machine State Register (bit size: 64)
SPR32: Special Purpose Register (32 bits) (bit size: 32)
VSR: Vector Scalar Register (bit size: 128)
FPSCR: Floating-Point Status and Control Register (bit size: 32)
CR: Condition Register (bit size: 4)
--------------------------------------------------------------------------------
Architected registers:
AESR : AESR Register (Type: SPR)
AMOR : AMOR Register (Type: SPR)
AMR : Authority Mask Register (Type: SPR)
...
access_storage : False (Boolean indicating if the instruction has storage operands )
access_storage_with_update : False (Boolean indicating if the instruction accesses to storage and updates the source register with the generated address)
algebraic : False (Boolean indicating if operation uses algebraic rules to keep values )
branch : False (Boolean indicating if the instruction is a branch )
branch_conditional : False (Boolean indicating if the instruction is a branch conditional )
branch_relative : False (Boolean indicating if the instruction is a relative branch )
category : VSX (String indicating if the instruction the instruction category )
decimal : False (Boolean indication if the instruction requires inputs in decimal format )
disable_asm : False (Boolean indicating if ASM generation is disabled for the instruction. If so, binary codification is used. )
hypervisor : False (Boolean indicating if the instruction need hypervisor mode )
privileged : False (Boolean indicating if the instruction is privileged )
privileged_optional : False (Boolean indicating the instrucion is priviledged or not depending on the input values )
switching : None (Input values required to maximize the computational switching )
syscall : False (Boolean indicating if the instruction is a syscall or return from one )
trap : False (Boolean indicating if the instruction is a trap )
Instructions defined: 938
Variants defined: 964
--------------------------------------------------------------------------------
The following code is what has been executed:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16isa_power_v206_info.py
17
18Example module to show how to access to isa definitions.
19"""
20
21# Futures
22from __future__ import absolute_import, print_function
23
24# Built-in modules
25import os
26
27# Own modules
28from microprobe.target.isa import find_isa_definitions, import_isa_definition
29
30__author__ = "Ramon Bertran"
31__copyright__ = "Copyright 2011-2021 IBM Corporation"
32__credits__ = []
33__license__ = "IBM (c) 2011-2021 All rights reserved"
34__version__ = "0.5"
35__maintainer__ = "Ramon Bertran"
36__email__ = "rbertra@us.ibm.com"
37__status__ = "Development" # "Prototype", "Development", or "Production"
38
39# Constants
40ISANAME = "power_v206"
41
42# Functions
43
44# Classes
45
46# Main
47
48# Search and import definition
49ISADEF = import_isa_definition(
50 os.path.dirname([
51 isa for isa in find_isa_definitions() if isa.name == ISANAME
52 ][0].filename))
53
54# Print definition
55print((ISADEF.full_report()))
56exit(0)
In this simple code, first the find_isa_definitions
,
import_isa_definition
from the microprobe.target.isa module
are imported (line 14). Then, the first one is used to look for definitions of
architectures, a list returned and filtered and only the one with
name power_v206
is imported using the second method:
import_isa_definition
(lines 34-37). Finally, the full report of
the ISADEF
object is printed to standard output in line 40.
In the case, the full report is printed but the user can query any
information about the particular ISA that has been imported by using the
microprobe.target.isa.ISA
API.
power_v206_power7_ppc64_linux_gcc_profile.py
The aim of this example is to show how the code generation works in Microprobe. In particular, this example shows how to generate, for each instruction of the ISA, an endless loop containing such instruction. The size of the loop and the dependency distance between the instructions of the loop can specified as a parameter. Using Microprobe you can generate thousands of microbenchmarks in few minutes. Let’s start with the command line interface. Executing:
> ./power_v206_power7_ppc64_linux_gcc_profile.py --help
will generate the following output:
power_v206_power7_ppc64_linux_gcc_profile.py: INFO: Processing input arguments...
usage: power_v206_power7_ppc64_linux_gcc_profile.py [-h]
[-P SEARCH_PATH [SEARCH_PATH ...]]
[-V] [-v] [-d]
[-i INSTRUCTION_NAME [INSTRUCTION_NAME ...]]
[--output_prefix PREFIX]
[-O PATH] [-p NUM_JOBS]
[-S BENCHMARK_SIZE]
[-D DEPENDECY_DISTANCE]
ISA power v206 profile example
optional arguments:
-h, --help show this help message and exit
-P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
Default search paths for microprobe target definitions
-V, --version Show Microprobe version and exit
-v, --verbosity Verbosity level (Values: [0,1,2,3,4]). Each time this
argument is specified the verbosity level is
increased. By default, no logging messages are shown.
These are the four levels available:
-v (1): critical messages
-v -v (2): critical and error messages
-v -v -v (3): critical, error and warning messages
-v -v -v -v (4): critical, error, warning and info messages
Specifying more than four verbosity flags, will
default to the maximum of four. If you need extra
information, enable the debug mode (--debug or -d
flags).
-d, --debug Enable debug mode in Microprobe framework. Lots of
output messages will be generated
-i INSTRUCTION_NAME [INSTRUCTION_NAME ...], --instruction INSTRUCTION_NAME [INSTRUCTION_NAME ...]
Instruction names to generate. Default: All
instructions
--output_prefix PREFIX
Output prefix of the generated files. Default:
POWER_V206_PROFILE
-O PATH, --output_path PATH
Output path. Default: current path
-p NUM_JOBS, --parallel NUM_JOBS
Number of parallel jobs. Default: number of CPUs
available (80). Valid values: 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80
-S BENCHMARK_SIZE, --size BENCHMARK_SIZE
Benchmark size (number of instructions in the endless
loop). Default: 64 instructions
-D DEPENDECY_DISTANCE, --dependency_distance DEPENDECY_DISTANCE
Average dependency distance between the instructions.
Default: 1000 (no dependencies)
Environment variables:
MICROPROBETEMPLATES Default path for microprobe templates
MICROPROBEDEBUG If set, enable debug
MICROPROBEDEBUGPASSES If set, enable debug during passes
MICROPROBEASMHEXFMT Assembly hexadecimal format. Options:
'all' -> All immediates in hex format
'address' -> Address immediates in hex format (default)
'none' -> All immediate in integer format
Lets look at the code to see how this command line tool is implemented. This is the complete code of the script:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_profile.py
17
18Example module to show how to generate a benchmark for each instruction
19of the ISA
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import sys
29import traceback
30import random
31
32# Third party modules
33
34# Own modules
35import microprobe.code.ins
36import microprobe.passes.address
37import microprobe.passes.branch
38import microprobe.passes.decimal
39import microprobe.passes.float
40import microprobe.passes.ilp
41import microprobe.passes.initialization
42import microprobe.passes.instruction
43import microprobe.passes.memory
44import microprobe.passes.register
45import microprobe.passes.structure
46import microprobe.utils.cmdline
47from microprobe import MICROPROBE_RC
48from microprobe.exceptions import MicroprobeException
49from microprobe.target import import_definition
50from microprobe.utils.cmdline import existing_dir, \
51 int_type, print_error, print_info, print_warning
52from microprobe.utils.logger import get_logger
53
54__author__ = "Ramon Bertran"
55__copyright__ = "Copyright 2011-2021 IBM Corporation"
56__credits__ = []
57__license__ = "IBM (c) 2011-2021 All rights reserved"
58__version__ = "0.5"
59__maintainer__ = "Ramon Bertran"
60__email__ = "rbertra@us.ibm.com"
61__status__ = "Development" # "Prototype", "Development", or "Production"
62
63# Constants
64LOG = get_logger(__name__) # Get the generic logging interface
65
66
67# Functions
68def main_setup():
69 """
70 Set up the command line interface (CLI) with the arguments required by
71 this command line tool.
72 """
73
74 args = sys.argv[1:]
75
76 # Create the CLI interface object
77 cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
78 config_options=False,
79 target_options=False,
80 debug_options=False)
81
82 # Add the different parameters for this particular tool
83 cmdline.add_option(
84 "instruction",
85 "i",
86 None,
87 "Instruction names to generate. Default: All instructions",
88 required=False,
89 nargs="+",
90 metavar="INSTRUCTION_NAME")
91
92 cmdline.add_option(
93 "output_prefix",
94 None,
95 "POWER_V206_PROFILE",
96 "Output prefix of the generated files. Default: POWER_V206_PROFILE",
97 opt_type=str,
98 required=False,
99 metavar="PREFIX")
100
101 cmdline.add_option("output_path",
102 "O",
103 "./",
104 "Output path. Default: current path",
105 opt_type=existing_dir,
106 metavar="PATH")
107
108 cmdline.add_option(
109 "parallel",
110 "p",
111 MICROPROBE_RC['cpus'],
112 "Number of parallel jobs. Default: number of CPUs available (%s)" %
113 mp.cpu_count(),
114 opt_type=int,
115 choices=list(range(1, MICROPROBE_RC['cpus'] + 1)),
116 metavar="NUM_JOBS")
117
118 cmdline.add_option(
119 "size",
120 "S",
121 64, "Benchmark size (number of instructions in the endless loop). "
122 "Default: 64 instructions",
123 opt_type=int_type(1, 2**20),
124 metavar="BENCHMARK_SIZE")
125
126 cmdline.add_option("dependency_distance",
127 "D",
128 1000,
129 "Average dependency distance between the instructions. "
130 "Default: 1000 (no dependencies)",
131 opt_type=int_type(1, 1000),
132 metavar="DEPENDECY_DISTANCE")
133
134 # Start the main
135 print_info("Processing input arguments...")
136 cmdline.main(args, _main)
137
138
139def _main(arguments):
140 """
141 Main program. Called after the arguments from the CLI interface have
142 been processed.
143 """
144
145 print_info("Arguments processed!")
146
147 print_info("Importing target definition "
148 "'power_v206-power7-ppc64_linux_gcc'...")
149 target = import_definition("power_v206-power7-ppc64_linux_gcc")
150
151 # Get the arguments
152 instructions = arguments.get("instruction", None)
153 prefix = arguments["output_prefix"]
154 output_path = arguments["output_path"]
155 parallel_jobs = arguments["parallel"]
156 size = arguments["size"]
157 distance = arguments["dependency_distance"]
158
159 # Process the arguments
160 if instructions is not None:
161
162 # If the user has provided some instructions, make sure they
163 # exists and then we call the generation function
164
165 instructions = _validate_instructions(instructions, target)
166
167 if len(instructions) == 0:
168 print_error("No valid instructions defined.")
169 exit(-1)
170
171 # Set more verbose level
172 # set_log_level(10)
173 #
174 list(
175 map(_generate_benchmark,
176 [(instruction, prefix, output_path, target, size, distance)
177 for instruction in instructions]))
178
179 else:
180
181 # If the user has not provided any instruction, go for all of them
182 # and then call he generation function
183
184 instructions = _generate_instructions(target, output_path, prefix)
185
186 # Since several benchmark will be generated, reduce verbose level
187 # and call the generation function in parallel
188
189 # set_log_level(30)
190
191 if parallel_jobs > 1:
192 pool = mp.Pool(processes=parallel_jobs)
193 pool.map(
194 _generate_benchmark,
195 [(instruction, prefix, output_path, target, size, distance)
196 for instruction in instructions], 1)
197 else:
198 list(
199 map(_generate_benchmark,
200 [(instruction, prefix, output_path, target, size, distance)
201 for instruction in instructions]))
202
203
204def _validate_instructions(instructions, target):
205 """
206 Validate the provided instruction for a given target
207 """
208
209 nins = []
210 for instruction in instructions:
211
212 if instruction not in list(target.isa.instructions.keys()):
213 print_warning("'%s' not defined in the ISA. Skipping..." %
214 instruction)
215 continue
216 nins.append(instruction)
217 return nins
218
219
220def _generate_instructions(target, path, prefix):
221 """
222 Generate the list of instruction to be generated for a given target
223 """
224
225 instructions = []
226 for name, instr in target.instructions.items():
227
228 if instr.privileged or instr.hypervisor:
229 # Skip priv/hyper instructions
230 continue
231
232 if instr.branch and not instr.branch_relative:
233 # Skip branch absolute due to relocation problems
234 continue
235
236 if instr.category in ['LMA', 'LMV', 'DS', 'EC']:
237 # Skip some instruction categories
238 continue
239
240 if name in [
241 'LSWI_V0', 'LSWX_V0', 'LMW_V0', 'STSWX_V0', 'LD_V1', 'LWZ_V1',
242 'STW_V1'
243 ]:
244 # Some instructions are not completely supported yet
245 # String-related instructions and load multiple
246
247 continue
248
249 # Skip if the files already exists
250
251 fname = "%s/%s_%s.c" % (path, prefix, name)
252 ffname = "%s/%s_%s.c.fail" % (path, prefix, name)
253
254 if os.path.isfile(fname):
255 print_warning("Skip %s. '%s' already generated" % (name, fname))
256 continue
257
258 if os.path.isfile(ffname):
259 print_warning("Skip %s. '%s' already generated (failed)" %
260 (name, ffname))
261 continue
262
263 instructions.append(name)
264
265 return instructions
266
267
268def _generate_benchmark(args):
269 """
270 Actual benchmark generation policy. This is the function that defines
271 how the microbenchmark are going to be generated
272 """
273
274 instr_name, prefix, output_path, target, size, distance = args
275
276 try:
277
278 # Name of the output file
279 fname = "%s/%s_%s" % (output_path, prefix, instr_name)
280
281 # Name of the fail output file (generated in case of exception)
282 ffname = "%s.c.fail" % (fname)
283
284 print_info("Generating %s ..." % (fname))
285
286 instruction = microprobe.code.ins.Instruction()
287 instruction.set_arch_type(target.instructions[instr_name])
288 sequence = [target.instructions[instr_name]]
289
290 # Get the wrapper object. The wrapper object is in charge of
291 # translating the internal representation of the microbenchmark
292 # to the final output format.
293 #
294 # In this case, we obtain the 'CInfGen' wrapper, which embeds
295 # the generated code within an infinite loop using C plus
296 # in-line assembly statements.
297 cwrapper = microprobe.code.get_wrapper("CInfGen")
298
299 # Create the synthesizer object, which is in charge of driving the
300 # generation of the microbenchmark, given a set of passes
301 # (a.k.a. transformations) to apply to the an empty internal
302 # representation of the microbenchmark
303 synth = microprobe.code.Synthesizer(target,
304 cwrapper(),
305 value=0b01010101)
306
307 rand = random.Random()
308 rand.seed(13)
309
310 # Add the transformation passes
311
312 #######################################################################
313 # Pass 1: Init integer registers to a given value #
314 #######################################################################
315 synth.add_pass(
316 microprobe.passes.initialization.InitializeRegistersPass(
317 value=_init_value()))
318 floating = False
319 vector = False
320
321 for operand in instruction.operands():
322 if operand.type.immediate:
323 continue
324
325 if operand.type.float:
326 floating = True
327
328 if operand.type.vector:
329 vector = True
330
331 if vector and floating:
332 ###################################################################
333 # Pass 1.A: if instruction uses vector floats, init vector #
334 # registers to float values #
335 ###################################################################
336 synth.add_pass(
337 microprobe.passes.initialization.InitializeRegistersPass(
338 v_value=(1.000000000000001, 64)))
339 elif vector:
340 ###################################################################
341 # Pass 1.B: if instruction uses vector but not floats, init #
342 # vector registers to integer value #
343 ###################################################################
344 synth.add_pass(
345 microprobe.passes.initialization.InitializeRegistersPass(
346 v_value=(_init_value(), 64)))
347 elif floating:
348 ###################################################################
349 # Pass 1.C: if instruction uses floats, init float #
350 # registers to float values #
351 ###################################################################
352 synth.add_pass(
353 microprobe.passes.initialization.InitializeRegistersPass(
354 fp_value=1.000000000000001))
355
356 #######################################################################
357 # Pass 2: Add a building block of size 'size' #
358 #######################################################################
359 synth.add_pass(
360 microprobe.passes.structure.SimpleBuildingBlockPass(size))
361
362 #######################################################################
363 # Pass 3: Fill the building block with the instruction sequence #
364 #######################################################################
365 synth.add_pass(
366 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
367 sequence))
368
369 #######################################################################
370 # Pass 4: Compute addresses of instructions (this pass is needed to #
371 # update the internal representation information so that in #
372 # case addresses are required, they are up to date). #
373 #######################################################################
374 synth.add_pass(
375 microprobe.passes.address.UpdateInstructionAddressesPass())
376
377 #######################################################################
378 # Pass 5: Set target of branches to be the next instruction in the #
379 # instruction stream #
380 #######################################################################
381 synth.add_pass(microprobe.passes.branch.BranchNextPass())
382
383 #######################################################################
384 # Pass 6: Set memory-related operands to access 16 storage locations #
385 # in a round-robin fashion in stride 256 bytes. #
386 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
387 #######################################################################
388 synth.add_pass(microprobe.passes.memory.SingleMemoryStreamPass(
389 16, 256))
390
391 #######################################################################
392 # Pass 7.A: Initialize the storage locations accessed by floating #
393 # point instructions to have a valid floating point value #
394 #######################################################################
395 synth.add_pass(
396 microprobe.passes.float.InitializeMemoryFloatPass(
397 value=1.000000000000001))
398
399 #######################################################################
400 # Pass 7.B: Initialize the storage locations accessed by decimal #
401 # instructions to have a valid decimal value #
402 #######################################################################
403 synth.add_pass(
404 microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1))
405
406 #######################################################################
407 # Pass 8: Set the remaining instructions operands (if not set) #
408 # (Required to set remaining immediate operands) #
409 #######################################################################
410 synth.add_pass(
411 microprobe.passes.register.DefaultRegisterAllocationPass(
412 rand, dd=distance))
413
414 # Synthesize the microbenchmark.The synthesize applies the set of
415 # transformation passes added before and returns object representing
416 # the microbenchmark
417 bench = synth.synthesize()
418
419 # Save the microbenchmark to the file 'fname'
420 synth.save(fname, bench=bench)
421
422 print_info("%s generated!" % (fname))
423
424 # Remove fail file if exists
425 if os.path.isfile(ffname):
426 os.remove(ffname)
427
428 except MicroprobeException:
429
430 # In case of exception during the generation of the microbenchmark,
431 # print the error, write the fail file and exit
432 print_error(traceback.format_exc())
433 open(ffname, 'a').close()
434 exit(-1)
435
436
437def _init_value():
438 """ Return a init value """
439 return 0b0101010101010101010101010101010101010101010101010101010101010101
440
441
442# Main
443if __name__ == '__main__':
444 # run main if executed from the command line
445 # and the main method exists
446
447 if callable(locals().get('main_setup')):
448 main_setup()
449 exit(0)
The code is self-documented. You can take a look to understand the basic concepts of the code generation in Microprobe. In order to help the readers, let us summarize and elaborate the explanations in the code. The following are the suggested steps required to implement a command line tool to generate microbenchmarks using Microprobe:
Define the command line interface and parameters (
main_setup()
function in the example). This includes:Create a command line interface object
Define parameters using the
add_option
interfaceCall the actual main with the arguments
Define the function to process the input parameters (
_main()
function in the example). This includes:Import target definition
Get processed arguments
Validate and use the arguments to call the actual microbenchmark generation function
Define the function to generate the microbenchmark (
_generate_benchmark
function in the example). The main elements are the following:Get the wrapper object. The wrapper object defines the general characteristics of code being generated (i.e. how the internal representation will be translated to the final file being generated). General characteristics are, for instance, code prologs such as
#include <header.h>
directives, the main function declaration, epilogs, etc. In this case, the wrapper selected is theCInfGen
. This wrapper generates C code with an infinite loop of instructions. This results in the following code:#include <stdio.h> #include <string.h> // <declaration of variables> int main(int argc, char** argv, char** envp) { // <initialization_code> while(1) { // <generated_code> } // end while }
The user can subclass or define their own wrappers to fulfill their needs. See
microprobe.code.wrapper.Wrapper
for more details.Instantiate synthesizer. The benchmark synthesizer object is in charge of driving the code generation object by applying the set of transformation passes defined by the user.
Define the transformation passes. The transformation passes will fill the
declaration of variables
,<initialization_code>
and<generated_code>
sections of the previous code block. Depending on the order and the type of passes applied, the code generated will be different. The user has plenty of transformation passes to apply. Seemicroprobe.passes
and all its submodules for further details. Also, the use can define its own passes by subclassing the classmicroprobe.passes.Pass
.Finally, once the generation policy is defined, the user only has to synthesize the benchmark and save it to a file.
power_v206_power7_ppc64_linux_gcc_fu_stress.py
The following example shows how to generate microbenchmarks that stress a particular functional unit of the architecture. The code is self explanatory:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_fu_stress.py
17
18Example module to show how to generate a benchmark stressing a particular
19functional unit of the microarchitecture at different rate using the
20average latency of instructions as well as the average dependency distance
21between the instructions
22"""
23
24# Futures
25from __future__ import absolute_import
26
27# Built-in modules
28import os
29import sys
30import traceback
31import random
32
33# Own modules
34import microprobe.code.ins
35import microprobe.passes.address
36import microprobe.passes.branch
37import microprobe.passes.decimal
38import microprobe.passes.float
39import microprobe.passes.ilp
40import microprobe.passes.initialization
41import microprobe.passes.instruction
42import microprobe.passes.memory
43import microprobe.passes.register
44import microprobe.passes.structure
45import microprobe.utils.cmdline
46from microprobe.exceptions import MicroprobeException, \
47 MicroprobeTargetDefinitionError
48from microprobe.target import import_definition
49from microprobe.utils.cmdline import dict_key, existing_dir, \
50 float_type, int_type, print_error, print_info
51from microprobe.utils.logger import get_logger
52
53__author__ = "Ramon Bertran"
54__copyright__ = "Copyright 2011-2021 IBM Corporation"
55__credits__ = []
56__license__ = "IBM (c) 2011-2021 All rights reserved"
57__version__ = "0.5"
58__maintainer__ = "Ramon Bertran"
59__email__ = "rbertra@us.ibm.com"
60__status__ = "Development" # "Prototype", "Development", or "Production"
61
62# Constants
63LOG = get_logger(__name__) # Get the generic logging interface
64
65
66# Functions
67def main_setup():
68 """
69 Set up the command line interface (CLI) with the arguments required by
70 this command line tool.
71 """
72
73 args = sys.argv[1:]
74
75 # Get the target definition
76 try:
77 target = import_definition("power_v206-power7-ppc64_linux_gcc")
78 except MicroprobeTargetDefinitionError as exc:
79 print_error("Unable to import target definition")
80 print_error("Exception message: %s" % str(exc))
81 exit(-1)
82
83 func_units = {}
84 valid_units = [elem.name for elem in target.elements.values()]
85
86 for instr in target.isa.instructions.values():
87 if instr.execution_units == "None":
88 LOG.debug("Execution units for: '%s' not defined", instr.name)
89 continue
90
91 for unit in instr.execution_units:
92 if unit not in valid_units:
93 continue
94
95 if unit not in func_units:
96 func_units[unit] = [
97 elem for elem in target.elements.values()
98 if elem.name == unit
99 ][0]
100
101 # Create the CLI interface object
102 cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
103 config_options=False,
104 target_options=False,
105 debug_options=False)
106
107 # Add the different parameters for this particular tool
108 cmdline.add_option("functional_unit",
109 "f", [func_units['ALU']],
110 "Functional units to stress. Default: ALU",
111 required=False,
112 nargs="+",
113 choices=func_units,
114 opt_type=dict_key(func_units),
115 metavar="FUNCTIONAL_UNIT_NAME")
116
117 cmdline.add_option(
118 "output_prefix",
119 None,
120 "POWER_V206_FU_STRESS",
121 "Output prefix of the generated files. Default: POWER_V206_FU_STRESS",
122 opt_type=str,
123 required=False,
124 metavar="PREFIX")
125
126 cmdline.add_option("output_path",
127 "O",
128 "./",
129 "Output path. Default: current path",
130 opt_type=existing_dir,
131 metavar="PATH")
132
133 cmdline.add_option(
134 "size",
135 "S",
136 64, "Benchmark size (number of instructions in the endless loop). "
137 "Default: 64 instructions",
138 opt_type=int_type(1, 2**20),
139 metavar="BENCHMARK_SIZE")
140
141 cmdline.add_option("dependency_distance",
142 "D",
143 1000,
144 "Average dependency distance between the instructions. "
145 "Default: 1000 (no dependencies)",
146 opt_type=int_type(1, 1000),
147 metavar="DEPENDECY_DISTANCE")
148
149 cmdline.add_option("average_latency",
150 "L",
151 2, "Average latency of the selected instructins. "
152 "Default: 2 cycles",
153 opt_type=float_type(1, 1000),
154 metavar="AVERAGE_LATENCY")
155
156 # Start the main
157 print_info("Processing input arguments...")
158 cmdline.main(args, _main)
159
160
161def _main(arguments):
162 """
163 Main program. Called after the arguments from the CLI interface have
164 been processed.
165 """
166
167 print_info("Arguments processed!")
168
169 print_info("Importing target definition "
170 "'power_v206-power7-ppc64_linux_gcc'...")
171 target = import_definition("power_v206-power7-ppc64_linux_gcc")
172
173 # Get the arguments
174 functional_units = arguments["functional_unit"]
175 prefix = arguments["output_prefix"]
176 output_path = arguments["output_path"]
177 size = arguments["size"]
178 latency = arguments["average_latency"]
179 distance = arguments["dependency_distance"]
180
181 if functional_units is None:
182 functional_units = ["ALL"]
183
184 _generate_benchmark(target, "%s/%s_" % (output_path, prefix),
185 (functional_units, size, latency, distance))
186
187
188def _generate_benchmark(target, output_prefix, args):
189 """
190 Actual benchmark generation policy. This is the function that defines
191 how the microbenchmark are going to be generated
192 """
193
194 functional_units, size, latency, distance = args
195
196 try:
197
198 # Name of the output file
199 func_unit_names = [unit.name for unit in functional_units]
200 fname = "%s%s" % (output_prefix, "_".join(func_unit_names))
201 fname = "%s_LAT_%s" % (fname, latency)
202 fname = "%s_DEP_%s" % (fname, distance)
203
204 # Name of the fail output file (generated in case of exception)
205 ffname = "%s.c.fail" % (fname)
206
207 print_info("Generating %s ..." % (fname))
208
209 # Get the wrapper object. The wrapper object is in charge of
210 # translating the internal representation of the microbenchmark
211 # to the final output format.
212 #
213 # In this case, we obtain the 'CInfGen' wrapper, which embeds
214 # the generated code within an infinite loop using C plus
215 # in-line assembly statements.
216 cwrapper = microprobe.code.get_wrapper("CInfGen")
217
218 # Create the synthesizer object, which is in charge of driving the
219 # generation of the microbenchmark, given a set of passes
220 # (a.k.a. transformations) to apply to the an empty internal
221 # representation of the microbenchmark
222 synth = microprobe.code.Synthesizer(target,
223 cwrapper(),
224 value=0b01010101)
225
226 rand = random.Random()
227 rand.seed(13)
228
229 # Add the transformation passes
230
231 #######################################################################
232 # Pass 1: Init integer registers to a given value #
233 #######################################################################
234 synth.add_pass(
235 microprobe.passes.initialization.InitializeRegistersPass(
236 value=_init_value()))
237
238 #######################################################################
239 # Pass 2: Add a building block of size 'size' #
240 #######################################################################
241 synth.add_pass(
242 microprobe.passes.structure.SimpleBuildingBlockPass(size))
243
244 #######################################################################
245 # Pass 3: Fill the building block with the instruction sequence #
246 #######################################################################
247 synth.add_pass(
248 microprobe.passes.instruction.SetInstructionTypeByElementPass(
249 target, functional_units, {}))
250
251 #######################################################################
252 # Pass 4: Compute addresses of instructions (this pass is needed to #
253 # update the internal representation information so that in #
254 # case addresses are required, they are up to date). #
255 #######################################################################
256 synth.add_pass(
257 microprobe.passes.address.UpdateInstructionAddressesPass())
258
259 #######################################################################
260 # Pass 5: Set target of branches to be the next instruction in the #
261 # instruction stream #
262 #######################################################################
263 synth.add_pass(microprobe.passes.branch.BranchNextPass())
264
265 #######################################################################
266 # Pass 6: Set memory-related operands to access 16 storage locations #
267 # in a round-robin fashion in stride 256 bytes. #
268 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
269 #######################################################################
270 synth.add_pass(microprobe.passes.memory.SingleMemoryStreamPass(
271 16, 256))
272
273 #######################################################################
274 # Pass 7.A: Initialize the storage locations accessed by floating #
275 # point instructions to have a valid floating point value #
276 #######################################################################
277 synth.add_pass(
278 microprobe.passes.float.InitializeMemoryFloatPass(
279 value=1.000000000000001))
280
281 #######################################################################
282 # Pass 7.B: Initialize the storage locations accessed by decimal #
283 # instructions to have a valid decimal value #
284 #######################################################################
285 synth.add_pass(
286 microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1))
287
288 #######################################################################
289 # Pass 8: Set the remaining instructions operands (if not set) #
290 # (Required to set remaining immediate operands) #
291 #######################################################################
292 synth.add_pass(
293 microprobe.passes.register.DefaultRegisterAllocationPass(
294 rand,
295 dd=distance))
296
297 # Synthesize the microbenchmark.The synthesize applies the set of
298 # transformation passes added before and returns object representing
299 # the microbenchmark
300 bench = synth.synthesize()
301
302 # Save the microbenchmark to the file 'fname'
303 synth.save(fname, bench=bench)
304
305 print_info("%s generated!" % (fname))
306
307 # Remove fail file if exists
308 if os.path.isfile(ffname):
309 os.remove(ffname)
310
311 except MicroprobeException:
312
313 # In case of exception during the generation of the microbenchmark,
314 # print the error, write the fail file and exit
315 print_error(traceback.format_exc())
316 open(ffname, 'a').close()
317 exit(-1)
318
319
320def _init_value():
321 """ Return a init value """
322 return 0b0101010101010101010101010101010101010101010101010101010101010101
323
324
325# Main
326if __name__ == '__main__':
327 # run main if executed from the command line
328 # and the main method exists
329
330 if callable(locals().get('main_setup')):
331 main_setup()
332 exit(0)
power_v206_power7_ppc64_linux_gcc_memory.py
The following example shows how to create microbenchmarks with different activity (stress levels) on the different levels of the cache hierarchy. Note that it is not necessary to use the built-in command line interface provided by Microprobe, as the example shows.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_memory.py
17
18Example python script to show how to generate microbenchmarks with particular
19levels of activity in the memory hierarchy.
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import random
29import sys
30from typing import List, Tuple
31
32# Own modules
33import microprobe.code
34import microprobe.passes.address
35import microprobe.passes.ilp
36import microprobe.passes.initialization
37import microprobe.passes.instruction
38import microprobe.passes.memory
39import microprobe.passes.register
40import microprobe.passes.structure
41from microprobe import MICROPROBE_RC
42from microprobe.exceptions import MicroprobeTargetDefinitionError
43from microprobe.model.memory import EndlessLoopDataMemoryModel
44from microprobe.target import import_definition
45from microprobe.target.isa.instruction import InstructionType
46from microprobe.target.uarch.cache import SetAssociativeCache
47from microprobe.utils.cmdline import print_error, print_info
48from microprobe.utils.typeguard_decorator import typeguard_testsuite
49
50__author__ = "Ramon Bertran"
51__copyright__ = "Copyright 2011-2021 IBM Corporation"
52__credits__ = []
53__license__ = "IBM (c) 2011-2021 All rights reserved"
54__version__ = "0.5"
55__maintainer__ = "Ramon Bertran"
56__email__ = "rbertra@us.ibm.com"
57__status__ = "Development" # "Prototype", "Development", or "Production"
58
59# Get the target definition
60try:
61 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
62except MicroprobeTargetDefinitionError as exc:
63 print_error("Unable to import target definition")
64 print_error("Exception message: %s" % str(exc))
65 exit(-1)
66
67assert TARGET.microarchitecture is not None, \
68 "Target must have a defined microarchitecture"
69
70BASE_ELEMENT = [
71 element for element in TARGET.microarchitecture.elements.values()
72 if element.name == 'L1D'
73][0]
74CACHE_HIERARCHY: List[SetAssociativeCache] = \
75 TARGET.microarchitecture.cache_hierarchy.get_data_hierarchy_from_element(
76 BASE_ELEMENT)
77
78# Benchmark size
79BENCHMARK_SIZE = 8 * 1024
80
81# Fill a list of the models to be generated
82
83MEMORY_MODELS: List[Tuple[str, List[SetAssociativeCache], List[int]]] = []
84
85#
86# Due to performance issues (long exec. time) this
87# model is disabled
88#
89# MEMORY_MODELS.append(
90# (
91# "ALL", CACHE_HIERARCHY, [
92# 25, 25, 25, 25]))
93
94MEMORY_MODELS.append(("L1", CACHE_HIERARCHY, [100, 0, 0, 0]))
95MEMORY_MODELS.append(("L2", CACHE_HIERARCHY, [0, 100, 0, 0]))
96MEMORY_MODELS.append(("L3", CACHE_HIERARCHY, [0, 0, 100, 0]))
97MEMORY_MODELS.append(("L1L3", CACHE_HIERARCHY, [50, 0, 50, 0]))
98MEMORY_MODELS.append(("L1L2", CACHE_HIERARCHY, [50, 50, 0, 0]))
99MEMORY_MODELS.append(("L2L3", CACHE_HIERARCHY, [0, 50, 50, 0]))
100MEMORY_MODELS.append(("CACHES", CACHE_HIERARCHY, [33, 33, 34, 0]))
101MEMORY_MODELS.append(("MEM", CACHE_HIERARCHY, [0, 0, 0, 100]))
102
103# Enable parallel generation
104PARALLEL = False
105
106DIRECTORY = None
107
108
109@typeguard_testsuite
110def main():
111 """Main function. """
112 # call the generate method for each model in the memory model list
113
114 if PARALLEL:
115 print_info("Start parallel execution...")
116 pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
117 pool.map(generate, MEMORY_MODELS, 1)
118 else:
119 print_info("Start sequential execution...")
120 list(map(generate, MEMORY_MODELS))
121
122 exit(0)
123
124
125@typeguard_testsuite
126def generate(model: Tuple[str, List[SetAssociativeCache], List[int]]):
127 """Benchmark generation policy function. """
128
129 assert DIRECTORY is not None, "DIRECTORY variable cannot be None"
130
131 print_info(f"Creating memory model '{model[0]}' ...")
132 memmodel = EndlessLoopDataMemoryModel(*model)
133
134 modelname = memmodel.name
135
136 print_info(f"Generating Benchmark mem-{modelname} ...")
137
138 # Get the architecture
139 garch = TARGET
140
141 # For all the supported instructions, get the memory operations,
142 sequence: List[InstructionType] = []
143 for instr_name in sorted(garch.isa.instructions.keys()):
144
145 instr = garch.isa.instructions[instr_name]
146
147 if not instr.access_storage:
148 continue
149 if instr.privileged: # Skip privileged
150 continue
151 if instr.hypervisor: # Skip hypervisor
152 continue
153 if instr.trap: # Skip traps
154 continue
155 if "String" in instr.description: # Skip unsupported string instr.
156 continue
157 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
158 continue
159 if instr.category in ['LMA', 'LMV', 'DS', 'EC',
160 'WT']: # Skip unsupported categories
161 continue
162 if instr.access_storage_with_update: # Not supported by mem. model
163 continue
164 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
165 continue
166 if "Conditional Indexed" in instr.description: # Skip (illegal intr.)
167 continue
168 if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1']:
169 continue
170
171 sequence.append(instr)
172
173 # Get the loop wrapper. In this case we take the 'CInfPpc', which
174 # generates an infinite loop in C using PowerPC embedded assembly.
175 cwrapper = microprobe.code.get_wrapper("CInfPpc")
176
177 # Define function to return random numbers (used afterwards)
178 def rnd():
179 """Return a random value. """
180 return random.randrange(0, (1 << 64) - 1)
181
182 # Create the benchmark synthesizer
183 synth = microprobe.code.Synthesizer(garch, cwrapper())
184
185 rand = random.Random()
186 rand.seed(13)
187
188 ##########################################################################
189 # Add the passes we want to apply to synthesize benchmarks #
190 ##########################################################################
191
192 # --> Init registers to random values
193 synth.add_pass(
194 microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
195
196 # --> Add a single basic block of size 'size'
197 if memmodel.name in ['MEM']:
198 synth.add_pass(
199 microprobe.passes.structure.SimpleBuildingBlockPass(
200 BENCHMARK_SIZE * 4))
201 else:
202 synth.add_pass(
203 microprobe.passes.structure.SimpleBuildingBlockPass(
204 BENCHMARK_SIZE))
205
206 # --> Fill the basic block using the sequence of instructions provided
207 synth.add_pass(
208 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
209 sequence))
210
211 # --> Set the memory operations parameters to fulfill the given model
212 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(memmodel))
213
214 # --> Set the dependency distance and the default allocation. Sets the
215 # remaining undefined instruction operands (register allocation,...)
216 synth.add_pass(microprobe.passes.register.NoHazardsAllocationPass())
217 synth.add_pass(
218 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=0))
219
220 # Generate the benchmark (applies the passes).
221 bench = synth.synthesize()
222
223 print_info(f"Benchmark mem-{modelname} saving to disk...")
224
225 # Save the benchmark
226 synth.save(f"{DIRECTORY}/mem-{modelname}", bench=bench)
227
228 print_info(f"Benchmark mem-{modelname} generated")
229 return True
230
231
232if __name__ == '__main__':
233 # run main if executed from the command line
234 # and the main method exists
235
236 if len(sys.argv) != 2:
237 print_info("Usage:")
238 print_info("%s output_dir" % (sys.argv[0]))
239 exit(-1)
240
241 DIRECTORY = sys.argv[1]
242
243 if not os.path.isdir(DIRECTORY):
244 print_error(f"Output directory '{DIRECTORY}' does not exists")
245 exit(-1)
246
247 main()
power_v206_power7_ppc64_linux_gcc_random.py
The following example generates random microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_memory.py
17
18Example python script to show how to generate random microbenchmarks.
19"""
20
21# Futures
22from __future__ import absolute_import
23
24# Built-in modules
25import multiprocessing as mp
26import os
27import random
28import sys
29from typing import List
30
31# Own modules
32import microprobe.code
33import microprobe.passes.address
34import microprobe.passes.branch
35import microprobe.passes.ilp
36import microprobe.passes.initialization
37import microprobe.passes.instruction
38import microprobe.passes.memory
39import microprobe.passes.register
40import microprobe.passes.structure
41from microprobe import MICROPROBE_RC
42from microprobe.exceptions import MicroprobeError, \
43 MicroprobeTargetDefinitionError
44from microprobe.model.memory import EndlessLoopDataMemoryModel
45from microprobe.target import import_definition
46from microprobe.target.isa.instruction import InstructionType
47from microprobe.utils.cmdline import print_error, print_info
48from microprobe.utils.typeguard_decorator import typeguard_testsuite
49
50__author__ = "Ramon Bertran"
51__copyright__ = "Copyright 2011-2021 IBM Corporation"
52__credits__ = []
53__license__ = "IBM (c) 2011-2021 All rights reserved"
54__version__ = "0.5"
55__maintainer__ = "Ramon Bertran"
56__email__ = "rbertra@us.ibm.com"
57__status__ = "Development" # "Prototype", "Development", or "Production"
58
59# Benchmark size
60BENCHMARK_SIZE = 8 * 1024
61
62# Get the target definition
63try:
64 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
65except MicroprobeTargetDefinitionError as exc:
66 print_error("Unable to import target definition")
67 print_error("Exception message: %s" % str(exc))
68 exit(-1)
69
70assert TARGET.microarchitecture is not None, \
71 "Target must have a defined microarchitecture"
72BASE_ELEMENT = [
73 element for element in TARGET.microarchitecture.elements.values()
74 if element.name == 'L1D'
75][0]
76CACHE_HIERARCHY = \
77 TARGET.microarchitecture.cache_hierarchy.get_data_hierarchy_from_element(
78 BASE_ELEMENT)
79
80PARALLEL = True
81
82DIRECTORY = None
83
84
85@typeguard_testsuite
86def main():
87 """ Main program. """
88 if PARALLEL:
89 pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
90 pool.map(generate, list(range(0, 100)), 1)
91 else:
92 list(map(generate, list(range(0, 100))))
93
94
95@typeguard_testsuite
96def generate(name: str):
97 """ Benchmark generation policy. """
98
99 assert DIRECTORY is not None, "DIRECTORY variable cannot be None"
100
101 if os.path.isfile(f"{DIRECTORY}/random-{name}.c"):
102 print_info(f"Skip {name}")
103 return
104
105 print_info(f"Generating {name}...")
106
107 # Seed the randomness
108 rand = random.Random()
109 rand.seed(64) # My favorite number ;)
110
111 # Generate a random memory model (used afterwards)
112 model: List[int] = []
113 total = 100
114 for mcomp in CACHE_HIERARCHY[0:-1]:
115 weight = rand.randint(0, total)
116 model.append(weight)
117 print_info("%s: %d%%" % (mcomp, weight))
118 total = total - weight
119
120 # Fix remaining
121 level = rand.randint(0, len(CACHE_HIERARCHY[0:-1]) - 1)
122 model[level] += total
123
124 # Last level always zero
125 model.append(0)
126
127 # Sanity check
128 psum = 0
129 for elem in model:
130 psum += elem
131 assert psum == 100
132
133 modelobj = EndlessLoopDataMemoryModel("random-%s", CACHE_HIERARCHY, model)
134
135 # Get the loop wrapper. In this case we take the 'CInfPpc', which
136 # generates an infinite loop in C using PowerPC embedded assembly.
137 cwrapper = microprobe.code.get_wrapper("CInfPpc")
138
139 # Define function to return random numbers (used afterwards)
140 def rnd():
141 """Return a random value. """
142 return rand.randrange(0, (1 << 64) - 1)
143
144 # Create the benchmark synthesizer
145 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
146
147 ##########################################################################
148 # Add the passes we want to apply to synthesize benchmarks #
149 ##########################################################################
150
151 # --> Init registers to random values
152 synth.add_pass(
153 microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
154
155 # --> Add a single basic block of size size
156 synth.add_pass(
157 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
158
159 # --> Fill the basic block with instructions picked randomly from the list
160 # provided
161
162 instructions: List[InstructionType] = []
163 for instr in TARGET.isa.instructions.values():
164
165 if instr.privileged: # Skip privileged
166 continue
167 if instr.hypervisor: # Skip hypervisor
168 continue
169 if instr.trap: # Skip traps
170 continue
171 if instr.syscall: # Skip syscall
172 continue
173 if "String" in instr.description: # Skip unsupported string instr.
174 continue
175 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
176 continue
177 if instr.category in ['LMA', 'LMV', 'DS', 'EC',
178 'WT']: # Skip unsupported categories
179 continue
180 if instr.access_storage_with_update: # Not supported by mem. model
181 continue
182 if instr.branch and not instr.branch_relative: # Skip branches
183 continue
184 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
185 continue
186 if "Conitional Indexed" in instr.description: # Skip (illegal intr.)
187 continue
188 if instr.name in [
189 'LD_V1',
190 'LWZ_V1',
191 'STW_V1',
192 ]:
193 continue
194
195 instructions.append(instr)
196
197 synth.add_pass(
198 microprobe.passes.instruction.SetRandomInstructionTypePass(
199 instructions, rand))
200
201 # --> Set the memory operations parameters to fulfill the given model
202 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(modelobj))
203
204 # --> Set target of branches to next instruction (first compute addresses)
205 synth.add_pass(microprobe.passes.address.UpdateInstructionAddressesPass())
206 synth.add_pass(microprobe.passes.branch.BranchNextPass())
207
208 # --> Set the dependency distance and the default allocation. Dependency
209 # distance is randomly picked
210 synth.add_pass(
211 microprobe.passes.register.DefaultRegisterAllocationPass(
212 rand, dd=rand.randint(1, 20)))
213
214 # Generate the benchmark (applies the passes)
215 # Since it is a randomly generated code, the generation might fail
216 # (e.g. not enough access to fulfill the requested memory model, etc.)
217 # Because of that, we handle the exception accordingly.
218 try:
219 print_info(f"Synthesizing {name}...")
220 bench = synth.synthesize()
221 print_info(f"Synthesized {name}!")
222 # Save the benchmark
223 synth.save(f"{DIRECTORY}/random-{name}", bench=bench)
224 except MicroprobeError:
225 print_info(f"Synthesizing error in '{name}'. This is Ok.")
226
227 return True
228
229
230if __name__ == '__main__':
231 # run main if executed from the command line
232 # and the main method exists
233
234 if len(sys.argv) != 2:
235 print_info("Usage:")
236 print_info("%s output_dir" % (sys.argv[0]))
237 exit(-1)
238
239 DIRECTORY = sys.argv[1]
240
241 if not os.path.isdir(DIRECTORY):
242 print_error(f"Output directory '{DIRECTORY}' does not exists")
243 exit(-1)
244
245 if callable(locals().get('main')):
246 main()
power_v206_power7_ppc64_linux_gcc_custom.py
The following example shows different examples on how to customize the generation of microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_custom.py
17
18Example python script to show how to generate random microbenchmarks.
19"""
20
21# Futures
22from __future__ import absolute_import
23
24# Built-in modules
25import os
26import sys
27import random
28
29# Own modules
30import microprobe.code
31import microprobe.passes.initialization
32import microprobe.passes.instruction
33import microprobe.passes.memory
34import microprobe.passes.register
35import microprobe.passes.structure
36from microprobe.exceptions import MicroprobeTargetDefinitionError
37from microprobe.model.memory import EndlessLoopDataMemoryModel
38from microprobe.target import import_definition
39from microprobe.utils.cmdline import print_error, print_info
40from microprobe.utils.misc import RNDINT
41
42__author__ = "Ramon Bertran"
43__copyright__ = "Copyright 2011-2021 IBM Corporation"
44__credits__ = []
45__license__ = "IBM (c) 2011-2021 All rights reserved"
46__version__ = "0.5"
47__maintainer__ = "Ramon Bertran"
48__email__ = "rbertra@us.ibm.com"
49__status__ = "Development" # "Prototype", "Development", or "Production"
50
51# Benchmark size
52BENCHMARK_SIZE = 8 * 1024
53
54if len(sys.argv) != 2:
55 print_info("Usage:")
56 print_info("%s output_dir" % (sys.argv[0]))
57 exit(-1)
58
59DIRECTORY = sys.argv[1]
60
61if not os.path.isdir(DIRECTORY):
62 print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
63 exit(-1)
64
65# Get the target definition
66try:
67 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
68except MicroprobeTargetDefinitionError as exc:
69 print_error("Unable to import target definition")
70 print_error("Exception message: %s" % str(exc))
71 exit(-1)
72
73
74###############################################################################
75# Example 1: loop with instructions accessing storage , hitting the first #
76# level of cache and with dependency distance of 3 #
77###############################################################################
78def example_1():
79 """ Example 1 """
80 name = "L1-LOADS"
81
82 base_element = [
83 element for element in TARGET.elements.values()
84 if element.name == 'L1D'
85 ][0]
86 cache_hierarchy = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
87 base_element)
88
89 model = [0] * len(cache_hierarchy)
90 model[0] = 100
91
92 mmodel = EndlessLoopDataMemoryModel("random-%s", cache_hierarchy, model)
93
94 profile = {}
95 for instr_name in sorted(TARGET.instructions.keys()):
96 instr = TARGET.instructions[instr_name]
97 if not instr.access_storage:
98 continue
99 if instr.privileged: # Skip privileged
100 continue
101 if instr.hypervisor: # Skip hypervisor
102 continue
103 if "String" in instr.description: # Skip unsupported string instr.
104 continue
105 if "ultiple" in instr.description: # Skip unsupported mult. ld/sts
106 continue
107 if instr.category in ['DS', 'LMA', 'LMV',
108 'EC']: # Skip unsupported categories
109 continue
110 if instr.access_storage_with_update: # Not supported
111 continue
112
113 if instr.name in [
114 'LD_V1',
115 'LWZ_V1',
116 'STW_V1',
117 ]:
118 continue
119
120 if (any([moper.is_load for moper in instr.memory_operand_descriptors])
121 and all([
122 not moper.is_store
123 for moper in instr.memory_operand_descriptors
124 ])):
125 profile[instr] = 1
126
127 rand = random.Random()
128 rand.seed(13)
129
130 cwrapper = microprobe.code.get_wrapper("CInfPpc")
131 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
132
133 synth.add_pass(
134 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
135 synth.add_pass(
136 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
137 synth.add_pass(
138 microprobe.passes.initialization.InitializeRegisterPass("GPR1",
139 0,
140 force=True,
141 reserve=True))
142 synth.add_pass(
143 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
144 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(mmodel))
145 synth.add_pass(
146 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=3))
147
148 print_info("Generating %s..." % name)
149 bench = synth.synthesize()
150 print_info("%s Generated!" % name)
151 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
152
153
154###############################################################################
155# Example 2: loop with instructions using the MUL unit and with dependency #
156# distance of 4 #
157###############################################################################
158def example_2():
159 """ Example 2 """
160 name = "FXU-MUL"
161
162 cwrapper = microprobe.code.get_wrapper("CInfPpc")
163 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
164
165 rand = random.Random()
166 rand.seed(13)
167
168 synth.add_pass(
169 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
170 synth.add_pass(
171 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
172 synth.add_pass(
173 microprobe.passes.instruction.SetInstructionTypeByElementPass(
174 TARGET, [TARGET.elements['MUL_FXU0_Core0_SCM_Processor']], {}))
175 synth.add_pass(
176 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=4))
177
178 print_info("Generating %s..." % name)
179 bench = synth.synthesize()
180 print_info("%s Generated!" % name)
181 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
182
183
184###############################################################################
185# Example 3: loop with instructions using the ALU unit and with dependency #
186# distance of 1 #
187###############################################################################
188def example_3():
189 """ Example 3 """
190 name = "FXU-ALU"
191
192 cwrapper = microprobe.code.get_wrapper("CInfPpc")
193 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
194
195 rand = random.Random()
196 rand.seed(13)
197
198 synth.add_pass(
199 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
200 synth.add_pass(
201 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
202 synth.add_pass(
203 microprobe.passes.instruction.SetInstructionTypeByElementPass(
204 TARGET, [TARGET.elements['ALU_FXU0_Core0_SCM_Processor']], {}))
205 synth.add_pass(
206 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=1))
207
208 print_info("Generating %s..." % name)
209 bench = synth.synthesize()
210 print_info("%s Generated!" % name)
211 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
212
213
214###############################################################################
215# Example 4: loop with FMUL* instructions with different weights and with #
216# dependency distance 10 #
217###############################################################################
218def example_4():
219 """ Example 4 """
220 name = "VSU-FMUL"
221
222 profile = {}
223 profile[TARGET.instructions['FMUL_V0']] = 4
224 profile[TARGET.instructions['FMULS_V0']] = 3
225 profile[TARGET.instructions['FMULx_V0']] = 2
226 profile[TARGET.instructions['FMULSx_V0']] = 1
227
228 cwrapper = microprobe.code.get_wrapper("CInfPpc")
229 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
230
231 rand = random.Random()
232 rand.seed(13)
233
234 synth.add_pass(
235 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
236 synth.add_pass(
237 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
238 synth.add_pass(
239 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
240 synth.add_pass(
241 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=10))
242
243 print_info("Generating %s..." % name)
244 bench = synth.synthesize()
245 print_info("%s Generated!" % name)
246 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
247
248
249###############################################################################
250# Example 5: loop with FADD* instructions with different weights and with #
251# dependency distance 1 #
252###############################################################################
253def example_5():
254 """ Example 5 """
255 name = "VSU-FADD"
256
257 profile = {}
258 profile[TARGET.instructions['FADD_V0']] = 100
259 profile[TARGET.instructions['FADDx_V0']] = 1
260 profile[TARGET.instructions['FADDS_V0']] = 10
261 profile[TARGET.instructions['FADDSx_V0']] = 1
262
263 cwrapper = microprobe.code.get_wrapper("CInfPpc")
264 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
265
266 rand = random.Random()
267 rand.seed(13)
268
269 synth.add_pass(
270 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
271 synth.add_pass(
272 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
273 synth.add_pass(
274 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
275 synth.add_pass(
276 microprobe.passes.register.DefaultRegisterAllocationPass(rand, dd=1))
277
278 print_info("Generating %s..." % name)
279 bench = synth.synthesize()
280 print_info("%s Generated!" % name)
281 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
282
283
284###############################################################################
285# Call the examples #
286###############################################################################
287example_1()
288example_2()
289example_3()
290example_4()
291example_5()
292exit(0)
power_v206_power7_ppc64_linux_gcc_genetic.py
Deprecated since version 0.5: Support for the PyEvolve and genetic algorithm based searches has been discontinued
The following example shows how to use the design exploration module and the genetic algorithm based searches to look for a solution. In particular, for each functional unit of the architecture and a range of IPCs (instruction per cycle), the example looks for a solution that stresses that functional unit at the given IPC. External commands (not included) are needed to evaluate the generated microbenchmarks in the target platform.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_genetic.py
17
18Example python script to show how to generate a set of microbenchmark
19stressing a particular unit but at different IPC ratio using a genetic
20search algorithm to play with two knobs: average latency and dependency
21distance.
22
23An IPC evaluation and scoring script is required. For instance:
24
25.. code:: bash
26
27 #!/bin/bash
28 # ARGS: $1 is the target IPC
29 # $2 is the name of the generate benchnark
30 target_ipc=$1
31 source_bench=$2
32
33 # Compile the benchmark
34 gcc -O0 -mcpu=power7 -mtune=power7 -std=c99 $source_bench.c -o $source_bench
35
36 # Evaluate the ipc
37 ipc=< your preferred commands to evaluate the IPC >
38
39 # Compute the score (the closer to the target IPC the
40 score=(1/($ipc-$target_ipc))^2 | bc -l
41
42 echo $score
43
44Use the script above as a template for your own GA-based search.
45"""
46
47# Futures
48from __future__ import absolute_import, division
49
50# Built-in modules
51import datetime
52import os
53import sys
54import time as runtime
55from typing import List, Tuple
56import random
57
58# Own modules
59import microprobe.code
60import microprobe.driver.genetic
61import microprobe.passes.ilp
62import microprobe.passes.initialization
63import microprobe.passes.instruction
64import microprobe.passes.register
65import microprobe.passes.structure
66from microprobe.exceptions import MicroprobeTargetDefinitionError
67from microprobe.target import import_definition
68from microprobe.utils.cmdline import print_error, print_info, print_warning
69from microprobe.utils.misc import RNDINT
70from microprobe.utils.typeguard_decorator import typeguard_testsuite
71
72__author__ = "Ramon Bertran"
73__copyright__ = "Copyright 2011-2021 IBM Corporation"
74__credits__ = []
75__license__ = "IBM (c) 2011-2021 All rights reserved"
76__version__ = "0.5"
77__maintainer__ = "Ramon Bertran"
78__email__ = "rbertra@us.ibm.com"
79__status__ = "Development" # "Prototype", "Development", or "Production"
80
81# Benchmark size
82BENCHMARK_SIZE = 20
83
84COMMAND = None
85DIRECTORY = None
86
87# Get the target definition
88try:
89 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
90except MicroprobeTargetDefinitionError as exc:
91 print_error("Unable to import target definition")
92 print_error("Exception message: %s" % str(exc))
93 exit(-1)
94
95
96@typeguard_testsuite
97def main():
98 """Main function."""
99
100 component_list = ["FXU", "FXU-noLSU", "FXU-LSU", "VSU", "VSU-FXU"]
101 ipcs = [float(x) / 10 for x in range(1, 41)]
102 ipcs = ipcs[5:] + ipcs[:5]
103
104 for name in component_list:
105 for ipc in ipcs:
106 generate_genetic(name, ipc)
107
108
109@typeguard_testsuite
110def generate_genetic(compname: str, ipc: float):
111 """Generate a microbenchmark stressing compname at the given ipc."""
112
113 assert COMMAND is not None, "COMMAND variable cannot be None"
114 assert DIRECTORY is not None, "DIRECTORY variable cannot be None"
115
116 comps = []
117 bcomps = []
118 any_comp: bool = False
119
120 assert TARGET.microarchitecture is not None, \
121 "Target must have a defined microarchitecture"
122
123 if compname.find("FXU") >= 0:
124 comps.append(
125 TARGET.microarchitecture.elements["FXU0_Core0_SCM_Processor"])
126
127 if compname.find("VSU") >= 0:
128 comps.append(
129 TARGET.microarchitecture.elements["VSU0_Core0_SCM_Processor"])
130
131 if len(comps) == 2:
132 any_comp = True
133 elif compname.find("noLSU") >= 0:
134 bcomps.append(
135 TARGET.microarchitecture.elements["LSU0_Core0_SCM_Processor"])
136 elif compname.find("LSU") >= 0:
137 comps.append(
138 TARGET.microarchitecture.elements["LSU_Core0_SCM_Processor"])
139
140 if (len(comps) == 1 and ipc > 2) or (len(comps) == 2 and ipc > 4):
141 return True
142
143 for elem in os.listdir(DIRECTORY):
144 if not elem.endswith(".c"):
145 continue
146 if elem.startswith("%s:IPC:%.2f:DIST" % (compname, ipc)):
147 print_info("Already generated: %s %d" % (compname, ipc))
148 return True
149
150 print_info(f"Going for IPC: {ipc} and Element: {compname}")
151
152 def generate(name: str, dist: float, latency: float):
153 """Benchmark generation function.
154
155 First argument is name, second the dependency distance and the
156 third is the average instruction latency.
157 """
158 wrapper = microprobe.code.get_wrapper("CInfPpc")
159 synth = microprobe.code.Synthesizer(TARGET, wrapper())
160 rand = random.Random()
161 rand.seed(13)
162 synth.add_pass(
163 microprobe.passes.initialization.InitializeRegistersPass(
164 value=RNDINT))
165 synth.add_pass(
166 microprobe.passes.structure.SimpleBuildingBlockPass(
167 BENCHMARK_SIZE))
168 synth.add_pass(
169 microprobe.passes.instruction.SetInstructionTypeByElementPass(
170 TARGET,
171 comps, {},
172 block=bcomps,
173 avelatency=latency,
174 any_comp=any_comp))
175 synth.add_pass(
176 microprobe.passes.register.DefaultRegisterAllocationPass(
177 rand, dd=dist))
178 bench = synth.synthesize()
179 synth.save(name, bench=bench)
180
181 # Set the genetic algorithm parameters
182 ga_params: List[Tuple[int, int, float]] = []
183 ga_params.append((0, 20, 0.05)) # Average dependency distance design space
184 ga_params.append((2, 8, 0.05)) # Average instruction latency design space
185
186 # Set up the search driver
187 driver = microprobe.driver.genetic.ExecCmdDriver(
188 generate, 20, 30, 30, f"'{COMMAND}' {ipc} ", ga_params)
189
190 starttime = runtime.time()
191 print_info("Start search...")
192 driver.run(1)
193 print_info("Search end")
194 endtime = runtime.time()
195
196 print_info("Genetic time::"
197 f"{datetime.timedelta(seconds=endtime - starttime)}")
198
199 # Check if we found a solution
200 ga_sol_params: Tuple[float, float] = driver.solution()
201 score = driver.score()
202
203 print_info(f"IPC found: {ipc}, score: {score}")
204
205 if score < 20:
206 print_warning(f"Unable to find an optimal solution with IPC: {ipc}:")
207 print_info("Generating the closest solution...")
208 generate(
209 f"{DIRECTORY}/{compname}:IPC:{ipc:.2f}:"
210 f"DIST:{ga_sol_params[0]:.2f}:LAT:{ga_sol_params[1]:.2f}-check",
211 ga_sol_params[0], ga_sol_params[1])
212 print_info("Closest solution generated")
213 else:
214 print_info("Solution found for %s and IPC %f -> dist: %f , "
215 "latency: %f " %
216 (compname, ipc, ga_sol_params[0], ga_sol_params[1]))
217 print_info("Generating solution...")
218 generate(
219 f"{DIRECTORY}/{compname}:IPC:{ipc:.2f}:"
220 f"DIST:{ga_sol_params[0]:.2f}:LAT:{ga_sol_params[1]:.2f}",
221 ga_sol_params[0], ga_sol_params[1])
222 print_info("Solution generated")
223 return True
224
225
226if __name__ == '__main__':
227 # run main if executed from the COMMAND line
228 # and the main method exists
229
230 if len(sys.argv) != 3:
231 print_info("Usage:")
232 print_info("%s output_dir eval_cmd" % (sys.argv[0]))
233 print_info("")
234 print_info("Output dir: output directory for the generated benchmarks")
235 print_info("eval_cmd: command accepting 2 parameters: the target IPC")
236 print_info(" and the filename of the generate benchmark. ")
237 print_info(" Output: the score used for the GA search. E.g.")
238 print_info(" the close the IPC of the generated benchmark to")
239 print_info(" the target IPC, the cmd should give a higher ")
240 print_info(" score. ")
241 exit(-1)
242
243 DIRECTORY = sys.argv[1]
244 COMMAND = sys.argv[2]
245
246 if not os.path.isdir(DIRECTORY):
247 print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
248 exit(-1)
249
250 if not os.path.isfile(COMMAND):
251 print_info("The COMMAND '%s' does not exists" % (COMMAND))
252 exit(-1)
253
254 if callable(locals().get('main')):
255 main()