Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SPECaccel Frequently Asked Questions (FAQ)

This document has frequently asked technical questions and answers. The latest version of this document may be found at http://www.spec.org/accel2023/Docs/faq.html.

If you are looking for the list of known problems with SPECaccel, please see http://www.spec.org/accel2023/Docs/errata.html.

Contents

Requirements

Require.01 How much memory do I need?

Require.02 Does this work with Windows?

Require.03 What software do I need?

Installation

Install.01 ./install.sh: /bin/sh: bad interpreter: Permission denied

Install.02 The DVD drive is on system A, but I want to install on system B. What do I do?

Install.03 Do I need to be root?

runaccel

runaccel.01 Can't locate strict.pm

runaccel.02 specperl: bad interpreter: No such file or directory

runaccel.03 Do I need to be root?

Building benchmarks

Build.01 Why is it rebuilding the benchmarks?

Setting up

Setup.01 hash doesn't match after copy

Setup.02 Copying executable failed

Running benchmarks

Run.01 Why does this benchmark take so long to run?

Run.02 Why was there this cryptic message from the operating system?

Run.03 What happens with the compilation of accelerator code?

Run.04 Can I run on a 32-bits system?

Run.05 My runtimes vary quite a lot. Is there a way to fix it?

Run.06 How do I run on a particular device?

Run.07 How do I know what devices are available?

Run.08 Benchmark 470.bt keeps failing on me with a runtime error.

Miscompares

Miscompare.01 I got a message about a miscompare

Miscompare.02 The benchmark took less than 1 second

Miscompare.03 The .mis file says "short"

Miscompare.04 My compiler is generating bad code!

Miscompare.05 The code is bad even with low optimization!

Miscompare.06 The .mis file is just a bunch of numbers.

Results reporting

Results.01 It's hard to cut/paste into my spreadsheet

Results.02 What is a "flags file"? What does Unknown Flags mean?

Results.03 Submission Check -> FAILED

Results.04 Why does the report have an (*) that says ...

Power

Power.01 What is the power component of SPECaccel?

Requirements

Require.01 q. How much memory do I need?

The system requirements may be found in the document system-requirements.html. Currently, the minimum amount of accelerator memory is 16 GB, and 16 GB for the host.

Require.02 q. Does this work with Windows?

The SPECaccel suite has been tested on a number of platforms, but Windows is not one of them. Because of how this benchmark shares components with SPEC CPU benchmarks, it is possible that it might work on Windows. If you buy this benchmark and expect it to work on Windows, SPEC will not be able to support you because it is not a supported operating system.

Require.03 q. What software do I need?

The system requirements may be found in the document system-requirements.html. If you want to test the OpenACC parallel model, you will need a compiler that accepts OpenACC. If you want to test the OpenMP parallel model, you will need a compiler that accepts OpenMP 5.1. Note that the OpenMP codes use metadirective to select type of directive being used and and how they are applied. If you're compiler does not yet support metadirectives, SPECaccel includes alternate sources with the metadirectives removed.

Installation

Install.01 q. Why am I getting a message such as "./install.sh: /bin/sh: bad interpreter: Permission denied"?

a. If you are installing from a DVD you created, check to be sure that your operating system allows programs to be executed from the DVD. For example, some Linux man pages for mount suggest setting the properties for the CD or DVD drive in /etc/fstab to "/dev/cdrom /cd iso9660 ro,user,noauto,unhide", which is notably missing the property "exec". Add exec to that list in /etc/fstab, or add it to your mount command. Notice that the sample Linux mount command in install-guide-unix.html does include exec.

Perhaps install.sh lacks permission to run because you tried to copy all the files from the DVD, in order to move them to another system. If so, please don't do that. There's an easier way. See the next question.

Install.02 q. The DVD drive is on system A, but I want to install on system B. What do I do?

a. The installation guides have an appendix just for you, which describe installing from the network or installing from a tarfile. See Appendix 1 in install-guide-unix.html

Install.03 Do I need to be root?

Occasionally, users of Unix systems have asked whether it is necessary to elevate privileges, or to become 'root', prior to installing or running SPECaccel.

a. SPEC recommends (*) that you do not become root, because: (1) To the best of SPEC's knowledge, no component of SPECaccel needs to modify system directories, nor does any component need to call privileged system interfaces. (2) Therefore, if you find that it appears that there is some reason why you need to be root, the cause is likely to be outside the SPEC toolset - for example, disk protections, or quota limits. (3) For safe benchmarking, it is better to avoid being root, for the same reason that it is a good idea to wear seat belts in a car: accidents happen, humans make mistakes. For example, if you accidentally type:

kill 1

when you meant to say:

kill %1

then you will very grateful if you are not privileged at that moment.

(*) This is only a recommendation, not a requirement nor a rule.

runaccel

runaccel.01 q. When I say runaccel, why does it say Can't locate strict.pm? For example:

Can't locate strict.pm in @INC (@INC contains: .) at runaccel line 28.
BEGIN failed--compilation aborted at runaccel line 28.

a. You can't use runaccel if its path is not set correctly. You should source shrc or cshrc, as described in Section 6 of the install guide.

runaccel.02 q. Why am I getting messages about specperl: bad interpreter? For example:

bash: /specaccel.new/bin/runaccel: /specaccel/bin/specperl: bad interpreter: No such file or directory

a. Did you move the directory where runaccel was installed? If so, you can probably put everything to rights, just by going to the new top of the directory tree and typing "bin/relocate".

For example, the following unwise sequence of events is repaired after completion of the final line.

Top of SPEC benchmark tree is '/specaccel'
Everything looks okay.  cd to /specaccel, source the shrc file and have at it!
$ cd /specaccel
$ . ./shrc
$ cd ..
$ mv specaccel specaccel.new
$ runaccel -h | head
bash: runaccel: command not found
$ cd specaccel.new/
$ . ./shrc
$ runaccel --help | head
bash: /specaccel.new/bin/runaccel: /specaccel/specperl: bad interpreter: No such file or directory
$ bin/relocate

runaccel.03 Do I need to be root?

Regarding the root account, the answer for runaccel is the same as the answer for installation question #3, above.

Building benchmarks

Build.01 q. Why is it rebuilding the benchmarks?

a. You changed something, and the tools thought that it might affect the generated binaries. See the section about automatic rebuilds in the config.html document.

Setting up

Setup.01 q. What does hash doesn't match after copy mean?

I got this strange, difficult to reproduce message:
    hash doesn't match after copy ... in copy_file (1 try total)! Sleeping 2 seconds...
followed by several more tries and sleeps. Why?

a. During benchmark setup, certain files are checked. If they don't match what they are expected to, you might see this message. Check:

If the condition persists, try turning up the verbosity level. Look at the files with other tools; do they exist? Can you see differences? Try a different disk and controller. And, check for the specific instance of this message described in the next question.

Setup.02q. Why does it say ERROR: Copying executable failed?

I got this strange, difficult to reproduce message:
    ERROR: Copying executable to run directory FAILED
or
    ERROR: Copying executable from build dir to exe dir FAILED!
along with the bit about hashes not matching from the previous question. Why?

a. Perhaps you have attempted to build the same benchmark twice in two simultaneous jobs.

On most operating systems, the SPEC tools don't mind concurrent jobs. They use your operating system's locking facilities to write the correct outputs to the correct files, even if you fire off many runaccel commands at the same time.

But there's one case of simultaneous building that is difficult for the tools to defend against: please don't try to build the very same executable from two different jobs at the same time. Notice that if you say something like this:

$ tail myconfig.cfg
450.md=peak:
basepeak=yes
$ runaccel --config myconfig --size test --tune base 450.md &
$ runaccel --config myconfig --size test --tune peak 450.md &

then you are trying to build the same benchmark twice in two different jobs, because of the presence of basepeak=yes. Please don't try to do that.

Running benchmarks

Run.01 q. Why does this benchmark suite take so long to run?

The benchmarks are targeting a runtime of about 500 seconds each (6000 total accross all 12 benchmarks when run on an NVIDIA V100. The target time is indended to be initially long to ensure it can be useful for future generations of accelerators. However if they are running very long, such as over an hour each, you should check:

Run.02 q. Why was there this cryptic message from the operating system?

If you are getting cryptic, hard-to-reproduce, unpredictable error messages from your system, one possible reason may be that the benchmarks consume substantial resources, of several types. If an OS runs out of some resource - for example, pagefile space, or process heap space - it might not give you a very clear message. Instead, you might see only a very brief message, or a dialog box with a hex error code in it. Please see the hints and suggestions in the section about resources in system-requirements.html.

Run.03 q. Is the compilation time of the accelerator code included in the total execution time ?

If using "Just-in-time" (JIT) compliation which occurs when loading binary the first time, then the time needed for the compilation of accelerator code is included in the total execution time. Often JIT code may be cached by the system, in which case the time would not be included. As SPECaccel allows for using cached JIT, consider performing a warm-up run on the system under test prior to running reportable results.

Run.04 q. Can I run on a 32-bits system?

The benchmarks have been tested extensively as 64-bit binaries on a range of systems, but not as 32-bit binaries. Given several of the benchmarks' workloads use objects larger than 2GB, it is unlikely that they can be successfully run as 32-bit binaries.

Run.05 q. My runtimes vary quite a lot. Is there a way to fix it?

This usually happens on multi-socket systems when your host process runs on a different socket from your accelerator. Try pinning runaccel to the right socket using the NUMA tool of your choice.

Run.06 q. How do I run on a particular device?

When running on a system with multiple accelerators, SPECaccel will use the default device, typically device 0. To select a different device, use the vendor supplied method such as setting the environment variable CUDA_VISIBLE_DEVICES (Nvidia), HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES (AMD)

Run.07 q. How do I know what devices are available?

Check with the tools provided by your compiler or accelerator vendor.

Run.08 q. Benchmark 470.bt keeps failing on me with a runtime error.

On some Linux distributions, 470.bt may require more stack space than is default. Please increase the stack size or set it to unlimited and try again.

Miscompares

Miscompare.01 I got a message about a miscompare. The tools said something like:

Running Benchmarks
  Running 450.md ref base 12.3 default 
/spec/accel/bin/specinvoke -d /spec/accel/benchspec/ACCEL/450.md/run/run_base_ref_12.3.0000 
-e speccmds.err -o speccmds.stdout -f speccmds.cmd -C -q
/spec/accel/bin/specinvoke -E -d /spec/accel/benchspec/ACCEL/450.md/run/run_base_ref_12.3.0000 
-c 1 -e compare.err -o compare.stdout -f compare.cmd -k

*** Miscompare of md.log.01228060000; for details see
    /spec/accel/benchspec/ACCEL/450.md/run/run_base_ref_12.3.0000/md.log.01228060000.mis
Error: 1x450.md
Producing Raw Reports
mach: default
  ext: 12.3
    size: ref
      set: accel

Why did it say that? What's the problem?

a. We don't know. Many things can cause a benchmark to miscompare, so we really can't tell you exactly what's wrong based only on the fact that a miscompare occurred.

But don't panic.

Please notice, if you read the message carefully, that there's a suggestion of a very specific file to look in. It may be a little hard to read if you have a narrow terminal window, as in the example above, but if you look carefully you'll see that it says:

*** Miscompare of md.log.01228060000; for details see
    /spec/accel/benchspec/ACCEL/450.md/run/run_base_ref_12.3.0000/md.log.01228060000.mis

Now's the time to look inside that file. Simply doing so may provide a clue as to the nature of your problem.

Change your current directory to the run directory using the path mentioned in the message, for example:

cd /spec/accel/benchspec/ACCEL/450.md/run/run_base_ref_12.3.0000

Then, have a look at the file that was mentioned, using your favorite text editor. If the file does not exist, then check your paths, and check to see whether you have run out of disk space.

Miscompare.02 The benchmark ran, but it took less than 1 second and there was a miscompare. Help!

a. If the benchmark took less than 1 second to execute, it didn't execute properly. There should be one or more .err files in the run directory which will contain some clues about why the benchmark failed to run. Common causes include libraries that were used for compilation but not available during the run, executables that crash with access violations or other exceptions, and permissions problems. See also the suggestions in the next question.

Miscompare.03 I looked in the .mis file and it said something like:

   'md.log.01228060000' short

What does "short" mean?

a. If a line like the above is the only line in the .mis file, it means that the benchmark failed to produce any output. In this case, the corresponding error file (look for files with .err extensions in the run directory) may have a clue. In this case, it was Segmentation Fault - core dumped. For problems like this, the first things to examine are the portability flags used to build the benchmark.

Have a look at the sample config files in $SPEC/config. If you constructed your own config file based on one of those, maybe you picked a starting point that was not really appropriate (e.g. you picked a 32-bit config file but are using 64-bit compilation options). Have a look at other samples in that directory. Check at www.spec.org/accel2023 to see if there have been any result submissions using the platform that you are trying to test. If so, compare your portability flags to the ones in the the config files for those results.

If the portability flags are okay, your compiler may be generating bad code.

Miscompare.04 My compiler is generating bad code! Help!

a. Try reducing the optimization that the compiler is doing. Instructions for doing this will vary from compiler to compiler, so it's best to ask your compiler vendor for advice if you can't figure out how to do it for yourself.

Miscompare.05 My compiler is generating bad code with low or no optimization! Help!

a. If you're using a beta compiler, try dropping down to the last released version, or get a newer copy of the beta. If you're using a version of GCC that shipped with your OS, you may want to try getting a "vanilla" (no patches) version and building it yourself.

Miscompare.06 I looked in the .mis file and it was just full of a bunch of numbers.

a. In this case, the benchmark is probably running, but it's not generating answers that are within the tolerances set. See the suggestions for how to deal with compilers that generate bad code in the previous two questions. In particular, you might see if there is a way to encourage your compiler to be careful about optimization of floating point expressions.

Results reporting

Results.01 q. It's hard to cut/paste into my spreadsheet

a. Please don't do that. With SPECaccel, there's a handy .csv format file right next to the other result formats on the index page. Or, you can go up to the top of your browser and change the .pdf (or .whichever) to .csv

Results.02 q. What is a "flags file"? What does the message Unknown Flags mean in a report?

a. SPECaccel provides benchmarks in source code form, which are compiled under control of SPEC's toolset. Compilation flags (such as -O5 or -unroll) are detected and reported by the tools with the help of flag description files. Therefore, to do a complete run, you need to (1) point to an existing flags file (easy) or (2) modify an existing flags file (slightly harder) or (3) write one from scratch (definitely harder).

  1. Find an existing flags file by noticing the address of the .xml file at the bottom of any result published at www.spec.org/accel2023. You can use the --flagsurl switch to point your own runaccel command at that file, or you can reference it from your config file with the flagsurl option. For example:
       runaccel --config=myconfig --flagsurl=http://www.spec.org/accel2023/flags/sun-studio.xml int
  2. You can download the .xml flags file referenced at the bottom of any published result at www.spec.org/accel2023. Warning: clicking on the .xml link may just confuse your web browser; it's probably better to use whatever methods your browser provides to download a file without viewing it - for example, control-click in Safari, right click in Internet Explorer. Then, look at it with a text editor.
  3. You can write your own flags file by following the instructions in flag-description.html.

Notice that you do not need to re-run your tests if the only problem was Unknown flags. You can just use runaccel --rawformat --flagsurl

Results.03 q. What's all this about Submission Check -> FAILED littering my log file and my screen?

At the end of my run, why did it print something like this?

format: Submission Check -> FAILED.  Found the following errors:
        - The "hw_memory" field is invalid.
            It must contain leading digits, followed by a space,
            and a standard unit abbreviation.  Acceptable
            abbreviations are KB, MB, GB, and TB.
           The current value is "20480 Megabytes".

a. A complete, reportable result has various information filled in for readers. These fields are listed in the table of contents for config.html. If you wish to submit a result to SPEC for publication at www.spec.org/accel2023, these fields not only have to be filled in; they also have to follow certain formats. Although you are not required to submit your result to SPEC, for convenience the tools try to tell you as much as they can about how the result should be improved if you were to submit it. In the above example, the tools would stop complaining if the field hw_memory said something like "20 GB" instead of "20480 Megabytes".

Notice that you can repair minor formatting problems such as these without doing a re-run of your tests. You are allowed to edit the rawfile, as described in utility.html.

Results.04 q. Why does the report have an (*) that says ...

The report has a line that says

(*) Indicates compiler flags found in non-compiler variables

What does this mean, how do I make it go away.

a. There are potentially a number of errors that will show up like this. They usually mean that you have a conflict of some kind between flags file specifications and how you ran. If you specify a compiler flag that is listed as portability but put it in a config file variable for optimization, the reporter will notice this and warn you about a potential problem. Unfortunately, many of these kinds of problems require a rerun to make everything report nicely. Sometimes you can get lucky and fix your flags file to make the error go away. So the first thing to look at is your flags file. If that isn't the issue and you have a config file issue, you will need to rerun to make the error go away.

Power

Power.01 q. What is the power component of SPECaccel?

a. A power measurement component was an optional feature in SPEC ACCEL v1 that allowed the power consumed during the benchmark to be reported. Given this feature was rarely used, it was not carried forward to SPECaccel 2023. If there is a demand, SPEC may consider adding it to SPECaccel in a future version.

 


Copyright 2014-2023 Standard Performance Evaluation Corporation

All Rights Reserved