WCET Tool Challenge 2006

... results presented at ...

ISoLA 2006

2nd International Symposium on Leveraging Applications of Formal Methods, Verification and Validation
15-19 November 2006 - Coral Beach Resort (Paphos, Cyprus)


Introduction

The purpose of the WCET Tool Challenge is to be able to study, compare and discuss the properties of different WCET tools and approaches, to define common metrics, and to enhance the existing benchmarks. The WCET Tool Challenge has been designed to find a good balance between openness for a wide range of analysis approaches, and specific participation guidelines to provide a level playing field. This should make results transparent and facilitate friendly competition among the participants. However, participants and other interested parties should be aware that results from different WCET tools will still be hard to compare directly, and there is not yet an established classification or set of performance metrics in this field. Therefore, the purpose of this Challenge is not to establish "winning tools". For a more detailed discussion of the actual goals, see here.

The WCET Tool Challenge will be performed during the autumn of 2006. This specification of the WCET Tool Challenge is based on discussions at the WCET 2006 workshop and the ARTIST2 Timing Analysis group meeting in Dresden in the beginning of July 2006.

The WCET Tool Challenge will concentrate on three aspects of WCET analysis (more info below):

Companies as well as research groups are welcome to participate. The actual work with the tools will be made by an external student (see below) and/or the development teams. We will target the evaluation on a set of benchmark programs.

The report to ISoLA 2006 will be based on the reports from the developers and the student. The report will be compiled by the working group.


Important Dates and Submission Information

WCET tool developers are asked to enroll by sending an email to Jan Gustafsson no later than 2006-08-31. In your email, please state information and your choices according to the following:

*/ The WCET survey has a detailed treatment of the classification and clarifies this point.

More information concerning some of these bullets are found further down in this document.

More detailed directions as decided by the working team will be sent out during August, as well as a more detailed time schedule. If you plan to do the measurements by the development team, your results should be sent in during first half of October, to be able to produce the summary report to ISoLA 2006.


Working group

A working group is set up with Dr. Jan Gustafsson, Mälardalen University, Prof. Dr. Reinhard Wilhelm, Saarland, Prof. Dr. Reinhard v. Hanxleden, Kiel University, Dr. rer. nat. Steffen Goerzig, DaimlerChrysler, and Prof. Dr. Paul Levi, Stuttgart University, as current members. We are currently looking for a student who would set up the logistics and do the experimentation supervised by a competent person.


Aspects of WCET analysis

Area 1: Flow analysis
The purpose of the flow analysis phase is to extract the dynamic behaviour of the program. This includes information on which functions get called, loops bounds, if there are dependencies between if-statements, etc. We propose the following flow analysis metrics to be measured:

Area 2: Required user interaction
This area of evaluation is concerned with the amount of work that is involved with setting up a WCET calculation to receive a result. One important metric for this area is number of program-specific manual annotations. Necessary annotations (like CPU type, frequency etc) can be excluded from the number.

Area 3: Performance
This area is about the bottom line: the final WCET value and the performance of the tool. We propose the following metrics:

WCET tool rounds

Aspects may be non-orthogonal and influence each other. For example, much preparation work may give a better (tighter) WCET. This is expected and normal, and shows the signs of a flexible WCET tool. The same tool can be used for different aspects with different setups. Therefore we suggest that each tool is used in three rounds for each target processor:
  1. One initial round with no manual annotations for loop bounds etc. Necessary annotations (like CPU type, frequency etc) are however allowed. This run may not give a WCET bound at all for some tools and benchmarks.
  2. A basic round with the smallest set of manual annotations possible to get a WCET bound.
  3. An optimal round with the largest set of annotations to get as tight WCET bound as possible.

The required user interaction is of course growing for each round. For each round, the metrics are measured for the three aspects. For each metric, the complete setup is described.


Carrying the evaluation out

There will be two possibilities:
  1. The evaluation is carried out by Lili Tan (lili.tan AT icb DOT uni-due DOT de), who is a research assistant in the research group of Dependability of Computing Systems at the University of Duisburg-Essen. This will take the advantages of letting an external person to try out the tools and to give an independent feedback of the usability of the tools, without bias of any WCET tool developer. She will make a report on the evaluation and give it to the working team.
  2. The evaluation is carried out by the development team. Results are expected to be sent in to be included in the report to ISoLA.

There will be a choice for the participants to use one or both of these approaches.


Selection of benchmark programs

The benchmarks will represent different types of codes, for example code with different types of loops, infeasible paths, automatically generated code, hardware specialized code, and also large ”real world” programs. A mix of single-path programs and multi-path programs will be included.

There will be two main types of codes:

  1. Open source benchmark programs from the Mälardalen WCET benchmark and PapaBench (MiBench is skipped this time). These benchmarks are all available on the web. For more details, see here.
  2. Proprietary software from DaimlerChrysler will be available. This code will however not be open for obvious reasons and the analysis of it can only be be made by the student mentioned above, inside DaimlerChrysler. (This second option will not be used this time).

Since this is the first event of this kind, we expect some difficulties when analysing the software. For example, not all benchmarks have been tested with all tools. Please have patience and see this as a part of the learning process. Analyse as many as possible of the benchmarks, and try to solve the problems as they appear. There are a number of considerations when analysing the benchmarks, see here and here.


Selection of processors

We suggest that each participant selects up to three processors for which to do the analyses; for example one simple (e.g., Renesas H8), one medium complex (e.g., ARM7/9, C167NEC, V850E) and one very complex (e.g., PowerPC), if possible.

To be able to compare results, we propose that the most commonly supported processors are selected. See here for a list of currently supported processors in WCET tools. If possible, avoid processors supported by only one tool.

We are aware that not all WCET tools support all processors, and that results will sometimes be hard to compare.


Selection of compilers

As there is no overview over which compilers are supported by which tools, we let the participants decide on one or two compiler(s). We ask you to choose some common compilers for the chosen processors, if possible.

We are aware that the results will become hard to compare since we cannot force the usage of certain compilers.


Welcome to participate in the WCET Tool Challenge 2006!

Version 2006-11-06
The Working Team

Jan Gustafsson <jan DOT gustafsson AT mdh DOT se>, Reinhard Wilhelm <wilhelm AT cs DOT uni-sb DOT de>, Reinhard v. Hanxleden < rvh AT informatik DOT uni-kiel DOT de>, Steffen Goerzig <steffen DOT goerzig AT daimlerchrysler DOT com>, and Paul Levi <Paul DOT Levi AT informatik DOT uni-stuttgart DOT de>