Logic
- Run es2panda for benchmark files from this directory
- Dump perfmetrics to
<work_dir>/test-current-perf.txtfor current es2panda - Dump perfmetrics to
<work_dir>/test-pre_merge-perf.txtfor another (pre-merge) es2panda - Dump comparison report to
<work_dir>/test-report.txtin format liketime=-90.00ms (-1.9%)
Static mode
If actual_perf > max_perf * (1 + static_regression) - an error occurs. Example:
[PERF REGRESSION] Failed for bench_1-current-perf.txt: Memory exceeded threshold.
Limit: 5.0%, Actual: +406.25%
Base: 32.00MB, New: 162.00MB
Threshold: < 33.60MB
If actual_perf < max_perf * (1 + 3 * static_regression) - an error occurs. Example:
[UPDATE REQUIRED] Very good perf for bench_1-current-perf.txt: Please update *-max.txt.
Hint: use flag '--dump-perf-metrics' and Release build of es2panda.
Dynamic mode
If actual_perf > pre_merge_perf * dynamic_regression - an error occurs (the same as in static mode).
Errors reporting
Errors are printed to the console and also to <work_dir>/error_log.txt.
Arguments
--mode- 'static' to compare with*-max.txtfiles. 'dynamic' to compare with pre-merge es2panda.--es2panda- Path to current es2panda (aka /bin/es2panda)--es2panda-pre-merge- Path to pre-merge es2panda (aka <pre_merge_build>/bin/es2panda)--test-dir- Path to test directory with test files--work-dir- Path to the working temp folder with gen, intermediate and report folders--dynamic-regression- Acceptable regression compared to the another (pre-merge) es2panda--static-regression- Acceptable regression compared to static vales from*-max.txtfiles--runs- The number of runs to average--werror- Warnings as errors
Max values
Each file have companion: for test.ets companion is test-max.txt. This file contains max values for metrics.
Local reproduction
# static mode
python3 <ets_frontend>/ets2panda/test/benchmarks/runner/runner.py --mode=static --es2panda=<build>/bin/es2panda --work-dir=<build>/e2p_benchmarks --test-dir=<static_core>/tools/es2panda/test/benchmarks
# dynamic mode
python3 <ets_frontend>/ets2panda/test/benchmarks/runner/runner.py --mode=dynamic --es2panda=<build>/bin/es2panda --work-dir=<build>/e2p_benchmarks --test-dir=<static_core>/tools/es2panda/test/benchmarks --es2panda-pre-merge=<pre_merge_build>/bin/es2panda
See --help if needed.
CI
You can download artifacts for this job with perf stat.
Artifacts example
test-perf.txt
================ es2panda perf metrics (Averaged over 3 runs) ================
:@phases : time=891.00ms maxrss=140.00MB
:@phases/ConstantExpressionLowering : time=233.00ms maxrss=0.26MB
:@phases/TopLevelStatements : time=193.00ms maxrss=79.00MB
:@phases/ResolveIdentifiers : time=83.40ms maxrss=6.00MB
:@phases/CheckerPhase : time=78.60ms maxrss=19.00MB
test-report.txt
Performance Comparison: 'bench_1-max.txt' vs 'bench_1-current-perf.txt'
================================================================================
:@EmitProgram : time=+2.90ms (+4.6%) maxrss=+0.00MB (+0.0%)
:@GenerateProgram : time=+4.67ms (+6.7%) maxrss=0.00MB (0.0%)
:@GenerateProgram/OptimizeBytecode : time=0.00ms (0.0%) maxrss=0.00MB (0.0%)
:@phases : time=+22.67ms (+2.6%) maxrss=+0.00MB (+0.0%)
:@phases/AmbientLowering : time=-0.07ms (-0.6%) maxrss=0.00MB (0.0%)
Update *-max.txt files.
To update the static baseline values using multiple runs from CI:
- Create a PR.
- Run runonly for job "es2panda static benchmarks" for 28 job repeats and 1 test repeats.
- Download all zip artifacts from these runs into a single directory (e.g.
downloaded_artifacts). - Run the update script:
python3 -B <ets_frontend>/ets2panda/test/benchmarks/runner/update_static_values.py \
--test-dir=<static_core>/tools/es2panda/test/benchmarks \
--zips-from-ci=./downloaded_artifacts
The script will unpack the archives, find all *-current-perf.txt files, aggregate the metrics across all runs, and overwrite the corresponding *-max.txt files.
Warning diagnostics workloads
- Additional warning-focused workloads live in
test/benchmarks/warning_diagnostics. - These files are not picked by
runner.py(it scans onlytest/benchmarks/*.ets) and are intended for manual performance experiments.