Uploaded image for project: 'OptaPlanner'
  1. OptaPlanner
  2. PLANNER-401

Statistical benchmarking: run each single benchmark n times and average the results

    Details

    • Docs QE Status:
      NEW
    • QE Status:
      NEW

      Description

      Watch the presentation of Josh Block called Performance Anxiety on parleys: https://www.parleys.com/tutorial/performance-anxiety

      Running a single benchmark just once is unreliable:

      • 2 JVM processes on the same hardware for the same code can behave very differently performance wise (= score calculation count per second)
      • If the randomSeed isn't fixed (for example in PRODUCTION mode), a different randomSeed will influence score quality to certain degree.

      Our benchmarks needs to be to show the impact of this, by easily allowing to do every single benchmark n times. The benchmark report should show the average, the minimum & maximum (maybe even with a candle stick diagram?) and maybe even the raw result of every separate single benchmark run. Requirements needs to be discussed further before implementation starts.

      After this is implemented, this feature can be used to validate or invalidate the information in old blog posts that just did 1 single benchmark and presumed it was representative:
      http://www.optaplanner.org/blog/tags/production/

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  oskopek Ondrej Skopek
                  Reporter:
                  ge0ffrey Geoffrey De Smet
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: