Splash-4 benchmark suite

  • Over the past three decades, the parallel applications of the Splash-2 benchmark suite have been instrumental in advancing multiprocessor research. Recently, the Splash-3 benchmarks eliminated performance bugs, data races, and improper synchronization that plagued Splash-2 benchmarks after the definition of the C memory model. In Splash-4 ,we revisit the Splash-3 benchmarks and adapt them for contemporary architectures with atomic operations and lock-free constructs. With our changes, we improve the scalability of most benchmarks for up to 32 and 64 cores, showing an improvement of up to 9x in actual machines, and up to 5x in simulation, over the unmodified Splash-3 benchmarks. To denote the substantive nature of the improvements in the Splash-3 benchmarks.

    Eduardo José Gómez-Hernández, Ruixiang Shao, Christos Sakalis, Stefanos Kaxiras, Alberto Ros, "Splash-4: Improving Scalability with Lock-Free Constructs". International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 235--236, Worldwide event, March 2021.[PDF]

    Link to Splash-4

Splash-3 benchmark suite

  • A well-known benchmark suite of parallel applications is the Splash-2 suite. Since its creation in the context of the DASH project, Splash-2 benchmarks have been widely used in research. However, Splash-2 was released over two decades ago and does not adhere to the recent C memory consistency model. This leads to unexpected and often incorrect behavior when some Splash-2 benchmarks are used in conjunction with contemporary compilers and hardware (simulated or real). Most importantly, we discovered critical performance bugs. In the Splash-3 benchmark suite we rectify the problematic benchmarks and contribute to the community a new sanitized version of the Splash-2 benchmarks.

    Reference: Christos Sakalis, Carl Leonardsson, Stefanos Kaxiras, Alberto Ros, "Splash-3: A Properly Synchronized Benchmark Suite for Contemporary Research". International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 101--111, April 2016.[PDF]

    Link to Splash-3

Fast&Furious tool

  • Existing multi-threaded applications perform synchronization either in an explicit way, e.g., making use of the functionality provided by synchronization libraries or in an implicit way, e.g., using shared variables. Unfortunately, the implicit synchronization constructs are prone to errors and difficult to detect. We developed a tool that is able to detect implicit synchronization in multi-threaded applications. The detection is performed by ensuring that during the execution of an application under a memory model that provides sequential consistency for data-race-free applications (SC for DRF), every read returns the same value as if running under sequential consistency. If the previous condition is not fulfilled by the execution, the application has data races, which may be intended to perform implicit synchronization.

    Reference: Alberto Ros, Stefanos Kaxiras, "Fast&Furious: A Tool for Detecting Covert Racing". 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures (PARMA) and 4th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (DITAM), pages 1--6, January 2015.[PDF]

    Link to Fast&Furious