Home

Toggle configurations

(gnu+none)
(intel+none)

Select a test suite

Select and display specific tests

Filter:  

Global STREAM Time
HPCC STREAM-EP Time
LULESH (release)
HPCC RA Time
BigInteger Performance (fast ops)
BigInteger Performance (slow ops)
Associative Type Add/Remove
1D ArrayView Parallel Indexing
1D ArrayView Parallel Iteration
Array View Creation
List Operations
One Billion Row Challenge (using 10 million row input set)
Sin LUT single locale
Dot Product Vectorization with real(64)
Dot Product Vectorization with int(32)
Dot Product Vectorization with real(64) with large arrays
Clenshaw
Loop Division Scalar Kernel
Global STREAM
Global STREAM using Promotion
HPCC FFT Time
HPCC PTRANS Time (numrows=5733)
LSU 1D heat diffusion codes (n=10,000,000)
Global STREAM Performance (GB/s)
HPCC RA Performance (GUPS)
HPCC STREAM-EP Performance (GB/s)
EP STREAM (fragmented) (GB/s)
Global STREAM (GB/s)
Global STREAM using Promotion (GB/s)
HPCC FFT Performance (Gflop/s)
HPCC HPL Performance (Gflop/s)
HPCC PTRANS Performance (GB/sec) (nrows=5733)
SSCA#2 Kernel 4 TEPS (SCALE=8)
SSCA#2 Kernels 2 & 3 (SCALE=8)
SSCA#2 Kernel 4 (SCALE=8)
SSCA#2 Graph Construction (SCALE=8)
SSCA#2 (SCALE=8)
LULESH (bradc study version)
miniMD LJ (--size=10) Time
LLNL CoMD Time (sec)
Elegant AoS CoMD Time (sec)
LCALS (raw, short)
LCALS (raw, medium)
LCALS (raw, long)
LCALS (raw_omp, short)
LCALS (raw_omp, medium)
LCALS (raw_omp, long)
LCALS (raw_vector_only, short)
LCALS (raw_vector_only, medium)
LCALS (raw_vector_only, long)
Binary Trees Shootout Benchmark (n=21)
Chameneos Redux Shootout Benchmark (n=6,000,000)
Fasta Shootout Benchmark (n=25,000,000)
K-nucleotide Shootout Benchmark
Meteor Shootout Benchmark (n=2098)
N-body variations
Pi digits variations
Regex-dna Shootout Benchmark
Regex-dna redux Shootout Benchmark
Reverse-complement Shootout Benchmark
Thread Ring Shootout Benchmark (n=50,000,000)
Submitted Binary Trees Shootout Benchmark (n=21)
Submitted Fannkuch-Redux Shootout Benchmark (n=12)
Submitted Fasta Shootout Benchmark
Submitted K-nucleotide Shootout Benchmark
Submitted Mandelbrot Shootout Benchmark
Submitted N-body Shootout Benchmark
Submitted Pi digits Shootout Benchmark
Submitted Regex-dna Shootout Benchmark
Submitted Regex-dna redux Shootout Benchmark
Submitted Reverse-complement Shootout Benchmark
Submitted Spectral Norm Shootout Benchmark
Fasta Shootout Benchmark (historical)
Fasta-redux Shootout Benchmark (n=25,000,000)
Mandelbrot all
Spectral Norm Step Size Times
Parboil BFS Execution Time
Parboil Histo Serial Execution Time
Parboil SAD Serial Execution Time
Parboil Stencil 3D Execution Time
Black-Scholes (PARSEC benchmark)
Black-Scholes (PARSEC benchmark) using Promotion
ISx variations
ISx (SPMD over Buckets)
ISx (Hand Optimized)
ISx (Avoids array returns)
ISx (Pure SPMD)
ISx (Release)
PRK synch_p2p
PRK synch_p2p time
PRK stencil
PRK stencil time
PRK transpose
PRK transpose time
NAS Parallel Benchmarks: CG timings - size S
NAS Parallel Benchmarks: CG timings - size W
NAS Parallel Benchmarks: CG timings - size A
NAS Parallel Benchmarks: FT timings - size S
NAS Parallel Benchmarks: FT timings - size W
NAS Parallel Benchmarks: FT timings - size A
NPB MG Size S
NPB MG Size A
NPB MG Size B
Scalar Multiplication 2D Array Execution Time
Serial 1D Array Performance
array vs tuple serial accesses
array vs ddata serial accesses
Serial Array Access vs. Reference Access
Array Access vs. Reference Access in forall loop
Serial Array Access vs. Reference Access (Multidimensional)
1D Domain vs. Range Parallel Iteration
1D Array Parallel Iteration
Promoted op= Time (local)
Promoted op= Time (no-local)
Array copy from Global to Local (Cyclic)
2D Array Assignment (256x256)
DGEMM Performance (128x128)
2D Array Assignment (1024x1024, faster idioms)
2D Array Assignment (1024x1024, slower idioms)
2D non-local Array Non-local Assignment (256x256)
DGEMM Performance (64x64)
2D non-local Array Assignment (1024x1024)
Array initialization (faster idioms)
Array initialization (slower idioms)
Associative Array Iteration
Associative Domain Iteration
Identical domain equality Timings (sec)
2D Matrix Multiply
Array of string element access
array return performance, serial/parallel, 8 element array
array return performance, serial/parallel, 20000000 element array
array return performance, serial/parallel, 40000000 element array
Building up associative domains
Sparse Domain Assignment (similar layouts, ~5M indices)
Sparse Domain Assignment (dissimilar layouts, ~50K indices)
Array init/deinit performance, 40000000 element array
Parallel Atomic fetchAdd Time (sec)
Dynamic Iterator
Adaptative WS v1 Iterator (BS, BS)
Adaptative WS v2 Iterator (BS, BS)
Guided Iterator
Adaptative WS Iterator (BS, BS)
For+Begin Timings
Coforall Timings
Coforall+Begin Timings
Early Exit Task Parallel Timings
Task Parallel Timings
Empty Task Spawn Timings (500,000 x maxTaskPar)
Empty Serialized Task Spawn Timings (500,000 x maxTaskPar)
Task Yield Timings (500,000 yields/task)
Task Placement STREAM Performance (GB/s)
Empty Barrier Timings (500,000 x maxTaskPar)
SPMD Barrier STREAM Performance (GB/s)
Memory Usage for examples Tests
Memory Leaks for examples Tests
Number of Tests with Leaks (examples only)
Memory Usage for all Tests
Memory Leaks for all Tests
Number of Tests with Leaks
Memory Usage for Multilocale Tests
Memory Leaks for Multilocale Tests
Number of Multilocale Tests with Leaks
Serial 'object' Allocation
Jacobi AST size
Jacobi Emitted Code Size
no-op (non-user code startup time)
NAS Parallel Benchmarks: EP timings - size S
NAS Parallel Benchmarks: EP timings - size W
NAS Parallel Benchmarks: EP timings - size A
NAS Parallel Benchmarks: EP timings - size B
HPCC HPL Time
EP STREAM (fragmented)
Fannkuch-Redux (n=12)
Spectral Norm Shootout Benchmark
Mandelbrot variations
cast from string time by dest type
String temporary copies
Splitting a string on whitespace
Allocating string operations
Passing and returning strings
Searches over n strings (historical)
Searches over n strings
Search within a string
Linearithmic sorts on 2^24 bytes of shuffled data
Quadratic sorts on 2^12 bytes of shuffled data
LinearAlgebra.dot() performance (2048*2048 matrix)
Transpose 10x10
Transpose 1000x1000
LU factorization - 1000x1000
Sparse matrix-matrix multiplication variations - big (N = 10e7)
Sparse matrix-matrix multiplication variations - small (N = 10e3)
LinearAlgebra.Sparse.dot() - squaring NxN matrices - big (N = 10e7)
LinearAlgebra.Sparse.dot() - squaring NxN matrices - small (N = 10e3)
Jacobi method - solving 512 unknowns - dense and sparse
CSR-CSC multiplication
CSDom._grow()
CSDom._bulkGrow()
Radix sorts on 2^24 bytes of shuffled data
Spiral 5.0 Chapel FFT example
Serial Reduction Styles
Reductions Time (sec)
Forall with AoA In and Reduce Intents
PARACR Block to Cyclic redistribution
BitOps - popcount - C
BitOps - clz - C
BitOps - ctz - C
randomStream.next (50M trials)
RBC benchmark
C-Ray Tracer
1D scan time
Repeated Large Binary IO (n=10000)
Performance of Various Jacobi1D Implementations
Performance of Various Jacobi2D Implementations
Performance of Various CFD-mini Implementations
Training Time of Multiple CNNs With Different Architectures and Hyperparameters
Training Time of XOR Perceptron
Time to build trees
Time to tree paircount
Elegance arrayAdd
Elegance promotedOp
Elegance tuplesVSarrays
Bale: Toposort Perf (Rows/s)
Bale: Toposort Time (seconds)
Storing Words in Moby Dick into a Map
Performance of primality test
For Loop Time Compare
While Loop Time Compare
c vs chpl writing single newlines to stdout
c vs chpl serial accesses
c vs chpl multidimensional and array of arrays serial accesses
Extern Method Call Time Compare
Loop Time Compare
Conditional Statement Compare (Balanced)
Conditional Statement Compare (random)
Conditional Statement Compare (Reversed)
Conditional Statement Compare (SkewedLow)
Conditional Statement Compare (SkewedMid)
Conditional Statement Compare (SkewedHigh)
Module versus Method variable access
Module versus Method multiplication/division
Module versus Method mod comparison
plb2 matmul
plb2 nqueen
Sample Performance Test (Bogus)
My Sample Performance Test (Bogus)

Chapel Performance Graphs for 1-node-xc

Graphs Last Updated on 2025-04-03