Arrow Release Benchmark Report

Benchmark Run Summary

Run Type1 Commit SHA Time of Commit Hardware Languages Benchmark Type Number of Benchmarks
2023-10-19 09:12:19 Intel(R) Core(TM) i9-9960X CPU @ 3.10GHz Python, R 194
2024-01-16 14:38:51 Intel(R) Core(TM) i9-9960X CPU @ 3.10GHz Python, R 193
2023-10-19 09:12:19 AMD Ryzen 5 PRO 4650GE with Radeon Graphics C++, Java 3368
2024-01-16 14:38:51 AMD Ryzen 5 PRO 4650GE with Radeon Graphics C++, Java 3386
1 When we compare benchmark results, we always have a contender (the new code that we are considering) and a baseline (the old code that were are comparing to). The historic distribution will be drawn from all benchmark results on commits in the baseline commit's git ancestry, up to and including all runs on the baseline commit itself. In this context, a baseline is typically the last Arrow release and the contender is the current release candidate.

Macrobenchmarks

Live Conbench UI views for the macrobenchmarks are available at this url. Conbench is an additional method to explore the results of the benchmarks particularly if you want to see results from more of the history or see more metadata.

Benchmark Percent Changes

  • Benchmarks are plotted using the percent change from baseline to contender.
  • Additional information on each benchmark is available by hovering over the relevant bar.

Python

dataframe-to-table

dataset-filter

dataset-read

dataset-select

dataset-selectivity

file-read

file-write

recursive-get-file-info

wide-dataframe

R

dataframe-to-table

file-read

file-write

partitioned-dataset-filter

tpch

Microbenchmarks

There are currently 3352 microbenchmarks in the Arrow benchmarks. The following comparisons are also available to be viewed in the Conbench UI.

Language z-score threshold Number of microbenchmarks
Stable Improvements Regressions Total
C++ 5 2733 321 263 3317
Java 5 26 5 4 35

Because of the large number of benchmarks, the top 20 benchmark results that deviate most from the baseline in both the positive and negative directions are presented below. All microbenchmark results for this comparison can be explored interactively in the microbenchmark explorer.

Benchmark Params Analysis Results
z-score Percent Change Baseline result Contender result unit
arrow-acero-aggregate-benchmark
C++ 1048576/1 −25.92 −3,423% 287,200 188,900 MB/s1
arrow-acero-expression-benchmark
C++ <ComplexExpressionBaseline>/rows_per_batch:10000/real_time/threads:1 −29.06 −830% 1,419,000 1,537,000 ns1
C++ <ComplexExpressionBaseline>/rows_per_batch:100000/real_time/threads:1 −25.43 −788% 1,412,000 1,523,000 ns1
arrow-compute-function-benchmark
C++ 1024 −35.50 −2,268% 12,930 9,998 i/s1
C++ 16384 −36.12 −1,964% 172,800 138,900 i/s1
C++ 4096 −42.24 −2,244% 49,840 38,650 i/s1
arrow-compute-vector-hash-benchmark
C++ 0 −26.86 −1,474% 837 714 MB/s1
arrow-compute-vector-selection-benchmark
C++ 524288/3 −50.81 −3,739% 3,339 2,090 MB/s1
arrow-compute-vector-sort-benchmark
C++ 32768/10 −25.61 −1,303% 42 37 MB/s1
arrow-io-file-benchmark
C++ real_time −26.24 −1,275% 245 214 MB/s1
arrow-small-vector-benchmark
C++ <SMALL_VECTOR(std::string)> −248.30 −1,617% 126,400,000 106,000,000 i/s1
C++ <SMALL_VECTOR(int)> −121.30 −615% 875,300,000 821,400,000 i/s1
arrow-value-parsing-benchmark
C++ <UInt16Type> −29.58 −449% 162,600,000 155,300,000 i/s1
parquet-encoding-benchmark
C++ 4096 −47.59 −1,593% 523 439 MB/s1
C++ 65536 −67.53 −3,649% 38,020 24,150 MB/s1
C++ 65536 −68.45 −3,652% 38,000 24,130 MB/s1
C++ 65536 −52.23 −3,482% 35,680 23,260 MB/s1
C++ 65536 −53.58 −3,501% 35,790 23,260 MB/s1
C++ 65536 −58.99 −3,494% 35,740 23,250 MB/s1
arrow.memory.ArrowBufBenchmarks
Java source=java-micro, suite=arrow.memory.ArrowBufBenchmarks −57.60 −4,768% 47,800 25,010 i/s1
1 MB/s = megabytes per second; ns = nanoseconds; i/s = iterations per second
Benchmark Params Analysis Results
z-score Percent Change Baseline result Contender result unit
arrow-compute-scalar-if-else-benchmark
C++ 0 164.40 4,449% 5,058 7,309 MB/s1
arrow-compute-vector-selection-benchmark
C++ 524288/2 494.30 32,020% 12,760 53,640 MB/s1
C++ 524288/1 102.40 9,197% 1,420 2,726 MB/s1
C++ 524288/2 108.40 22,340% 23,500 75,990 MB/s1
arrow-io-memory-benchmark
C++ threads:1/real_time 175.70 7,386% 6,518 11,330 MB/s1
C++ threads:2/real_time 278.50 7,214% 6,618 11,390 MB/s1
C++ threads:4/real_time 436.70 7,062% 6,420 10,950 MB/s1
C++ threads:6/real_time 257.80 6,371% 6,289 10,300 MB/s1
parquet-encoding-benchmark
C++ 32768 347.30 2,985% 2,374 3,083 MB/s1
C++ 65536 604.30 2,976% 2,371 3,076 MB/s1
C++ 1024 115.70 11,420% 1,510 3,236 MB/s1
C++ 4096 137.40 11,510% 1,513 3,255 MB/s1
C++ 1024 1,343.00 9,531% 2,097 4,096 MB/s1
C++ 32768 1,796.00 10,240% 1,992 4,033 MB/s1
C++ 4096 1,887.00 10,180% 1,997 4,032 MB/s1
C++ 65536 2,123.00 10,220% 1,992 4,029 MB/s1
C++ 1024 118.90 18,280% 1,473 4,167 MB/s1
C++ 4096 125.30 18,330% 1,481 4,196 MB/s1
arrow.vector.VectorUnloaderBenchmark
Java source=java-micro, suite=arrow.vector.VectorUnloaderBenchmark 8,848.00 2,210,000% 4,326 960,100 i/s1
arrow.vector.ipc.message.ArrowRecordBatchBenchmarks
Java source=java-micro, suite=arrow.vector.ipc.message.ArrowRecordBatchBenchmarks 8,674.00 1,966,000% 39,410 7,786,000 i/s1
1 MB/s = megabytes per second; ns = nanoseconds; i/s = iterations per second

z-score distribution

Plotting the distribution of zscores for all microbenchmark results will help identify any systematic differences between the baseline and contender. The shape of the distribution of z-scores provides a sense of the overall performance of the contender relative to the baseline. Narrow distributions centered around 0 indicate that the contender is performing similarly to the baseline. Wider distributions indicate that the contender is performing differently than the baseline with left skewing indicating regressions and right skewing indicating improvements.

Microbenchmark explorer

This microbenchmarks explorer allows you to filter the microbenchmark results by language, suite, and benchmark name and toggle regressions and improvements based on a threshold level of 5 z-scores. Languages, suite and benchmark name need to be selected to show a benchmark plot. Additional benchmark parameters are displayed on the vertical axis resulting in each bar representing a case permutation. If a benchmark does not have additional parameters, the full case permutation string is displayed. Each bar can be clicked to open the Conbench UI page for that benchmark providing additional history and metadata for that case permutation.