pyarrow.compute.rank#

pyarrow.compute.rank(input, /, sort_keys='ascending', *, null_placement='at_end', tiebreaker='first', options=None, memory_pool=None)#

Compute ordinal ranks of an array (1-based).

This function computes a rank of the input array. By default, null values are considered greater than any other value and are therefore sorted at the end of the input. For floating-point types, NaNs are considered greater than any other non-null value, but smaller than null values. The default tiebreaker is to assign ranks in order of when ties appear in the input.

The handling of nulls, NaNs and tiebreakers can be changed in RankOptions.

Parameters:
inputArray-like or scalar-like

Argument to compute function.

sort_keyssequence of (name, order) tuples or str, default “ascending”

Names of field/column keys to sort the input on, along with the order each field/column is sorted in. Accepted values for order are “ascending”, “descending”. The field name can be a string column name or expression. Alternatively, one can simply pass “ascending” or “descending” as a string if the input is array-like.

null_placementstr, default “at_end”

Where nulls in input should be sorted. Accepted values are “at_start”, “at_end”.

tiebreakerstr, default “first”

Configure how ties between equal values are handled. Accepted values are:

  • “min”: Ties get the smallest possible rank in sorted order.

  • “max”: Ties get the largest possible rank in sorted order.

  • “first”: Ranks are assigned in order of when ties appear in the

    input. This ensures the ranks are a stable permutation of the input.

  • “dense”: The ranks span a dense [1, M] interval where M is the

    number of distinct values in the input.

optionspyarrow.compute.RankOptions, optional

Alternative way of passing options.

memory_poolpyarrow.MemoryPool, optional

If not passed, will allocate memory from the default memory pool.