Extension#

Extension Type classes#

class Bool8Type : public arrow::ExtensionType#

Bool8 is an alternate representation for boolean arrays using 8 bits instead of 1 bit per value.

The underlying storage type is int8.

Public Functions

inline Bool8Type()#

Construct a Bool8Type.

inline virtual std::string extension_name() const override#

Unique name of extension type used to identify type for serialization.

Returns:

the string name of the extension

virtual std::string ToString(bool show_metadata = false) const override#

A string representation of the type, including any children.

virtual bool ExtensionEquals(const ExtensionType &other) const override#

Determine if two instances of the same extension types are equal.

Invoked from ExtensionType::Equals

Parameters:

other[in] the type to compare this type with

Returns:

bool true if type instances are equal

virtual std::string Serialize() const override#

Create a serialized representation of the extension type’s metadata.

The storage type will be handled automatically in IPC code paths

Returns:

the serialized representation

virtual Result<std::shared_ptr<DataType>> Deserialize(std::shared_ptr<DataType> storage_type, const std::string &serialized_data) const override#

Create an instance of the ExtensionType given the actual storage type and the serialized representation.

Parameters:
  • storage_type[in] the physical storage type of the extension

  • serialized_data[in] the serialized representation produced by Serialize

virtual std::shared_ptr<Array> MakeArray(std::shared_ptr<ArrayData> data) const override#

Create a Bool8Array from ArrayData.

class FixedShapeTensorType : public arrow::ExtensionType#

Concrete type class for constant-size Tensor data.

This is a canonical arrow extension type. See: https://arrow.apache.org/docs/format/CanonicalExtensions.html

Public Functions

inline virtual std::string extension_name() const override#

Unique name of extension type used to identify type for serialization.

Returns:

the string name of the extension

virtual std::string ToString(bool show_metadata = false) const override#

A string representation of the type, including any children.

inline size_t ndim() const#

Number of dimensions of tensor elements.

inline const std::vector<int64_t> &shape() const#

Shape of tensor elements.

inline const std::shared_ptr<DataType> &value_type() const#

Value type of tensor elements.

const std::vector<int64_t> &strides()#

Strides of tensor elements.

Strides state offset in bytes between adjacent elements along each dimension. In case permutation is non-empty strides are computed from permuted tensor element’s shape.

inline const std::vector<int64_t> &permutation() const#

Permutation mapping from logical to physical memory layout of tensor elements.

inline const std::vector<std::string> &dim_names() const#

Dimension names of tensor elements. Dimensions are ordered physically.

virtual bool ExtensionEquals(const ExtensionType &other) const override#

Determine if two instances of the same extension types are equal.

Invoked from ExtensionType::Equals

Parameters:

other[in] the type to compare this type with

Returns:

bool true if type instances are equal

virtual std::string Serialize() const override#

Create a serialized representation of the extension type’s metadata.

The storage type will be handled automatically in IPC code paths

Returns:

the serialized representation

virtual Result<std::shared_ptr<DataType>> Deserialize(std::shared_ptr<DataType> storage_type, const std::string &serialized_data) const override#

Create an instance of the ExtensionType given the actual storage type and the serialized representation.

Parameters:
  • storage_type[in] the physical storage type of the extension

  • serialized_data[in] the serialized representation produced by Serialize

virtual std::shared_ptr<Array> MakeArray(std::shared_ptr<ArrayData> data) const override#

Create a FixedShapeTensorArray from ArrayData.

Public Static Functions

static Result<std::shared_ptr<Tensor>> MakeTensor(const std::shared_ptr<ExtensionScalar> &scalar)#

Create a Tensor from an ExtensionScalar from a FixedShapeTensorArray.

This method will return a Tensor from ExtensionScalar with strides derived from shape and permutation of FixedShapeTensorType. Shape and dim_names will be permuted according to permutation stored in the FixedShapeTensorType metadata.

static Result<std::shared_ptr<DataType>> Make(const std::shared_ptr<DataType> &value_type, const std::vector<int64_t> &shape, const std::vector<int64_t> &permutation = {}, const std::vector<std::string> &dim_names = {})#

Create a FixedShapeTensorType instance.

class OpaqueType : public arrow::ExtensionType#

Opaque is a placeholder for a type from an external (usually non-Arrow) system that could not be interpreted.

Public Functions

inline explicit OpaqueType(std::shared_ptr<DataType> storage_type, std::string type_name, std::string vendor_name)#

Construct an OpaqueType.

Parameters:
  • storage_type[in] The underlying storage type. Should be arrow::null if there is no data.

  • type_name[in] The name of the type in the external system.

  • vendor_name[in] The name of the external system.

inline virtual std::string extension_name() const override#

Unique name of extension type used to identify type for serialization.

Returns:

the string name of the extension

virtual std::string ToString(bool show_metadata) const override#

A string representation of the type, including any children.

virtual bool ExtensionEquals(const ExtensionType &other) const override#

Determine if two instances of the same extension types are equal.

Invoked from ExtensionType::Equals

Parameters:

other[in] the type to compare this type with

Returns:

bool true if type instances are equal

virtual std::string Serialize() const override#

Create a serialized representation of the extension type’s metadata.

The storage type will be handled automatically in IPC code paths

Returns:

the serialized representation

virtual Result<std::shared_ptr<DataType>> Deserialize(std::shared_ptr<DataType> storage_type, const std::string &serialized_data) const override#

Create an instance of the ExtensionType given the actual storage type and the serialized representation.

Parameters:
  • storage_type[in] the physical storage type of the extension

  • serialized_data[in] the serialized representation produced by Serialize

virtual std::shared_ptr<Array> MakeArray(std::shared_ptr<ArrayData> data) const override#

Create an OpaqueArray from ArrayData.

class JsonExtensionType : public arrow::ExtensionType#

Concrete type class for variable-size JSON data, utf8-encoded.

Public Functions

inline virtual std::string extension_name() const override#

Unique name of extension type used to identify type for serialization.

Returns:

the string name of the extension

virtual bool ExtensionEquals(const ExtensionType &other) const override#

Determine if two instances of the same extension types are equal.

Invoked from ExtensionType::Equals

Parameters:

other[in] the type to compare this type with

Returns:

bool true if type instances are equal

virtual Result<std::shared_ptr<DataType>> Deserialize(std::shared_ptr<DataType> storage_type, const std::string &serialized_data) const override#

Create an instance of the ExtensionType given the actual storage type and the serialized representation.

Parameters:
  • storage_type[in] the physical storage type of the extension

  • serialized_data[in] the serialized representation produced by Serialize

virtual std::string Serialize() const override#

Create a serialized representation of the extension type’s metadata.

The storage type will be handled automatically in IPC code paths

Returns:

the serialized representation

virtual std::shared_ptr<Array> MakeArray(std::shared_ptr<ArrayData> data) const override#

Wrap built-in Array type in a user-defined ExtensionArray instance.

Parameters:

data[in] the physical storage for the extension type

class UuidType : public arrow::ExtensionType#

UuidType is a canonical arrow extension type for UUIDs.

UUIDs are stored as FixedSizeBinary(16) with big-endian notation and this does not interpret the bytes in any way. Specific UUID version is not required or guaranteed.

Public Functions

inline UuidType()#

Construct a UuidType.

inline virtual std::string extension_name() const override#

Unique name of extension type used to identify type for serialization.

Returns:

the string name of the extension

virtual std::string ToString(bool show_metadata = false) const override#

A string representation of the type, including any children.

virtual bool ExtensionEquals(const ExtensionType &other) const override#

Determine if two instances of the same extension types are equal.

Invoked from ExtensionType::Equals

Parameters:

other[in] the type to compare this type with

Returns:

bool true if type instances are equal

virtual std::shared_ptr<Array> MakeArray(std::shared_ptr<ArrayData> data) const override#

Create a UuidArray from ArrayData.

virtual Result<std::shared_ptr<DataType>> Deserialize(std::shared_ptr<DataType> storage_type, const std::string &serialized) const override#

Create an instance of the ExtensionType given the actual storage type and the serialized representation.

Parameters:
  • storage_type[in] the physical storage type of the extension

  • serialized_data[in] the serialized representation produced by Serialize

inline virtual std::string Serialize() const override#

Create a serialized representation of the extension type’s metadata.

The storage type will be handled automatically in IPC code paths

Returns:

the serialized representation

Public Static Functions

static inline Result<std::shared_ptr<DataType>> Make()#

Create a UuidType instance.

Extension Array classes#

class Bool8Array : public arrow::ExtensionArray#

Bool8 is an alternate representation for boolean arrays using 8 bits instead of 1 bit per value.

The underlying storage type is int8.

Public Functions

explicit ExtensionArray(const std::shared_ptr<ArrayData> &data)#

Construct an ExtensionArray from an ArrayData.

The ArrayData must have the right ExtensionType.

ExtensionArray(const std::shared_ptr<DataType> &type, const std::shared_ptr<Array> &storage)#

Construct an ExtensionArray from a type and the underlying storage.

class FixedShapeTensorArray : public arrow::ExtensionArray#

Public Functions

const Result<std::shared_ptr<Tensor>> ToTensor() const#

Create a Tensor from FixedShapeTensorArray.

This method will create a Tensor from a FixedShapeTensorArray, setting its first dimension as length equal to the FixedShapeTensorArray’s length and the remaining dimensions as the FixedShapeTensorType’s shape. Shape and dim_names will be permuted according to permutation stored in the FixedShapeTensorType metadata.

explicit ExtensionArray(const std::shared_ptr<ArrayData> &data)#

Construct an ExtensionArray from an ArrayData.

The ArrayData must have the right ExtensionType.

ExtensionArray(const std::shared_ptr<DataType> &type, const std::shared_ptr<Array> &storage)#

Construct an ExtensionArray from a type and the underlying storage.

Public Static Functions

static Result<std::shared_ptr<FixedShapeTensorArray>> FromTensor(const std::shared_ptr<Tensor> &tensor)#

Create a FixedShapeTensorArray from a Tensor.

This method will create a FixedShapeTensorArray from a Tensor, taking its first dimension as the number of elements in the resulting array and the remaining dimensions as the shape of the individual tensors. If Tensor provides strides, they will be used to determine dimension permutation. Otherwise, row-major layout (i.e. no permutation) will be assumed.

Parameters:

tensor[in] The Tensor to convert to a FixedShapeTensorArray

class OpaqueArray : public arrow::ExtensionArray#

Opaque is a wrapper for (usually binary) data from an external (often non-Arrow) system that could not be interpreted.

Public Functions

explicit ExtensionArray(const std::shared_ptr<ArrayData> &data)#

Construct an ExtensionArray from an ArrayData.

The ArrayData must have the right ExtensionType.

ExtensionArray(const std::shared_ptr<DataType> &type, const std::shared_ptr<Array> &storage)#

Construct an ExtensionArray from a type and the underlying storage.

class UuidArray : public arrow::ExtensionArray#

UuidArray stores array of UUIDs.

Underlying storage type is FixedSizeBinary(16).

Public Functions

explicit ExtensionArray(const std::shared_ptr<ArrayData> &data)#

Construct an ExtensionArray from an ArrayData.

The ArrayData must have the right ExtensionType.

ExtensionArray(const std::shared_ptr<DataType> &type, const std::shared_ptr<Array> &storage)#

Construct an ExtensionArray from a type and the underlying storage.