Skip to content

Migration Guides

Lance aims to avoid breaking changes when possible. Currently, we are refining the Rust public API so that we can move it out of experimental status and make stronger commitments to backwards compatibility. The python API is considered stable and breaking changes should generally be communicated (via warnings) for 1-2 months prior to being finalized to give users a chance to migrate. This page documents the breaking changes between releases and gives advice on how to migrate.

5.0.0

  • The default data storage version changed from 2.0 to 2.1. This affects the column_indices field in the DataFile protobuf message. In 2.0, every field (including non-leaf fields like struct containers and list containers) was assigned a sequential column index. In 2.1, non-leaf fields (unpacked structs, list containers) are assigned -1 instead since their validity information is now folded into repetition/definition levels. Only leaf fields and packed structs are assigned column indices.

    For example, given the schema:

    x: i32, y: [f32], z: { a: i32 }
    

    The fields (in depth-first order) are:

    Field ID Field
    0 x (i32)
    1 y (list)
    2 y.item (f32)
    3 z (struct)
    4 z.a (i32)

    In 2.0, column_indices = [0, 1, 2, 3, 4] — every field gets a column.

    In 2.1, column_indices = [0, -1, 1, -1, 2] — non-leaf fields (y and z) get -1.

  • This change only affects advanced users who construct DataFile messages directly, for example when building operations by hand for Dataset.commit. Normal read and write paths are unaffected.

  • To opt back to 2.0 format, set data_storage_version="2.0" when creating a dataset.

1.0.0

  • The SearchResult returned by scalar indices must now output information about null values. Instead of containing a RowIdTreeMap, it now contains a NullableRowIdSet. Expressions that resolve to null values must be included in search results in the null set. This ensures that NOT can be applied to index search results correctly.

0.39

  • The lance crate no longer re-exports utilities from lance-arrow such as RecordBatchExt or SchemaExt. In the short term, if you are relying on these utilities, you can add a dependency on the lance-arrow crate. However, we do not expect lance-arrow to ever be stable, and you may want to consider forking these utilities.

  • Previously, we exported Error and Result as both lance::Error and lance::error::Error. We have now reduced this to just lance::Error. We have also removed some internal error utilities (such as OptionExt) from the public API and do not plan on reintroducing these.

  • The Python and Rust dataset::diff_meta API has been removed in favor of dataset::delta, which returns a DatasetDelta that offers both metadata diff through list_transactions and data diff through get_inserted_rows and get_updated_rows.

  • Some other minor utilities which had previously been public are now private. It is unlikely anyone was utilizing' these. Please open an issue if you were relying on any of these.

  • The lance-namespace Rust crate now splits into lance-namespace that contains the main LanceNamespace trait and data models, and lance-namespace-impls that has different implementations of the namespace. The DirectoryNamespace and RestNamespace interfaces have been refactored to be more user friendly. The DirectoryNamespace also now uses Lance ObjectStore for IO instead of directly depending on Apache OpenDAL.