Manage Tags¶
Lance, much like Git, employs the LanceDataset.tags
property to label specific versions within a dataset's history.
Tags
are particularly useful for tracking the evolution of datasets,
especially in machine learning workflows where datasets are frequently updated.
For example, you can create
, update
,
and delete
or list
tags.
Note
Creating or deleting tags does not generate new dataset versions. Tags exist as auxiliary metadata stored in a separate directory.
import lance
ds = lance.dataset("./tags.lance")
print(len(ds.versions()))
# 2
print(ds.tags.list())
# {}
ds.tags.create("v1-prod", 1)
print(ds.tags.list())
# {'v1-prod': {'version': 1, 'manifest_size': ...}}
ds.tags.update("v1-prod", 2)
print(ds.tags.list())
# {'v1-prod': {'version': 2, 'manifest_size': ...}}
ds.tags.delete("v1-prod")
print(ds.tags.list())
# {}
print(ds.tags.list_ordered())
# []
ds.tags.create("v1-prod", 1)
print(ds.tags.list_ordered())
# [('v1-prod', {'version': 1, 'manifest_size': ...})]
ds.tags.update("v1-prod", 2)
print(ds.tags.list_ordered())
# [('v1-prod', {'version': 2, 'manifest_size': ...})]
ds.tags.delete("v1-prod")
print(ds.tags.list_ordered())
# []
Note
Tagged versions are exempted from the LanceDataset.cleanup_old_versions()
process.
To remove a version that has been tagged, you must first LanceDataset.tags.delete()
the associated tag.