Hyper Open Edge Cloud

Technical Note on Catalog Uid Handling

Specifications of catalog uid handling
  • Last Update:2016-06-28
  • Version:001
  • Language:en

This specification documents the conditions and rules on how to allocate uids and keep consistency on the catalog. This document does not specify whether something is already followed or not implemented or violated as the nature of a specification. It only describe how it should be.

Table of Contents

Preconditions

  • Every ERP5 document (basically, an object which has a portal type) must get assigned to its own uid as soon as possible, that is, a new document should get an uid right after being created. So we assume that every document always has an uid, except for a short period when it is being created. More precisely, the API Folder.newContent must return a document with an uid assigned.
  • Uids (here only referring to the uids used for the catalog, they are different from other uids used in ERP5, for example, message uids in CMFActivity) are unique in a single site. In other words, all documents under an ERP5 site must have different uids. This requirement is more strict than oids in ZODB which may overlap among different objects under a single ERP5 site, as the ERP5 site may mount multiple ZODB storages.
  • Uids are unsigned 64-bit integers.
  • Uids must be kept inside ZODB so that we can rebuild a site only from ZODB.

Components

We assume that the following components are involved in uid handling:

Component Description
Document gets assigned to an uid and stores its uid in ZODB
Catalog records the path and the uid of every Document
Generator produces an uid for a new Document, and may pool uids unused yet

API

Here the API is defined in a very abstract way, as it can be implemented in various ways for optimization or convenience. The API requires the following methods:

Method Description
Generator.allocate allocates an unused uid for a Document
Generator.reset resets information on which uids are in use
Catalog.catalog records the path and the uid of a Document
Catalog.uncatalog deletes the record of a Document, if any
Document.store stores an uid allocated for itself

Practical API definitions

In order to implement the above API in practice, here is the suggestion of an implementation:

Method Input Output Invariant
Generator.allocate None an uid An uid may be extracted from a pool of unused uids, then pooled uids, when generated, must be strictly greater than the last allocated uid, and the pool must be discarded when Generator.reset is called. The last allocated uid must be stored in ZODB, and if it is bigger than the one stored in any other storage, it must override any other, although this method may utilize a storage different from ZODB when generating new uids for optimization.
Generator.reset the last uid None After this method is called, Generator.allocate must not return any uid less than or equal to the input uid.
Catalog.catalog a Document None If the input Document has an uid conflicting with another Document that the catalog records, the catalog should proceed with conflict handling described below. If the input Document has an uid which is greater than the last uid held by the Generator, this method must invoke Generator.reset with that uid as an input.
Catalog.uncatalog a Document (or a path and an uid) None The catalog must forget that the path and the uid were used.
Document.store an uid None This method must store the input uid in ZODB.

Conflict handling

When a Catalog detects a conflict of an uid, we have two modes to deal with the conflict:

Mode Description
Operation Mode The catalog should raise an exception, and refuses to catalog such objects. The catalog must notify the administrator of its site that there is a critical problem, urge him/her to resolve it carefully.
Recovery Mode The catalog must allocate new uids to all Documents which have a conflicting uid, then reindex those Documents as well as all Documents which refer to those Documents. Normally, every catalog should be in the Operation Mode. But when a full indexing of an ERP5 site is performed, it should be in Recovery Mode.

Related Articles