Fragmented authoring concepts

The fragmentation network

A fragmented document is a structured document that is composed of one or more fragments. A fragment is a self contained unit that can be referenced by other fragments and may include references to other fragments. The relationships between fragments are directed and resemble a containment relationship (i.e. part-of). The root fragment is the fragment that does not have a parent and includes (transitively) all fragments in the structured document. A fragment may be referenced more than once in a fragmented document, but a child may not reference one of its ancestors. Consequently, the fragmentation structure may be represented as a directed-acyclic-graph. Furthermore, a fragmented document respects a pre-defined document model, which is comparable to XML-Schema or relax-NG, but adapted for fragmented documents (for instance including reference integrity). As a result, the relationships between fragments are constrained by this document model.

Re-purposing

Re-purposing refers to the use of one fragment in two or more documents. This is typically useful in a scenario where a document should be made available for other types of support (e.g. paper, website, presentation). However, there are other examples of re-purposing.

Re-purposing and document consistency

In Scenari (and other document engineering tools) there is a clear separation between authoring and publishing phases, which is less defined in a wiki environment. In a wiki, documents are accessible by a user in their "published" state. When a user decides to modify a document, the document is opened in an editor, and as such, moves from a published state to an editable state. Since documents in a wiki are always "published", the document consistency should be maintained. Consistency means that all documents are presented as intended by their authors. This is different from Scenari, where an author publishes documents once at a time. If an author changes a fragment that is re-purposed by another document, this might consequently render the other document inconsistent. However, this is acceptable in Scenari because an author does not intend to publish both documents simultaneously. If the second document is to be published the author has the opportunity to revise the document.

Archiving the fragmentation network

To archiving a fragmented document, all referenced fragments including metadata that may change between versions (such as author and rights) should be archived as well. To this end, a fragment should be identifiable by an identifier that is guaranteed to be unique over time. We use unix mtime (i.e. a numerical representation of date and time) appended to the name of the fragment to achieve this. Note that a hash, such as a md5, which uniquely identifies the content of a fragment is insufficient since different versions of the fragment might have identical content.

This might suggest that archiving a fragmented document is a matter of copying the fragments into the archive. However, if the archive contains a prior version of the fragmented document, and there are fragments that didn't change between two subsequent versions, then the references to these unaltered fragments will be identical. Consequently, an archived version of a fragmented document may contain references that point to fragments in prior versions. If a fragment has changed since its prior version, this implies that its parent has changed as well, since at least the reference to the child has changed. Consequently, the root fragment changes between every subsequent version, and as such, accumulates the changes made to the fragmented document. To restores a prior version of the document as the current version, it therefore, suffices to just restore the root fragment.