Page tree
Skip to end of metadata
Go to start of metadata

Overview

This guide details the ordering of messages in the Attivio platform.

View incoming links.

Unordered Messaging

Attivio does not enforce the order of messages as they flow through the system. It is possible that messages may arrive out of order to any component in any workflow, including the index engine. Unordered messaging also means the order in which documents are processed by the index engine may differ from the order in which the connector(s) sent them. For general bulk loads, or systems without high update rates for single documents, unordered messaging is not a problem, and the system defaults allow for the highest throughput and lowest memory utilization.

For example, a client may send a series of documents as follows:

feed doc1
feed doc2
feed doc3

However, those documents may arrive and be processed by the indexer as follows:

index doc2
index doc1
index doc3

Issues with Unordered Messaging

Unordered messaging can be a problem if updates or deletes are sent in between index commits, as follows:

feed doc1 // original version
feed doc2
delete doc1
feed doc1 // new version
feed doc2 // an update to doc 2
commit

In the series above, with unordered messaging, there are no guarantees that the original versions of doc1 and doc2 will get to the indexer first and be deleted/updated with the later versions of those documents. For example, the delete could be processed first, followed by the update, followed by the original version that would result in an old version of a document being in the index. In order to solve this problem, commits must be issued and completed before the series of updates/deletes are sent.

Ordered Commits Mode

Attivio runs with connectors having "ordered commits" mode enabled by default. "Ordered commits" mode means that connectors will wait for all previously sent documents to be indexed before sending a commit message. Ordered Commits mode ensures that any time a commit is sent to a workflow, all documents previously sent are flushed through the workflow before the commit is sent.

For example, a client may make the following calls to the API:

feed doc1
feed doc2
feed doc3
commit
feed doc4
feed doc1  // an update to doc1
commit

Using ordered commits, the feeder waits for doc1, doc2, and doc3 to reach the end of the workflow before the commit is sent. This guarantees that an update for doc1 sent later will not arrive at the indexer before the original document. See the javadoc for ContentFeeder  for more information on ordered commits.

Because of unordered messaging, in the example above, the indexer might see messages in any of the following orders. The important point is that the groups of documents between commits are guaranteed to have completed the workflow before the commit completes and before any other documents are sent to Attivio.

doc2
doc1
doc3
commit
doc1  // an update to doc1
doc4
commit

or

doc3
doc2
doc1
commit
doc4
doc1  // an update to doc1 (after first occurrence of doc1)
commit

Enabling/Disabling Ordered Commits Mode on a Connector

The default "ordered commits" mode can be overridden in the connector configuration.

Disable Ordered Commits

Configure the connector component with the following property set:

<property name="orderedCommits" value="false" />

Enable Ordered Commits

Configure the connector component with the following property set.

<property name="orderedCommits" value="true" />

Issues with Ordered Commit Mode

Ordered commit mode only guarantees that all documents sent before a commit will be committed. Ordered commit mode makes no guarantee that other messages will not be committed. For instance, if the following messages were sent:

feed doc1
feed doc2
commit
feed doc3
feed doc4
commit

they may be seen and indexed as follows:

feed doc1
feed doc2
feed doc3
commit
feed doc4
commit

If client code is dependent on the number of commits, waitForCompletion()  should be called after each call to commit() , or ordered messaging should be enabled. Client code may be dependent on the number of commits if, for example, the number of documents committed is sent, or updates and deletes are sent together.

Manual Ordered Messaging

In addition to ordered commits mode, any ingest client can call waitForCompletion()  at any time to ensure that all documents previously sent to Attivio have completed processing. Calling this method does have significant overhead, as the client must wait for all messages to be processed, which may take a long time if the server is busy or otherwise resource bound; however, calling this method can allow clients to ensure message order, when necessary, without the added cost of a commit. For example:

feed doc1
feed doc2
waitForCompletion
feed doc1 update  // this is guaranteed to replace the version sent earlier
feed doc2 update
commit

Multiple Clients

There is no coordination between separate clients feeding the same document. This applies to clients in separate threads and clients in the same thread (ContentFeeder client is NOT thread-safe). If sending the same document (based on unique Attivio document id) from different clients, an external synchronization method based on waiting for commits to complete needs to be implemented. Timing is not sufficient as a means to guarantee message ordering between clients, as SEDA, fault tolerant configurations, multiple component instances and a variety of other workflow stages could reorder the messages.

For example, if two clients are feeding documents as shown below, there is no guarantee which versions of the documents will get indexed. Document doc1 will either be deleted or updated. Documents doc2 and doc3 will both be in the index; however, there is no guarantee which client's version will be returned from a search.

Client 1
feed doc1
feed doc2
commit
feed doc3
feed doc1 // update to doc1
commit
Client 2
feed doc8
feed doc2
commit
feed doc3
delete doc1
commit
  • No labels