Page tree

Overview

Every message and document processed by the ingestion system is audited. The audit system allows Attivio users to answer questions such as:

  • Is processing for my connector complete?
  • How many document ingestion warning or errors were encountered?
  • When did document Y become searchable?
  • What were the processing errors for document Z?
  • What occurred on the system between date/time A and date/time B?


Best Practice

This audit information contains valuable feedback on how well your Attivio ingestion is running. It cannot be assumed that lack of errors in the  aie-node.log  and  aie-node.error.log  files means that ingestion is running cleanly. Audit information should be reviewed for every connector run, especially after changes are made to your project.  System ingestion should be tuned until there are no warnings or failures in the audit trail. 

Interpreting Audit Information 

Initial Ingestion

Every message and document processed by the ingestion system is audited and associated with a unique clientId.  Auditing tracks the moment a document or message is added to the system (CREATE), processing events (WARN, DROP, FAILOK), arrival for processing on a particular node (RECEIVE), loss due to system or node failures (LOST), re-feeding of lost data (REFEED), and completion of normal processing (COMPLETE).  The clientId is the key for almost all audit and connector execution API calls.  Every run of a connector creates a unique clientId associated with all the documents that were created by that connector.  The execution of a connector results in a connector history record, an audit summary record, and numerous audit detail records. 

Once a connector is running or has run, summary information about the current or last execution can be seen on the connector's page.  The store admin page also lists every client whose audit data has not been purged along with the connector name (if any).  The history of all connector executions (including anything currently executing) can be obtained via the ConnectorHistoryApi.  This API can be used to programmatically discover the clientId for connector executions.

While Running

While a connector (or programmatic ingestion) is running, the clientId may be used to interrogate the audit system for the current status.  The status for each client is continuously updated in an audit summary record.  This summary record may be viewed in the Admin Audit UI or obtained via the AuditReaderApi getSummary method.  The summary record contains counts of the various types of events (CREATE, COMPLETE, etc) that are tracked by the audit system for a specific client.  Also included are the first and last action times.  When  CREATE + FOUND = COMPLETE + DROPPED + LOST then the summary counts are balanced and the client is currently considered finished.  Descriptions of each event type can be found here.

Attivio ingestion is inherently asynchronous.  The Audit UI and the summary record may indicate a client is currently complete, but this can change.  For instance, the connector could send in a new document (CREATE) and the summary counts would no longer be balanced.  A client is considered completely finished when it is inactive (getInactiveTime() is non-null) and isComplete() returns true.  On the Audit Admin UI the client id will have (inactive) after it and Feed Complete will show as TRUE.

Document and message CREATE audit events are counted as Document Adds and Sent Messages. If a document that is created has a parent document ID set for it, it will be counted as a Child Document Add.  Creates are audited synchronously (the documents are not sent into the system until the audit of the creation succeeds) and all other audit events are asynchronous and generally batched for efficiency.

Every component which processes a document may report the FAIL, OK, WARN, or DROPPED audit records.  By default, OK audits are only created by the indexing component.   When the processing of a document or message has finished an associated COMPLETE audit record is recorded.

Warn and Fail

Audit warnings and failures are bad.  The goal of any ingestion should be to run cleanly without any warnings or failures.  If a majority of documents have warnings then something is seriously wrong with the ingestion document processing and the application is likely to have data errors.   A FAIL for a document means that it was not processed properly by a component but was passed on without any changes to the document.  A WARN indicates the document may have been updated by the component, but that something unusual happened.  The Admin Audit UI may be used to inspect the audit detail records for current and unpurged ingestions.  For WARN and FAIL audit records, the component name, error code (there are several categories), exception class, and an exception message can be used to determine the cause.  

A poorly configured system which generates warnings and failures for a majority of ingested documents will suffer performance issues.  These types of audit records contain much more data than the typical CREATE, COMPLETE, and OK audit records.

To conduct a more detailed analysis of ingestion activity, there are two avenues: csv export and API.  The csv export option is available on the Admin Audit UI screen.  There is no REST API for csv export as this is intended for diagnostic efforts only.  The Audit API may also be used directly to interrogate the system about the status of a client's ingestion run or at the individual document level.  This API (AuditReaderApi) can be used to develop automated responses to document errors, failed ingestion, etc.

Guaranteeing Ingestion (introducing DROP and LOST)

A DROP for a document means that the document was not passed on to the next component in the processing workflow.  In other words, dropped documents do not appear in the index and will not be searchable.  Generally this is intentional or indicates a document's ID was changed (ID change results in paired DROP/CREATE audit records).  A LOST audit record indicates a system error or node failure (documents being processed on a node while it crashes are eventually marked as LOST).  A LOST document is not automatically recovered or otherwise handled (i.e., operator intervention is required to refeed documents) unless fault tolerance is active.

Audit Record Type

May occur multiple times

per document

Application

Implication

FAIL(tick)Document unaltered by component, data incomplete
WARN(tick)Document possible updated by component, data incomplete
DROP(error)Document dropped by component, not indexed
LOST(error)Document lost due to system error or node failure, not indexed

If ingestion must be guaranteed for each document then after the conclusion of a client (see the info bar in the While Running section) the audit summary must be checked for DROP and LOST counts.  Unexplained DROPs and any LOST counts indicate that the ingestion must be re-executed.  Generally this can be done by re-running the connector.  If however, the connector is running incrementally, then either the connector must be reset (requiring a full refeed) or the problematic documents must be individually identified and refed.  This exercise may be difficult depending on the connector type. 

 

  • No labels