This document details how users register signals to be tracked in their Attivio Platform project.
What is a signal?
A signal is just an action in the search application that could potentially be used to refine future search results. An administrator can define what signals types they choose when they create a signal but the default signal is a search result click signal. This is when a user searches the application and clicks on one of the search result. A signal is created from this click that contains information like the query, relevancy model used, relevancy feature vector, query time stamp and other fields. Accumulation of these signals will allow for training of relevancy models. Relevancy models will be applied to query actions based on user, group and other things and this will optimize what a user sees when they enter a certain query.
What is signal type and signal strength?
The signal type is just a descriptor of the type of signal. Signals will not be compared to signals of other types when being run through the relevancy model. We have two types that are statically defined as "click" and "rating" corresponding to signals created as the result of clicking on a search result and signals created as a result of rating a search result. While we have these signal types statically defined, users can register signals with any type that they would like depending on what actions they would like to affect how their relevancy models are trained.
Signal strength is a way of quantifying how much weight a certain signal should get. The higher the strength, the more that signal will be weighed when training the relevancy model. For instance, if there is a rating signal, the strength of the signal would be higher the higher the rating given was. Signal strength can set to 0 to nullify bad signals. The values of signal strength are relative only to the strengths of a certain signal type. That is, a strength of 1 for a signal type with a strength range of 0-1 will not be given the same weight within its type as a for a signal type with a strength range of 0-100.
How can I track signals?
This API is for both available for both Java and REST. The Java classes needed to interface are in our SDK and can be accessed by including our SDK JAR libraries on the classpath of any Java project that wishes to interface with the signal tracking feature. The REST API can be accessed using HTTP requests to any node running at your system. See the section below for more details.
To interface with Attivio using Java, you need to first need to include our SDK JAR libraries on your classpath. The SDK JAR libraries exist in the Attivio Platform installation directory at this location:
This will allow you to use our service factory to access the signal tracking API. For this you will need to enter the host and port on which ZooKeeper is running in Attivio Platform as well as the project name. Below is a sample program that will query the index and add a signal as though a user clicked on a result:
To add signals using REST requests, you need to have an Attivio node running. The root for all of Attivio's REST API endpoints are
To add a signal you need to make an HTTP POST request to the following endpoint
with a JSON object with the following parameters:
docId- id of the document for which the signal
principal- the Attivio principal formatted like so: <realmId>:<principalId>:<principalName>
docOrdinal- the ordinal of the document in the search result
featureVector- the feature vector associated with the document
query- the query string that returned the document
locale- the locale of the query
relevancyModelName- the name of the relevancy model associated with the query
relevancyModelVersion- the name of the relevancy model associated with the query
relevancyModelNames- the names of all relevancy models associated with the query request
queryTimestamp- the time the query was entered
signalType- the type of the signal
signalStrength- the strength of the signal
Below is a sample
curl request to add a signal to Attivio:
Other endpoints include:
This requires no input and will return a JSON list of all relevancy model names associated with all signals being tracked.
This requires no input and will return a JSON list of all unique signal types associated with all signals being tracked.
This will return a JSON object that maps document ordinal to number of signals associated with that ordinal for a relevancy model specified with required URL parameters
This will return a JSON list of all signals being tracked. Signals will be modeled as JSON objects that are in the same form as those in the payload of the
signals/add endpoint above. Signals can be filtered using the following optional URL parameters:
modelName- relevancy model name
startTime- the earliest time at which a the query was made for the signal
endTime- the latest time at which a the query was made for the signal
signalType- type of the signal, multiple signalType fields may be entered to filter on more than one signal type
Command Line Tool
The uploadSignals command line tool can be used to bulk upload signal data. This tool is especially useful for uploading signal data for "golden set" queries indicating a supervised set of ranked documents for given queries. This tool will automatically resolve feature vectors for specified documents according to the given query. Any external signal data that does not include feature vectors can be uploaded using this tool.
The following project properties are available to configure the signal tracking API. To change from the defaults, put the properties in the attivio.core-app.properties file located in <install_dir>/conf/properties/core-app
The delay in milliseconds between refreshing the signal store between non ui originated access requests to said store. There should be no need to change this unless the relevancy models for the project need to be trained more frequently than every 3 hours
|signal.flush.ui.timeout||The delay in milliseconds between refreshing the signal store between ui originated access requests to said store. Currently the only ui originated request to which this applies is the getModelHistogram method||60000|
|signal.flush.delay||The delay in seconds after starting the application that the signal information will be automatically centralized||86400|
|signal.flush.period||The amount of time in seconds between automatic centralization of signal data after the inital delay (configured with signal.flush.delay)||86400|
|signal.flush.batch.size||The number of signals to have in a batch while we sort/centralize the data. If there is enough memory available in the application, increasing this value may slightly improve performance of signal centralization/access||10000|
|signal.purge.period||The amount of time in milliseconds from being created that a signal will be stored in Attivio. Once signals are older than this they will be deleted.||3888000000|