Page tree

Overview

This document details how users register signals to be tracked in their Attivio Platform project.

What is a signal?

A signal is just an action in the search application that could potentially be used to refine future search results. An administrator can define what signals types they choose when they create a signal but the default signal is a search result click signal. This is when a user searches the application and clicks on one of the search result. A signal is created from this click that contains information like the query, relevancy model used, relevancy feature vector, query time stamp and other fields. Accumulation of these signals will allow for training of relevancy models. Relevancy models will be applied to query actions based on user, group and other things and this will optimize what a user sees when they enter a certain query.

What is signal type and signal strength?

The signal type is just a descriptor of the type of signal. Signals will not be compared to signals of other types when being run through the relevancy model. We have two types that are statically defined as "click" and "rating" corresponding to signals created as the result of clicking on a search result and signals created as a result of rating a search result. While we have these signal types statically defined, users can register signals with any type that they would like depending on what actions they would like to affect how their relevancy models are trained.
Signal strength is a way of quantifying how much weight a certain signal should get. The higher the strength, the more that signal will be weighed when training the relevancy model. For instance, if there is a rating signal, the strength of the signal would be higher the higher the rating given was. Signal strength can set to 0 to nullify bad signals. The values of signal strength are relative only to the strengths of a certain signal type. That is, a strength of 1 for a signal type with a strength range of 0-1 will not be given the same weight within its type as a for a signal type with a strength range of 0-100.

How can I track signals?

This API is for both available for both Java and REST. The Java classes needed to interface are in our SDK and can be accessed by including our SDK JAR libraries on the classpath of any Java project that wishes to interface with the signal tracking feature. The REST API can be accessed using HTTP requests to any node running at your system. See the section below for more details.

Java API

To interface with Attivio using Java, you need to first need to include our SDK JAR libraries on your classpath. The SDK JAR libraries exist in the Attivio Platform installation directory at this location:

<install_directory>/sdk/java/client/lib/

This will allow you to use our service factory to access the signal tracking API. For this you will need to enter the host and port on which ZooKeeper is running in Attivio Platform as well as the project name. Below is a sample program that will query the index and add a signal as though a user clicked on a result:

package com.test;
import com.attivio.sdk.client.trigger.TriggerApi;
import com.attivio.sdk.client.trigger.TriggerDictionaryInfo;
import com.attivio.service.Platform;
import com.attivio.service.ServiceFactory;
import com.attivio.service.JmxAuthParams;
public class TestTest {
  
  public static void main(String[] args) throws Exception {

	// initialization needed to access services using the service factory
    System.setProperty("AIE_ZOOKEEPER","localhost:16980");
    Platform.instance.setProjectName("sampleproject");
 
	// create a query request to get all documents with a table of orders
	Query query = new QueryString("table:orders");
	QueryRequest request = new QueryRequest(query);
 
	// query against the index
    QueryResponse response = ServiceFactory.getService(SearchClient.class).search(request);
 
	// retrieve the signal tracking API from the service factory
    SignalTrackingApi signalTrackingApi = ServiceFactory.getService(SignalTrackingApi.class);
	
	// create the signal on the first result document and add it to the signals being tracked
	// this is simulating a user click on the first result of the query
	Signal signal = Signal.getSignal(response, 1);
	signalTrackingApi.addSignal(signal);
 
	// create the signal on the second document and add it to the signals being tracked
	// this is simulating a user rating on the second result of the query with three stars
	signal = Signal.getSignal(response, 2, Signal.RATING, 3);
	signalTrackingApi.addSignal(signal);

  }
}


REST API

To add signals using REST requests, you need to have an Attivio node running. The root for all of Attivio's REST API endpoints are http://<host>:<port>/rest.
To add a signal you need to make an HTTP POST request to the following endpoint

http://<host>:<port>/rest/signals/add

with a JSON object with the following parameters:

  • docId - id of the document for which the signal
  • principal - the Attivio principal formatted like so: <realmId>:<principalId>:<principalName>
  • docOrdinal - the ordinal of the document in the search result
  • featureVector - the feature vector associated with the document
  • query - the query string that returned the document
  • locale - the locale of the query
  • relevancyModelName - the name of the relevancy model associated with the query
  • relevancyModelVersion - the name of the relevancy model associated with the query
  • relevancyModelNames - the names of all relevancy models associated with the query request
  • queryTimestamp - the time the query was entered
  • signalType - the type of the signal
  • signalStrength - the strength of the signal

Below is a sample curl request to add a signal to Attivio:

curl -X POST \
http://localhost:17000/rest/signals/add \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d '{
"principal": "default:user1:aieadmin",
"docId": "doc1",
"docOrdinal": 1,
"featureVector": "score=0.16604789,freshness=0.16604789,phrase_title=0.0,phrase_anchortext=0.0,title=0.0,anchortext=0.0",
"query": "*",
"locale": "en",
"relevancyModelName": "default",
"relevancyModelVersion": 0,
"relevancyModelNames": [
"default",
"noop"
],
"queryTimestamp": 1491847672751,
"signalTimestamp": 1491847672879,
"type": "click",
"weight": 1
}'

 

Other endpoints include:

GET http://<host>:<port>/rest/signals/relevancyModels
This requires no input and will return a JSON list of all relevancy model names associated with all signals being tracked.

GET http://<host>:<port>/rest/signals/signalTypes
This requires no input and will return a JSON list of all unique signal types associated with all signals being tracked.

GET http://<host>:<port>/rest/signals/modelHistogram
This will return a JSON object that maps document ordinal to number of signals associated with that ordinal for a relevancy model specified with required URL parameters modelName and modelVersion.

GET http://<host>:<port>/rest/signals
This will return a JSON list of all signals being tracked. Signals will be modeled as JSON objects that are in the same form as those in the payload of the signals/add endpoint above. Signals can be filtered using the following optional URL parameters:

  • modelName - relevancy model name
  • startTime - the earliest time at which a the query was made for the signal
  • endTime - the latest time at which a the query was made for the signal
  • signalType - type of the signal, multiple signalType fields may be entered to filter on more than one signal type

Command Line Tool

The uploadSignals command line tool can be used to bulk upload signal data. This tool is especially useful for uploading signal data for "golden set" queries indicating a supervised set of ranked documents for given queries. This tool will automatically resolve feature vectors for specified documents according to the given query. Any external signal data that does not include feature vectors can be uploaded using this tool.

 

Project Properties

The following project properties are available to configure the signal tracking API. To change from the defaults, put the properties in the attivio.core-app.properties file located in <install_dir>/conf/properties/core-app

namedescriptiondefault
signal.flush.timeout

The delay in milliseconds between refreshing the signal store between non ui originated access requests to said store. There should be no need to change this unless the relevancy models for the project need to be trained more frequently than every 3 hours

3600000
signal.flush.ui.timeoutThe delay in milliseconds between refreshing the signal store between ui originated access requests to said store. Currently the only ui originated request to which this applies is the getModelHistogram method60000
signal.flush.delayThe delay in seconds after starting the application that the signal information will be automatically centralized86400
signal.flush.periodThe amount of time in seconds between automatic centralization of signal data after the inital delay (configured with signal.flush.delay)86400
signal.flush.batch.sizeThe number of signals to have in a batch while we sort/centralize the data. If there is enough memory available in the application, increasing this value may slightly improve performance of signal centralization/access10000
signal.purge.periodThe amount of time in milliseconds from being created that a signal will be stored in Attivio. Once signals are older than this they will be deleted.3888000000

 

 

 

  • No labels