Quick Start
The following tutorial walks you through the process for creating a Machine Learning (ML) Relevancy Model. After completing this exercise, you should be familiar enough with the aspects of creating an ML Relevancy Model, generating signal data for that model and training the model with that signal data. Understanding these fundamental concepts and how they have been implemented within Search UI will allow you to implement ML Relevancy Models within your custom search applications and even make use of external signal data.
Due to an issue with the 5.5.1 Windows installer, an executable required to complete this tutorial is missing. Before beginning this exercise, please download train.zip and extract the .exe within it to <install-dir\lib\liblinear
.
1. Deploy the Attivio Factbook project
Open the Attivio Quick Start Tutorial in a new tab in your browser, follow the instructions for Steps 1–6 to deploy and start the Factbook project and run its connectors, and return to this page when complete.
When we created the Factbook project, we included the demo group of modules (-g demo
) which includes the relevancy module. Without any further changes to our project, we are able to create new relevancy models, generate signal data and train those models.
In this tutorial, we will focus only on the documents in the country and news tables when building our relevancy models, so let's delete our index and rerun those connectors only by executing the following steps:
Go to http://localhost:17000/admin/engines to open the Indexes page of the Attivio Admin UI.
Click the "delete all" link under the index named "index".
Enter "index" in the popup. Click OK
.
Go to http://localhost:17000/admin/connectors to open the Connectors page of the Attivio Admin UI.
Select the "country" and "news" connectors and click the
button in the top toolbar.2. Create users
Attivio's Search UI is programmed to feed signal data back into Attivio when users click on search results. However, this signal data must be generated from more than a single user to be useful. Unless signal data includes multiple signals from multiple users for at least one query there will not be sufficient data to provide differentiation between multiple documents for a query and attempting to train the model will fail. Therefore, we need to create a couple of users and log into Search UI as each user to generate some signal data by clicking on some results for a number of queries.
Execute the following steps to create and load users into Attivio. Afterwards, we'll be able to log into Search UI as each user, execute queries and click on some results to produce our signal data.
Create a file named users.xml
with the contents below and save it to <project-dir>\resources\users.xml
.
<?xml version="1.0" encoding="UTF-8"?> <principals> <user id="user1@test" name="user1" password="password1"/> <user id="user2@test" name="user2" password="password2"/> </principals>
Create a file named default-users-authentication-provider.xml
with the contents below and save it to <project-dir>\conf\bean\
.
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.springframework.org/schema/beans" xmlns:util="http://www.springframework.org/schema/util" xmlns:sec="http://www.springframework.org/schema/security" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd http://www.springframework.org/schema/security http://www.springframework.org/schema/security/spring-security-3.1.xsd"> <bean name="default-users-authentication-provider" class="com.attivio.security.authentication.XmlBasedAuthenticationProvider"> <property name="xmlFile" value="users.xml"/> </bean> </beans>
Edit the file <project-dir>\conf\features\core\DeployWebapp._searchui.xml
and replace its contents with the following:
<?xml version="1.0" encoding="UTF-8"?> <ff:features xmlns:ff="http://www.attivio.com/configuration/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:fbase="http://www.attivio.com/configuration/features/base" xmlns:f="http://www.attivio.com/configuration/features/core" xsi:schemaLocation="http://www.attivio.com/configuration/config http://www.attivio.com/configuration/config.xsd http://www.attivio.com/configuration/features/base http://www.attivio.com/configuration/features/baseFeatures.xsd http://www.attivio.com/configuration/features/core http://www.attivio.com/configuration/features/coreFeatures.xsd"> <f:deployWebapp authentication-provider-ref="default-users-authentication-provider" context-path="/searchui" directory="webapps/searchui" enabled="true" featureNameSource="contextPath" nodeset="*" requires-authentication="true"> <f:menuEntry blank="true" label="Search UI" path="search" uri="/searchui"/> </f:deployWebapp> </ff:features>
Save and close the file.
Edit the file <project-dir>\conf\features\core\DeployWebapp._rest.xml
and replace its contents with the following:
<?xml version="1.0" encoding="UTF-8"?> <ff:features xmlns:ff="http://www.attivio.com/configuration/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:fbase="http://www.attivio.com/configuration/features/base" xmlns:f="http://www.attivio.com/configuration/features/core" xsi:schemaLocation="http://www.attivio.com/configuration/config http://www.attivio.com/configuration/config.xsd http://www.attivio.com/configuration/features/base http://www.attivio.com/configuration/features/baseFeatures.xsd http://www.attivio.com/configuration/features/core http://www.attivio.com/configuration/features/coreFeatures.xsd"> <f:deployWebapp authentication-provider-ref="default-users-authentication-provider" context-path="/rest" directory="webapps/rest" enabled="true" featureNameSource="contextPath" nodeset="*" requires-authentication="true"> <f:menuEntry label="Debug Search" name="debug-search" path="search" uri="/rest/search"/> </f:deployWebapp> </ff:features>
Save and close the file.
Open the CLI and execute the deploy
command.
<install-dir>\bin\aie-cli.exe -p C:\attivio\projects\Factbook aie> deploy
Once the system restarts, you should be able to log into Search UI using any of the name/password combinations in the users.xml
file.
To completely log out of Search UI, use the the Log Out link on a page of the Admin UI such as http://localhost:17000/rest/search or close all browser windows. You can also launch a new incognito window.
3. Create the initial relevancy models
Execute the following steps to create two new relevancy models, "R&D" and "Sales".
If you are logged into Search UI as one of the users in users.xml
, be sure to log out prior to executing the following steps.
Open the Relevancy Model Admin UI at http://localhost:17000/relevancy/. Login with username aieadmin
and password attivio
.
Select the model named "default" in the list so that it appears in the inspector on the right-hand side of the window. Then choose the Copy
command from the inspector's menu (the gear icon — ). The Copy Relevancy Model dialog appears in which you can enter a name for the new model. Name it "R&D". Click the Copy button to complete the process. The new model will appear in the list.
Click the yellow circle next to version 1 in the inspector on the right-hand side of the window to publish it.
Repeat steps 2 - 3 to create and publish another relevancy model named "Sales".
4. Display the new relevancy models in Search UI
Search UI applies the default relevancy model to al queries out of the box. Execute the following steps to expose the two new relevancy models in Search UI so that we can apply them to searches.
Edit the file <install-dir>/modules/searchui/resources/searchui/configuration.properties.js
and locate the following text:
SearchUISearchPage: { // The names of the relevancy models to be able to switch between. If this is an empty array, // the server will be queried for the list of available relevancy models and they will be used. // To force the UI to always use a single model when making queries, set this to an array with // that single name as its sole element. relevancyModels: [ 'default', ],
Replace the above snippet with the following:
SearchUISearchPage: { // The names of the relevancy models to be able to switch between. If this is an empty array, // the server will be queried for the list of available relevancy models and they will be used. // To force the UI to always use a single model when making queries, set this to an array with // that single name as its sole element. relevancyModels: [ 'default', 'R&D', 'Sales', ],
Open the CLI and execute the deploy
command.
<install-dir>\bin\aie-cli.exe -p C:\attivio\projects\Factbook aie> deploy
Once Attivio restarts, the new relevancy models should be displayed on the search results of Search UI. You may need to clear your browser's cache in order to see them.
5. Create the signal data
Now that we have our users defined and two new relevancy models created and accessible in Search UI, we need to log in as various users, execute some queries and click on some results to produce the signal data which we will use later to train our models.
If Search UI shows you logged in as aieadmin, log out first so that you can log in as one of the users defined in users.xml
. To completely log out of Search UI, use the the Log Out link on a page of the Admin UI such as http://localhost:17000/rest/search or close all browser windows. You can also launch a new incognito window.
Execute the following steps for each of our 2 users:
Log into Search UI as user1
and click the Go button to go to the results page for the default search of *:*
Select R&D from the drop-down menu next to the Relevancy Model drop-down menu.
Enter "United States" in the search box and click Go.
Click on 1 to several titles of the results (try to avoid clicking on the #1 document). Make a note of the documents you clicked on so you can track that they appear higher in the results after training the model.
Repeat steps 3 and 4 doing searches for "China" and then "Russia", each time clicking on some of the results lower in the result set.
Log out of Search UI as that user.
Repeat steps 1 - 6 with user2
's credentials, executing the same queries and clicking on either the same or other results still taking care to avoid clicking on the first result.
Repeat steps 1 - 7 again for each each user, but this time select Sales from the drop-down menu next to the Relevancy Model drop-down menu.
6. Flush the signal data
By default, Attivio only consumes signal data that has been collected via the Signal Tracking API once per day. In order to speed things up for the sake of our tutorial, we can use a tool like Postman to force Attivio to flush the signal data we just created immediately. Postman is a tool which allows you to interact with RESTful APIs such as the Attivio's Signal Tracking API.
Execute the following steps to flush the signal data we've produced in the preceeding steps.
Download and install Postman from https://www.getpostman.com/. Once installed, run Postman.
Change the drop-down which displays "Get" to "Post".
Enter http://localhost:17000/rest/signals/flush?force=true
into the request URL field.
Click Send
. If successful, you should receive a 200 response code.
Now that our signal data is ready, we can train our models.
If you are unable to install Postman to flush the signal data, you can set the following two properties in the <project-dir>\conf\properties\core-app\attivio.core-app.properties
file:
signal.flush.delay=60 signal.flush.period=60
Open the CLI and execute the deploy
command.
<install-dir>\bin\aie-cli.exe -p C:\attivio\projects\Factbook aie> deploy
This will configure Attivio to flush signal data every minute instead of the default period of 24 hours. Once http://localhost:17000/rest/signals is returning signal data, you can proceed to the next step.
7. Train the relevancy models
Now that we have created some signal data for our two relevancy models, let's return to the Relevancy Model Admin UI at http://localhost:17000/relevancy/ so we can train them with this signal data. Login with username aieadmin
and password attivio
.
Select the model named "R&D" in the list so that it appears in the inspector on the right-hand side of the window. Click the edit icon ( ) to edit the model.
Convert the model into Machine Learning Configured Model by selecting "Machine-Learning-Configured-Model" from the Model Type drop-down menu.
Check "Remove Old Versions" and set "versions to Keep" to 10.
Click NEXT
Switch "Include/Exclude" to "Include only the features selected below, ignoring all others" and use the controls to add "table_country" and "table_news" to the right panel.
Click NEXT
.
Enter "click" into the "Signal Types" field and click UPDATE RELEVANCY MODEL
Training of the model will automatically begin. Once complete, the right-hand side of the window should resemble below. We now have two versions of our model, version 1 is the original user-defined model. Version 2 is our Machine Learning Configured model.
Click the
icon to display the features which were automatically configured as a result of the training.Click the yellow circle next to version 2 to publish it.
Repeat steps 1 - 6 for the "Sales" relevancy model.
8. Test the converted Machine Learning relevancy models
Log back into into Search UI as user1
and click the Go button to go to the results page for the default search of *:*
Select R&D from the drop-down menu next to the Relevancy Model drop-down menu.
Re-execute your searches for "United States", "China" and "Russia". You should notice the results you clicked on earlier, or other results similar to them, have moved higher is the results.
9. Compare relevancy models
We've seen how the machine learning relevancy model has affected our results, but let's dive deeper into how much. Return to the Relevancy Model Admin UI at http://localhost:17000/relevancy/ so we can compare the default relevancy model with one of our new ones. Login with username aieadmin
and password attivio
.
Click the
icon to access the "Compare Relevancy Models" tool.Use the controls at the top of the page to compare version 2 of the "R&D" model with version 0 of the "default" model. Switch "Advanced Query Lanaguage" to "Simple Query Language", enter "United States" in the "Query to compare" field and chit ENTER
.
In the example below, we can see that result 1 using the R&D relevancy model moved up 2 places from what it is using the default relevancy model.
You can click the
and icons to display the feature vector and score explaination for the document respectively.Other documents moved lower in the results such as the following:
The "Compare Relevancy Models" tool can also be used to compare different versions of the same model.
What Next?
- See Machine Learning Relevancy to learn more about the various components that make up Machine Learning Relevancy.