The Attivio Cognitive Search and Insight Platform is designed to scale up or down to meet the needs of your situation.
While our minimum and recommended hardware specification reflect these possibilities, it is important to consider the size and complexity of your development or production system when choosing hardware. For instance, a standard laptop is not a good choice for prototyping a large production system, because certain features may require more resources than the laptop can provide.
If you have any questions about hardware, contact Attivio Support for help in properly sizing your hardware to meet your application requirements.
Attivio uses its own internal Java Virtual Machine, so project systems do not need to have Java installed separately. Systems on which API development is performed require the Java SE Development Kit 11 (JDK 11).
In order to install and configure Attivio, all systems must be able to communicate with each other via their primary IP addresses. There is no requirement that the machines have access to the Internet.
Attivio allows developers to run in environments with limited memory. Before running in a low-memory environment, read the Memory Usage Tuning guide
Machines with limited memory should only be used in production environments with small indexes and very light query load. In addition, certain content types and/or modules such as binary file formats and text extraction require more memory than a baseline system.
Follow these configuration guidelines on systems used to develop Attivio applications:
Follow these configuration guidelines on systems used to deploy Attivio to end-users:
When deploying Attivio, whether an unclustered or multi-node topology, it is critical that each of the Attivio hosts can communicate with themselves and each other properly. The following steps are recommended when configuring Linux servers for a multi-node environment.
Set the hostname of each host to the FULLY QUALIFIED NAME.
attivio1.lab.mycompany.com rather than simply
set this in
Reboot the host for hostname settings to take effect. Repeat this for every node.
/etc/hosts file to statically define the host forward and reverse lookup for each node in the cluster.
add the following *AFTER* the entries already present in
### Attivio cluster # format # ip_address fully_qualified_hostname alias 10.1.1.101 attivio1.lab.mycompany.com attivio1 10.1.1.102 attivio2.lab.mycompany.com attivio2 # etc…
One common mistake that happens here is when host alias and fully qualified hostnames are swapped.
The following is incorrect:
10.1.1.101 attivio1 attivio1.lab.mycompany.com
Aliases should follow fully qualified host names.
The Attivio cluster section of
/etc/hosts file has to be distributed to all cluster nodes.
In order to validate the DNS settings, run the following commands on each host in the cluster.
hostname -f attivio1.lab.mycompany.com
The value returned should be the fully qualified host name for the current host.
getent hosts for all other hosts and their respective IPs in the cluster.
getent hosts attivio2.lab.mycompany.com 10.1.1.102 attivio2.lab.mycompany.com getent hosts 10.1.1.101 10.1.1.101 attivio2.lab.mycompany.com
The values returned should be the fully qualified host names for the other hosts. Repeat this process on each host to confirm consistent values are returned from each host to each other host.
Once all DNS settings have been validated, be sure to restart all Attivio processes, including the Attivio Agent.
When using SSL certificates, it is critical that the common name (CN) that is used in the certificate match the fully qualified domain name that is resolved by reverse DNS lookup.
The settings described here are required for proper execution of Attivio processes on Linux. Modify any settings on your Linux OS to meet the following requirements.
View the current limits using the
ulimit -a command and then change the settings by adding the following entries to your
The following setting is not in the
limits.conf file. To make it permanent, add the following line to
/etc/sysctl.conf and reboot the machine:
Sometimes the Attivio process may spawn one or more child processes to execute certain commands or functionality. For example, on a system configured with replication, or when Advanced Text Extraction is being used, child processes will be forked and executed. On Linux, if the machine does not have sufficient free memory during the fork stage, these child processes will fail. The easiest way to avoid this problem is by setting the Linux kernel
vm.overcommit_memory parameter to a non-zero value. The recommended value for
To set this value permanently on Linux, you must modify the
/etc/sysctl.conf file and reboot as with the vm.max_map_count setting mentioned above.
vm.overcommit_memory=0, Attivio will likely run fine without errors as long as at least 1x the amount of memory allocated to Attivio is free. For example, on a machine with 16Gig of memory (including swap space), if Attivio is configured to run with 4Gig of memory, then as long as there is at least 4Gig of free space, Attivio will be able to fork these child processes.
For Linux, the
en_US.UTF-8 encoding must be installed.
On Linux systems, Attivio only supports the bash shell.
Ensure that SELinux has been disabled or put into permissive mode.
Attivio recommends synchronizing clocks for all nodes in the topology using a tool such as NTP ( http://www.ntp.org/ ). This will help with reconciling logs across nodes when troubleshooting. Synchronizing clocks is required for the Hadoop master and slave nodes when configuring a multi-node topology .
Ensure that the
libstdc++ related libraries are installed and up to date. One way to check which libraries you may need to update would be to run the following command on RHEL/CentOS Linux:
If you are running Red Hat Enterprise Linux or CentOS 7.1, you may also need to install the 32-bit version of the
zlib library, as it is not included in these distributions by default.
You must have
glibc version 2.3 or later installed.
Attivio takes advantage of features introduced in
glibc 2.13 when monitoring resource limits for Advanced Text Extraction processes. For best results, please install
glibc 2.13 or later if offered for your Linux distribution and version.
Ensure that Python version 2.6.6 or higher is available.
The Attivio Platform installation's
<INSTALL_DIR>/bin directory includes a
linux_checker.sh Bash script. Run this script from the
<INSTALL_DIR>/bin directory and review its output to ensure that your current system configuration meets Attivio's requirements and recommendations.
The script writes its output to
stdout and to a log file named
linux_checker.<HOSTNAME>.log. The output includes entries for a number of system tests. Each test returns one of three results:
The system meets Attivio's requirement and recommendation for the tested component.
|The system meets Attivio's minimum requirement for the tested component, but does not meet the recommendation for this component.|
|The system does not meet Attivio's minimum requirement for the tested component.|
[FAIL] results, a message displays below the test result with additional details.
Make appropriate changes to address any system settings marked with
[FAIL] messages in the Linux checker script's output.
Attivio Platform 5.6.1 on Windows requires libraries from the Microsoft Visual C++ 2010 Redistributable Package (x64). Download the
vcredist_x64.exe installer for this package and execute it on each Attivio Platform host to install the required libraries.
As of Attivio Platform 5.6.2 these libraries are included with the Attivio installer, and no separate download or installation step is required for Windows hosts.
On Windows systems, the only shell supported is the Command Prompt window. Other shells (in particular Cygwin) are not supported.
Attivio may fail to start on non-English only installations of Windows. To work around this issue, the English language pack for Windows should be installed.
Deprecations are called out in the tables below as needed.
Business Center UI
Attivio can be configured to run in multi-node mode with a Linux Hadoop cluster. The following versions of Hadoop and related packages have been tested and are supported:
The Hadoop cluster must have the following installed and configured for proper Attivio execution:
HDFS, ZooKeeper, and HBase installed and running
Software system specific parameters:
Attivio cannot offer generic advice about Linux requirements for Hadoop and HDFS, however, there are a few configuration changes that are needed on every Hadoop node and every Attivio node for proper Attivio execution. These are noted below.
swappiness to 1 to reduce the amount of swapping to minimum. To do this, edit the
/etc/sysctl.conf file as root:
And then add this setting to the bottom of the file:
Disable Transparent Huge Pages for better performance by editing
/etc/rc.local as root:
And add this to the bottom of the file:
nproc high to allow for large numbers of files and process needed by Attivio. To do this edit
/etc/security/limits.conf as root:
And change or add these settings:
After making these changes, reboot your machines for them to take effect:
The minimum recommended Hadoop nodes and memory for a test system are as follows:
1 Instance, 32 GB RAM Hadoop Cluster
The minimum recommended Hadoop nodes and memory for a production system are as follows:
3 Instance, 64 GB RAM per node Hadoop Cluster
The actual memory required by Attivio on the cluster is determined by the Hadoop components, Attivio index processes, and Attivio modules. If the cluster does not have enough memory to launch a process, it will silently wait until enough resources become available.
To calculate the amount of memory required for the Hadoop cluster and determine what
yarn.nodemanager.resource.memory-mb should be set to, follow the memory calculations described here.
Attivio cannot meet production SLAs in virtual environments where physical resources (CPU, memory and I/O) are over-subscribed and/or not reserved due to the unpredictable nature of the overall system performance. All virtual resources for Attivio VMs should be backed with physical resources on a 1:1 basis. Over-subscribed / non-reserved virtual environments can be used for development and QA, however performance may vary significantly based on available resources.
Applications using large memory footprints such as Attivio are particularly sensitive to VMotion pauses as it can take significant time to copy in-memory data. If you are hosting Attivio in a cluster with DRS/VMotion enabled, we recommend using VM/Host Affinity rules on the Attivio VMs to avoid VMotion on these systems during normal operation.
Amazon’s default Linux settings set the max open file (
nofile) and the max user process (
nproc) limits to very low levels which are incompatible with Attivio. It is a good policy to set these properties to high values whether on a cloud server or not. See the Recommended Linux Settings above and Linux limit instructions here. Also, Amazon's Linux servers do not have any swap configured by default. Configuring a minimum of 16GB of swap space is a reasonable starting point for Attivio hosts.
Although we have encountered this issue with Amazon's cloud, it may also be true of other cloud providers’ Linux images and other Linux distributions.
It is a Best Practice to run only one Attivio project at a time on a given host. If a multi-project host crashes, it can be very difficult to determine the cause.