Page tree
Skip to end of metadata
Go to start of metadata

Overview

The Attivio Command Line Interface (AIE-CLI or just CLI) is a small-footprint utility that runs in an interactive command window. It lets us start, stop and monitor multiple servers, and also provides tools for deploying the project's source files to the configuration servers. 

An instance of the AIE-CLI loads the configuration of a single project. Its actions are confined to the servers, services, and configurtation files of that project. 

Multiple AIE-CLI instances can all communicate with the same project.

View incoming links.

Running AIE-CLI and Starting the Attivio Project

Start and stop nodes using the Attivio Command Line Interface executable, aie-cli. The AIE-CLI is based on jline, which closely follows the features of GNU shell interfaces. Command completion, history, and help are part of the interface. Commands issued from the AIE-CLI are sent to the Attivio Platform Agent services running on Attivio project hosts and let you start or stop any Attivio processes on any given host. As such, the host running the AIE-CLI must have network connectivity to all Attivio project hosts.

The AIE-CLI is invoked for a specific project. You must either start aie-cli in a project directory or specify where the project directory is located with the -p option.

Also, the createproject tool generates a project configuration appropriate for execution on your local system. When modifying the project to run on multiple nodes, make sure the projectdir property in your topology-nodes.xml file is adjusted correctly for the target nodes. See Multi-Node Topologies and Attivio Configuration for additional details.

  To start your Attivio project using the AIE-CLI, perform these steps:

  1. Open a Command Prompt or terminal window (a separate one from the one running the Attivio Platform Agent, if this service was started at the command line), and execute the following commands, replacing the <INSTALL_DIR> and <PROJECT_DIR> placeholders with your own Attivio Platform installation path and your Attivio project directory path:

    Linux
    cd <INSTALL_DIR>/bin
    sh aie-cli -p <PROJECT_DIR> -l /opt/attivio/data-agent
    Windows
    cd <INSTALL_DIR>\bin
    aie-cli.exe -p <PROJECT_DIR> -l C:\attivio\data-agent

    The "Loading configuration" message appears. The project automatically loads and becomes the active project for the AIE-CLI's management functions. By default, if the project or resource files change while the AIE-CLI is running they reload when a deploy command executes.

  2. When the configuration finishes loading and the aie> prompt appears, type start all and press Enter to start all of your Attivio project's service and node components.

    1. If this is the first time your project has been started, its configuration will be automatically deployed; this will not occur on subsequent startups.

  3. At the prompt, type status and press Enter to see the current status of all your Attivio project's server, node, and Hadoop service components. (Hadoop services are only shown for clustered projects which use the Attivio Hadoop cluster). Repeat this command until all components show RUNNING status. When this occurs, your project is ready for use.

AIE-CLI Startup Options

This table lists the available options for the Command Line Interface executable, <INSTALL_DIR>/bin/aie-cli.

OptionDefault ValueDescription
-allowPartialAgents

Allows aie-cli to run when the Attivio Platform Agent service is offline or inaccessible on one or more hosts defined in the Attivio project topology. (Without this option, aie-cli will refuse to start if any project host's Agent service is offline or inaccessible.)

Use this option only when necessary.

If a previously inaccessible Agent service instance becomes accessible again, the AIE-CLI will not recognize it until aie-cli is restarted. When restarting the AIE-CLI to pick up newly-active Agent instances, re-run any deploy commands to ensure all systems are up to date.

-dpDisables the profiler.
-e

"default", or the value denoted by the currentEnvironment attribute in the <PROJECT_DIR>/conf/configuration.xml file

Required for multiple environments. Set the current environment for the project (as specified among the environments subdirectory in the project's conf subdirectory).
-lcurrent working directory

Lets you specify the directory to which the aie-cli.log file is writen.

If you do not use this option, the aie-cli.log file is written to the current working directory.

-mm, --minmemAttivio typically detects available memory and adjusts accordingly. This option forces a minimum memory setting for running a simple, single-node localhost topology on a non production machine. For example you might want to use the minmem option when running Attivio on a laptop with only 4 GB memory.
-ni

Specifies that aie-cli runs in a non-interactive mode. Input is read from stdin until it is empty, at which point aie-cli exits. This option is intended for scripting.

As an example, on Linux you can use this command to start all of your project's servers and processes non-interactively from the system prompt:

echo start all | aie-cli -p <PROJECT_DIR> -ni
-pcurrent working directorySpecifies the project directory for the aie-cli command. The specified project loads and becomes the active project for the AIE-CLI's management functions. If the -p option isn't specified, aie-cli looks in the current working directory for the project's configuration files.
-rnone (must be specified)Given the name of a remote project (e.g., -r myproject), this option downloads the project configuration to the local directory. Requires the list of ZooKeepers be specified in the topology-nodes.xml file to specify the host and port of the ZooKeeper instance to communicate with. For full functionality, the AIE-CLI must be restarted after the download completes.
-scProvides a simple console mode to preserve console scrolling and simplify stdin/stdout displays. This mode does not show the automatically updating status on the bottom of the window and does not auto-complete commands. This mode may be useful with the status command when the Attivio project contains enough nodes that they will not all display in the default AIE-CLI console window.
-sysPct0

Sets the percentage of physical memory reserved for the OS, expressed as an integer between 0 and 100 inclusive.

Notes

  • Attivio Platform Agent service instances are inferred from the agentPort attribute in your Attivio project's topology-nodes.xml for your current environment and the hosts used therein.
  • All Agent instances for a given environment in a project must use the same port.
  • The agentPort may not be the same as any other port used in the topology.

Stopping the Attivio Project and Quitting the AIE-CLI

To stop the project, issue the stop all command at the AIE-CLI prompt, then issue the quit command to exit the AIE-CLI and return to the system prompt.

AIE-CLI Commands

The table below lists all available Command Line Interface commands. From within the AIE-CLI itself, you can also use the help command to find information about a given command.

CommandArgumentsDescription
breakprojectlock —
Breaks a project lock that was created and left around by a previous Configuration Server that was killed prior to completion. WARNING: This is a crash-recovery tool. It is possible to misuse this tool in ways that degrade system performance and possibly corrupt configuration data.
clean<name>
The state for the specified node is cleaned. When a node experiences a hard crash, the system waits up to 10 minutes before declaring the node dead. This wait time is necessary to avoid false positives due to network interrupts, excessive GC times, etc. Cleaning the node's state ends this waiting, declaring to the rest of the system that the node is indeed dead.
clear
Clears the screen.
cleardynamicchanges
This command clears all the sessions created on the config server for dynamic changes done through the Admin UI. In rare cases it may be necessary to clear these sessions if the user is unable to save dynamic changes through Admin UI or a deploy fails due to pending dynamic changes.
compare

Shows which files are different locally and remotely.

Status codes are:

  • C - a conflict exists between a local and remote version of the file. This shows up for .remote files found after the update command.
  • U - a local update to this file exists that is different from the remote version.
  • D - a file was deleted locally.
  • A - a file locally added.
  • R - a file that was remotely changed since the last 'update' command. Appears as a single R for all dynamic changes, config and generic.
  • ? - When possible, this appears for files that are not proper configuration files.
delete

[stopIfRunning] [force]

Deletes the project from the cluster. Note:

  • If your project currently has dynamic changes made through the Admin UI, the delete command fails unless the force flag is provided.
  • If your project is currently running, then the delete command fails unless you pass the stopIfRunning flag.

deletesnapshot

<snapshotName>Deletes the named snapshot. See additional information.
deploy[force] [norestart] [noyarn]

Deploys local config and resource files to the configuration server/ZooKeeper. Note:

  • An error occurs if the remote configuration files or resources have changed since the last deploy. You must update first to incorporate those changes into the local project or issue the deploy force command to overwrite them.
  • The deploy command only uploads resource files that are different than those on the server, and deletes those resources that have been removed locally.
  • Use the force flag to overwrite newer configuration changes and force the re-upload of resources.
  • To avoid an automatic node restart when nodes are running, use deploy norestart. Use this flag carefully. If the configuration changes, and one node restarts before others, it will use the newer configuration, which may cause errors. Also, if resources change on a running node, this may cause problems depending on the resource and use of it.
  • Use the noyarn flag to avoid rebuilding applications/containers and pushing code to the Hadoop cluster.
deploykeytab<keytab file> <hdfs directory>Deploys a keytab file to the specified HDFS directory. The keytab file parameter must point to a file on the local machine. The HDFS directory parameter is the directory relative to slider keytabs directory specified by the slider.hdfs.keytab.dir property in attivio.core-app.properties (e.g., /attivio/.slider/keytabs/<hdfs directory> for the default value of /attivio/.slider/keytabs). Deploying a keytab file is necessary when deploying to HDFS and YARN using a non-privileged principal.

exportsnapshot

<snapshotName> <targetHdfsUrl>Copies the snapshot data to any arbitrary HDFS location, potentially in another datacenter. See additional information.
localexportsnapshot<snapshotName> <targetUrl>Copies the snapshot data to any arbitrary file system location.
importconfig<full path to configuration file>

Imports connectors, components, and workflows in 4.0/4.1 configuration file formats and serializes them into the new configuration file formats in the proper locations.

importsnapshot<sourceHdfsUrl>Restores a snapshot that was previously exported, given the HDFS path as the only identifier. The path must exist on the same HDFS instance as the target Attivio system. See additional information.
localimportsnapshot<sourceUrl>Restores a snapshot that was previously exported to a standard file system, given the path as the only identifier. The path must exist on the same node that the AIE-CLI is running on.
help[<command>]Prints usage information for the command. If no argument is present, the list of all commands is printed.
info
Shows the path information for the current environment.
kill

node <name>

index <name>

perfmon

config|store <host-regex> <port>

java <host-regex> <process handle ID>

When used with node, kills the node with the given name. The clean command is automatically executed after the node is killed.

When used with index, kills the index with the given name.

When used with perfmon, kills the single performance monitor instance.

When used with other types of processes, the associated process on the indicated host and port is killed.

Use this command only as a last resort, as it bypasses normal shutdown procedures. The stop command is preferred. You can use a regular expression for name or host.

listsnapshots

 Lists all snapshots in descending order, sorted by the time they were requested. See additional information.
quitQuit the AIE-CLI.
resolvedAssumes all conflicts are resolved, then deletes any *.remote files found.

restoresnapshot

<snapshotName>Restore the state of the index, store, or ZooKeeper from a previous snapshot. See additional information.
setshutdownImmediately | shutdownSafely

Sets an aie-cli option:

  • shutdownImmediately — all stop-type operations happen as quickly as possible; documents in flight are dropped and index replication operations are interrupted.
  • shutdownSafely — all stop-type operations occur safely; documents in flight and index replication operations are allowed to complete. Shutdown can take an indeterminate amount of time due to this processing. This value is the default.
showerr

node <name>

config | store <host>

perfmon

When an Attivio process (configserver, perfmon, store, or node) encounters an error during startup, use the showerr command to retrieve the startup error. The command detects the context and complete host names and node names as appropriate.
snapshotindex | store | zk | allCreates a new snapshot based on the current state of the index, store, and/or ZooKeeper. See additional information.
start

all

node <name> | nodeset <name>

perfmon

config | store [<host>]

index <name>

java <host> <class name> [args]

Start the specified process or processes. The start command does not return until all processes have started up.

 

  • all — Starts all processes associated with the project, in this order:
    • aie-configserver
    • aie-store
    • upload files from project conf, lib, lib-override, and resources
    • transfer the uploaded files
    • deploy project to configuration server
    • aie-perfmon
    • all aie-node processes
  • node — The named node is started. You can use a regular expression for name.
  • nodeset — All nodes in the named nodeset are started.
  • config — The specified config servers start (if no host is specified, all config servers will start).
  • perfmon — The perfmon server starts.
  • store — The store starts (if no host is specified, all stores will start).
  • index — The named index is started in YARN
  • java — a process is started on thei specified host, running the Java class specified by the fully qualified name. All remaining arguments are passed to the process that is started. By default the following environment variables will be set appropriately: HOSTNAMEPROJECT_NAMEAIE_ZOOKEEPER, and PROJECT_ENVIRONMENT. Additionally, the log directory for the process will be set to the job directory for the process under the aie-agent data directory.

 

statusList all agent controlled processes associated with the project and their current status.
stop

all

node <name> | nodeset <name>

perfmon

config | store [<host>]

index <name>

java <host> <class name> [args]

Stop the specified process or processes. The stop command does not return until all processes have stopped. The stop all command stops all processes associated with the project, in the opposite order of the start all command.

The stop command cannot stop servers while they are starting. To stop servers while they are starting, use the kill command. Also, the shutdown mode defaults to shutdownSafely, see the set command for more information.

tail

node <nodeName> <logFile>

store | config <hostName <logFile>

perfmon <logFile>

Tail the log file of a node. The possible values for the logFile parameter can be seen using the completion feature of the AIE-CLI. Some common values are "log," "stdout," and "stderr."
update[force [deleteLocalChanges]]

Downloads remote config files and resources and tries to merge them locally. If no new changes have been made to the deployed configuration since the last update, the update command reports this and does nothing.

Use the force flag to force the update and attempted merge. If there are changes to a file both locally and remotely, a version of the file with the extension ".remote" is created in your local configuration, which you must merge manually and then clean up.

Use update force deleteLocalChanges to revert/delete all local changes and update to the latest configuration and resources without resulting in any potential conflicts or .remote files.

waitWait for latest start/stop commands to complete.
waitwhilestarting | stoppingWaits for all processes to exit the given state.
zk

get | ls | set | delquota | printwatches | create | stat | listquota | setAcl | getAcl | sync | redo | addauth | delete | deleteall | setquota | help

The zk command lets the AIE-CLI user communicate directly with ZooKeeper in a Hadoop environment. For an explanation of ZooKeeper commands see the ZooKeeper Getting Started Guide.

AIE-CLI Commands For Clustered Indexes

The following commands are for dynamically changing the index. These commands only apply to clustered systems.

CommandDescription
flexindex <indexName> addrow [<rowsToAdd>]

Add rows of searchers to the specified index. If rowsToAdd is not specified, it defaults to 1.

flexindex <indexName> removerow [<rowsToRemove>]Remove rows of searchers from the specified index. If rowsToRemove is not specified, it defaults to 1.
flexindex <indexName> loadfactor <newLoadFactor>

Dynamically change the specified index's loadfactor (the number of index partitions hosted by a single process in YARN).

The number of partitions for the index must be evenly divisible by the specified loadfactor value: if your index has 6 partitions, for instance, you can specify loadfactor values of 1, 2, 3, or 6, but not values of 4 or 5. The default loadfactor value is 1.

Increasing an index's loadfactor value reduces the number of processes (and the amount of hardware) required to run the index, but also means that any process or hardware issue will affect more partitions. Decreasing the index's loadfactor value will split the index partitions over more processes/hardware, increasing the requirements but isolating partitions from one another and potentially improving stability and performance.

flexindex <indexName> abort

Reset the number of YARN containers for the specified index to the persisted state.

This command should only be used to abort a failed addrow, removerow or loadfactor command; it will otherwise have no effect.

The flexindex addrow, flexindex removerow, and flexindex loadfactor commands are re-entrant. That is, if the command fails or times out, you can run it again to complete the operation. If you wish to abort the command, you should run the flexindex abort command in order to ensure the correct number of YARN containers are running.

AIE-CLI Commands For Attivio Clustered Projects

The following AIE-CLI commands are for managing clustered projects which are not deployed to an external Hadoop cluster. In these cases, Attivio manages the Hadoop services necessary to run Attivio and therefore provides commands to manage these services. These commands only apply to Attivio clustered systems. 

CommandDescription
start hadoop [<hostRegex> <serviceRegex>]

Starts the Hadoop processes required to run Attivio in the required order (listed below). The command accepts two optional arguments. The first is a regular expression to be used to match the nodes where the service is configured to run in the project's topology-nodes.xml file. The second is a regular expression to be used to match the service name. For example, issuing the command start hadoop prodClusterA* Hdfs* starts all Hadoop cluster services with names beginning with Hdfs which run on hosts whose names begin with prodClusterA.

When starting individual Hadoop components, the order in which they should be started is as follows:

  1. HdfsJournalNode
  2. HdfsNameNode
  3. HdfsFailoverController
  4. HdfsDataNode
  5. HBaseMaster*
  6. HBaseRegionServer*
  7. YarnNodeManager
  8. YarnResourceManager
  9. JobHistoryService

It is recommended that the HBase services be started together using start hadoop .* HBase.*

stop hadoop [<hostRegex> <serviceRegex>]

Stops the Hadoop processes required to run Attivio in the required order (listed below). The command accepts two optional arguments. The first is a regular expression to be used to match the nodes where the service is configured to run in the project's topology-nodes.xml file. The second is a regular expression to be used to match the service name. For example, issuing the command stop hadoop prodClusterA* Hdfs* stops all Hadoop cluster services with names beginning with Hdfs which run on hosts whose names begin with prodClusterA.

When stopping individual Hadoop components, the order in which they should be started is as follows:

  1. YarnJobHistoryServer (patch 124 and higher only)

  2. YarnResourceManager
  3. YarnNodeManager
  4. HBaseRegionServer
  5. HBaseMaster
  6. HdfsDataNode
  7. HdfsFailoverController
  8. HdfsNameNode
  9. HdfsJournalNode

start all

With an Attivio Cluster, the start all command will first check whether the Hadoop services are running. If not, they will be started first before starting the rest of the system. However, the stop all command will not stop any Hadoop services. You must use the stop hadoop command explicitly to stop these services.

Recovering partially running Hadoop services

In the case where the Hadoop services are not all running, such as in the following example:

Hadoop Cluster Services
HOST                        TYPE                   PID   STATUS 
-------------------------------------------------------------
server-45.attivio.com       HdfsFailoverController 4462  RUNNING
server-45.attivio.com       YarnResourceManager    4573  RUNNING
server-45.attivio.com       HdfsDataNode           4464  RUNNING
server-45.attivio.com       YarnNodeManager        4738  RUNNING
server-45.attivio.com       HdfsNameNode           4434  RUNNING
server-45.attivio.com       HBaseRegionServer      4567  NOT_STARTED
server-45.attivio.com       HdfsJournalNode        4348  RUNNING
server-45.attivio.com       HBaseMaster            4525  RUNNING
server-68.attivio.com       HBaseMaster            20762 RUNNING
server-68.attivio.com       HdfsJournalNode        20495 RUNNING
server-68.attivio.com       YarnNodeManager        21045 RUNNING
server-68.attivio.com       HdfsDataNode           20702 RUNNING
server-68.attivio.com       YarnResourceManager    20834 RUNNING
server-68.attivio.com       HdfsFailoverController 20701 RUNNING
server-68.attivio.com       HBaseRegionServer      20828 NOT_STARTED
server-68.attivio.com       HdfsNameNode           20591 RUNNING
server-88.attivio.com       HBaseRegionServer      16980 NOT_STARTED
server-88.attivio.com       YarnNodeManager        17094 RUNNING
server-88.attivio.com       HdfsJournalNode        16854 RUNNING
server-88.attivio.com       HdfsDataNode           16903 RUNNING 

To start the services with the NOT_STARTED status, rather than using the start all command, they should be started individually on each server using start hadoop [<hostRegex> <serviceRegex>]

For example:

 start hadoop server-45\.attivio\.com HBaseRegionServer

The services should be started in the following order:

  1. HdfsJournalNode
  2. HdfsNameNode
  3. HdfsFailoverController
  4. HdfsDataNode
  5. HBaseMaster
  6. HBaseRegionServer
  7. YarnNodeManager
  8. YarnResourceManager
  9. YarnJobHistoryServer (patch 124 and higher only)


Changing the Project Configuration after Attivio has Started

See Developing in Attivio - Concepts and Tools for a discussion of updating the configuration of an Attivio project after it has been deployed for the first time.