Overview
The Advanced Query Language provides sophisticated tools for use from a client program where the query can be assembled by software. Unlike the Simple Query Language, it is not intended for end users. However, a project developer can use Search UI or the Debug Search page to cut-and-paste advanced queries to test syntax.
The Advanced Query Language is a prefix notation query language. In other words, the operator is specified first, followed by the parameters to be used by that operator.
The Advanced Query Language specifies search criteria, similar to the WHERE half of a SQL query. It does not specify the form of the search results (which would be the SELECT side of a SQL query). To format, concatenate, or otherwise transform the returned field values, see the discussion of Field Expressions.
View incoming links.
Syntactic Elements
Expressions
Each expression in Advanced Query Language is structured as follows:
OPERATOR(ARG1, ARG2, ..., PARAM1 = PARAMVALUE1, PARAM2 = PARAMVALUE2, ...)
- Each expression has an operator, 1 or more arguments, and 0 or more parameters.
- Operator names are case insensitive. For clarity and as a best practice, all operators will appear in UPPER CASE on this page.
- Some operators ("unary" operators) take only a single argument.
Other operators ("n-ary" operators) may take one or more arguments.
The Advanced Query Language inherits Simple Query Language constructs for Data Types, Terms, Fields, and Ranges.
Special Characters
The following characters have special meanings in Advanced Query Language:
\ " , = < > [ ] { } ( ) : ~ ^
The above list includes the space character. To treat any of these characters literally, you must escape it with a backslash \ . You need not escape any characters inside double-quotes, except for the double-quote character itself.
Example Value | Escaped Value | Quoted Value |
---|---|---|
dog | dog | "dog" |
attivio engine | attivio\ engine | "attivio engine" |
dog=cat | dog\=cat | "dog=cat" |
double"quote | double\"quote | "double\"quote" |
Query Language Parameters
In the following expression, each PARAM and PARAMVALUE can have an arbitrary value (except that PARAM cannot be an operator name).
OPERATOR(ARG1, ARG2, ..., PARAM1 = PARAMVALUE1, PARAM2 = PARAMVALUE2, ...)
Certain parameters have specific meaning to certain operators; these are listed below in each operator's description section. Note that parameters specified in an expression will be inherited by those sub-expressions that do not define the parameter. In other words, the expression:
AND( NEAR( HELLO, WORLD ), NEAR( DOG, CAT ), NEAR( ACTIVE, ENGINE, distance=3 ), distance=4 )
...is equivalent to the expression...
AND( NEAR( HELLO, WORLD, distance=4 ), NEAR( DOG, CAT, distance=4 ), NEAR( ACTIVE, ENGINE, distance=3 ))
Shared Query Parameters
Some parameters that are shared among several different operators include:
Parameter | Data Type | Default | Inherited | Description |
---|---|---|---|---|
boost | integer | 100 | no | Boost for query |
language | string | "en" | yes | The language to use when performing linguistic processing on the query. |
tokenize | boolean | true | yes | If false, terms will not be tokenized |
stopwords | enum (on, off, block) | off | yes | Turn stopword removal on (remove stopwords from query), off (do no processing of stopwords), or block (down boost stopwords). Defaults to off, except where schema stopword.mode is "index". |
spellcheck | enum (off, suggest, auto_correct, auto_expand) | off | yes | Specify spellchecking mode |
Scope Operator
Under certain circumstances, various features of AIE permit searching for, or within, a "scope." A scope is a bounded region in a token list that represents a sentence, a person, a location, a company, a keyphrase, a date, an entity-sentiment value, or is based on one of the XML elements found in the source document.
See Scope Search for details and examples.
Date Operator
Advanced query languages provides a DATE operator for specifying date values in custom formats. The syntax is as follows:
DATE(date_string[, date_format[, date_timezone]])
If not specified, date_format is the standard ISO-8601 format ("yyyy-MM-dd'T'HH:mm:ss"). If not specified, date_timezone is assumed to be UTC.
Examples:
Date Expression |
---|
DATE("1983-12-01T00:00:00") |
DATE("12/01/1983", "MM/dd/yyyy") |
DATE("12/01/1983", "MM/dd/yyyy", EST) |
For the details of AIE's date-matching behavior, see Dates and Date Formats.
Term Operator
The TERM operator is a unary operator that provides the ability to search for terms or phrases.
The argument for the TERM operator must be a String.
TERM specific parameters
Parameter | Type | Default | Description |
---|---|---|---|
score | enum (default, constant, position) | default | See Term Score Mode |
norms | boolean | true | Use indexed field norms in score computation |
wildcard | boolean | true | If false, treat * and ? literally (not as wildcard characters) |
Term Score Mode
The TERM operator supports 3 scoring modes.
- DEFAULT - TF/IDF Scoring calculation
- POSITION - score calculated based on first match's position
- CONSTANT - score will be constant for all documents that match the term
Examples
Example Query | Description |
---|---|
TERM(attivio) | Matches all documents that contain the term "attivio" |
TERM("attivio search") | Matches all documents that contain the phrase "attivio search" |
TERM(attivio, boost=200) | Matches all documents with the term "attivio" and applies a boost of 200% to the score for this match |
TERM(attivio, score=default) | Explicitly use TF/IDF scoring |
TERM(attivio, score=constant) | Use constant scoring |
TERM(attivio, score=constant, boost=150) | Use constant scoring with a boost |
Fuzzy Operator
The FUZZY operator provides the ability to match documents with terms that are similar to a search term. Similarity between terms is computed using Levenshtein distance formula.
FUZZY Specific Parameters
Parameter | Type | Default | Description |
---|---|---|---|
similarity | integer | 50 | The similarity threshold for declaring a match. (Smaller number means more fuzzy.) (max value = 99) |
similarity.prefix | integer | 0 | Requires that the first N characters of the fuzzy match be identical to the search term. |
Maximum Levenshtein distance allowed for matching terms is roughly (Search Term Length) - (Search Term Length) * (Similarity percentage)
Examples
Example Query | Description |
---|---|
FUZZY(ativio) | Matches documents that have terms that are a small edit distance away from "ativio". Uses the default 50% similarity threshold. |
FUZZY(ativio, similarity=75) | Matches documents that have terms that are 75% similar to "ativio". |
FUZZY(ativio, similarity.prefix=1) | Matches documents that contain terms that are 50% similar to "ativio" and must start with "a". |
OR(FUZZY("dollar", similarity=80, similarity.prefix=1), FUZZY("collar", similarity=80, similarity.prefix=1)) | This query restricts the match to terms that begin with either "d" or "c". See the tip below. |
Use similarity.prefix to speed up matches
A FUZZY match calculates the Levenshtein distance between the search term and every word in the index. This can take a long time. You can speed up the search by using the similarity.prefix=1 parameter. This restricts the FUZZY matching to those terms that have the same first letter as the search term. If you want to simulate fuzziness in the first letter, you can OR multiple FUZZY queries, as shown in the example above.
Regex Operator
The REGEX operator provides the ability to match documents with terms that match a specified regular expression. The full list of REGEX operators appears on the Lucene RegExp page.
Example Query | Description |
---|---|
REGEX("c[aeiou]t") | Matches documents that have terms that match the supplied regular expression (cat, cet, cit, cot, cut) |
REGEX expressions cannot cross token boundaries.
Note that regular expressions cannot cross token boundaries in tokenized text: a query for REGEX(a.*d) will match the single token and but will not match a fund in tokenized text because it contains two separate tokens, a and fund. If you need regular expression that cross token boundaries see RegexMatch (Field Expression)
Anchoring Operators
Term anchoring is provided by the STARTSWITH, ENDSWITH and EQUALS operators.
These operators are unary operators that follow the same semantics as the TERM operator.
Operator | Example Query | Description |
---|---|---|
STARTSWITH | title:STARTSWITH("query language") | Matches documents which contain a title value that starts with the phrase "query language" |
ENDSWITH | title:ENDSWITH("query language") | Matches documents which contain a title value that ends with the phrase "query language" |
EQUALS | title:EQUALS("query language") | Matches documents which contain a title value that equals the phrase "query language" |
Field Operators
The following operators apply to field names:
ISNULL Operator
The ISNULL operator can be applied to a field name. It will match documents that contain no value for the specified field.
Example Query | Description |
---|---|
ISNULL(foo) | Matches all documents that have no indexed value for the "foo" field |
Special Internal Fields
There is one internal field (not in the AIE Schema) that can be used in queries.
Document ID Field
The AIE document ID field is .id. You can write Simple Query Language queries that directly match this field.
Example Query | Description |
---|---|
.id:NA | Match a document whose document ID is "NA". |
OR(.id:NA, .id:NAM) | Match either of these two documents. "NA" and "NAM" are document ID values. |
.id:OR(NA, NAM) | Same as the previous example. The expression that follows the field name can be any legal query. |
.id:"NAM\-2" | If the ID contains special characters, put it inside double-quotes and escape the characters. |
Proximity Query Operators
Proximity searching is provided by the NEAR, ONEAR and PHRASE operators.
The NEAR operator searches for terms that are within a specified distance of each other. NEAR is an n-ary operator. Its arguments can be Strings, a special form of the OR operator or the SCOPE operator. If using an OR operator inside a NEAR, its arguments must be Strings. The OR operator inside a NEAR provides the ability to "stack" terms in the same position.
The ONEAR operator is similar to the NEAR operator, except it requires matching documents have terms in the same order as specified in the ONEAR expression.
The PHRASE operator is an ordered near with a default distance of 0 between terms.
NEAR/ONEAR/PHRASE Specific Parameters
Parameter | Type | Default | Description |
---|---|---|---|
wildcard | boolean | true | If false, treat * and ? literally (not as wildcard characters) |
distance | integer | 10 (0 for PHRASE) | Defines the maximum distance between terms for proximity; a distance of 0 is equivalent to exact phrase query (terms right next to each other). Must be non-negative. |
Examples
Example Query | Description |
---|---|
NEAR(hello, world) | Matches documents where the term "hello" is up to 10 terms away from "world" |
NEAR("hello world", project) | Matches documents where the term "project" is up to 10 terms away from the "world" term in a "hello world" phrase |
NEAR(hello, world, distance=5) | Matches documents where the term "hello" is up to 5 terms away from "world" |
NEAR(hello, OR(world, worlds)) | Roughly equivalent to OR(NEAR(hello, world), NEAR(hello, worlds)) |
PHRASE(hello, world) | Equivalent to TERM("hello world") |
PHRASE(hello, scope(person)) | Matches documents where "hello" directly precedes a "person" scope |
Two Query Terms
Given the content "the big dogs went running to the market"...
Query | Description |
---|---|
NEAR(big, running, market) | Matches |
NEAR(big, running, distance=2) | Matches |
NEAR(big, running, distance=1) | Does not match as "running" is 2 away from "big" |
NEAR(running, big, distance=2) | Matches |
ONEAR(big, running) | Matches |
ONEAR(running, big) | Does not match because "big" must follow "running" |
ONEAR(big, running, distance=4) | Matches |
More Than Two Query Terms
Proximity operators can take an arbitrary number of query terms. The query NEAR(word1, word2, ... wordn, distance=d) will match documents that have at least one block of n+x words that includes at least one instance of each word in the query. The corresponding ONEAR query works similarly and requires that the query terms appear in order. For example:
Example data | NEAR(a, b, c, distance=5) | ONEAR(a, b, c, distance=5) |
---|---|---|
a z b z c | matches | matches |
a z z z z c z b | matches | does not match |
a z b z z z z b z z c | does not match | does not match |
Boolean Query Operators
AND
The AND operator is an n-ary operator whose arguments can be any other expression. It will only match documents when all sub-expressions match.
Examples
Example Query | Description |
---|---|
AND(cat, dog) | Matches all documents that contain both "cat" and "dog" |
AND(TERM(cat), TERM(dog)) | Matches all documents that contain both "cat" and "dog" |
AND(cat, "hound dog") | Matches all documents that contain both "cat" and the phrase "hound dog" |
OR
The OR operator is an n-ary operator whose arguments can be any other expression. It will match any document where one or more of the sub-expressions matches.
OR Specific Parameters
Parameter | Type | Default | Description |
---|---|---|---|
minimum | integer | 1 | The minimum number of operands that must match for a document to match |
disableCoord | boolean | false | If true, disables coordination for or queries; this means that no weight will be given to documents which match more terms - the relevancy score will be based on matched terms only |
Examples
Example Query | Description |
---|---|
OR(cat, dog) | Matches all documents that contain either "cat" or "dog" |
OR(TERM(cat), TERM(dog)) | Matches all documents that contain either "cat" or "dog" |
OR(cat, "hound dog") | Matches all documents that contain either "cat" or the phrase "hound dog" |
Score Mode
AND and OR queries support specifying a scoring mode using the score query parameter.
Scoring Mode | Description |
---|---|
default | Score for all clauses is summed. OR queries will also apply a coordination boost based on the number of clauses that match |
max | Score is the maximum score across all clauses |
Examples
Example Query | Description |
---|---|
OR(a, b, c, score=default) | Score will be sum of scores for a , b , and c with coordination |
OR(a, b, c, score=max) | Score will be max of scores for a , b , and c |
AND(a, b, c, score=default) | Score will be sum of scores for a , b , and c |
AND(a, b, c, score=max) | Score will be max of scores for a , b , and c |
NOT
The NOT operator is a unary operator whose argument is another expression. It will match the inverse of the sub-expression.
Examples
Example Query | Description |
---|---|
NOT(cat) | Matches any document that does not contain "cat" |
NOT(AND(cat, dog)) | Matches any document that does not contain both "cat" and "dog" |
NOT(OR(cat, dog)) | Matches any document that does not contain either "cat" or "dog" |
Boost Query Operator
The BOOST operator can be used to apply a boost query and/or score functions to a search query. The search query determines which documents will be returned. The boost query determines which of the returned documents will be sorted to the top of the list.
Syntax
BOOST(<query>[, <boost-query>][, function("<score-function>")]...[, <parameters>])
The <query> specifies the search query for matching documents, and can be any valid advanced query language query. The <boost-query> specifies a query that will be used for boosting the score of documents matching the search query and can be any valid advanced query langauge query. The <boost-query> will not impact which documents match the <query> in any way. The <score-function> is a field expression.
BOOST Specific Parameters
Parameter | Data Type | Default | Description |
---|---|---|---|
method | enum (SUM, PRODUCT) | SUM | Specifies how the scores for the boost query/score functions will be applied to the search query's score |
Examples
Example Query | Description |
---|---|
BOOST(*:*, title:news) | Match all documents, boosting documents with "news" in the title |
BOOST(title:news, function("freshness(date)")) | Match all documents with "news" in the title, boosting documents using a "freshness" function on the "date" field |
BOOST(title:news, function("freshness(date)"), method=product) | Same as the last one, except the freshness score will be multiplied into the document's score |
BOOST(*:*,title:Japan) | Adds a boost to the base score of documents that have "Japan" in the title. |
BOOST(*:*,title:Japan, function("multiply(1, gold_i)")) | For all documents, multiplies the base score by the number in the gold_i field. Adds a boost to the documents that match the boost query. |
BOOST(*:*, fileext:xml^1000) or BOOST(*:*, fileext:term(xml, boost=1000)) | Match *:* (all documents). If fileext:xml matches, add score for fileext:xml with boost of 1000 to score for *:* match. |
Subquery Operator
The SUBQUERY operator allows annotating sections of a query. It does not modify what documents match the query in any way. It takes any legal advanced query language query as an argument.
This operator is especially useful in annotating sections of a query with additional parameters. For example, it allows annotating the "user" component of a query. These annotations can be then used by custom query transformers to perform different transformations based on parameters set on the SUBQUERY operator.
Example Query | Description |
---|---|
AND(SUBQUERY(OR(a, b), alias="user"), table:documents) | Annotate the user query OR(a, b) with the "user" alias |
Any inherited parameters can be set on the SUBQUERY operator to affect query transformations. Refer to the section on Standard Query Parameters for more information.
Embedded Query Operator
The QUERY operator can be used to embed a user query string in an Advanced Query Language query.
This is very useful for providing a user input text box for providing a Simple Query Language query and wrapping that query string with advanced query syntax.
The query string provided in the QUERY operator must have any special characters escaped.
QUERY Specific Parameters
Parameter | Type | Default | Description |
---|---|---|---|
qlang | string | simple | The query language for parsing an embedded query |
Examples
Example Query | Description |
---|---|
QUERY("+this -that OR dog", qlang="simple") | Simple query language query written in advanced query language |
AND(QUERY("user query", qlang="simple"), autofilterterm1, autofilterterm2) | Adds application provided filter terms to user query |
Join Query Operator
The JOIN operator can be used to perform relational queries inside an index.
Syntax
JOIN(<query>, <clause>[, <clause>][, <parameters>])
The <query> can be any valid advanced query language query. The JOIN query must also specify one or more <clause>. A <clause> defines the criteria for joining. Mulitple types of join clause are supported and provide different functionality.
INNER and OUTER JOIN Clauses
Syntax
INNER | OUTER(<clause-query>[, <parameters>])
The <clause-query> can be any advanced query language query (including JOIN queries). If it is a JOIN query, this will result in a multi-level join.
INNER/OUTER Specific Parameters
Parameter | Data Type | Default | Description |
---|---|---|---|
on | String | "" | Specify the primary/foreign key fields (Ex: "id = countryId"). The field to the left of the operator is the primary key and the field to the right of the operator is the foreign key. |
boost | int | 100 | The boost to apply to hits from this clause |
facet | boolean | true | If false, do not use hits for this clause in facet bucket count calculation |
rollup | int | 100 | Specify the number of child documents to collect for each parent document for this clause. Set to 0 if child documents are not needed. Defaults to 100 to avoid out-of-memory errors, but may be manually overridden if necessary by setting the value higher. NOTE: true = 100, false = 0 |
minimum | int | 1 | INNER only. Specify the minimum number of child documents a parent document must have in order for them to be returned. If there are not enough child documents, the parent document will also not be returned. This parameter has no effect on an OUTER join. |
relevancyModel | String | null | Specify the relevancy model to apply to this join clause; if not specified, the relevancy model set on the QueryRequest will be used |
order | String | null | Specify a sort order in REST format for ordering child documents. (Ex: order="fieldname:asc") Only a single level of sort is supported here, and ordering by relevancy is not currently supported. Ordering of child documents from cross-partition joins is not supported; if a cross-partition join query is submitted with an order parameter value, the query will fail with a "Child document ordering not supported for join <primary-key>=<foreign-key>" error message. |
On Parameter
The on parameter is used for specifying the primary/foreign key relationship. The format for this parameter is: primaryKey[ <OP> foreignKey]
The operator and foreign key are both optional. If not specified the operator will be =
and the foreignKey will be the same as the primaryKey.
Supported join operators are:
Operator | Description |
---|---|
= | primary key equal to foreign key |
<> | primary key does not equal foreign key. |
>= | primary key greater than equal to foreign key |
> | primary key greater than foreign key |
<= | primary key less than equal to foreign key |
< | primary key less than foreign key |
The data type for the primary and foreign key must be the same.
For join inequalities, the data type of the join fields is important. inequalities may not work as expected if using string fields. Numeric/Date fields should in general be used if joining using inequalities
INNER/OUTER Clause Examples
Example | Description |
---|---|
JOIN(table:cpus, INNER(AND(table:processes, memory:>500), ON="cpuid")) | Select all cpus that have a process associated with the cpu using more than 500k of memory; all processes matching will be rolled up into CPU records returned |
JOIN(table:disks, OUTER(AND(table:partitions, type:NTFS), ON="diskid")) | Return all records in the "disks" table, only rolling up partitions that are NTFS partitions |
JOIN(table:cpus, INNER(AND(table:processes, memory:>500), ON="cpuid"), INNER(AND(table:cpuinfo, model:amd), ON="cpuid")) | Return all CPU records where a process is using more than 500k of memory, and the cpuinfo table indicates the cpu is an AMD |
JOIN(table:region, INNER(table:country, ON="id=regionId")) | Return all regions with country data joined in |
JOIN(table:region, INNER(table:country, ON="id=regionId" , facet="false")) | Same as above, but with facets disabled. |
The JOIN <query> and each clause's queries should only match records in a single "table". This is shown in the examples above by adding a restriction of table:<TABLENAME> to the join clauses. Unexpected results may occur if this is not done.
Composite Join Query
The COMPOSITEJOIN operator can be used to perform a search across a document that is represented as multiple IngestDocuments. This differs from typical JOIN queries as it allows for the evaluation of complex boolean queries against a composite document consisting of multiple attiviDocuments. This is the most useful when a document is indexed as two separate documents: a metadata record containing metadata fields, and a content record containing large extracted text field.
Syntax
COMPOSITEJOIN(<query>, FROM(<rootquery>), <clause>[, <clause>], on=<joinfield>[, <parameters>])
The <query> can be any valid advanced query language query. The COMPOSITEJOIN query must also specify one or more <clause>. A <clause> defines the criteria for selecting child documents to join to documents matching <rootquery>.
Parameters
Parameter | Data Type | Default | Description |
---|---|---|---|
on | String | <required> | Specify the join field. This field must represent a one-to-one or one-to-many join. |
Clause Syntax
INNER | OUTER(<clause-query>[, <parameters>])
Clause Parameters
Parameter | Data Type | Default | Description |
---|---|---|---|
boost | int | 100 | The boost to apply to hits from this clause |
facet | boolean | true | If false, do not use hits for this clause in facet bucket count calculation |
rollup | int | 100 | Specify the number of child documents to collect for each parent document for this clause. Set to 0 if child documents are not needed. Defaults to 100 to avoid out-of-memory errors, but may be manually overridden if necessary by setting the value higher. NOTE: true = 100, false = 0 |
Example
COMPOSITEJOIN(AND(searchterm1, OR(searchterm2, searchterm3)), FROM(table:metadata), OUTER(table:content), on=joinfield)
The COMPOSITEJOIN operator requires that documents are co-located.
The join field should produce a one-to-one or one-to-many join between the <rootquery> and all joined clauses. If you construct your query to produce many-to-one or many-to-many joins, you may get unexpected matches.
Filters
Filter expressions can be added to any part of the query tree, or can be specified separately from the query.
To apply a filter in a query, use the FILTER operator:
FILTER(<search-query>, <filter>)
- <search-query> - any advanced query language query expression
- <filter> - The filter to apply to <search-query>. Must be one of the filters described below.
You can also specify a filter expression alongside a query, using the appropriate parameter defined for your query API.
Format / API | Parameter | Format | Example |
---|---|---|---|
Simplified JSON Query Request (used for JSON REST Query API's | filters | Array of Advanced Query Languages filter expressions | { } |
HTTP REST GET/POST parameter (used for JSON REST Query API's | filter | Advanced Query Language filter expression (use AND operator to specify multiple expressions) | workflows=search&q=*:*&filter=AND(language:English, author:Austen) |
Query Filters
Query Filters are the simplest type of filters. They can be any other Advanced Query Language query. The query used for filtering will not be scored and will only be used for filtering the results from the search query.
Syntax:
FILTER(<search-query>, <filter-query>)
The <filter-query> can be any valid Advanced Query Language query.
Example:
FILTER(*:*, table:authors)
Geographic Distance Filtering
The DISTANCE filter expression limits a result set to include only documents with position field values (of type POINT) that lie inside a great circle defined by the specified center longitude and latitude and radius. See Geographic Searching for more information.
DISTANCE Syntax:
FILTER(<search-query>, <position-field>:DISTANCE(<center-longitude>, <center-latitude>, <maximum-distance>[, <parameters>]))
The <position-field> parameter value should be the name of an Attivio schema field of type POINT. The <center-longitude> and <center-latitude> parameter values are expressed in decimal degrees, with positive values representing northern latitudes and eastern longitudes. The <maximum-distance> parameter value is specified in decimal kilometers by default (but see discussion of the units parameter, below).
Note that the DISTANCE operator takes its arguments in (longitude, latitude) order, while many sources provide coordinates in (latitude, longitude) order. Be sure to double-check that you're entering longitude first.
DISTANCE Syntax Variations
You can omit the <center-longitude> and <center-latitude> arguments if your query request specifies these values via the HTTP REST APIs' geo.longitude
and geo.latitude
query parameters:
FILTER(<search-query>, <position-field>:DISTANCE(<maximum-distance>[, <parameters>]))
Similarly, you can omit the <position-field> prefix if your query request specifies the desired field via a geo.field
query parameter:
FILTER(<search-query>, DISTANCE(<center-longitude>, <center-latitude>, <maximum-distance>[, <parameters>]))
If you set all three query parameters, you can use this syntax:
FILTER(<search-query>, DISTANCE(<maximum-distance>[, <parameters>]))
DISTANCE Specific Parameters
Parameter | Data Type | Default | Description |
---|---|---|---|
units | enum | KILOMETERS | Units in which maximum-distance value is specified (one of KILOMETERS, MILES, NAUTICAL_MILES, METERS, or YARDS, case-insensitive) |
Examples:
Return the set of documents with position
field values within 500 miles of Vienna:
FILTER(*:*, position:DISTANCE(16.373819, 48.208176, 500, units=miles, boost=120))
All of the example queries below should return the set of documents with position
field values within 1,600 kilometers of the point at 41° N., 93° W.:
Query | Notes |
---|---|
FILTER(*:*, position:DISTANCE(-93.0, 41.0, 1600, units=kilometers)) | all parameters specified in query |
FILTER(*:*, position:DISTANCE(-93.0, 41.0, 1600)) | units parameter defaults to KILOMETERS |
FILTER(*:*, DISTANCE(-93.0, 41.0, 1600, units=kilometers)) | if submitted with geo.field=position query-request parameter value |
FILTER(*:*, DISTANCE(-93.0, 41.0, 1600)) | as above, with default units |
FILTER(*:*, position:DISTANCE(1600, units=kilometers)) | if submitted with geo.longitude=-93.0 and geo.latitude=41.0 query-request parameter values |
FILTER(*:*, position:DISTANCE(1600)) | as above, with default units |
FILTER(*:*, DISTANCE(1600, units=kilometers)) | if submitted with all three query-request parameter values shown above |
FILTER(*:*, DISTANCE(1600)) | as above, with default units |
Shape Intersection Filtering
The RECTANGLE, POLYGON, CIRCLE, and ELLIPSE shape filter expressions limit a result set to include only documents with position field values (of type POINT) that lie within a specified shape of the given type, or those with shape field values (of type SHAPE) that intersect with a specified shape of the listed type. See Shape Intersection Filtering for more information.
Syntax:
FILTER(*:*, <position-or-shape-field>:RECTANGLE((<min-x>, <min-y>), (<max-x>, <max-y>)))
FILTER(*:*, <position-or-shape-field>:POLYGON((<x1>, <y1>), (<x2>, <y2>), ...))
FILTER(*:*, <position-or-shape-field>:CIRCLE((<center-x>, <center-y>), <radius>))
FILTER(*:*, <position-or-shape-field>:ELLIPSE((<center-x>, <center-y>), <radius-a>, <radius-b>, <rotation-angle>))
Examples:
// Rectangle intersection filtering
FILTER(*:*, shape:RECTANGLE((-0.5, -0.5), (0.5, 0.5)))
// Polygon intersection filtering
FILTER(*:*, position:POLYGON((0.0, 0.0), (0.0, 1.0), (1.0, 1.0)))
// Circle intersection filtering
FILTER(*:*, shape:CIRCLE((0.0, 0.0), 0.5))
// Ellipse intersection filtering
FILTER(*:*, shape:ELLIPSE((0.0, 0.0), 0.25, 0.5, 0.0))
Saved Filters
Saved filters allow external systems to provide arbitrarily large sets of document IDs to be used as a filter against a result set from the index. The SAVEDFILTER operator takes a single argument, which must be a URI that all engine partitions can access. Each partition will request the URI and expects a newline separated list of document IDs to be returned. Note that each partition will fetch its own copy of the URI via an HTTP GET; caching (etc.) of the document IDs is left up to the web service implementing the back end.
Syntax:
SAVEDFILTER(<URI>)
Examples:
// search for the word 'dog' in the index but return documents whose document IDs are in the file hosted at http://myhost/someListOfDocumentIds.txt AND(dog, SAVEDFILTER("http://myhost/someListOfDocumentIds.txt") // same as above but use a URI that takes CGI arguments to generate document IDs AND(dog, SAVEDFILTER("http://myhost/myService?arg1=val1&arg2=val2")
Sample return content from URI
documentId1 documentId2 documentId3
Query Builder User Interface
In some places in the Attivio user interfaces where an Advanced Query Language query may be entered, it is possible to use the Query Builder dialog box to construct the text of the query using graphical building blocks. There will be an Edit icon (like the pencil shown in the Create Rule dialog, below) next to the query field; clicking it opens the Query Builder dialog box. Edit the query using the controls in the dialog box and click "Save" when you are done to update the text field with your new query.