Overview
Several AIE connectors provide the option of including or excluding incoming documents based on the content of a field. Regular expressions are used to match field values.
You can recognize these connectors by the presence of Include Property Filter and Exclude Property Filter fields on the Advanced tab of the Connector Editor.
Regular expressions are very powerful. It is not within the scope of AIE documentation to describe regular expression features. There are many web pages on the Internet that do so. We urge you to consult these pages.
Some fields cannot be excluded by a connector!
Some fields are not populated by the connector, and therefore the connector cannot include/exclude based on the value of the field. This is often the case with the text field, which is populated by the Advanced Text Extraction module. To exclude documents based on the text field, consider inserting a drop component downstream of text extraction.
View incoming links.
Include/Exclude Examples
The following example configures an Include Filter to include documents with the word 'Apple' or the word 'Orange' in the 'fruit' field. The default matching value is false which means that documents without a 'fruit' field are not included.
The example also shows a configured Exclude Filter that excludes documents with the word 'Plum' in the 'fruit' field. The default match is true which means that documents without a 'fruit' field are excluded. This is because true states that the filter will exclude by default, even if the 'fruit' field is absent.
These two beans together cause the scanner to ingest only documents with a 'fruit' field that exactly matches the word 'Apple' or the word 'Orange' as long as the word 'Plum' does not match.
You can define the beans in configuration files and set them in the connector definition using the Connector Editor.
<beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.springframework.org/schema/beans" xmlns:util="http://www.springframework.org/schema/util" xmlns:sec="http://www.springframework.org/schema/security" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd http://www.springframework.org/schema/security http://www.springframework.org/schema/security/spring-security-3.1.xsd"> <bean name="includeFilter" class="com.attivio.connector.scanner.filters.PropertyFilter"> <property name="fieldName" value="fruit"/> <property name="defaultValue" value="false"/> <property name="expressions"> <list> <value>Apple</value> <value>Orange</value> </list> </property> </bean> </beans>
<beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.springframework.org/schema/beans" xmlns:util="http://www.springframework.org/schema/util" xmlns:sec="http://www.springframework.org/schema/security" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd http://www.springframework.org/schema/security http://www.springframework.org/schema/security/spring-security-3.1.xsd"> <bean name="excludeFilter" class="com.attivio.connector.scanner.filters.PropertyFilter"> <property name="fieldName" value="fruit"/> <property name="defaultValue" value="false"/> <property name="expressions"> <list> <value>Plum</value> </list> </property> </bean> </beans>