Page tree
Skip to end of metadata
Go to start of metadata

Quick Start

The following exercise builds on the Attivio Quick Start tutorial to introduce the concept of building a custom connector using the Attivio Connector SDK. We'll deploy the Factbook project and then create a custom connector which will connect to an email server via IMAP and ingest the emails present in the Inbox for the account into our Attivio index.

Prerequisites

In order to complete the following exercise, you'll need to have the following software installed:

ApplicationRecommended VersionDownload LocationNotes
Maven3.5 or laterhttps://maven.apache.org/ 
EclipseOxygen or laterhttps://www.eclipse.org/Be sure to install the M2Eclipse plugin or use the Eclipse IDE for Java Developers installer which includes Maven integration.

 

1. Deploy the Attivio Factbook Project

Open the Attivio Quick Start Tutorial in a new tab in your browser, follow the instructions to deploy the Factbook project, and return to this page when complete.

2. Create an Attivio Java SDK project 

The Attivio Java SDK allows custom connectors to be added to your Attivio projects. The SDK produces a custom module which you can install in any compatible Attivio project. In the next several steps, we will create a custom connector which will log into an email server via IMAP and ingest all the emails that are in the Inbox. 

    Navigate to the directory where you want to create your connector module. This should be a separate location from the Attivio installation or any Attivio projects you have created.

    cd C:\attivio\connectorsdkprojects\

    Create a new module using the Maven archetype command:

     mvn archetype:generate -DarchetypeGroupId=com.attivio.platform.archetypes -DarchetypeArtifactId=attivio-archetype-module -DarchetypeVersion=5.6.1.0

    This will download a number of artifacts and prompt you for the following:

    • groupId - enter 'com.attivio.platform'
    • artifactId: enter 'imapconnector'
    • package - accept the default package
    • Confirm the settings by typing 'Y'

    This command will result in a module project directory as follows:

    pom.xml
    tree.txt
    
    src
    	assembly
    		dist.xml
    		
    	main
    		java
    			com
    				attivio
    					platform
    						imapconnector
    							ModuleInit.java
    							SampleAPIErrorDescriptorsScanner.java
    							SampleAttivioRunnable.java
    							SampleConcurrentDataSourceScanner.java
    							SampleCustomHttpDataSourceScanner.java
    							SampleDataSourceScanner.java
    							SampleDataSourceScannerUI.java
    							SampleDocumentModifyingTransformer.java
    							SampleFieldValueCreatingTransformer.java
    							SampleFilterAwareDataSourceScanner.java
    							SampleHttpDataSourceScanner.java
    							SampleIncrementalDataSourceScanner.java
    							SampleParentChildAssociationScanner.java
    							SamplePrincipalScanner.java
    							SampleQueryRewriteTransformer.java
    							SampleQueryTransformer.java
    							SampleResponseTransformer.java
    							SampleThrottlerAwareDataSourceScanner.java
    							
    		resources
    			attivio.module.json
    			
    			imapconnector
    				beans.xml
    				features.xml
    				imapconnector.properties
    				module.xml
    				
    	test
    		java
    			com
    				attivio
    					platform
    						imapconnector
    							SampleAPIErrorDescriptorsScannerTest.java
    							SampleAttivioRunnableTest.java
    							SampleConcurrentDataSourceScannerTest.java
    							SampleCustomHttpDataSourceScannerTest.java
    							SampleDataSourceScannerTest.java
    							SampleDataSourceScannerUITest.java
    							SampleDocumentModifyingTransformerTest.java
    							SampleFieldValueCreatingTransformerTest.java
    							SampleFilterAwareDataSourceScannerTest.java
    							SampleHttpDataSourceScannerTest.java
    							SampleIncrementalDataSourceScannerTest.java
    							SampleParentChildAssociationScannerTest.java
    							SamplePrincipalScannerTest.java
    							SampleQueryRewriterTransformerTest.java
    							SampleQueryTransformerTest.java
    							SampleResponseTransformerTest.java
    							SampleThrottlingDataSourceScannerTest.java
    							
    		resources
    			attivio.test.json
    			SampleAPIErrorDescriptorsScanner_errors.json

    A number of sample source code files and their respective tests are included by default. Feel free to review them, such as SampleDataSourceScanner.java.

    Import the imapconnector project into Eclipse:

    1. In Eclipse, choose File > Import.
    2. Expand the Maven group and select Existing Maven Projects and click Next.
    3. Navigate to the project directory.
    4. The project's pom.xml file should be displayed and selected. Click Finish.
    5. In Eclipse's Package Explorer, expand the src/main/java > com.attivio.platform.imapconnector package.
    6. In the next step, we will create the code for the IMAP Connector in a class named IMapConnector in the com.attivio.platform.imapconnector package.

    3. Copy the SampleDataSourceScanner Example Class

    The Connector SDK provides a number of code samples to get you started in creating all types of connectors. To begin, copy the file at /imapconnector/src/main/java/com/attivio/platform/imapconnector/SampleDataSourceScanner.java and create a new file from it named IMapConnector.java in the same package.

    4. Define the Configuration of the Connector

      Observe the first part of the code that contains a comment, followed by the @ScannerInfo and @ConfigurationOptionInfo annotations. These describe the purpose of the connector to users when using the Connector Admin wizard and specify its preferred workflow. We can also configure any custom properties we want users to be able to set and where they will be displayed.

      Edit the comment to adequately describe the connector

      /**
       * A connector which logs into any IMap server and ingests the emails in a configurable list of folders along with their attachments.
       */

      Since emails can have attachments which will require text extraction, we'll configure our connector to submit the documents it feeds to the fileIngest workflow which Attivio provides for this purpose.

      @ScannerInfo(
          suggestedWorkflow =
              "fileIngest") // Use fileIngest for large documents and documents with complex formats such as PDF files

      Next, let's specify the name and description which will be displayed when users are creating a new connector in the Connector Admin. Ignore the groups section for now, we'll come back to that.

      @ConfigurationOptionInfo(
          displayName = "IMap Connector",
          description = "A connector which logs into any IMap server and ingests the emails along with their attachments",
          groups = {
            @ConfigurationOptionInfo.Group(
                path = ConfigurationOptionInfo.SCANNER,
                propertyNames = {"testText"})
          })

      Next, we will identify all the properties we wish to allow users to set. These include:

      Property Description Data Type
      emailServer The mail server which supports IMap, such as outlook.office365.com String
      port

      The port you need to use to connect using IMAP securely, such as 993

      Integer
      username The username which will be used when logging in to the server String
      password The password which will be used when logging in to the server String
      folders The list of folders from which to retrieve emails List of Strings


      1. The sample code includes a single property definition as well as its getter and setter. We'll modify this existing one for our emailServer property.

          private String emailServer;
        
          @ConfigurationOption(
              displayName = "Email Server",
              description = "The mail server which supports IMap, such as outlook.office365.com")
          public String getEmailServer() {
            return emailServer;
          }
          public void setEmailServer(String emailServer) {
        		this.emailServer = emailServer;
          }

        (info) Notice the naming convention of the methods. This is important for the Connector Admin UI to function properly. The getter is named by prepending "get" with the name of the variable with its first character capitalized. The setter likewise is prepended with "set". Therefore, the getter and setter for the emailServer variable are getEmailServer and setEmailServer respectively. The @ConfigurationOption annotation of the getter gives us the ability to provide a displayName and description which will provide guidance to the user in the UI. We've also specified that this field is required using the optionLevel configuration.

      2. Next, we'll add the additional properties.

        	import java.util.List;
        
        	import com.attivio.sdk.AttivioException;
        	import com.attivio.sdk.connector.DocumentPublisher;
        	import com.attivio.sdk.ingest.IngestDocument;
        	import com.attivio.sdk.scanner.DataSourceScanner;
        	import com.attivio.sdk.schema.FieldNames;
        	import com.attivio.sdk.server.annotation.ConfigurationOption;
        	import com.attivio.sdk.server.annotation.ConfigurationOption.OptionLevel;
        	import com.attivio.sdk.server.annotation.ConfigurationOptionInfo;
        	import com.attivio.sdk.server.annotation.ScannerInfo;
        
        
        	...
        
        
        	private String emailServer;
        	int port;
        	private String username;
        	private String password;
        	private List<String> folders;
        
        	@ConfigurationOption(
        			displayName = "Email Server", 
        			description = "The mail server which supports IMap, such as outlook.office365.com", 
        			optionLevel = OptionLevel.Required)
        	public String getEmailServer() {
        		return emailServer;
        	}
        
        	public void setEmailServer(String emailServer) {
        		this.emailServer = emailServer;
        	}
        
        	@ConfigurationOption(
        			displayName = "Port", 
        			description = "The port you need to use to connect using IMAP securely, such as 993", 
        			optionLevel = OptionLevel.Required)
        	public int getPort() {
        		return port;
        	}
        
        	public void setPort(int port) {
        		this.port = port;
        	}
        
        	@ConfigurationOption(
        			displayName = "Username", 
        			description = "The username which will be used when logging in to the server", 
        			optionLevel = OptionLevel.Required)
        	public String getUsername() {
        		return username;
        	}
        
        	public void setUsername(String username) {
        		this.username = username;
        	}
        
        	@ConfigurationOption(
        			displayName = "Password", 
        			description = "The password which will be used when logging in to the server", 
        			optionLevel = OptionLevel.Required, 
        			formEntryClass = ConfigurationOption.PASSWORD)
        	public String getPassword() {
        		return password;
        	}
        
        	public void setPassword(String password) {
        		this.password = password;
        	}
        
        	@ConfigurationOption(
        			displayName = "Folders", 
        			description = "The list of folders from which to retrieve emails", 
        			optionLevel = OptionLevel.Required, 
        			formEntryClass = ConfigurationOption.STRING_LIST)
        	public List<String> getFolders() {
        		return folders;
        	}
        
        
            public void setFolders(List<String> folders) {
        		this.folders = folders;
        	}

        (info) Notice we've made each of these required and we've used a formEntryClass for the password and folders getters so that the text of the password field is masked in the UI and the folders property is presented as a list in the UI, complete with buttons to add and remove entries. There are a number of other formEntryClass options which you might find useful when building connectors. Feel free to explore the examples in the SampleDataSourceScannerUI.java sample class.

      Now we can return to the @ConfigurationOptionInfo annotation of our class and edit the list of property names.

      @ConfigurationOptionInfo(
      		displayName = "IMap Connector", 
      		description = "A connector which logs into any IMap server and ingests the emails along with their attachments", 
      		groups = {
      				@ConfigurationOptionInfo.Group(
      						path = {"IMap Settings"}, 
      						propertyNames = { "emailServer", "port", "username", "password", "folders" }) }
      		)

      (info) We've specified that our 5 properties should be displayed on a tab labelled "IMap Settings" in the new connector wizard. We could also choose to display them on an existing tab.

      5. Create a Test Class and Define a Temporary start Method 

      Before we get too far along, we should take some time to establish a test class. Once again, we'll copy the provided sample code.

        Copy the file at /imapconnector/src/test/java/com/attivio/platform/imapconnector/SampleDataSourceScannerTest.java and create a new file from it named IMapConnectorTest.java in the same package.

        Modify the source code of the test class to match the following:

        /** Copyright 2019 Attivio Inc., All rights reserved. */
        package com.attivio.platform.imapconnector;
        
        import com.attivio.platform.imapconnector.IMapConnector;
        import com.attivio.sdk.AttivioException;
        import com.attivio.sdk.ingest.DocumentList;
        import com.attivio.sdk.ingest.IngestDocument;
        import com.attivio.sdk.scanner.TestScannerRunner;
        import com.attivio.sdk.schema.FieldNames;
        import com.attivio.sdk.test.SdkTestUtils;
        import org.junit.Assert;
        import org.junit.Test;
        
        /** Run a simple test scanner using SdkTestUtils */
        public class IMapConnectorTest {
        	private static final String HELLO_WORLD = "Hello World!";
        
        	@Test
        	public void test() throws AttivioException {
        		IMapConnector sampleScanner = new IMapConnector();
        		TestScannerRunner scannerRunner = SdkTestUtils.createTestScannerRunner("IMapConnector", sampleScanner);
        		scannerRunner.start();
        		DocumentList documentList = (DocumentList) scannerRunner.getSentMessages().get(0);
        		Assert.assertEquals(1, documentList.size());
        
        		for (IngestDocument doc : documentList) {
        			if (doc.getFirstValue(FieldNames.TITLE).stringValue().equals(HELLO_WORLD))
        				return;
        		}
        		Assert.fail("A document with 'Hello World!' in it's title field was not published");
        	}
        }

        Replace the definition of the start method of the IMapConnector class with the following:

          @Override
          public void start(String name, DocumentPublisher publisher) throws AttivioException {
            IngestDocument doc = new IngestDocument("1");
            doc.addValue(FieldNames.TITLE, "Hello World!");
            publisher.feed(doc);
          }
        1. Temporarily comment out the validateConfiguration method. We'll return to it later.
        2. Right-click the IMapConnectorTest.java class in Eclipse's package explorer and choose Run As > JUnit Test. The test should pass. However, we've yet to implement our custom functionality. Let's continue on, but now that we have a test class created, we can continue to update it as we progress.


        6. Implement Authentication

          Now the fun begins. We can start to implement the logic of the connector. We'll start with logging in.

          Start by modifying the project's pom.xml file to add the following dependency. We'll be using this library in our connector code.

          		<dependency>
          			<groupId>com.sun.mail</groupId>
          			<artifactId>javax.mail</artifactId>
          			<version>1.6.2</version>
          		</dependency>

          Add the following imports to our connector class:

          import java.util.Properties;
          import javax.mail.Address;
          import javax.mail.Folder;
          import javax.mail.Message;
          import javax.mail.Session;
          import javax.mail.Store;
          import javax.mail.internet.MimeMessage;

          Next, modify the start method to match the following. This will connect to the server, open one or more folders and create a document with the name of the folder as its title for each.

          	@Override
          	public void start(String name, DocumentPublisher publisher) throws AttivioException {
          		
          		Properties props = new Properties();
          		props.setProperty("mail.store.protocol", "imaps");
          	    props.setProperty("mail.imap.port", port + "");
          
          		try {
          			Session session = Session.getInstance(props, null);
          			Store store = session.getStore();
          			store.connect(emailServer, port, username, password);
          			for (String folder : folders) {
          				Folder currentFolder = store.getFolder(folder);
          				currentFolder.open(Folder.READ_WRITE);
          				IngestDocument doc = new IngestDocument(folder);
          				doc.addValue(FieldNames.TITLE, currentFolder.getName());
          				publisher.feed(doc);
          				currentFolder.close();
          			}
          		} catch (Exception ex) {
          			ex.printStackTrace();
          		}
          	}

          Let's update our test class to see if our connector code successfully logs into the server. Update your test class to match the following, filling in your own values for emailServer, port, username and password in place of redacted.

          /** Copyright 2019 Attivio Inc., All rights reserved. */
          package com.attivio.platform.imapconnector;
          
          
          import com.attivio.sdk.AttivioException;
          import com.attivio.sdk.ingest.DocumentList;
          import com.attivio.sdk.ingest.IngestDocument;
          import com.attivio.sdk.scanner.TestScannerRunner;
          import com.attivio.sdk.schema.FieldNames;
          import com.attivio.sdk.test.SdkTestUtils;
          
          import java.util.ArrayList;
          
          import org.junit.Assert;
          import org.junit.Test;
          
          /** Run a simple test scanner using SdkTestUtils */
          public class IMapConnectorTest {
          
          	@Test
          	public void test() throws AttivioException {
          		IMapConnector scanner = new IMapConnector();
          		scanner.setEmailServer("redacted");
          		scanner.setPort(redacted);
          		scanner.setUsername("redacted");
          		scanner.setPassword("redacted");
          		ArrayList<String> list = new ArrayList<String>();
          		list.add("INBOX");
          		scanner.setFolders(list);
          		TestScannerRunner scannerRunner = SdkTestUtils.createTestScannerRunner("IMapConnector", scanner);
          		scannerRunner.start();
          		DocumentList documentList = (DocumentList) scannerRunner.getSentMessages().get(0);
          		Assert.assertEquals(1, documentList.size());
          
          		for (IngestDocument doc : documentList) {
          			if (doc.getFirstValue(FieldNames.TITLE).stringValue().equalsIgnoreCase("INBOX"))
          				return;
          		}
          		Assert.fail("A document with 'INBOX' in it's title field was not published");
          	}
          }

          Re-run the test. If it does not pass, take the time to debug your code now and continue once your connector can successfully log into the server.

          7. Iterate Through Messages and Create Documents

          Now that we can connect to our server and open and close one or more folders, we're ready to start ingesting the messages themselves.

            Edit the imports, start and new setEnvelope method to match the following :

            import java.util.List;
            import java.util.Properties;
            
            import javax.mail.Address;
            import javax.mail.Folder;
            import javax.mail.Message;
            import javax.mail.MessagingException;
            import javax.mail.Multipart;
            import javax.mail.Session;
            import javax.mail.Store;
            import javax.mail.internet.MimeMessage;
            
            import com.attivio.sdk.AttivioException;
            import com.attivio.sdk.connector.DocumentPublisher;
            import com.attivio.sdk.ingest.IngestDocument;
            import com.attivio.sdk.scanner.DataSourceScanner;
            import com.attivio.sdk.schema.FieldNames;
            import com.attivio.sdk.server.annotation.ConfigurationOption;
            import com.attivio.sdk.server.annotation.ConfigurationOption.OptionLevel;
            import com.attivio.sdk.server.annotation.ConfigurationOptionInfo;
            import com.attivio.sdk.server.annotation.ScannerInfo;
            
            
            ...
            
            
            	@Override
            	public void start(String name, DocumentPublisher publisher) throws AttivioException {
            		
            		Properties props = new Properties();
            		props.setProperty("mail.store.protocol", "imaps");
            	    props.setProperty("mail.imap.port", port + "");
            
            		try {
            			// connect
            			Session session = Session.getInstance(props, null);
            			Store store = session.getStore();
            			store.connect(emailServer, port, username, password);
            			
            			// iterate through folders
            			for (String folder : folders) {
            				Folder currentFolder = store.getFolder(folder);
            				currentFolder.open(Folder.READ_WRITE);
            				
            				// iterate through messages
            				for (int i = currentFolder.getMessageCount(); i > 0; i = i - 1) {
            					// copy the message from IMAP to MimeMessage
            			        Message msg = new MimeMessage((MimeMessage) currentFolder.getMessage(i));
            			 
            			        // get from and to
            			        Address[] sender = msg.getFrom();
            			        Address[] recipients = msg.getAllRecipients();
            			 
            			        String author = sender[0].toString();
            			        
            			        //create attivio document
            			        String docID = author + msg.getSentDate().toString()
            			                + msg.getSubject();
            			        IngestDocument doc = new IngestDocument(docID);
            			        
            			        // get "From" addresses
            			        for (Address address : sender) {
            			          doc.addValue("From", address.toString());
            			        }
            			        for (Address address : recipients) {
            			          doc.addValue("To", address.toString());
            			        }
            			        
            			        setEnvelope(doc, msg);
            			        publisher.feed(doc);
            			        
            			        // added temporarily to short-circuit
            			        break;
            				}
            				currentFolder.close();
            			}
            		} catch (Exception ex) {
            			ex.printStackTrace();
            		}
            	}
            
            
            	  public static void setEnvelope(IngestDocument thisDoc, Message msg)
            	          throws MessagingException {
            	    thisDoc.addValue(FieldNames.DATE, msg.getSentDate());
            	    thisDoc.addValue(FieldNames.TITLE, msg.getSubject());
            	  }

            Edit the test class to match the following, filling in your own values for emailServer, port, username and password in place of redacted:

            /** Copyright 2019 Attivio Inc., All rights reserved. */
            package com.attivio.platform.imapconnector;
            
            
            import com.attivio.sdk.AttivioException;
            import com.attivio.sdk.ingest.DocumentList;
            import com.attivio.sdk.ingest.IngestDocument;
            import com.attivio.sdk.scanner.TestScannerRunner;
            import com.attivio.sdk.schema.FieldNames;
            import com.attivio.sdk.test.SdkTestUtils;
            
            import java.util.ArrayList;
            
            import org.junit.Assert;
            import org.junit.Test;
            
            /** Run a simple test scanner using SdkTestUtils */
            public class IMapConnectorTest {
            
            	@Test
            	public void test() throws AttivioException {
            		IMapConnector scanner = new IMapConnector();
            		scanner.setEmailServer("redacted");
            		scanner.setPort(redacted);
            		scanner.setUsername("redacted");
            		scanner.setPassword("redacted");
            		ArrayList<String> list = new ArrayList<String>();
            		list.add("INBOX");
            		scanner.setFolders(list);
            		TestScannerRunner scannerRunner = SdkTestUtils.createTestScannerRunner("IMapConnector", scanner);
            		scannerRunner.start();
            		DocumentList documentList = (DocumentList) scannerRunner.getSentMessages().get(0);
            		Assert.assertEquals(1, documentList.size());
            
            		for (IngestDocument doc : documentList) {
            			if (doc.containsField(FieldNames.TITLE))
            				return;
            		}
            		Assert.fail("No document with a title field was published");
            	}
            }

            Re-run the test to confirm we have ingested the first message.

            Sidebar on IngestDocument

            In some of the previous code snippets, we've created IngestDocument objects. These represent each searchable document we are adding to the Attivio index and are a central part of creating custom connectors.

            Following are some basics when working with the IngestDocument objects:

            Code SampleExplanation

            import com.attivio.sdk.ingest.IngestDocument;

            Import statement which allows you to create and modify documents to feed into Attivio
            IngestDocument doc = new IngestDocument(String id);Creates a new IngestDocument. It is a best practice to set a unique id which will never change. The id acts like a primary key. Any document in the index will be replaced if a new document is fed with the same id. Sometimes this involves concatenating multiple fields together. Take care not to create duplicate documents in the index by feeding the same content with different ids.
            doc.addValue(String name, String value)The addValue method will create a field on the document, set its value, or append the value if field values already exist. The method is overloaded to allow you to pass in different data types such as Boolean, Date, Number etc.
            import com.attivio.sdk.schema.FieldNames;Import statement which allows you to access the FieldNames constants.
            FieldNamesContains document field names that are frequently used in Attivio.
            publisher.feed(doc);Submits the document to the ingestion workflow. This should be done once each document is completely constructed.


            8. Handle Attachments

            So far, we've only generated code to create some simple IngestDocument objects with ids, titles, and a couple other meta data fields. Emails retrieved using IMAP can be plain text or HTML and can also have attachments. The following code should create the necessary documents for us. Beneath the code is an explanation of the snippets we have not seen yet.

            package com.attivio.platform.imapconnector;
            
            import java.io.ByteArrayInputStream;
            import java.io.IOException;
            import java.util.List;
            import java.util.Properties;
            import java.util.UUID;
            
            import javax.mail.Address;
            import javax.mail.Folder;
            import javax.mail.Message;
            import javax.mail.MessagingException;
            import javax.mail.Multipart;
            import javax.mail.Part;
            import javax.mail.Session;
            import javax.mail.Store;
            import javax.mail.internet.MimeBodyPart;
            import javax.mail.internet.MimeMessage;
            import javax.mail.internet.ParseException;
            
            import org.apache.commons.io.IOUtils;
            import org.apache.commons.lang.StringUtils;
            
            import com.attivio.sdk.AttivioException;
            import com.attivio.sdk.connector.DocumentPublisher;
            import com.attivio.sdk.ingest.IngestDocument;
            import com.attivio.sdk.scanner.DataSourceScanner;
            import com.attivio.sdk.schema.FieldNames;
            import com.attivio.sdk.server.annotation.ConfigurationOption;
            import com.attivio.sdk.server.annotation.ConfigurationOption.OptionLevel;
            import com.attivio.sdk.server.annotation.ConfigurationOptionInfo;
            import com.attivio.sdk.server.annotation.ScannerInfo;
            
            /**
             * A connector which logs into any IMap server and ingests the emails in a
             * configurable list of folders along with their attachments.
             */
            @ScannerInfo(suggestedWorkflow = "fileIngest") // Use fileIngest for large documents and documents with complex formats
            												// such as PDF files
            @ConfigurationOptionInfo(
            		displayName = "IMap Connector", 
            		description = "A connector which logs into any IMap server and ingests the emails along with their attachments", 
            		groups = {
            				@ConfigurationOptionInfo.Group(
            						path = {"IMap Settings"}, 
            						propertyNames = { "emailServer", "port", "username", "password", "folders" }) }
            		)
            public class IMapConnector implements DataSourceScanner {
            	private String emailServer;
            	int port;
            	private String username;
            	private String password;
            	private List<String> folders;
            
            	@ConfigurationOption(
            			displayName = "Email Server", 
            			description = "The mail server which supports IMap, such as outlook.office365.com", 
            			optionLevel = OptionLevel.Required)
            	public String getEmailServer() {
            		return emailServer;
            	}
            
            	public void setEmailServer(String emailServer) {
            		this.emailServer = emailServer;
            	}
            
            	@ConfigurationOption(
            			displayName = "Port", 
            			description = "The port you need to use to connect using IMAP securely, such as 993", 
            			optionLevel = OptionLevel.Required)
            	public int getPort() {
            		return port;
            	}
            
            	public void setPort(int port) {
            		this.port = port;
            	}
            
            	@ConfigurationOption(
            			displayName = "Username", 
            			description = "The username which will be used when logging in to the server", 
            			optionLevel = OptionLevel.Required)
            	public String getUsername() {
            		return username;
            	}
            
            	public void setUsername(String username) {
            		this.username = username;
            	}
            
            	@ConfigurationOption(
            			displayName = "Password", 
            			description = "The password which will be used when logging in to the server", 
            			optionLevel = OptionLevel.Required, 
            			formEntryClass = ConfigurationOption.PASSWORD)
            	public String getPassword() {
            		return password;
            	}
            
            	public void setPassword(String password) {
            		this.password = password;
            	}
            
            	@ConfigurationOption(
            			displayName = "Folders", 
            			description = "The list of folders from which to retrieve emails", 
            			optionLevel = OptionLevel.Required, 
            			formEntryClass = ConfigurationOption.STRING_LIST)
            	public List<String> getFolders() {
            		return folders;
            	}
            
            	public void setFolders(List<String> folders) {
            		this.folders = folders;
            	}
            
            	@Override
            	public void start(String name, DocumentPublisher publisher) throws AttivioException {
            		
            		Properties props = new Properties();
            		props.setProperty("mail.store.protocol", "imaps");
            	    props.setProperty("mail.imap.port", port + "");
            
            		try {
            			// connect
            			Session session = Session.getInstance(props, null);
            			Store store = session.getStore();
            			store.connect(emailServer, port, username, password);
            			
            			// iterate through folders
            			for (String folder : folders) {
            				Folder currentFolder = store.getFolder(folder);
            				currentFolder.open(Folder.READ_WRITE);
            				
            				// iterate through messages
            				for (int i = currentFolder.getMessageCount(); i > 0; i = i - 1) {
            					// copy the message from IMAP to MimeMessage
            			        Message msg = new MimeMessage((MimeMessage) currentFolder.getMessage(i));
            			 
            			        // get from and to
            			        Address[] sender = msg.getFrom();
            			        Address[] recipients = msg.getAllRecipients();
            			 
            			        String author = sender[0].toString();
            			        
            			        //create attivio document
            			        String docID = author + msg.getSentDate().toString()
            			                + msg.getSubject();
            			        System.out.println("Starting to process message: " + docID);
            			        IngestDocument doc = new IngestDocument(docID);
            			        
            			        // get "From" addresses
            			        for (Address address : sender) {
            			          doc.addValue("From", address.toString());
            			        }
            			        for (Address address : recipients) {
            			          doc.addValue("To", address.toString());
            			        }
            			        
            			        // get mail content object
            			        Object content = msg.getContent();
            			        
            			        if (content instanceof String) {
            			        	System.out.println("Message is String");
            			        	setEnvelope(doc, msg);
            			        	publisher.put(
            								 doc,
            							          FieldNames.CONTENT_POINTER,
            							          UUID.randomUUID().toString(),
            							          new ByteArrayInputStream(msg.getContent().toString().getBytes()));
            			        } else if (content instanceof Multipart) {
            			        	System.out.println("Message is Multipart");
            			          Multipart innerMultiPart = (Multipart) content;
            			          checkType(innerMultiPart, doc, msg, publisher);
            			          setEnvelope(doc, msg);
            			        }
            			        System.out.println("Feeding " + docID);
            			        publisher.feed(doc);
            			        
            			        // added temporarily to short-circuit
            			        //break;
            				}
            			}
            		} catch (Exception ex) {
            			ex.printStackTrace();
            		}
            	}
            
            	  public static void setEnvelope(IngestDocument thisDoc, Message msg)
            	          throws MessagingException {
            	    thisDoc.addValue(FieldNames.DATE, msg.getSentDate());
            	    thisDoc.addValue(FieldNames.TITLE, msg.getSubject());
            	  }
            	  
            	  public void checkType(Multipart part, IngestDocument thisDoc, Message msg, DocumentPublisher publisher)
            				throws MessagingException, IOException, AttivioException,
            				ParseException {
            
            			for (int i = 0; i < part.getCount(); i++) {
            				MimeBodyPart body = (MimeBodyPart) part.getBodyPart(i);
            				String disposition = body.getDisposition();
            				body.getContent();
            				body.getContentType();
            				
            				if (body.isMimeType("text/*")) {
            					System.out.println("Message is Mimetype text/*");
            					if (disposition == null) {
            						System.out.println("disposition is null");
            						publisher.put(
            								thisDoc,
            							          FieldNames.CONTENT_POINTER,
            							          UUID.randomUUID().toString(),
            							          new ByteArrayInputStream(body.getContent().toString().getBytes()));
            					} else if (disposition != null) {
            						System.out.println("disposition is NOT null");
            						if ((body.getDisposition().equalsIgnoreCase(Part.ATTACHMENT) && StringUtils
            								.isNotBlank(body.getFileName()))) {
            							System.out.println("Attachment");
            
            							thisDoc.addValue("hasAttachment", true);
            							thisDoc.addValue("attachmentName", body.getFileName());
            							Address[] sender = msg.getFrom();
            							String author = sender[0].toString();
            							// set child document ID
            							String attachDocID = "attachment-" + author
            									+ msg.getSentDate().toString()
            									+ body.getFileName();
            							IngestDocument attachmentDoc = new IngestDocument(
            									attachDocID);
            							setEnvelope(attachmentDoc, msg);
            							
            							byte[] bytes = IOUtils.toByteArray(body.getInputStream());
            							
            							publisher.put(
            									attachmentDoc,
            							          FieldNames.CONTENT_POINTER,
            							          UUID.randomUUID().toString(),
            							          new ByteArrayInputStream(bytes));
            							attachmentDoc.addValue("parentid", thisDoc.getId());
            							attachmentDoc.addValue("attachmentName",
            									body.getFileName());
            							
            							try {
            								publisher.feed(attachmentDoc);
            							} catch (Exception mex) {
            								System.out.println(mex.getMessage());
            							}
            							
            						}
            					}
            				}
            
            				if (body.isMimeType("application/*")) {
            					System.out.println("Mimetype is application/*");
            					// Check that body type is an attachment and has a filename
            					if ((body.getDisposition().equalsIgnoreCase(Part.ATTACHMENT) && StringUtils
            							.isNotBlank(body.getFileName()))) {
            						System.out.println("Attachment");
            			
            						thisDoc.addValue("hasAttachment", true);
            						thisDoc.addValue("attachmentName", body.getFileName());
            						// Create IngestDocument for each attachment
            						if (!body.isMimeType("application/x-zip-compressed") && !body.isMimeType("application/octet-stream")) {
            							System.out.println("Mimetype is NOT application/x-zip-compressed");
            							Address[] sender = msg.getFrom();
            							String author = sender[0].toString();
            							// set child document ID
            							String attachDocID = "attachment-" + author
            									+ msg.getSentDate().toString()
            									+ body.getFileName();
            							IngestDocument attachmentDoc = new IngestDocument(
            									attachDocID);
            							setEnvelope(attachmentDoc, msg);
            							
            							// set content pointer
            							byte[] bytes = IOUtils.toByteArray(body.getInputStream());
            							
            							publisher.put(
            									attachmentDoc,
            							          FieldNames.CONTENT_POINTER,
            							          UUID.randomUUID().toString(),
            							          new ByteArrayInputStream(bytes));
            							attachmentDoc.addValue("parentid", thisDoc.getId());
            							attachmentDoc.addValue("attachmentName",
            									body.getFileName());
            						
            								try {
            									publisher.feed(attachmentDoc);
            								} catch (Exception mex) {
            									System.out.println(mex.getMessage());
            								}
            								
            						}
            					}
            				}
            
            
            				else if (body.isMimeType("image/*")) {
            					System.out.println("Mimetype is image/*");
            					Address[] sender = msg.getFrom();
            					String author = sender[0].toString();
            					// set child document ID
            					String attachDocID = "attachment-" + author
            							+ msg.getSentDate().toString()
            							+ body.getFileName();
            					IngestDocument attachmentDoc = new IngestDocument(
            							attachDocID);
            					setEnvelope(attachmentDoc, msg);
            					// set content pointer
            					byte[] bytes = IOUtils.toByteArray(body.getInputStream());
            					
            					publisher.put(
            							attachmentDoc,
            					          FieldNames.CONTENT_POINTER,
            					          UUID.randomUUID().toString(),
            					          new ByteArrayInputStream(bytes));
            
            					attachmentDoc.addValue("parentid", thisDoc.getId());
            					attachmentDoc.addValue("hasImg", true);
            					attachmentDoc.addValue("imgName", body.getFileName());
            
            				
            						try {
            							publisher.feed(attachmentDoc);
            						} catch (Exception mex) {
            							System.out.println(mex.getMessage());
            						}
            
            				}
            
            				// If bodypart is a multipart, recursively pull out pieces
            				else if (body.isMimeType("multipart/*")) {
            					System.out.println("RECURSE: Mimetype is multipart/*");
            					Multipart mPart = (Multipart) body.getContent();
            					System.out.println("Starting RECURSE");
            					checkType(mPart, thisDoc, msg, publisher);
            				}
            
            				// If bodypart is a message, recursively pull out pieces
            				
            				 else if (body.isMimeType("message/rfc822")) { 
            					 System.out.println("Mimetype is message/rfc822");
            					 Message mess = new MimeMessage((MimeMessage) body.getContent());
            					 if(mess.isMimeType("multipart/*")) { 
            						 System.out.println("RECURSE message/rfc822");
            						 Multipart mPart =(Multipart) mess.getContent(); 
            						 checkType(mPart, thisDoc, msg, publisher);
            						 mess.setContent(mPart); mess.saveChanges();
            					 } 
            					 else if(mess.getContent() instanceof String) { 
            						 System.out.println("Multi-part String");
            						 setEnvelope(thisDoc,msg); 
            						 publisher.put(
            								 thisDoc,
            							          FieldNames.CONTENT_POINTER,
            							          UUID.randomUUID().toString(),
            							          new ByteArrayInputStream(mess.getContent().toString().getBytes()));
            					} 
            					 body.setDataHandler(body.getDataHandler());
            					 System.out.println("Set DataHandler as " + body.getDataHandler());
            				 }
            						 
            				 
            			}
            		}
            	
            	@Override
            	public void validateConfiguration() throws AttivioException {
            		//if (testText == null)
            		//	throw new AttivioException(ConnectorError.CONFIGURATION_ERROR, "Test text was not configured");
            	}
            }

            Another Sidebar

            In the most recent code, we've introduced a bit more of the SDK:

            Code SampleExplanation

            publisher.put(
            doc,
            FieldNames.CONTENT_POINTER,
            UUID.randomUUID().toString(),
            new ByteArrayInputStream(msg.getContent().toString().getBytes()));

            To avoid consuming an excessive amount of memory while ingesting documents, when we have binary content, such as a PDF, or even a large amount of text (which some emails have), we put the binary payload in the Content Store and only carry a pointer to this content in the IngestDocument. At the appropriate point within the ingestion workflow, the binary content will be retrieved to extract its text in an efficient manner. The extract text will end up in the field named "text" at the end of the workflow. If your documents are small, you can populate the "text" field directly in your connector, like you would any other field.
            attachmentDoc.addValue("parentid", thisDoc.getId());It's likely we'll be interested in displaying an email along with its attachments in our search application. It is a best practice to maintain this "parent-child" relationship by populating a field named "parentid" on the child documents with the .id value of the document from which they came.

            String attachDocID = "attachment-" + author
            + msg.getSentDate().toString()
            + body.getFileName();

            Notice when we create the attachment documents, we take care to give them unique ids by prepending the original id of the parent document with "attachment-" and appending the filename of the attachment.

            9. Add Logging

            In the above code, you may have noticed several System.out.println() statements. While this is convenient while developing and testing exclusively in Eclipse, it is not what we want to when our connector is ready for deployment to a running Attivio project. Instead, we want to write to the same logs as all other connectors. 

              Add the following import:

              import com.attivio.util.AttivioLogger;

              Add the following to the class:

              private static final AttivioLogger LOG = AttivioLogger.getLogger(IMapConnector.class);

              Replace all the System.out.println() statements with LOG.info(), LOG.debug(), or LOG.trace() statements as appropriate.


              10. Add Configuration Validation

              Next, we want to catch any bad configurations. We can validate the values set for our custom properties. We would implement any such logic in the provided validateConfiguration() method.

              For example, we could check whether the port that is set is a number greater than 0.

              import com.attivio.sdk.error.ConnectorError;
              
              
              ...
              
              
              	@Override
              	public void validateConfiguration() throws AttivioException {
              		if (port <= 0) {
              			throw new AttivioException(ConnectorError.CONFIGURATION_ERROR, "The value for port must be a number greater than 0.");
              		}
              	}

              We could take this much further to ensure that the emailServer property is either a valid domain or IP address. If there are rules for the username, password and folders, we could validate those as well.

              11. Handle Exceptions

              Another improvement we should make is to throw appropriate errors when things go wrong. In the validation code we added in the previous step, you can see that we throw an AttivioException when we hit an issue. We have access to a number of error codes via the ConnectorError object. You can pick the most appropriate error code to throw, defaulting to ConnectorError.CRAWL_FAILED.

              12. Set the Module Configuration

              Edit the src/main/resources/attivio.module.json file to add the new component [connector]:

              {
              	"name": "imapconnector",
              	"moduleVersion": "${version}",
              	"description": "A connector which logs into any IMap server and ingests the emails in a configurable list of folders along with their attachments.",
              	"initClassName": "com.attivio.platform.imapconnector.ModuleInit",
              	"connectors": {
              		"ImapConnector": "com.attivio.platform.imapconnector.IMapConnector"
              	},
              	"newFiles": {
              		"lib/imapconnector.jar": "lib/imapconnector-${project.version}.jar"
              	}
              }

              13. Build and Install the imapconnector Module in the Factbook Project

              At this point, our IMAPConnector class should look like the following:

              package com.attivio.platform.imapconnector;
               
              import java.io.ByteArrayInputStream;
              import java.io.IOException;
              import java.util.List;
              import java.util.Properties;
              import java.util.UUID;
               
              import javax.mail.Address;
              import javax.mail.Folder;
              import javax.mail.Message;
              import javax.mail.MessagingException;
              import javax.mail.Multipart;
              import javax.mail.Part;
              import javax.mail.Session;
              import javax.mail.Store;
              import javax.mail.internet.MimeBodyPart;
              import javax.mail.internet.MimeMessage;
              import javax.mail.internet.ParseException;
               
              import org.apache.commons.io.IOUtils;
              import org.apache.commons.lang.StringUtils;
               
              import com.attivio.sdk.AttivioException;
              import com.attivio.sdk.connector.DocumentPublisher;
              import com.attivio.sdk.error.ConnectorError;
              import com.attivio.sdk.ingest.IngestDocument;
              import com.attivio.sdk.scanner.DataSourceScanner;
              import com.attivio.sdk.schema.FieldNames;
              import com.attivio.sdk.server.annotation.ConfigurationOption;
              import com.attivio.sdk.server.annotation.ConfigurationOption.OptionLevel;
              import com.attivio.sdk.server.annotation.ConfigurationOptionInfo;
              import com.attivio.sdk.server.annotation.ScannerInfo;
              import com.attivio.util.AttivioLogger;
               
              /**
               * A connector which logs into any IMap server and ingests the emails in a
               * configurable list of folders along with their attachments.
               */
              @ScannerInfo(suggestedWorkflow = "fileIngest") // Use fileIngest for large documents and documents with complex formats
                                                              // such as PDF files
              @ConfigurationOptionInfo(
                      displayName = "IMap Connector",
                      description = "A connector which logs into any IMap server and ingests the emails along with their attachments",
                      groups = {
                              @ConfigurationOptionInfo.Group(
                                      path = {"IMap Settings"},
                                      propertyNames = { "emailServer", "port", "username", "password", "folders" }) }
                      )
              public class IMapConnector implements DataSourceScanner {
              	
              	private static final AttivioLogger LOG = AttivioLogger.getLogger(IMapConnector.class);
              	
                  private String emailServer;
                  int port;
                  private String username;
                  private String password;
                  private List<String> folders;
               
                  @ConfigurationOption(
                          displayName = "Email Server",
                          description = "The mail server which supports IMap, such as outlook.office365.com",
                          optionLevel = OptionLevel.Required)
                  public String getEmailServer() {
                      return emailServer;
                  }
               
                  public void setEmailServer(String emailServer) {
                      this.emailServer = emailServer;
                  }
               
                  @ConfigurationOption(
                          displayName = "Port",
                          description = "The port you need to use to connect using IMAP securely, such as 993",
                          optionLevel = OptionLevel.Required)
                  public int getPort() {
                      return port;
                  }
               
                  public void setPort(int port) {
                      this.port = port;
                  }
               
                  @ConfigurationOption(
                          displayName = "Username",
                          description = "The username which will be used when logging in to the server",
                          optionLevel = OptionLevel.Required)
                  public String getUsername() {
                      return username;
                  }
               
                  public void setUsername(String username) {
                      this.username = username;
                  }
               
                  @ConfigurationOption(
                          displayName = "Password",
                          description = "The password which will be used when logging in to the server",
                          optionLevel = OptionLevel.Required,
                          formEntryClass = ConfigurationOption.PASSWORD)
                  public String getPassword() {
                      return password;
                  }
               
                  public void setPassword(String password) {
                      this.password = password;
                  }
               
                  @ConfigurationOption(
                          displayName = "Folders",
                          description = "The list of folders from which to retrieve emails",
                          optionLevel = OptionLevel.Required,
                          formEntryClass = ConfigurationOption.STRING_LIST)
                  public List<String> getFolders() {
                      return folders;
                  }
               
                  public void setFolders(List<String> folders) {
                      this.folders = folders;
                  }
               
                  @Override
                  public void start(String name, DocumentPublisher publisher) throws AttivioException {
                       
                      Properties props = new Properties();
                      props.setProperty("mail.store.protocol", "imaps");
                      props.setProperty("mail.imap.port", port + "");
               
                      try {
                          // connect
                          Session session = Session.getInstance(props, null);
                          Store store = session.getStore();
                          store.connect(emailServer, port, username, password);
                           
                          // iterate through folders
                          for (String folder : folders) {
                              Folder currentFolder = store.getFolder(folder);
                              currentFolder.open(Folder.READ_WRITE);
                               
                              // iterate through messages
                              for (int i = currentFolder.getMessageCount(); i > 0; i = i - 1) {
                                  // copy the message from IMAP to MimeMessage
                                  Message msg = new MimeMessage((MimeMessage) currentFolder.getMessage(i));
                            
                                  // get from and to
                                  Address[] sender = msg.getFrom();
                                  Address[] recipients = msg.getAllRecipients();
                            
                                  String author = sender[0].toString();
                                   
                                  //create attivio document
                                  String docID = author + msg.getSentDate().toString()
                                          + msg.getSubject();
                                  LOG.trace("Starting to process message: " + docID);
                                  IngestDocument doc = new IngestDocument(docID);
                                   
                                  // get "From" addresses
                                  for (Address address : sender) {
                                    doc.addValue("From", address.toString());
                                  }
                                  for (Address address : recipients) {
                                    doc.addValue("To", address.toString());
                                  }
                                   
                                  // get mail content object
                                  Object content = msg.getContent();
                                   
                                  if (content instanceof String) {
                                      LOG.trace("Message is String");
                                      setEnvelope(doc, msg);
                                      publisher.put(
                                               doc,
                                                    FieldNames.CONTENT_POINTER,
                                                    UUID.randomUUID().toString(),
                                                    new ByteArrayInputStream(msg.getContent().toString().getBytes()));
                                  } else if (content instanceof Multipart) {
                                      LOG.trace("Message is Multipart");
                                    Multipart innerMultiPart = (Multipart) content;
                                    checkType(innerMultiPart, doc, msg, publisher);
                                    setEnvelope(doc, msg);
                                  }
                                  LOG.trace("Feeding " + docID);
                                  publisher.feed(doc);
                                   
                                  // add temporarily to short-circuit after 1 email is ingested
                                  //break;
                              }
                          }
                      } catch (Exception ex) {
                          ex.printStackTrace();
                      }
                  }
               
                    public static void setEnvelope(IngestDocument thisDoc, Message msg)
                            throws MessagingException {
                      thisDoc.addValue(FieldNames.DATE, msg.getSentDate());
                      thisDoc.addValue(FieldNames.TITLE, msg.getSubject());
                    }
                     
                    public void checkType(Multipart part, IngestDocument thisDoc, Message msg, DocumentPublisher publisher)
                              throws MessagingException, IOException, AttivioException,
                              ParseException {
               
                          for (int i = 0; i < part.getCount(); i++) {
                              MimeBodyPart body = (MimeBodyPart) part.getBodyPart(i);
                              String disposition = body.getDisposition();
                              body.getContent();
                              body.getContentType();
                               
                              if (body.isMimeType("text/*")) {
                                  LOG.trace("Message is Mimetype text/*");
                                  if (disposition == null) {
                                      LOG.trace("disposition is null");
                                      publisher.put(
                                              thisDoc,
                                                    FieldNames.CONTENT_POINTER,
                                                    UUID.randomUUID().toString(),
                                                    new ByteArrayInputStream(body.getContent().toString().getBytes()));
                                  } else if (disposition != null) {
                                      LOG.trace("disposition is NOT null");
                                      if ((body.getDisposition().equalsIgnoreCase(Part.ATTACHMENT) && StringUtils
                                              .isNotBlank(body.getFileName()))) {
                                          LOG.trace("Attachment");
               
                                          thisDoc.addValue("hasAttachment", true);
                                          thisDoc.addValue("attachmentName", body.getFileName());
                                          Address[] sender = msg.getFrom();
                                          String author = sender[0].toString();
                                          // set child document ID
                                          String attachDocID = "attachment-" + author
                                                  + msg.getSentDate().toString()
                                                  + body.getFileName();
                                          IngestDocument attachmentDoc = new IngestDocument(
                                                  attachDocID);
                                          setEnvelope(attachmentDoc, msg);
                                           
                                          byte[] bytes = IOUtils.toByteArray(body.getInputStream());
                                           
                                          publisher.put(
                                                  attachmentDoc,
                                                    FieldNames.CONTENT_POINTER,
                                                    UUID.randomUUID().toString(),
                                                    new ByteArrayInputStream(bytes));
                                          attachmentDoc.addValue("parentid", thisDoc.getId());
                                          attachmentDoc.addValue("attachmentName",
                                                  body.getFileName());
                                           
                                          try {
                                              publisher.feed(attachmentDoc);
                                          } catch (Exception mex) {
                                              LOG.trace(mex.getMessage());
                                          }
                                           
                                      }
                                  }
                              }
               
                              if (body.isMimeType("application/*")) {
                                  LOG.trace("Mimetype is application/*");
                                  // Check that body type is an attachment and has a filename
                                  if ((body.getDisposition().equalsIgnoreCase(Part.ATTACHMENT) && StringUtils
                                          .isNotBlank(body.getFileName()))) {
                                      LOG.trace("Attachment");
                           
                                      thisDoc.addValue("hasAttachment", true);
                                      thisDoc.addValue("attachmentName", body.getFileName());
                                      // Create IngestDocument for each attachment
                                      if (!body.isMimeType("application/x-zip-compressed") && !body.isMimeType("application/octet-stream")) {
                                          LOG.trace("Mimetype is NOT application/x-zip-compressed");
                                          Address[] sender = msg.getFrom();
                                          String author = sender[0].toString();
                                          // set child document ID
                                          String attachDocID = "attachment-" + author
                                                  + msg.getSentDate().toString()
                                                  + body.getFileName();
                                          IngestDocument attachmentDoc = new IngestDocument(
                                                  attachDocID);
                                          setEnvelope(attachmentDoc, msg);
                                           
                                          // set content pointer
                                          byte[] bytes = IOUtils.toByteArray(body.getInputStream());
                                           
                                          publisher.put(
                                                  attachmentDoc,
                                                    FieldNames.CONTENT_POINTER,
                                                    UUID.randomUUID().toString(),
                                                    new ByteArrayInputStream(bytes));
                                          attachmentDoc.addValue("parentid", thisDoc.getId());
                                          attachmentDoc.addValue("attachmentName",
                                                  body.getFileName());
                                       
                                              try {
                                                  publisher.feed(attachmentDoc);
                                              } catch (Exception mex) {
                                                  LOG.trace(mex.getMessage());
                                              }
                                               
                                      }
                                  }
                              }
               
               
                              else if (body.isMimeType("image/*")) {
                                  LOG.trace("Mimetype is image/*");
                                  Address[] sender = msg.getFrom();
                                  String author = sender[0].toString();
                                  // set child document ID
                                  String attachDocID = "attachment-" + author
                                          + msg.getSentDate().toString()
                                          + body.getFileName();
                                  IngestDocument attachmentDoc = new IngestDocument(
                                          attachDocID);
                                  setEnvelope(attachmentDoc, msg);
                                  // set content pointer
                                  byte[] bytes = IOUtils.toByteArray(body.getInputStream());
                                   
                                  publisher.put(
                                          attachmentDoc,
                                            FieldNames.CONTENT_POINTER,
                                            UUID.randomUUID().toString(),
                                            new ByteArrayInputStream(bytes));
               
                                  attachmentDoc.addValue("parentid", thisDoc.getId());
                                  attachmentDoc.addValue("hasImg", true);
                                  attachmentDoc.addValue("imgName", body.getFileName());
               
                               
                                      try {
                                          publisher.feed(attachmentDoc);
                                      } catch (Exception mex) {
                                          LOG.trace(mex.getMessage());
                                      }
               
                              }
               
                              // If bodypart is a multipart, recursively pull out pieces
                              else if (body.isMimeType("multipart/*")) {
                                  LOG.trace("RECURSE: Mimetype is multipart/*");
                                  Multipart mPart = (Multipart) body.getContent();
                                  LOG.trace("Starting RECURSE");
                                  checkType(mPart, thisDoc, msg, publisher);
                              }
               
                              // If bodypart is a message, recursively pull out pieces
                               
                               else if (body.isMimeType("message/rfc822")) {
                                   LOG.trace("Mimetype is message/rfc822");
                                   Message mess = new MimeMessage((MimeMessage) body.getContent());
                                   if(mess.isMimeType("multipart/*")) {
                                       LOG.trace("RECURSE message/rfc822");
                                       Multipart mPart =(Multipart) mess.getContent();
                                       checkType(mPart, thisDoc, msg, publisher);
                                       mess.setContent(mPart); mess.saveChanges();
                                   }
                                   else if(mess.getContent() instanceof String) {
                                       LOG.trace("Multi-part String");
                                       setEnvelope(thisDoc,msg);
                                       publisher.put(
                                               thisDoc,
                                                    FieldNames.CONTENT_POINTER,
                                                    UUID.randomUUID().toString(),
                                                    new ByteArrayInputStream(mess.getContent().toString().getBytes()));
                                  }
                                   body.setDataHandler(body.getDataHandler());
                                   LOG.trace("Set DataHandler as " + body.getDataHandler());
                               }
                                        
                                
                          }
                      }
                   
                    @Override
                    public void validateConfiguration() throws AttivioException {
                        if (port <= 0) {
                            throw new AttivioException(ConnectorError.CONFIGURATION_ERROR, "The value for port must be a number greater than 0.");
                        }
                    }
              }

              Now, let's build it and install it in our Factbook project.

              (warning) When building a production-quality connector, you should delete all the sample classes and their corresponding tests before building your module.

                Run the following commands to build the project:

                Be sure you modify your test to only ingest a small number of emails, or else you may be waiting a long time while the test executes.


                cd C:\attivio\connectorsdkprojects\imapconnector
                mvn clean install
                ...
                [INFO] ------------------------------------------------------------------------
                [INFO] BUILD SUCCESS
                [INFO] ------------------------------------------------------------------------
                [INFO] Total time: 18.290 s
                [INFO] Finished at: 2018-06-04T15:49:09-04:00
                [INFO] Final Memory: 27M/272M
                [INFO] ------------------------------------------------------------------------

                This will create a new file in the target directory of your module named imapconnector-0.1.0-SNAPSHOT-dist.zip. Next, we'll install this module into our Factbook project.

                Run the following command to install the imapconnector module:

                 <install-dir>\bin\aie-exec modulemanager -i file:///c:/attivio/conectorsdkprojects/imapconnector/target/imapconnector-0.1.0-SNAPSHOT.zip 

                Confirm the module has been installed:

                 <install-dir>\bin\aie-exec modulemanager -l
                 
                Name            Version        User      Installed On        Description
                -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                cloudsupport    1.0.1          localuser 2018-05-24T13:00:27 Adds support for interacting with Cloud services
                module-alm      1.0.1.aie55    localuser 2018-05-24T12:59:57 Provides English language advanced linguistics capabilities, including language detection and entity extraction
                imapconnector   0.1.0-SNAPSHOT apaquette 2018-06-04T16:06:13 A description of my module
                searchanalytics 1.0.1          localuser 2018-05-24T13:00:05 Archive, scan and query search information
                searchui        1.0.1          localuser 2018-05-24T13:00:14 Used to host the Search UI application from within Attivio nodes
                webcrawler      1.0.4          localuser 2018-05-24T13:00:21 Configure connectors to crawl web content and process the pages as ingest documents. 

                Incrementally add the imapconnector module to the Factbook project:

                <install-dir>\bin\createproject.exe --name Factbook -i -m imapconnector -o C:\attivio\projects 

                Open the Attivio CLI:

                <install-dir>\bin\aie-cli.exe -p C:\attivio\projects\Factbook 

                Type the update command and hit Enter 

                Type the deploy command and hit Enter 

                Once the project is running again, move on to the next step.

                14. Add an Instance of the ImapConnector

                1. Click Business Center > Connector Admin UI
                2. Click New Connector
                3. Select the Imap Connector type and click Next
                4. Name the connector "emails" and click Next

                5. On the IMap Settings tab of the Configure page of the new connector wizard, enter the following:

                  FieldValue
                  Nameemails
                  Port993
                  Usernameredacted
                  Passwordredacted
                  FoldersInbox
                6. Click Validate
                7. Click Next
                8. Once the field mapping are previewed, click Save
                9. Run the connector

                15.Test the Results

                1. Go to http://localhost:17000/searchui/ to open Search UI. Login with username aieadmin and password attivio.
                2. Execute a search for table:emails
                3. The searchUI should display emails from the mailbox(es) specified, with the attachment(s) if any in the mails.

                Summary

                We've covered a lot in this tutorial. We created a custom connector and adjusted how the custom properties it requires are displayed. Our connector logs to the Attivio standard and error logs and throws appropriate exceptions when things go wrong. 

                If we were to continue building our connector for production use, we'd want to make our connector ingest documents incrementally, meaning each time it runs it only ingests content that is new or has been edited or deleted. We may want to make it ingest more quickly by making it run multiple threads concurrently. Some sources push back when you make requests too frequently. If we can determine what those "push backs" look lie, we can make our connector respond to them in a graceful way, such as halting ingestion for a period of time. 

                These advanced concepts will be handled in future tutorials. For now, we encourage you to start building connectors and reach out to us with any questions.

                What's Next?

                • No labels