Overview
AIE supports a hierarchical drill-down of date ranges through the DateFacetRequest mechanism. The DateFacetRequest is attached to the QueryRequest at some point in the defaultQuery workflow.
A date facet uses a calendar-based facet bucket collection:
- first query results in one facet bucket for each "year"
- drill down on "year" results in getting "month" buckets for that year
- drill down on "month" results in getting "days" buckets for that month
- drill down on "day" results in getting "hours" buckets for that day
- drill down on "hour" results in getting "minutes" buckets for that hour
- drill down on "minute" results in getting "seconds" buckets for that minute
General information common to all facet features can be found on the Facets page.
View incoming links.
Creating Date Test Data
A date facet performs best when the index contains numerous documents that have a wide range of dates. If you don't have access to an appropriate data set for testing, you can create one by passing documents through an ingest transformer that inserts randomly-generated timestamps in the date field.
The following transformer class is an elaboration of the simple one described in Creating Custom Ingest Transformers.
package com.acme.examples; import com.attivio.sdk.ingest.IngestDocument; import com.attivio.sdk.server.component.ingest.DocumentModifyingTransformer; import com.ibm.icu.util.GregorianCalendar; // DateRandomizer is an ingest transformer that overwrites the creation dates // of all passing documents and substitutes a random date between 2000 and // 2010. Put it in the ingest workflow just before the indexer. // The purpose of DateRandomizer is to provide data to exercise the // DateFacetRequest capability. public class DateRandomizer implements DocumentModifyingTransformer { public boolean processDocument(IngestDocument doc) { int year = randBetween(2000, 2010); // Adjust range of dates here. int month = randBetween(0, 11); int day = randBetween(0,29); int hour = randBetween(0,23); int min = randBetween(0, 59); int sec = randBetween(0, 59); GregorianCalendar calendar = new GregorianCalendar(year,month,day,hour,min,sec); doc.setField("date", calendar.getTime()); return true; } public static int randBetween(int start, int end) { Math.random(); return start + (int)Math.round(Math.random() * (end - start)); } }
Create a new ingest transformer based on this class, and insert it in the ingest workflow just before the indexer subflow. Then load the Factbook country feed (or any other documents). The transformer will generate a random timestamp and insert it in each document's date field. The granularity of the timestamps can be controlled by adjusting the limits on the random values.
Requesting a Date Facet
Java Query API
It is very simple to add a DateFacetRequest to a QueryRequest. The following QueryTransformer can be inserted at the beginning of the defaultQuery workflow. It will append a DateFacetRequest to every passing query. This example is based on the one in Creating Custom Query Transformers.
package com.acme.examples; import java.util.ArrayList; import java.util.List; import com.attivio.sdk.search.QueryRequest; import com.attivio.sdk.AttivioException; import com.attivio.sdk.search.QueryFeedback; import com.attivio.sdk.search.facet.DateFacetRequest; import com.attivio.sdk.server.annotation.ConfigurationOption; import com.attivio.sdk.server.component.query.QueryTransformer; /** Sample query transformers that adds a new field for faceting. */ public class AddDateFacetRequest implements QueryTransformer { private String facetField = "myfacetfield"; @Override public List<QueryFeedback> processQuery(QueryRequest query) throws AttivioException { query.addFacet( new DateFacetRequest("date") ); // adding feedback is optional but is useful for letting end users know what happened. Return null if there is no feedback List<QueryFeedback> feedback = new ArrayList<QueryFeedback>(); feedback.add(new QueryFeedback(this.getClass().getSimpleName(), "sample", "added a facet field: " + facetField)); return feedback; } @ConfigurationOption(displayName = "Facet Field", description = "Field to add a facet request for to each query request") public String getFacetField() { return facetField; } public void setFacetField(String facetField) { this.facetField = facetField; } }
The critical details are in these two lines:
public List<QueryFeedback> processQuery(QueryRequest query) throws AttivioException { query.addFacet( new DateFacetRequest("date") );
The processQuery() method picks up the incoming QueryRequest and adds a DateFacetRequest for the date field. That's really all you have to do. The default date-facet behavior adapts flexibly to most situations.
HTTP REST Query API
A date facet request can be added to a query request through the REST Query API by adding this element to the query URL.
&facet=datefield(dateIntervals=auto)
Setting dateIntervals to "auto" enables the date facet's sophisticated automatic interval discovery mechanism.
JSON REST Query API
To add a date-facet request to a query request using JSON, specify the date schema field in the query's facets array with the (dateIntervals=auto) parameter value, as in this example:
{ "query" : "*:*", "facets" : ["date(dateIntervals=auto)" ] }
You should see the facet and its buckets displayed in the JSON response's facets list as usual, along these lines:
"facets": [ { "label": "Date", "buckets": [ { "label": "January 1, 2000", "filter": "PHRASE(RANGE(\"2000-01-01T00:00:00\", \"2002-01-01T00:00:00\", upper=exclusive), facet=true, tokenize=false, filter=true)", "count": 500 } ] } ]
Facet Finder
The Facetfinder will automatically replace facet requests on date fields with a DateFacetRequest (unless that facet has the "facetfinder=false" parameter value set on it).
Viewing Date-Facet Behavior
Note that the Facets page contains examples of facet response values returned from AIE searches.
Viewing Date Facets in SAIL
As of the release of AIE 5.0, the SAIL search interface has not entirely caught up with recent advances in DateFacetRequest. You can use Sail to verify that your date facet is working, but the labels on the facet buckets may require some interpretation. Specifically, each facet bucket is a range, but the label on the bucket is the bucket's minimum limit. In the example below, "January 1, 2000" is a six-month bucket that begins on January 1.
After ingesting 260 documents with creation dates between 2000 to 2002, inclusive, we searched for *:*.
Date Facet Display | Remarks |
---|---|
Each facet bucket in this list represents half a year. (With a wider range of dates, each bucket would have represented a year or some other increment that would yield about ten buckets (the default granularity). We chose January 1, 2001, which means creation dates falling in the first half of 2001. | |
Each bucket in the next display represents a calendar month. We selected June 1, 2001, meaning the entire month of June, 2001. | |
These buckets represent "weeks in June, 2001," but the first few days of June are actually in the week of May 27, 2001. That's why we have a "May" bucket in the "weeks of June." We selected the June 3, 2001 bucket. | |
These buckets are the days of "the week of June 3, 2001." With sufficiently dense data, the date facet can continue to drill down to individual hours, minutes, seconds and milliseconds.
|
See the Debug Search Results (below) to view the returned buckets in more detail.
Viewing Date Facet in Debug Search Results
The date facet information returned in the QueryResponse object contains a full description of each "bucket." To view it, use the Debug Search page. Set the Output Format to Legacy-XML and perform a search. This returns a page of search results in XML format. The date facet information is very near the bottom of the page. Here's an example:
<facets time="0"> <facet name="date" field="date" displayName="Date" count="2"> <attributes> <string name="date.tz">UTC</string> <string name="date.scale">WEEK</string> <int name="date.step">1</int> </attributes> <statistics count="2" min="990613412000" max="991557101000" sum="0.0" sumOfSquares="0.0"/> <bucket count="1" ordinal="-1"> <label>May 20, 2001</label> <min class="date">2001-05-20T00:00:00.000</min> <max class="date">2001-05-27T00:00:00.000</max> </bucket> <bucket count="0" ordinal="-1"> <label>May 27, 2001</label> <min class="date">2001-05-27T00:00:00.000</min> <max class="date">2001-06-03T00:00:00.000</max> </bucket> <bucket count="1" ordinal="-1"> <label>June 3, 2001</label> <min class="date">2001-06-03T00:00:00.000</min> <max class="date">2001-06-10T00:00:00.000</max> </bucket> </facet> </facets>
This excerpt defines a date facet consisting of three buckets, each of which represents a calendar week:
- May 20-27, 2001
- May 27 to June 3, 2001
- June 3-10, 2001
Consult the min and max timestamps for a bucket if there is any question about what range it represents.