Repository

public interface Repository

An object that provides the necessary methods to access a data repository.

Depending on the traversal strategy being used, only a subset of these methods need to be implemented. The methods are typically called from a Connector object such as the ListingConnector or FullTraversalConnector.

Public Method Summary

abstract void
close()
Closes the data repository and releases resources such as connections or executors.
abstract boolean
exists(Item item)
Checks whether the document corresponding to Item exists in the data repository.
abstract CheckpointCloseableIterable<ApiOperation>
getAllDocs(byte[] checkpoint)
Fetches all the documents from the data repository.
abstract CheckpointCloseableIterable<ApiOperation>
getChanges(byte[] checkpoint)
Fetches all changed documents since the last traversal.
abstract ApiOperation
getDoc(Item item)
Fetches a single document from the data repository.
abstract CheckpointCloseableIterable<ApiOperation>
getIds(byte[] checkpoint)
Fetches all the document ids from the data repository.
abstract void
init(RepositoryContext context)
Performs data repository set-up and initialization.

Public Methods

public abstract void close ()

Closes the data repository and releases resources such as connections or executors.

public abstract boolean exists (Item item)

Checks whether the document corresponding to Item exists in the data repository.

Parameters
item
Returns
  • true if the document exists in the data repository
Throws
RepositoryException when processing the requested document fails

public abstract CheckpointCloseableIterable<ApiOperation> getAllDocs (byte[] checkpoint)

Fetches all the documents from the data repository.

This method typically returns a RepositoryDoc object for each document that exists in the repository. However depending on the data repository's capabilities, there might be delete document operations also returned.

Parameters
checkpoint encoded checkpoint bytes
Returns
Throws
RepositoryException when fetching documents from the data repository fails

public abstract CheckpointCloseableIterable<ApiOperation> getChanges (byte[] checkpoint)

Fetches all changed documents since the last traversal.

This method is only called if the data repository supports document change detection and the Connector implements the IncrementalChangeHandler interface.

The checkpoint is defined and maintained within the Repository for determining and saving the state from the previous traversal. The Cloud Search SDK stores and retrieves the checkpoint from its queue so the Repository doesn't have to manage its state between traversals or connector invocations.

Parameters
checkpoint encoded checkpoint bytes
Returns
Throws
RepositoryException when change detection fails

public abstract ApiOperation getDoc (Item item)

Fetches a single document from the data repository.

This method typically returns a RepositoryDoc object corresponding to passed Item. However, if the requested document is no longer in the data repository, then a DeleteItem operation might be returned instead.

Parameters
item the Item to process
Returns
Throws
RepositoryException when the processing of the Item fails

public abstract CheckpointCloseableIterable<ApiOperation> getIds (byte[] checkpoint)

Fetches all the document ids from the data repository.

This method is typically used by a list or graph traversal connector such as the ListingConnector to push document ids to the Cloud Search queue. These ids are then polled individually for uploading.

Parameters
checkpoint encoded checkpoint bytes
Returns
Throws
RepositoryException when fetching document ids from the data repository fails

public abstract void init (RepositoryContext context)

Performs data repository set-up and initialization.

This is the first access call from init(ConnectorContext). It indicates that the Configuration is initialized and is ready for use.

Parameters
context the RepositoryContext
Throws
RepositoryException when repository initialization fails