Define synonyms

It's common for an organization to have unique terminology or multiple ways to refer to a concept or thing. You should define synonyms to establish equivalency of terms to helps users find items when searching.

Synonyms are defined by indexing items with the _dictionaryEntry well-known schema.

Items of type _dictionaryEntry require two properties:

Property Type Description
_term string The term to define. Recommended values are unhyphenated words or phrases without punctuation.
_synonym string (repeated) Alternate terms to be included in queries matching the string defined in _term

When a user includes the value of the _term property in a query, the effective query becomes "term OR synonyms." For example, if the term "scifi" is defined with the synonym "science fiction" then a query containing the word "scifi" matches items containing either "scifi" or "science fiction."

Synonyms are not applied bidirectionally. If the query is instead for "science fiction," Cloud Search doesn't apply any synonyms to the query. The query only matches items containing "science fiction. Items containing "scifi" are omitted.

To make all both terms interchangeable, define each term separately:

Term Synonyms
scifi science fiction
science fiction scifi

During query processing, hyphenation and other punctuation are removed prior to applying synonyms. The user query "sci-fi" matches the _term "sci fi." To create synonyms for terms which may be hyphenated by users, first normalize the _term to use whitespace instead of hyphens.

Continuing the example, the following definitions match the user queries treat "sci-fi," "sci fi," "scifi," and "science fiction" as interchangeable:

Term Synonyms
scifi science fiction, sci fi
sci fi science fiction, scifi
science fiction scifi, sci fi

By default, synonyms in any data source apply across an entire domain. Specifically, synonyms are applied across search applications for all searches regardless of the data source. If you want data source-specific synonyms, refer to Define data source-specific synonyms.

Define global synonyms using the Cloud Search SDK

You can use the Content Connector SDK to define terms and their synonyms. See Create a content connector for instructions on building a connector.

The following snippet illustrates building a RepositoryDoc representing the term and synonym based on a CSV file record:

DictionaryConnector.java
/**
 * Creates a document for indexing.
 *
 * For this connector sample, the created document is domain public
 *  searchable. The content is a simple text string.
 *
 * @param record The current CSV record to convert
 * @return the fully formed document ready for indexing
 */
private ApiOperation buildDocument(CSVRecord record) {
  // Extract term and synonyms from record
  String term = record.get(0);
  List<String> synonyms = StreamSupport.stream(record.spliterator(), false)
      .skip(1) // Skip term
      .collect(Collectors.toList());

  Multimap<String, Object> structuredData = ArrayListMultimap.create();
  structuredData.put("_term", term);
  structuredData.putAll("_synonym", synonyms);

  if (Configuration.getBoolean("dictionary.attachedToSearchApp", false).get()) {
    structuredData.put("_onlyApplicableForAttachedSearchApplications", true);
  }

  String itemName = String.format("dictionary/%s", term);

  // Using the SDK item builder class to create the item
  Item item =
      IndexingItemBuilder.fromConfiguration(itemName)
          .setItemType(IndexingItemBuilder.ItemType.CONTENT_ITEM)
          .setObjectType("_dictionaryEntry")
          .setValues(structuredData)
          .setAcl(DOMAIN_PUBLIC_ACL)
          .build();

  // Create the fully formed document
  return new RepositoryDoc.Builder()
      .setItem(item)
      .build();
}

Note the following when defining synonyms:

  • Synonym entries are required to be domain public. In the previous example, this is accomplished by setting the ACL to DOMAIN_PUBLIC_ACL.
  • The following properties should not be defined for your configuration file because they override the domain public setting in your code:
    • defaultAcl.mode=FALLBACK
    • defaultAcl.public=true

Define data source-specific synonyms

By default, synonyms are applied to all data sources across all search applications. However, you can group synonyms by data source and search application. For example, maybe your organization has separate engineering and sales teams, and you want to provide each team with a different search experience, including job role-specific synonyms. In this case, you could create one search application with an engineering-specific data source and synonyms and another search application with a sales-specific data source and synonyms.

To limit where synonyms are applied, set _onlyApplicableForAttachedSearchApplications to true during indexing. This setting limits the synonyms such that they are only applied to search applications that include a specific data source. For example, adding the following line of code to the previous code sample ensures the indexed synonyms are data source-specific:

structuredData.put("_onlyApplicableForAttachedSearchApplications", true);