创建身份连接器

默认情况下，Google Cloud Search 仅识别存储在 Google Cloud Directory（用户和群组）中的 Google 身份。而身份连接器的作用是将企业中的身份与 Google Cloud Search 使用的 Google 身份同步。

Google 提供了以下选项来开发身份连接器：

身份连接器 SDK。此选项适用于使用 Java 编程语言编程的开发者。身份连接器 SDK 是 REST API 的封装容器，方便您快速创建连接器。要使用此 SDK 创建身份连接器，请参阅使用身份连接器 SDK 创建身份连接器。
低层级 REST API 和 API 库。这些选项适用于不使用 Java 编程或者其代码库更适合 REST API 或 API 库的开发人员。如需使用 REST API 创建身份连接器，请参阅目录 API：用户账号，了解有关映射用户的信息；另请参阅 Cloud Identity 文档，了解有关映射群组的信息。

使用身份连接器 SDK 创建身份连接器

通常，身份连接器可执行以下任务：

配置连接器。
从您的企业身份系统中检索所有用户，并将相关信息发送给 Google 以与 Google 身份同步。
从您的企业身份系统中检索所有群组，并将相关信息发送给 Google 以与 Google 身份同步。

设置依赖项

您必须在构建文件中加入特定的依赖项才能使用 SDK。请点击下面的标签查看构建环境的依赖项：

Maven

<dependency>
<groupId>com.google.enterprise.cloudsearch</groupId>
<artifactId>google-cloudsearch-identity-connector-sdk</artifactId>
<version>v1-0.0.3</version>
</dependency>

Gradle

 compile group: 'com.google.enterprise.cloudsearch',
         name: 'google-cloudsearch-identity-connector-sdk',
         version: 'v1-0.0.3'

创建连接器配置

每个连接器都有一个配置文件，其中包含连接器使用的参数，例如代码库的 ID。这些参数以键值对的形式进行定义，例如 api.sourceId=1234567890abcdef。

Google Cloud Search SDK 包含 Google 提供的若干个配置参数，可供所有连接器使用。您必须在配置文件中声明以下由 Google 提供的参数：

对于内容连接器，您必须声明 api.sourceId 和 api.serviceAccountPrivateKeyFile，因为这些参数标识了代码库的位置和访问代码库所需的私钥。

对于身份连接器，您必须声明 api.identitySourceId，因为此参数标识了外部身份源的位置。如果您要同步用户，则还必须将 api.customerId 声明为企业 Google Workspace 账号的唯一 ID。

除非您要覆盖其他 Google 提供的参数的默认值，否则您无需在配置文件中进行声明。如需详细了解 Google 提供的配置参数，例如如何生成特定的 ID 和密钥，请参阅 Google 提供的配置参数。

此外，您还可以定义代码库的专属参数，以便在配置文件中使用。

将配置文件传递给连接器

设置系统属性 config 以将配置文件传递给连接器。您可以在启动连接器时使用 -D 参数来设置属性。例如，以下命令会使用 MyConfig.properties 配置文件启动连接器：

java -classpath myconnector.jar;... -Dconfig=MyConfig.properties MyConnector

如果缺少此实参，SDK 将尝试访问名为 connector-config.properties 的默认配置文件。

使用模板类创建完全同步的身份连接器

身份连接器 SDK 包含一个 FullSyncIdentityConnector 模板类，您可以使用该模板类将身份代码库中的所有用户和群组与 Google 身份同步。本节介绍了如何使用 FullSyncIdentityConnector 模板从非 Google 身份代码库执行用户和群组的完全同步。

本文档的这一部分引用了 IdentityConnecorSample.java 示例中的代码段。此示例从两个 CSV 文件中读取用户和群组身份，并将其与 Google 身份同步。

实现连接器的入口点

连接器的入口点采用 main() 方法。此方法的主要任务是创建 Application 类的实例，并调用其 start() 方法来运行连接器。

在调用 application.start() 之前，请使用 IdentityApplication.Builder 类实例化 FullSyncIdentityConnector 模板。FullSyncIdentityConnector 接受您将实现其方法的 Repository 对象。以下代码段展示了如何实现 main() 方法：

IdentityConnectorSample.java

在 GitHub 上查看

/**
 * This sample connector uses the Cloud Search SDK template class for a full
 * sync connector. In the full sync case, the repository is responsible
 * for providing a snapshot of the complete identity mappings and
 * group rosters. This is then reconciled against the current set
 * of mappings and groups in Cloud Directory.
 *
 * @param args program command line arguments
 * @throws InterruptedException thrown if an abort is issued during initialization
 */
public static void main(String[] args) throws InterruptedException {
  Repository repository = new CsvRepository();
  IdentityConnector connector = new FullSyncIdentityConnector(repository);
  IdentityApplication application = new IdentityApplication.Builder(connector, args).build();
  application.start();
}

在幕后，SDK 会在连接器的 main() 方法调用 Application.build 后调用 initConfig() 方法。initConfig() 方法执行以下任务：

调用 Configuation.isInitialized() 方法，以确保 Configuration 尚未初始化。
使用 Google 提供的键值对初始化 Configuration 对象。每个键值对都存储在 Configuration 对象内的 ConfigValue 对象中。

实现 `Repository` 接口

Repository 对象的唯一目的是执行代码库身份与 Google 身份的同步操作。使用模板时，只需覆盖 Repository 接口中的某些方法即可创建身份连接器。对于 FullTraversalConnector，您可能会重写以下方法：

init() 方法。要设置和初始化身份代码库，请覆盖 init() 方法。
listUsers() 方法。如需将身份代码库中的所有用户与 Google 用户同步，请覆盖 listUsers() 方法。
listGroups() 方法。如需将身份代码库中的所有群组与 Google 群组同步，请替换 listGroups() 方法。
（可选）close() 方法。如果需要清理代码库，请重写 close() 方法。连接器关闭期间会调用一次此方法。

获取自定义配置参数

作为处理连接器配置流程的一环，您需要从 Configuration 对象获取所有自定义参数。此任务通常在 Repository 类的 init() 方法中执行。

Configuration 类有几种方法供您使用，以从配置中获取不同的数据类型，每个方法都会返回一个 ConfigValue 对象。然后，您将使用 ConfigValue 对象的 get() 方法来检索实际值。以下代码段展示了如何从 Configuration 对象中检索 userMappingCsvPath 和 groupMappingCsvPath 值：

IdentityConnectorSample.java

在 GitHub 上查看

/**
 * Initializes the repository once the SDK is initialized.
 *
 * @param context Injected context, contains convenienve methods
 *                for building users & groups
 * @throws IOException if unable to initialize.
 */
@Override
public void init(RepositoryContext context) throws IOException {
  log.info("Initializing repository");
  this.context = context;
  userMappingCsvPath = Configuration.getString(
      "sample.usersFile", "users.csv").get().trim();
  groupMappingCsvPath = Configuration.getString(
      "sample.groupsFile", "groups.csv").get().trim();
}

如需获取和解析包含多个值的参数，请使用 Configuration 类的一个类型解析器将数据解析为离散区块。在示例教程中，我们会看到连接器中的以下代码段使用 getMultiValue 方法获取一个包含多个 GitHub 代码库名称的列表：

GithubRepository.java

在 GitHub 上查看

ConfigValue<List<String>> repos = Configuration.getMultiValue(
    "github.repos",
    Collections.emptyList(),
    Configuration.STRING_PARSER);

获取所有用户的映射

重写 listUsers() 以从身份代码库中检索所有用户的映射。listUsers() 方法接受表示要同步的最后一个身份的检查点。因此即使进程被中断，也可以利用该检查点以恢复同步。对于代码库中的每个用户，您将在 listUsers() 方法中执行以下步骤：

获取包含 Google 身份和相关外部身份的映射。
将该映射对打包到 listUsers() 方法返回的迭代器中。

获取用户映射

以下代码段演示了如何检索存储在 CSV 文件中的身份映射：

IdentityConnectorSample.java

在 GitHub 上查看

/**
 * Retrieves all user identity mappings for the identity source. For the
 * full sync connector, the repository must provide a complete snapshot
 * of the mappings. This is reconciled against the current mappings
 * in Cloud Directory. All identity mappings returned here are
 * set in Cloud Directory. Any previously mapped users that are omitted
 * are unmapped.
 *
 * The connector does not create new users. All users are assumed to
 * exist in Cloud Directory.
 *
 * @param checkpoint Saved state if paging over large result sets. Not used
 *                   for this sample.
 * @return Iterator of user identity mappings
 * @throws IOException if unable to read user identity mappings
 */
@Override
public CheckpointCloseableIterable<IdentityUser> listUsers(byte[] checkpoint)
    throws IOException {
  List<IdentityUser> users = new ArrayList<>();
  try (Reader in = new FileReader(userMappingCsvPath)) {
    // Read user mappings from CSV file
    CSVParser parser = CSVFormat.RFC4180
        .withIgnoreSurroundingSpaces()
        .withIgnoreEmptyLines()
        .withCommentMarker('#')
        .parse(in);
    for (CSVRecord record : parser.getRecords()) {
      // Each record is in form: "primary_email", "external_id"
      String primaryEmailAddress = record.get(0);
      String externalId = record.get(1);
      if (primaryEmailAddress.isEmpty() || externalId.isEmpty()) {
        // Skip any malformed mappings
        continue;
      }
      log.info(() -> String.format("Adding user %s/%s",
          primaryEmailAddress, externalId));

      // Add the identity mapping
      IdentityUser user = context.buildIdentityUser(
          primaryEmailAddress, externalId);
      users.add(user);
    }
  }
  // ...
}

将用户映射打包到迭代器中

listUsers() 方法会返回一个 Iterator，具体来说是一个 CheckpointCloseableIterable，其中包含 IdentityUser 对象。您可以使用 CheckpointClosableIterableImpl.Builder 类来构建和返回迭代器。以下代码段显示了如何将每个映射打包到列表中，然后从该列表构建迭代器：

IdentityConnectorSample.java

在 GitHub 上查看

CheckpointCloseableIterable<IdentityUser> iterator =
  new CheckpointCloseableIterableImpl.Builder<IdentityUser>(users)
      .setHasMore(false)
      .setCheckpoint((byte[])null)
      .build();

获取群组

重写 listGroups() 以从身份代码库中检索所有群组及其成员。listGroups() 方法接受表示要同步的最后一个身份的检查点。因此即使进程被中断，也可以利用该检查点以恢复同步。对于代码库中的每个用户，您将在 listGroups() 方法中执行以下步骤：

获取群组及其成员。
将每个群组及其成员打包到 listGroups() 方法返回的迭代器中。

获取群组身份

以下代码段演示了如何检索存储在 CSV 文件中的群组和成员：

IdentityConnectorSample.java

在 GitHub 上查看

/**
 * Retrieves all group rosters for the identity source. For the
 * full sync connector, the repository must provide a complete snapshot
 * of the rosters. This is reconciled against the current rosters
 * in Cloud Directory. All groups and members  returned here are
 * set in Cloud Directory. Any previously created groups or members
 * that are omitted are removed.
 *
 * @param checkpoint Saved state if paging over large result sets. Not used
 *                   for this sample.
 * @return Iterator of group rosters
 * @throws IOException if unable to read groups
 */    @Override
public CheckpointCloseableIterable<IdentityGroup> listGroups(byte[] checkpoint)
    throws IOException {
  List<IdentityGroup> groups = new ArrayList<>();
  try (Reader in = new FileReader(groupMappingCsvPath)) {
    // Read group rosters from CSV
    CSVParser parser = CSVFormat.RFC4180
        .withIgnoreSurroundingSpaces()
        .withIgnoreEmptyLines()
        .withCommentMarker('#')
        .parse(in);
    for (CSVRecord record : parser.getRecords()) {
      // Each record is in form: "group_id", "member"[, ..., "memberN"]
      String groupName = record.get(0);
      log.info(() -> String.format("Adding group %s", groupName));
      // Parse the remaining columns as group memberships
      Supplier<Set<Membership>> members = new MembershipsSupplier(record);
      IdentityGroup group = context.buildIdentityGroup(groupName, members);
      groups.add(group);
    }
  }
  // ...

}

将群组和成员打包到迭代器中

listGroups() 方法会返回一个 Iterator，具体来说是一个 IdentityGroup 对象的 CheckpointCloseableIterable。您可以使用 CheckpointClosableIterableImpl.Builder 类来构建和返回迭代器。以下代码段显示了如何将每个群组和成员打包到一个列表中并从该列表构建迭代器：

IdentityConnectorSample.java

在 GitHub 上查看

CheckpointCloseableIterable<IdentityGroup> iterator =
   new CheckpointCloseableIterableImpl.Builder<IdentityGroup>(groups)
      .setHasMore(false)
      .setCheckpoint((byte[])null)
      .build();

后续步骤

您可以执行以下几个后续步骤：

（可选）实现 close() 方法以在运行结束前释放所有资源。
（可选）使用内容连接器 SDK 创建内容连接器。

创建身份连接器 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

使用身份连接器 SDK 创建身份连接器

设置依赖项

Maven

Gradle

创建连接器配置

将配置文件传递给连接器

使用模板类创建完全同步的身份连接器

实现连接器的入口点

实现 Repository 接口

获取自定义配置参数

获取所有用户的映射

获取用户映射

将用户映射打包到迭代器中

获取群组

获取群组身份

将群组和成员打包到迭代器中

后续步骤

创建身份连接器

实现 `Repository` 接口