After you register a CDH or CDP cluster in DataWorks, you can map a DataWorks tenant member's Alibaba Cloud account to a specified identity account in the cluster. This mapping allows the tenant member to access the CDH or CDP cluster using the mapped cluster identity. The configuration procedure is similar for both CDP and CDH clusters. This topic uses a CDH cluster as an example to describe the procedure in detail.
Mapping types
When you register a CDH cluster in DataWorks, you can use the Default Access Identity parameter to configure the account used to execute CDH cluster tasks. For more information, see Configure the default access identity for the cluster. The following table describes the accounts that you can configure for the Default Access Identity parameter and their supported mapping types.
Supported account types and descriptions | Supported mapping types and descriptions | ||
Cluster account | The cluster account that you specify is used to execute the code of CDH tasks in the CDH cluster regardless of who runs the CDH tasks in DataWorks. For example, the configured cluster account is used to run CDH tasks that are submitted by an Alibaba Cloud account, a RAM user that is assigned the Workspace Administrator role, or a RAM user that is assigned only the Developer role. | No Authentication | If you set Default Access Identity to a cluster account, the Mapping Type between the Alibaba Cloud account and the cluster account is No Authentication by default. Important If you set Default Access Identity to a mapping account, you cannot set the Mapping Type to No Authentication. Otherwise, the CDH task fails because no access identity is available for the Alibaba Cloud account. You can select System Account Mapping, OPEN LDAP Account Mapping, or Kerberos Account Mapping as needed. |
Mapping account | When different workspace members run CDH tasks in DataWorks, the CDH system account, Kerberos account, or OPEN LDAP account that is mapped to their Alibaba Cloud accounts (Alibaba Cloud account or RAM user) is used to execute the task code in the cluster. If you select a mapping account, you must go to the cluster account mapping configuration page to configure the CDH account to which the Alibaba Cloud account is mapped after you register the CDH cluster. Important An Alibaba Cloud account fails to run a CDH task in the following cases:
| System Account Mapping |
|
OPEN LDAP Account Mapping |
| ||
Kerberos Account Mapping |
|
Prerequisites
The CDH cluster account that you want to map is created.
Before you use Kerberos account mapping, ensure that the Kerberos service is enabled for the cluster.
Before you use OPEN LDAP account mapping, ensure that the OPEN LDAP service is enabled for the cluster.
You have attached a CDH computing resource to a DataWorks workspace.
Step 1: Go to the cluster account mapping configuration page
Log on to the DataWorks console. After you switch to the destination region, in the navigation pane on the left, click . Select the desired workspace from the drop-down list and click Go To Management Center.
In the navigation pane on the left, click Computing Resources.
On the computing resources page, find the target CDH cluster and click
under the cluster name.On this page, you can configure the mapping between a DataWorks Alibaba Cloud account and a CDH cluster account. This mapping determines which cluster account is used to run CDH tasks that are submitted by the Alibaba Cloud account.
Step 2: Set up cluster account mapping
Follow these steps to configure the cluster account that is used to execute CDH tasks in DataWorks.
Select a mapping type.
You can select No Authentication, System Account Mapping, OPEN LDAP Account Mapping, or Kerberos Account Mapping. For more information, see Mapping types.
Configure the cluster account mapping.
Configure the account mapping based on the selected type.
NoteIf you select No Authentication, no mapping configuration is required. The platform runs tasks using the cluster account that was configured in the basic information when the CDH or CDP cluster was registered. For more information, see Attach a CDH computing resource.
System Account Mapping
Configure the mapping between an Alibaba Cloud account (an Alibaba Cloud account or a RAM user) and a CDH cluster system account. Add the required account information as instructed on the page.
Run tasks using an Alibaba Cloud account: Select an Alibaba Cloud account and configure the mapped cluster system account.
Run tasks using a RAM user: Select a RAM user and configure the mapped cluster system account. The following two types of mappings are supported:
Same-name mapping (default): Runs a CDH task using a cluster system account that has the same name as the mapped RAM user. For example:
RAM user: ram_user_1@xxx.onaliyun.com
Cluster account with the same name: ram_user_1
CDH tasks submitted using ram_user_1@xxx.onaliyun.com are run by ram_user_1.
NoteBy default, when you use a RAM user to run a CDH task in DataWorks, a cluster system account with the same name is used to run the task in the CDH cluster. You can also use a cluster account that has a different name.
To prevent task failures, ensure that an account with the same name exists in the CDH cluster. You can go to the
page to configure the account.
Different-name mapping: Runs a CDH task using a cluster system account that has a different name from the mapped RAM user. Configure the mapping as instructed on the page.
Kerberos Account Mapping
Configure the mapping between an Alibaba Cloud account (an Alibaba Cloud account or a RAM user) and a CDH cluster Kerberos account. A Kerberos account uses the Instance name@Realm name format, for example, cdn_test@HADOOP.COM.
Kerberos authentication requires a keytab authentication file and a krb5.conf configuration file.
krb5.conf configuration file: Stores the configurations of the Key Distribution Center (KDC) server.
keytab authentication file: Stores the identity verification credentials of the resource entity. The file must be named in the Kerberos account.keytab format.
Add the required account and upload the required files as instructed on the page.
NoteIf Kerberos authentication is enabled for Hive MetaStore of the CDH cluster, you must use this mapping type. Otherwise, metadata acquisition is affected.
If you use the Presto component and select Kerberos account mapping, you must configure the
Config.Properties
andPresto.Jks
files in the basic information of the cluster.Ensure that the Kerberos service is enabled for the cluster.
OPEN LDAP Account Mapping
Configure the mapping between an Alibaba Cloud account (an Alibaba Cloud account or a RAM user) and a CDH cluster OPEN LDAP account. Add the required account information as instructed on the page.
NoteIf you use the Presto component and select OPEN LDAP account mapping, you must configure the
Config.Properties
andPresto.Jks
files in the basic information of the cluster.Ensure that the OPEN LDAP service is enabled for the cluster.
Click Finish Editing. The account mapping is now configured. Tasks run by an Alibaba Cloud account will use the mapped cluster account.