You can use exclusive resource groups for Data Integration in DataWorks to allocate independent computing resources for data synchronization tasks. This improves the efficiency and stability of task execution. After you purchase an exclusive resource group, you must configure network bindings and whitelists before you can use it. This topic describes how to purchase and use an exclusive resource group for Data Integration.
If you did not activate DataWorks in any region before June 10, 2024, you can purchase and use only Serverless resource groups after you activate DataWorks. You cannot purchase or use legacy resource groups. Existing DataWorks users who need to switch to Serverless resource groups can see Upgrading legacy resource groups.
Background information
Exclusive resource groups for Data Integration are no longer recommended in DataWorks. We recommend that you use serverless resource groups. Serverless resource groups support the core features of old-version resource groups, including exclusive resource groups for scheduling, exclusive resource groups for Data Integration, exclusive resource groups for DataService Studio, and shared resource groups. You can perform operations such as data synchronization, task scheduling and running, and API calling and management by using one serverless resource group.
Prerequisites
You must understand the details of an exclusive resource group for Data Integration to plan the required specifications and subscription duration based on your business scenarios. These details include performance, such as the number of tasks that can run concurrently for different specifications, and billing, such as how charges vary based on specifications. For more information, see Billing of exclusive resource groups for Data Integration.
Precautions
Exclusive resource groups for Data Integration support data synchronization in complex network environments. For example, you can use an exclusive resource group for Data Integration to synchronize data across cloud environments, such as Alibaba Finance Cloud and Alibaba Gov Cloud, across Alibaba Cloud accounts, or from or to data centers. Before you run a synchronization task, you must ensure that network connections are established between your resource group and data sources and that the IP address whitelists of the data sources are configured to ensure accessibility. If the network connections are not established, your synchronization task cannot run. For more information about network connectivity solutions for an exclusive resource group for Data Integration and a data source, and precautions for configuring an IP address whitelist for a data source, see Billing for exclusive resource groups for Data Integration.
If you do not need to connect to a database instance and want to resolve only task latency caused by insufficient public resources, you can ignore the network-related issues in this topic. You can purchase an exclusive resource group for Data Integration in any zone without configuring network settings.
By default, an exclusive resource group for Data Integration can access the Internet. However, the quality of Internet access cannot be guaranteed because the resource group uses Internet Shared Bandwidth. If you have a strong dependency on the Internet, use a serverless resource group.
Limits
Only users with the AliyunBSSOrderAccess and AliyunDataWorksFullAccess permissions can purchase resource groups.
Only workspace administrators can bind a resource group to a workspace or change the binding.
For more information about the permissions required to perform operations on the Resource Groups page, see Resource group permission control policies.
For more information about how to create a custom policy and grant permissions, see Create a custom policy (Optional).
An exclusive resource group for Data Integration with 4 vCPUs and 8 GiB of memory can be bound to a maximum of two VPCs. Other specifications can be bound to a maximum of three VPCs.
Step 1: Purchase a resource group
DataWorks provides subscription-based exclusive resource groups, which you must purchase separately. Follow these steps to purchase a resource group.
Only users with the AliyunBSSOrderAccess and AliyunDataWorksFullAccess permissions can purchase a resource group.
Log on to the DataWorks console.
In the navigation pane on the left, click Resource Group. On the Exclusive Resource Groups tab of the Resource Groups page, click Create Resource Group For Data Integration Of Old Version. On the buy page, configure the parameters. The key parameters are described as follows.
Parameter
Description
Region And Zone
The region where the exclusive resource group will be used.
NoteAn exclusive resource group for Data Integration cannot be shared across regions. For example, exclusive resource groups in the China (Shanghai) region can be used only by workspaces in the China (Shanghai) region.
Dedicated Resource Type
Select Exclusive Resource Groups For Data Integration.
Resource Group Name
Enter a name for the resource group. The name must be unique within the tenant. Duplicate names cause an error when you confirm the operation.
NoteA tenant is an Alibaba Cloud account. Each tenant can have multiple RAM users.
Resource Group Description
A brief description of the resource group.
Billing Period
Exclusive resource groups are a subscription service. To prevent service interruptions, select Auto-renewal. After the resource group is created, you can also go to the Alibaba Cloud Renewal Management page to enable or disable the auto-renewal service. For more information, see Billing deactivation instructions.
Configure other items as needed.
Click Buy Now and follow the on-screen instructions to complete the purchase of the exclusive resource group for Data Integration.
After the purchase, DataWorks begins to initialize the exclusive resource group. The resource group is added to the console after its status changes to Running.
NoteThe exclusive resource group takes approximately 20 minutes to initialize. Wait until its status changes to Running.
After the exclusive resource group is added to the console, you must bind it to a workspace before you can select it in the task configuration.
Step 2: Bind the resource group to a workspace
An exclusive resource group for Data Integration must be bound to a workspace before it can be used. A single exclusive resource group can be bound to multiple workspaces but cannot be used across regions. For example, a resource group in the China (Shanghai) region can be bound only to workspaces in the China (Shanghai) region. It cannot be bound to workspaces in other regions. Follow these steps to bind the resource group to a workspace.
Only workspace administrators can bind a resource group to a workspace or change the binding.
Log on to the DataWorks console.
On the Exclusive Resource Groups tab of the Resource Groups page, click Attach Workspace for the corresponding resource group.
On the Attach Workspace page, find the workspace to which you want to bind the exclusive resource group and click Attach.
Step 3: Configure the network
Bind a VPC
Exclusive resources are deployed in a VPC that is managed by DataWorks and isolated from other network environments. To use an exclusive resource, you must configure network settings by binding the resource to a VPC that can connect to your data sources. This establishes network connectivity with the data sources. Follow these steps to bind a VPC.
An exclusive resource group for Data Integration with 4 vCPUs and 8 GiB of memory can be bound to a maximum of two VPCs. Other specifications can be bound to a maximum of three VPCs.
Log on to the DataWorks console.
On the Exclusive Resource Groups tab of the Resource Groups page, click Network Settings for the target resource group to go to the VPC binding page.
Before you bind a resource group, you must use your Alibaba Cloud account to log on to the Resource Access Management (RAM) console and authorize DataWorks to access your cloud resources. You can go to the Cloud Resource Access Authorization page to grant the authorization. Alternatively, you can grant the authorization in the dialog box that appears the first time you log on to the DataWorks console with your Alibaba Cloud account.
Bind a VPC.
Click Add Binding in the upper-left corner of the VPC Binding page. In the Add VPC Binding dialog box, configure the parameters. The parameters vary based on the network environment.
NoteFor scenarios such as Alibaba Cloud instances and self-managed ECS instances, you can select a network connectivity solution and configure settings based on whether the DataWorks workspace and the data source are under the same Alibaba Cloud account.
Parameter
Configuration (data source and exclusive resource group in the same account and region)
Configuration (data source and exclusive resource group in different accounts or regions)
Attach VPC
If the data source and the exclusive resource group are in the same Alibaba Cloud account, select the VPC where the data source resides.
If they are in different Alibaba Cloud accounts, follow the instructions for the scenario where they are in different regions.
If your data source and exclusive resource group are in different regions or belong to different Alibaba Cloud accounts, select a virtual private cloud (VPC) that is connected to the network of the target data source. For example, if the data source is not in an Alibaba Cloud VPC, click Create VPC to create a VPC for the exclusive resource group. After the VPC is created, select the new VPC or a VPC that is already connected to the target database network.
NoteIf the DataWorks workspace and the data source are in different regions or belong to different Alibaba Cloud accounts, use VPN Gateway or Express Connect to connect the VPC that is bound to the exclusive resource group with the VPC of the data source. Then, you must manually add a route that points to the IP address of the target database to ensure connectivity between the two networks. For more information, see Network connectivity solutions.
Zone
Select the zone where the database resides.
Select a zone that has a network connection to the target database.
VSwitch
If you select the VPC where the data source resides, select the vSwitch to which the data source is connected.
NoteWhen you bind the exclusive resource group for Data Integration to the VPC of the data source and any vSwitch in that VPC, a route to the VPC's CIDR block is automatically added. This ensures that the resource group can access data sources in the VPC.
Select a vSwitch that is connected to the target database network. If no vSwitch is available, click Create VSwitch to create one for the exclusive resource group. After you create the vSwitch, select it.
Click OK to bind the VPC.
NoteIf the data source and the exclusive resource group are in different regions or under different Alibaba Cloud accounts, you must add a routing rule that points to the target database IP address after you bind the VPC.
Optional: Configure hosts.
If your data source is accessed by a domain name instead of an IP address, you must configure a host mapping. Otherwise, the connectivity test fails when you add the data source using its domain name.
Click Hostname-to-IP Mapping and then click Add in the upper-left corner. In the Create Hostname-to-IP Mapping dialog box, configure the parameters. The parameters are described as follows.
Parameter
Description
IP Address
The actual IP address of the data source.
Hostname
The hostname used to access the data source. To add multiple hostnames, place each one on a separate row.
To map the domain name to multiple IP addresses, click Add.
NoteThe IP address and domain names in a new host configuration cannot be the same as those in an existing host configuration.
In a host configuration, the relationship between an IP address and domain names is one-to-many. One IP address can be mapped to multiple domain names, but one domain name can be mapped to only one IP address.
Add to a whitelist
Even if the exclusive resource group for Data Integration and the data source are in the same zone and connected by a VPC and vSwitch, the network may be unreachable. This can occur due to whitelist restrictions on the data source. In this case, you must add the following information to the IP address whitelist of the data source.
To connect the data source and the exclusive resource group over an internal network, add the vSwitch CIDR block of the bound VPC to the IP address whitelist of the data source.
After you bind a virtual private cloud (VPC) to the exclusive resource group, on the Exclusive Resource Groups tab of the Resource Groups page, click Network Settings for the target resource group. On the VPC Binding tab, you can view the VSwitch CIDR Block.
If you want to establish a network connection between the exclusive resource group and your data source over the Internet, you must add the EIP of the resource group to the IP address whitelist of your data source.
Step 4: Test network connectivity
After you complete the network configuration, test the network connectivity between the resource group and the data source by following these steps.
Go to the Data Sources page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.
In the left-side navigation pane of the SettingCenter page, click Data Sources.
In the Actions column of the desired data source, click Edit.
On the Data Integration tab, click Test Network Connectivity next to the name of the target resource group. If the connectivity status is Connected, a network connection is established.
NoteIf a network connection cannot be established, you can click Self-service Troubleshoot to use a diagnostic tool to identify the cause of the network connection issue. For more information about network connectivity between data sources and exclusive resource groups in various network environments, see Network Connectivity Solutions.
Click Complete Modification.
More operations
View resource group utilization and monitor the resource group
You can view the resource usage of a resource group and the number of instances that are waiting for resources in the DataWorks console. You can also use the intelligent monitoring feature in the Operation Center to monitor the usage rate of the resource group and the number of instances that are waiting for resources. For more information about how to view the resource group usage rate, see View the usage rate of an exclusive resource group. For more information about how to monitor a resource group, see Create a custom rule.
Change the zone of a resource group
Follow these steps to change the zone of a resource group:
Log on to the DataWorks console.
In the navigation pane on the left, click Resource Group. On the Exclusive Resource Groups tab of the Resource Groups page, find an exclusive resource group where the Purpose is Data Integration.
Click the
icon in the Actions column of the resource group and select Change Zone. The Change Zone For Resource Group dialog box appears.
In the Change Zone For Resource Group dialog box, select the Current Zone and Number Of Machines, and then select the New Zone and Number Of Machines To Replace.
Click Confirm Replacement to change the zone of the resource group.
Moving the resources of a resource group from one zone to another may cause the following network changes:
Resource group CIDR block: Each zone in the resource group corresponds to a separate CIDR block. If the zone of the resource group changes, the corresponding CIDR block also changes.
Resource group primary elastic network interface (ENI) IP: The primary ENI IP address of the moved ECS instance changes. A new IP address is reassigned from the CIDR block of the destination zone.
ENIs bound to the resource group: If the vSwitch CIDR block is added to the whitelist, no impact occurs. However, if the IP address of the ENI that is bound to the resource group is added to the whitelist, you must update the whitelist configuration to ensure continued access and normal operations.
Appendix: Switch the Data Integration resource group
After you create and configure an exclusive resource group for Data Integration, you can switch the resource group that is used by a task. The following table describes how to switch the resource group.
Operating environment | Supported switch operations | Entry point |
Switch the resource group for the production environment | Batch switch | Go to the page.Select the tasks for which you want to change the resource group and click |
Switch the resource group for the development environment |
| Go to the DataStudio page.
Note If you cannot find the option to modify the Data Integration resource group, filter the node type by selecting Offline Sync. |