To develop and manage AnalyticDB for Spark tasks in DataWorks, you must first attach your cloud-native data warehouse AnalyticDB for MySQL cluster as an AnalyticDB for Spark computing resource. After the cluster is attached, you can use the computing resource for data development in DataWorks.
Prerequisites
You have created an AnalyticDB for MySQL cluster.
After you create the cluster, you must also create an Interactive resource group of the Spark engine type for it. Otherwise, you cannot attach the cluster as an AnalyticDB for Spark computing resource.
NoteWe recommend that you purchase the AnalyticDB for MySQL cluster in the same Region as the DataWorks workspace. If the cluster and workspace are in different regions, you cannot attach the cluster as a computing resource to the workspace.
You have created a workspace in DataWorks. A RAM user has been added to the workspace and assigned the Workspace Administrator role.
ImportantThis feature is supported only in workspaces that are Participating In The Public Preview Of The New Version Of DataStudio.
A resource group is attached to the workspace.
If you use a Serverless resource group, you only need to ensure that the AnalyticDB for Spark computing resource can connect to the Serverless resource group.
If you use legacy exclusive resource groups, ensure that the AnalyticDB for Spark computing resource can connect to the exclusive resource group for scheduling for the corresponding scenario.
The resource group must be in the same VPC as the AnalyticDB for MySQL cluster. The IP address of the resource group must be added to the whitelist of the AnalyticDB for MySQL cluster.
Limits
Region: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), and Indonesia (Jakarta).
Permissions:
User
Required permissions
Alibaba Cloud account
No additional authorization is required.
RAM user/RAM role
DataWorks management permissions: Only workspace members who have the O&M role, the workspace administrator role, or the
AliyunDataWorksFullAccess
permission can create computing resources. For more information, see Grant workspace administrator permissions.AnalyticDB for MySQL service permissions: When you attach an AnalyticDB for Spark computing resource, to create a database for the AnalyticDB for MySQL cluster, you must grant the
AliyunADBFullAccess
access policy to the RAM user to ensure that the user has full operational permissions on the AnalyticDB for MySQL cluster.
Go to the computing resource list page
Log on to the DataWorks console. In the top navigation bar, select the destination region. In the navigation pane on the left, choose . From the drop-down list, select the target workspace and click Go To Management Center.
In the navigation pane on the left, click Computing Resources to go to the Computing Resources page.
Attach an AnalyticDB for Spark computing resource
On the Computing Resources page, you can configure and attach an AnalyticDB for Spark computing resource.
Select the computing resource type.
Click Attach Computing Resource to go to the Attach Computing Resource page.
On the Attach Computing Resource page, set the computing resource type to AnalyticDB for Spark. You are then redirected to the Attach AnalyticDB For Spark Computing Resource configuration page.
Configure the AnalyticDB for Spark computing resource.
On the Attach AnalyticDB For Spark Computing Resource configuration page, configure the parameters as described in the following table.
Parameter
Description
Configuration Mode
Only the Alibaba Cloud Instance Pattern is supported.
Alibaba Cloud Account
Supports only the Current Alibaba Cloud Account.
Instance
Select the AnalyticDB for MySQL cluster to attach. You can also click New in the drop-down menu to create an AnalyticDB for MySQL cluster.
NoteWhen you create an AnalyticDB for MySQL cluster, you must create an Interactive resource group with the engine type set to Spark for the cluster. Otherwise, you cannot attach the cluster as an AnalyticDB for Spark computing resource.
Database Name
Enter the name of the database that you created in the AnalyticDB for MySQL cluster.
Computing Resource Instance Name
Enter a custom name for the computing resource. At runtime, you can select the computing resource for a task based on this name.
Test the network connection.
In the Connection Configuration section, select the resource group that DataWorks uses to run AnalyticDB for Spark tasks and click Test Connectivity to verify that the resource group can connect to your AnalyticDB for MySQL cluster. For more information, see Network Connectivity Solutions.
Click Confirm to complete the configuration.
NoteWhen you attach an AnalyticDB for Spark computing resource, the system automatically syncs a new AnalyticDB for Spark data source with the same name to the Data Source page in the current workspace.
What to do next
After you configure the AnalyticDB for Spark computing resource, you can use it in Data Development to develop ADB Spark node and ADB Spark SQL node tasks.