The scaling group is a core component of Auto Scaling that manages a group of instances with the same application scenarios and instance types. You can use a scaling group to accelerate horizontal expansion of instances in a cluster. You can also use a scaling group to dynamically adjust the number of instances based on your business requirements, which helps you save on costs.
Benefits
Rapid scale-out capability and guarantee of high service availability
You can use a scaling group to efficiently expand service clusters and improve service availability.
Cost control
Scaling out a service cluster means maintaining more computing resources, which increases costs. However, your business might not always run at full capacity. You can leverage the elasticity of the cloud to reduce resource investment when demand is low, thereby controlling costs.
Supported scaling solutions
Solution 1: Maintenance of a fixed number of available instances
Scenario: High availability maintenance without cluster scaling
Implementation method: Enable the Instance Health Check and Expected Number of Instances features.
After you enable the Instance Health Check feature for your scaling group, Auto Scaling automatically removes unhealthy instances from the scaling group. If the current number of instances in your scaling group is less than the expected number of instances, Auto Scaling automatically triggers a scale-out event to maintain a fixed number of available instances in the scaling group.
Example
For example, you enable the Expected Number of Instances feature for your scaling group and specify 10 as the expected number. If the actual number of instances in the scaling group is less than 10, Auto Scaling automatically triggers a scale-out event to increase the actual number to 10.
Solution 2: Regularly scheduled autoscaling
Scenario: Predictable workload fluctuations
Implementation method:: Create scheduled tasks to enable regular autoscaling.
When resource utilization in the cluster increases, you can execute a scheduled task to trigger a scale-out event. When resource utilization in the cluster decreases, you can execute a scheduled task to trigger a scale-in event. For more information, see Scale ECS instances as scheduled.
Example
For example, your cluster experiences an increase in traffic every evening at 19:00 and a decrease every morning at 01:00. To handle the fluctuations in business demand, you can create the following scheduled tasks:
Increased traffic: You can enable a scheduled task to increase the number of service replicas every evening at 19:00. This improves the capability of the cluster to handle the increased traffic.
Decreased traffic: You can enable a scheduled task to decrease the number of service replicas every morning at 01:00. This improves resource utilization and maximizes cost efficiency.
Solution 3: Autoscaling based on resource utilization thresholds
Scenario: Sudden fluctuations in workloads
Implementation method:
Trigger scaling events when resource utilization exceeds or falls below the specified threshold
You can create event-triggered tasks to trigger scaling events. When resource utilization exceeds or falls below the specified threshold, the event-triggered tasks are automatically executed to trigger scaling events.
Maintain the desired resource utilization
You can create a target tracking scaling rule in your scaling group to maintain the desired resource utilization.
Example
You create a target tracking scaling rule in a scaling group of the Elastic Compute Service (ECS) type and specify 80% as the desired average CPU utilization. In this case, Auto Scaling dynamically adds or removes instances to maintain the average CPU utilization at 80%.
Differences between the implementation methods
Simple and step scaling rules offer more flexibility and customization, allowing you to control the exact number of instances to add or remove based on different resource utilization tiers.
In contrast, target tracking scaling rules are simpler to configure, as you only need to define the target metric you want to maintain.
Solution 4: Custom scaling
If none of the preceding solutions meets your business requirements, you can configure a custom scaling solution.
You can manually execute scaling rules or modify the instance numbers to trigger scaling events. For more information, see Manually scale ECS instances with a few clicks.
Custom scaling supports API calls. You can call API operations to configure custom scaling solutions based on your business requirements.
Solution 5: Predictive scaling
Auto Scaling can also automatically make adjustments to meet predicted resource demands.
This solution allows you to first run a predictive scaling rule in prediction-only mode to evaluate its accuracy and effectiveness. If the results are satisfactory, you can then enable both prediction and scaling, which automatically generates scheduled tasks based on the forecast to scale your instances. For more information, see View the prediction of a predictive scaling rule.
Usage notes
Before you use a scaling group, make sure that the instances on which you deploy your business support horizontal scaling.
Auto Scaling horizontally scales instances. We recommend that you consider the potential impact of horizontal scaling on your business.
Data consistency
If your database is deployed on the instances being scaled, you risk data inconsistency when new instances are added. To prevent this, we recommend moving your database to a separate, dedicated service. This allows all instances to access the same central database, making your application stateless.
Data security
Instances in scaling groups are automatically created and released. If you store data on the instances, make sure that you perform data backup operations to secure your data.