All Products
Search
Document Center

Function Compute:Create a GPU function

Last Updated:Oct 01, 2025

If you use popular AI projects such as Stable Diffusion WebUI, ComfyUI, retrieval-augmented generation (RAG), or TensorRT, you need GPU-accelerated instances to accelerate computation. You can deploy function applications as container images to improve development and delivery efficiency.

Create a function

  1. Log on to the Function Compute console. In the navigation pane on the left, choose Functions > Function List.

  2. In the top navigation bar, select a region, and on the Functions page, click Create Function.

  3. In the dialog box, select a GPU Function type based on the prompts and your scenario, and then click Create GPU Function.

  4. On the Create GPU Function page, set the following configuration items and click Create.

    • Basic Configuration: Enter a Function Name. The name must be unique within the same account and region, and must follow the naming conventions.

    • Elastic Configuration: Select an instance type. You cannot use provisioned instances and elastic instances at the same time. After a function is created, the instance type cannot be changed.

      • Elastic instances

        Configuration item

        Description

        Example

        Instance Type

        Select Elastic Instance. Instances are automatically scaled based on the request volume and reclaimed when there are no requests. You are billed based on usage. You are not charged if you do not use the instances.

        Elastic Instance

        GPU Type

        Select a GPU type. For more information about the specifications supported by each type, see Instance types and specifications.

        Ada Series

        Specifications

        Set the GPU Memory, vCPU, Memory, and Disk specifications for the function as needed. After you set the specifications, the usage of each resource generated by actual function calls is measured by multiplying the specification by the duration of use. For more information, see Billing overview.

        Note
        • Data can be written to all directories on the disk. The disk space is shared.

        • The lifecycle of the disk is the same as the lifecycle of the underlying function instance. When the instance is reclaimed by the system, the data on the disk is also deleted. To persistently store files, you can mount a NAS file system or an OSS bucket. For more information, see Configure a NAS file system and Configure an OSS file system.

        • GPU Memory: 48 GB

        • vCPU: 8 vCPU

        • Memory: 64 GB

        • Disk: 512 MB (free of charge, Function Compute provides a free quota of 512 MB for disk usage)

        Minimum Instances

        If your business is sensitive to latency, set Minimum Instances to a value greater than or equal to 1 after you select Elastic Instance. This locks in resources in advance and reduces cold start latency.

        Note

        After you set Minimum Instances to a value greater than or equal to 1, if no auto scaling policy for minimum instances is configured or if no auto scaling policy is active for a period of time, the current number of minimum instances is the value you set here.

        If you configure multiple auto scaling policies, the system calculates the Minimum Instances when each policy is triggered and uses the maximum value among the active policies as the current Minimum Instances.

        For more information, see How is the current number of minimum instances calculated?.

        1

        Instance Concurrency

        You can configure a single GPU function instance to process multiple concurrent requests. For more information, see Configure instance concurrency.

      • Provisioned instances

        Configuration item

        Description

        Example

        Instance Type

        Select Provisioned Instance. Instances are allocated to the function from a purchased provisioned resource pool.

        Use provisioned instances for scenarios that require predictable costs, are sensitive to business latency, and have high resource utilization to ensure business stability.

        Provisioned Instance

        Provisioned Resource Pool

        A provisioned resource pool is a pool of provisioned instances that can be allocated to the target function. If the remaining quota of your provisioned resource pool is insufficient, click Scale Out in the Operation column and follow the on-screen instructions to scale out. For more information, see Provisioned resource pools (Subscription).

        • Provisioned Resource Pool: fc-pool-****

        • GPU Type: Ada

        Specifications

        Set the GPU Memory, vCPU, Memory, and Disk specifications for the function as needed. After you set the specifications, the usage of each resource generated by actual function calls is measured by multiplying the specification by the duration of use. For more information, see Billing overview.

        Note
        • Data can be written to all directories on the disk. The disk space is shared.

        • The lifecycle of the disk is the same as the lifecycle of the underlying function instance. When the instance is reclaimed by the system, the data on the disk is also deleted. To persistently store files, you can mount a NAS file system or an OSS bucket. For more information, see Configure a NAS file system and Configure an OSS file system.

        GPU Memory: 48 GB

        vCPU: 8 vCPU

        Memory: 64 GB

        Disk: 512 MB (free of charge, Function Compute provides a free quota of 512 MB for disk usage)

        Provisioned Instances

        Allocate the number of provisioned instances to the target function based on the resources of the provisioned resource pool.

        1

        Instance Concurrency

        You can configure a single GPU function instance to process multiple concurrent requests. For more information, see Configure instance concurrency.

        20

    • Function Code: Configure the runtime environment and code-related information for the function.

      Configuration item

      Description

      Example

      Runtime

      • Use sample image: Select a sample image provided by Function Compute to quickly deploy an image-based function. You need to select the target image from the image list under the Container Image configuration item.

      • Use image in ACR: Click Select Image In ACR under the Container Image configuration item. In the Select Container Image panel, select the created Container Image Instance and ACR Image Repository. Then, find the target image in the image area below and click Select in the Operation column. For more information, see Create a function that uses a custom image.

      Custom Image > Use Sample Image

      Container Image

      Select the target image.

      SpringBoot Web Application Sample Image

      Startup Command

      The startup command for the program. If you do not configure a startup command, the Entrypoint/CMD in the image is used by default.

      None

      Listening Port

      The port on which the HTTP server in your code listens.

      9000

      Execution Timeout

      Set the timeout period. The Execution Timeout is 60 seconds by default and can be up to 86,400 seconds.

      60

    • Instance Prefetch: For AI inference scenarios, configure instance prefetch to prefetch the model and reduce long initial request times.

      Configuration item

      Description

      Example

      Instance prefetch

      Instance Prefetch

      By configuring an Initializer hook, you can load the model by running a specified script or calling an interface after the function instance starts successfully and before processing requests. This prefetches the model in advance and optimizes cold starts.

      For more information about the Initializer hook, see Configure instance lifecycle.

      Enable

      Timeout

      Set the timeout period for the Initializer hook.

      60

      Prefetch Program Type

      You can configure two types of Initializer hooks to prefetch the model: Execute Instruction and Invoke Code.

      Execute Instruction

      Instruction Content

      Configure the content of the instruction to be executed. You can use custom Shell implementations, such as /bin/bash, /bin/sh, /bin/csh, and /bin/zsh. Make sure that the function runtime environment supports the corresponding Shell implementation.

      See Hook implementation

    • Permissions, Network, And Storage: Configure the function's access role, network settings, and storage mounts.

      Configuration item

      Description

      Example

      Function Role

      Function Compute uses this RAM role to generate temporary credentials for accessing your Alibaba Cloud resources and passes them to your code. For more information, see Grant a function permissions to access other Alibaba Cloud services using a function role.

      mytestrole

      Allow Access To VPC

      Enable this to allow the function to access resources in a VPC. For more information, see Configure network settings.

      Enable

      VPC

      This parameter is required if you set Allow Access To VPC to Yes. Create a new VPC or select the ID of the VPC that you want to access from the drop-down list.

      fc.auto.create.vpc.1632317****

      VSwitch

      This parameter is required if you set Allow Access To VPC to Yes. Create a new vSwitch or select a vSwitch ID from the drop-down list.

      fc.auto.create.vswitch.vpc-bp1p8248****

      Security Group

      This parameter is required if you set Allow Access To VPC to Yes. Create a new security group or select a security group from the drop-down list.

      fc.auto.create.SecurityGroup.vsw-bp15ftbbbbd****

      Allow Default ENI To Access Internet

      Specifies whether to allow the function to access the Internet through the default elastic network interface (ENI).

      Important

      When you use the static public IP address feature, you must disable Allow Default ENI To Access Internet. Otherwise, the configured static public IP address does not take effect. For more information, see Configure a static public IP address.

      Enable

      Mount NAS File System

      Configure a NAS file system for the function to persistently store data shared between functions, such as models shared by multiple inference functions.

      If you select automatic configuration, the system uses an existing General-purpose NAS file system named Alibaba-Fc-V3-Component-Generated by default. If no eligible NAS file system is found under the current account, the system automatically creates one.

      Enable

      Mount OSS Object Storage

      Mount an OSS bucket for the function to persistently store logs, business files, and more. For more information, see Configure an OSS file system.

      Enable

    • Logs And Tracing Analysis

      Configuration item

      Description

      Example

      Log Feature

      Set this to persistently save the function's execution logs to Simple Log Service. This helps you debug code, analyze faults, and analyze data. For more information, see Configure the log feature.

      • Automatic Configuration: Automatically selects a log project that starts with serverless-<region_id>.

        Only one such log project is created in each region. If the system finds that this log project already exists in the current region, it will be used directly.

      • Custom Configuration: You must manually specify the target Log Project and Logstore.

      Enable

    • More Settings

      Configuration item

      Description

      Example

      Time Zone

      Select the time zone for the function. After you set the time zone here, an environment variable TZ is automatically added to the function, with its value set to the target time zone.

      UTC

      Tags

      Set tags for the function to manage functions by group. You must set both a tag key and a tag value.

      key : value

      Resource Group

      Select the resource group where the function resides to manage functions by group.

      Default Resource Group

      Environment Variables

      Use environment variables to flexibly adjust the function's behavior without modifying the code. For more information, see Configure environment variables.

      {
          "BUCKET_NAME": "MY_BUCKET",
          "TABLE_NAME": "MY_TABLE"
      }

Edit a function

To change the image after a function is created, go to the function details page, click the Configuration tab, and edit the runtime as shown in the following figure.

image

For more information about other modifications, such as changing environment variables and log storage settings, see Configure a function.

Delete a function

Log on to the Function Compute console. In the left-side navigation pane, click Functions. In the top navigation bar, select a region. On the Functions page, find the function that you want to delete and choose More > Delete in the Actions column. In the dialog box that appears, confirm that the function that you want to delete is not bound to any resources such as triggers and reserved instances. Then, click Delete.

image

Obtain the ARN of a function

You can use an Alibaba Cloud Resource Name (ARN) to locate the corresponding Alibaba Cloud resource in code. You can obtain the ARN of a function for easy reference.

  1. Log on to the Function Compute console. In the left-side navigation pane, click Functions.

  2. In the top navigation bar, select a region. On the Functions page, click the function that you want to manage.

  3. On the Function Details tab, click the Configurations tab. On the Basic Configurations tab, view and copy the ARN of the function.

image

References