Welcome to KubeCampus Course 10 – Kubernetes Autoscaling

You have to complete/review all lessons in this course to receive your badge!

Autoscaling in Kubernetes refers to the ability to automatically adjust the number of pods in a deployment based on the current workload or traffic. Autoscaling allows Kubernetes to dynamically increase or decrease the resources allocated to a deployment, ensuring optimal utilization of resources and efficient workload management.

In this comprehensive Kubernetes Autoscaling tutorial, you will learn how to manage resources efficiently and maintain application performance during varying workloads by implementing both horizontal and vertical pod autoscaling. This lab is designed to guide you through the process of deploying a sample application with multiple replicas, set up the Horizontal Pod Autoscaler (HPA) and configure custom metrics for HPA. You will also implement the Vertical Pod Autoscaler.

By the end of this tutorial, you will have a thorough understanding of the different autoscaling techniques available in Kubernetes and be well-equipped to implement them in your own applications.

Topics covered in this lab include:

  • Introduction to Kubernetes Autoscaling
  • Importance of Autoscaling in application performance and resource efficiency
  • Implementing Horizontal Pod Autoscaler (HPA)
  • Implementing Vertical Pod Autoscaler (VPA)
  • Configuring Custom Metrics

What is the structure of the lab?

The lab consists of two sections:

  1. Autoscaling Theory
  2. Hands-on keyboard command line experience implementing Autoscaling

The lab will take about 45 minutes to 1 hour to complete, depending on your skill level.


  • Please read all material referenced in the links in the lab.
  • On multiple-choice questions, note that more than one answer may be correct.
  • Please note this lab is timed and should be completed in one sitting.

Autoscaling theory

This section will cover terminology for autoscaling. You will review material on-screen, then answer a challenge question. You must answer the question correctly to proceed to the hands-on section.

During the theory section, we will cover the following topics:

  • Introduction to Autoscaling:  We will begin by describing autoscaling, what it is used for and why it is important. We will also delve into HPA and VPA, the horizontal and vertical approaches to pursue autoscaling, and why they are important.
  • Benefits of Autoscaling: We’ll talk about the benefits of Autoscaling and how it plays a crucial role in maintaining application performance and resource efficiency.
  • Configuring Custom Metrics: Finally, we will turn to custom metrics in Kubernetes, what they allow users to do, what benefits they bring and how they form an important piece of the Autoscaling story.

Understanding Autoscaling

  • Kubernetes Autoscaling is a powerful feature that allows you to dynamically adjust the number of replicas or resources allocated to your application’s pods, based on real-time performance metrics. This enables your applications to maintain optimal performance and resource efficiency, even under varying workloads.
  • Autoscaling in Kubernetes is achieved through two primary techniques: HPA and VPA.
  • HPA focuses on adjusting the number of pod replicas in a deployment or replication controller based on the observed CPU utilization or custom performance metrics. This helps to ensure that your application can scale out to handle increased load or scale in when demand is low, thus optimizing resource utilization and maintaining application performance.
  • HPA relies on the Metrics Server or other monitoring solutions such as Prometheus to gather the necessary metrics for scaling decisions.
  • VPA, on the other hand, is responsible for adjusting the CPU and memory resources allocated to individual pods based on historical usage patterns and current requirements. This technique allows your application to maintain optimal performance without over-provisioning or under-provisioning resources.
  • VPA operates by continuously monitoring pod resource usage, generating recommendations and updating the pod’s resource limits, as needed.
  • When combined with HPA, these autoscaling techniques can significantly enhance the performance, efficiency, and resilience of your Kubernetes applications.

Benefits of Autoscaling

  • Autoscaling plays a crucial role in maintaining application performance and resource efficiency by dynamically adjusting the resources allocated to an application based on its current needs. This ensures that applications can handle fluctuations in workload and user demands effectively, without compromising their performance or stability.
  • One of the primary benefits of autoscaling is that it prevents over-provisioning or under-provisioning of resources. Over-provisioning leads to unnecessary resource consumption and increased costs, while under-provisioning can result in poor application performance, slow response times and even service outages.
  • By monitoring the application’s performance metrics and adjusting resources accordingly, Autoscaling ensures that applications always have the right number of resources to meet their needs at any given time.
  • Autoscaling contributes to the high availability and resilience of applications. In the case of HPA, the number of pod replicas can be increased to distribute the workload evenly, ensuring that the application remains responsive, even during peak traffic periods. This also helps in mitigating the impact of pod failures, as additional replicas can be created to replace the failed ones.
  • On the other hand, VPA optimizes the resource allocation for individual pods, ensuring that each pod has sufficient resources to function optimally without over-utilizing or under-utilizing the available resources.
  • By leveraging both horizontal and vertical autoscaling techniques, you can create a robust and efficient application infrastructure that can handle varying workloads while minimizing costs and maintaining optimal performance.

Configuring Custom Metrics

  • Custom metrics in Kubernetes allow users to define and utilize their own performance indicators to make more informed scaling decisions.
  • By configuring custom metrics, you can extend the capabilities of HPA beyond the default CPU and memory utilization metrics, enabling autoscaling based on application-specific or business-related metrics. This ensures that your application scales in response to the most relevant performance factors, leading to better resource utilization and overall performance.
  • To configure custom metrics in a Kubernetes cluster, you’ll first need a monitoring solution that supports custom metrics collection, such as Prometheus. Prometheus is a powerful open-source monitoring and alerting toolkit that can collect and store a wide range of metrics from your applications and infrastructure.
  • Next, you’ll need to install an adapter like the Prometheus Adapter, which exposes custom metrics to the Kubernetes Metrics API, making them available for use by the HPA.
  • Once you have the monitoring and adapter components in place, you can create an HPA configuration that uses custom metrics for scaling decisions.
  • In the HPA configuration, you’ll need to define the custom metrics you want to use, set target values or thresholds, and specify the scaling policies. With the custom metrics configured, the HPA can then make more informed scaling decisions based on the real-time performance of your application, leading to more efficient resource usage and improved application performance.ce.

Hands-on Commands Section

In this hands-on challenge, we will walk through the process rolling out of a sample application with exercises on how to scale pods to support the deployment.

The challenge will cover the following steps:

  • Deploying a Sample Application with Multiple Replicas
  • Configuring automatic horizontal scaling of pods based on CPU utilization
  • Familiarizing yourself with HPA’s components, including the Metrics Server
  • Simulating load and observing the auto scaling process

Is There Pre-work for the Course?

Yes. Be sure to read and study this blog post, watch the video showing the work to be performed and view the accompanying slides that form part of each course.

How Do I Access Course 10?

Go to and navigate the “Courses” tab to start. All the best. Enjoy!

Note: Additional Learning To extend your learning experience, Kasten offers a variety of resources such as white papers, case studies, data sheets and eBooks on Kubernetes backup. Follow this link to explore those learning materials!

Ready to start?

Take the skill level self-assessment quiz


Welcome to the KubeCampus Learning Community! has now relaunched and re-branded as​

For technical support with please email [email protected].

Connect with other users and Kasten support on Kasten’s Learning Slack Channel.

Kasten K10 free now

Kasten K10