Application architecture
The application includes the following Compute Engine components:
Launching the web application
This tutorial uses a web application that is stored on GitHub. If you would
like learn more about how the application was implemented, see the
GoogleCloudPlatform/python-docs-samples
repository on GitHub.
Launch the web application on every VM in a managed instance group by including
a startup script in an instance template. To allow HTTP traffic to the web
application, create a firewall rule.
Create a firewall rule
Create a firewall rule to allow HTTP traffic to the web application:
In the Google Cloud console, go to the
Firewalls
page.
Go to Firewalls
Click
Create firewall rule
.
Under
Name
, enter
default-allow-http
.
Set
Network
to
default
.
Set
Targets
to select
Specified target tags
.
Under
Target Tags
, enter
http-server
.
Set
Source filter
to
IPv4 ranges
.
Under
Source IPv4 ranges
, enter
0.0.0.0/0
to allow access for all IP addresses.
Under
Protocols and ports
, select
Specified protocols and ports
.
Then, select
TCP
and enter
80
to
allow access for HTTP traffic
.
Click
Create
.
Create an instance template
Create an instance template that launches the demo web application on startup:
In the Google Cloud console, go to the
Instance templates
page.
Go to Instance templates
Click
Create instance template
.
Under
Name
, enter
autoscaling-web-app-template
.
Under
Machine configuration
, set the
Machine type
to
e2-standard-2
.
Under
Firewall
, select the
Allow HTTP traffic
checkbox. This applies
the
http-server
networking tag to each instance created from this
template.
Expand the
Advanced options
section to see advanced
settings.
Expand the
Management
section.
In the
Automation
section, enter the following startup script:
sudo apt update && sudo apt -y install git gunicorn3 python3-pip
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
cd python-docs-samples/compute/managed-instances/demo
sudo pip3 install -r requirements.txt
sudo gunicorn3 --bind 0.0.0.0:80 app:app --daemon
This script causes each VM to run the web application during startup.
Click
Create
.
Create a managed instance group
Create a regional instance group to begin running the web application:
In the Google Cloud console, go to the
Instance groups
page.
Go to Instance groups
Click
Create instance group
to create a new instance group.
Select
New managed instance group (stateless)
."
For
Name
, enter
autoscaling-web-app-group
.
For
Instance template
, select
autoscaling-web-app-template
.
For
Location
, select
Multiple zones
.
For
Region
, select
us-central1
.
For
Zones
, select the following zones from the drop-down list:
- us-central1-b
- us-central1-c
- us-central1-f
Configure autoscaling for the instance group:
- For
Autoscaling mode
, select
On: add and remove instances to
the group
.
Set the
Minimum number of instances
to
3
.
Set the
Maximum number of instances
to
6
.
Set the
Initialization period
to
120
seconds.
Under
Autoscaling Metrics
, select
CPU utilization
as the
metric type.
To learn more about autoscaling metrics, see
Autoscaling policy
.
Set the
Target CPU utilization
to
60
.
Click
Done
.
Under
Autohealing
, select
No health check
from the
Health check
drop-down list.
Click
Create
. This redirects you to the
Instance groups
page.
To verify that your instances are running:
- On the
Instance groups
page in the Google Cloud console, click
autoscaling-web-app-group
to see the instances in that group.
Under
External IP
, click on an IP address to connect that instance.
A new browser tab opens displaying the demo web application:
When you are done, close the browser tab for the demo web application.
Observing autoscaling
For more information about autoscaling behaviors, see
Understanding autoscaling decisions
.
Monitor autoscaling
The instance group you created uses an
Autoscaling policy
based on
CPU usage
. This means that autoscaler grows or shrinks the group as needed
to maintain the target CPU utilization of
60
%.
To monitor the size and CPU utilization of your instance group, use
the
autoscaling charts
in the Google Cloud console:
- On the
Instance groups
page for the
autoscaling-web-app-group
instance group, click the
Monitoring
tab.
- You can monitor autoscaling from the
Group size
chart.
The graph displays
Instances
, which represents the
number of VM instances in the group over time.
Optional: To monitor autoscaled capacity versus utilization, see the
Autoscaler utilization (CPU)
chart. The graph displays
Utilization
, which is the total CPU utilization of VM instances in the
group, and
Capacity
, which is the cumulative target CPU utilization of
the group (target CPU utilization multiplied by the number of VM instances).
Autoscaling attempts to make
Capacity
match
Utilization
by changing
the number of
Instances
, when possible.
Keep this window open.
Simulate scale out
Scale out occurs when the average CPU utilization of the instance group is
significantly higher than the target value. During scale out, autoscaler
gradually increases the size of the instance group until CPU utilization
decreases to the target CPU utilization value or until the instance group size
equals the
Maximum number of instances
, which was set to
6
.
To trigger scale out, increase the CPU utilization for your instances:
In the Google Cloud console, open
Cloud Shell
.
Open Cloud Shell
Cloud Shell opens on the bottom of the
Google Cloud console. It can
take a few seconds for the session to initialize.
Create a local bash variable for the project ID:
export PROJECT_ID=
[PROJECT_ID]
where
PROJECT_ID
is the project ID for your current project, which
is displayed on each new line in the Cloud Shell:
user@cloudshell:~ (
[PROJECT_ID]
)$
Run the following bash script. This script causes the demo web application
instances to have an increased load, which increases CPU utilization.
After a few minutes, the CPU utilization will surpass the target value,
prompting the autoscaling to increase the instance group size.
export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "autoscaling-web-app-group")
for i in $MACHINES;
do
NAME=$(echo "$i" | cut -f1 -d,)
IP=$(echo "$i" | cut -f2 -d,)
echo "Simulating high load for instance $NAME"
curl -q -s "http://$IP/startLoad" >/dev/null --retry 2
done
Open the
Monitoring
tab in the Google Cloud console.
After a few minutes, the
Monitoring
tab displays that the CPU
Utilization
increased, which triggers autoscaling to increase
Capacity
by increasing the number of
Instances
.
You might also notice that 6 instances are now listed under the
Overview
tab.
Keep both windows open.
Simulate scale in
Scale in occurs when the average CPU utilization of the instance group is
significantly lower than the target value. During scale in, autoscaler
gradually decreases the size of the instance group until CPU utilization
increases to the target CPU utilization or until the instance
group size equals the
Minimum number of instances
, which was set to
3
.
To trigger scale in, decrease the CPU utilization for your instances:
Run the following bash script. This script causes the demo web application
instances to have a decreased load, which decreases CPU utilization.
After a few minutes, the CPU utilization will fall below the target value,
prompting the autoscaler to decrease the instance group size.
export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "autoscaling-web-app-group")
for i in $MACHINES;
do
NAME=$(echo "$i" | cut -f1 -d,)
IP=$(echo "$i" | cut -f2 -d,)
echo "Simulating low load for instance $NAME"
curl -q -s "http://$IP/stopLoad" >/dev/null --retry 2
done
Open the
Monitoring
tab in the Google Cloud console.
After a few minutes, the
Monitoring
tab displays that the CPU
Utilization
decreased. After the
stabilization period
,
which verifies that the load is consistently less,
autoscaling decreases
Capacity
by decreasing the number of
Instances
.
You might also notice that only 3 instances are listed under the
Overview
tab.
Close both windows when you have finished.