IaC for Compute & Storage Resources
Automating compute resources with Terraform and Schematics. Compute images and instances life cycle
One of the basic and atomic services every cloud provide is compute. Compute on IBM Cloud covers different services however this pattern guide is about the IBM Cloud Virtual Servers, also called Virtual Server Instances or VSI, how to customize public and private virtual servers that scale up or down to suit your needs, using Terraform and IBM Cloud Schematics. Other IBM Cloud Compute services such as Kubernetes, OpenShift, Container Registry and Cloud Functions are covered in other pattern guides.
Based on the Network Resources pattern we would like to add a few compute resources and storage to host and expose an API with movies.
The following diagram shows the proposed architecture.
The code to build these resources can be downloaded from the GitHub repository https://github.com/IBM/cloud-enterprise-examples/ in the directory 07-compute.
- Virtual Server Instance
- Virtual Server Instance with IBM Cloud Schematics
- Load Balance a Cluster of VSI
- IBM Cloud Object Storage
- Volumes
- Final Terraform code
- Clean up
- Compute Resources & Data Source Reference
Virtual Server Instance
Before create a VSI we need to have all the networking resources created. The list of resources includes: VPC, Subnets in one or more zones, Public gateway for public internet communication, ACL’s for inbound and outbound traffic to the subnets. All these resources management is covered in the Networking section where the required network resources were created in the networking.tf
file.
The following code is an example to create a VSI, for this we use the ibm_is_instance
resource.
compute.tfresource "ibm_is_ssh_key" "iac_app_key" {name = "${var.project_name}-${var.environment}-key"public_key = var.public_key}resource "ibm_is_instance" "iac_app_instance" {name = "${var.project_name}-${var.environment}-instance"image = "r006-14140f94-fcc4-11e9-96e7-a72723715315"profile = "cx2-2x4"
The variables used in this code are defined at variables.tf
like so:
variables.tfvariable "project_name" {}variable "environment" {}variable "public_key" {}variable "port" {default = 8080}
The value of the variables are set in the terraform.tfvars
and *.auto.tfvars
files. As we are going to use this code with the Terraform CLI on our local host and with IBM Cloud Schematics we should not use file(pathexpand(var.public_key_file))
to get the value of a public key file like ~/.ssh/id_rsa.pub
because it doesn’t work on IBM Cloud Schematics. Instead let’s send the content of the public key file to the variable public_key
in the secrets.auto.tfvars
file using the following command.
echo "public_key = \"$(cat ~/.ssh/id_rsa.pub)\"" > secrets.auto.tfvars
It is important you make sure to include the file secrets.auto.tfvars
to the .gitignore
file so you don’t share your secrets to the world.
To test this VSI we’ll deploy a server to expose an API from a movies database JSON file. This is a sample of it:
{"movies": [{"id": "83","title": "Akira Kurosawa's Dreams","summary": "This is essentially eight separate short films, with some overlaps in characters and thematic material - that of man's relationship with his environment. 'Sunshine Through The Rain' - 'The Peach Orchard' - 'The Blizzard' - 'Crows' - 'Mount Fuji in Red' - The Weeping Demon' - 'Village of the Watermills'","year": "1990","duration": "7173610","originallyAvailableAt": "1990-05-11","addedAt": "1348382738",
To deploy the JSON file to the provisioned VSI the first step is to load the file using the data source local_file, the file content can be obtained with the attribute content
or content_base64
if you require the file content encoded.
In the user_data
attribute use the command echo
with base64
to print the decoded content of the JSON file that was previously encoded using the content_base64
attribute of the local_file
data source. Terraform sends the content of the file to the IBM Cloud engine through HTTP, it is recommended to encode this text otherwise we can get unexpected results. That’s why we’ll use the attribute content_base64
instead of content
, and we use the command base64
, at the server side, to decode the received text.
Modify the value of the user_data
attribute of ibm_is_instance.iac_app_instance
and add the local_file
data source, like this:
compute.tfdata "local_file" "db" {filename = "${path.module}/db.min.json"}resource "ibm_is_instance" "iac_app_instance" {...user_data = <<-EOUD#!/bin/bashecho '${data.local_file.db.content_base64}' | base64 --decode > /var/lib/db.min.json
We are introducing a new Terraform plugin (local
) with the local_file
data source, so before apply the infrastructure code and deploy the application we need to download this plugin with the terraform init
command. Execute the following commands to apply the new code:
terraform initterraform planterraform apply
In about a minute the VSI is provisioned but the API server will be running in about an extra minute. To verify it’s working get the public_ip
output variable to fetch some movies, like so:
curl "$(terraform output entrypoint)/movies" | jqcurl "$(terraform output entrypoint)/movies/675"curl "$(terraform output entrypoint)/movies?id=1067&id=1649"
If you do not see any response, please, give it a minute or two, it takes some time to install all the dependencies to run the API server.
Virtual Server Instance with IBM Cloud Schematics
As seen in the IBM Cloud Schematics pattern a similar HCL code can be used with IBM Cloud Schematics and Terraform CLI, the only change is that Schematics cannot access your local filesystem but it can access the files located in the git repository or remote files.
To create this project on IBM Cloud Schematics create a JSON Workspace template file like so:
workspace.tmpl.json{"name": "iac_schematics_test","type": ["terraform_v0.12"],"description": "Sample workspace to test IBM Cloud Schematics. Deploys an web server on a VSI with a Hello World response","tags": ["app:helloworld","env:dev"
Then execute the following commands to create the Workspace, provision the VSI and deploy the application.
# Verify you are logged in to the right accountibmcloud target# Get the content of the SSH public key and render it into the JSON file using the templatePUBLIC_KEY="$(cat ~/.ssh/id_rsa.pub)"sed "s|{ PUBLIC_KEY }|$PUBLIC_KEY|" workspace.tmpl.json > workspace.json# Create the Schematics Workspaceibmcloud schematics workspace new --file workspace.json
You may have to wait 2-3 seconds between the execution of an action (i.e. apply
) and the retrieval of the logs. Also, you may have to execute the retrieval of the apply
logs several times until the task is completed.
If the Workspace JSON file is modified, for example a variable value, you can update it with the command ibmcloud schematics workspace update
.
To verify the application is running, get the output variables and use curl
to get the page:
ibmcloud schematics workspace output --id $ID --jsonIP=$(ibmcloud schematics workspace output --id $ID --json | jq -r '.[].output_values[].ip_address.value')ADDR=$(ibmcloud schematics workspace output --id $ID --json | jq -r '.[].output_values[].entrypoint.value')curl "${ADDR}/movies/675"
Identify Input Parameters Values
In the Getting Started with Terraform section was explained how to get the value of the input parameters such as the instance image
and profile
using the IBM Cloud CLI and some Unix commands.
It’s also possible to get these parameters using the Data Sources ibm_is_instance_profile
and ibm_is_image
. Modify the compute.tf
file to add the following data sources.
compute.tfdata "ibm_is_image" "ds_iac_app_image" {name = "ibm-ubuntu-18-04-1-minimal-amd64-1"}resource "ibm_is_instance" "iac_app_instance" {...image = data.ibm_is_image.ds_iac_app_image.id...}
This change is not much but give more information to the developer reading the code if the the image name is used instead of the ID. You still need to execute the following IBM Cloud CLI commands to find the image and profile name, like so:
ibmcloud is imagesibmcloud is images | grep available | grep ubuntu-18 | grep amd64 | cut -f2 -d" "ibmcloud is images --json | jq -r '.[] | select(.status=="available" and .operating_system.name=="ubuntu-18-04-amd64").name'ibmcloud is instance-profilesibmcloud is instance-profiles | grep amd64 | sort -k4 -k5 -n | head -1 | cut -f1 -d" "ibmcloud is instance-profiles --json | jq -r 'map(select(.vcpu_architecture.value=="amd64")) | sort_by(.memory.value)[0].name'
Load Balance a Cluster of VSI
Running a single server serve the purpose but it’s a single point of failure, if this single server fails the API is not accessible. The solution is to have a cluster of Virtual Servers, routing the traffic to the servers that are working and scaling the size up or down based on the traffic load. IBM Cloud allow us to get this features with the Load Balancer service and Terraform resources.
You can read more about the type of Load Balancers, Listeners, Pools, LB Methods and more in the Load Balancer documentation for IBM CLoud Gen 2.
We’ll begin the Load Balancer code by creating a ibm_is_lb
resource in the lb.tf
file, like so:
lb.tfresource "ibm_is_lb" "iac_app_lb" {name = "${var.project_name}-${var.environment}-lb"subnets = [ibm_is_subnet.iac_app_subnet.id]}
It requires an ID reference of the subnets where the LB is located. The VSI that are served by this Load Balancer should be in the same VPC and region. This specific LB name is formed by the project name and environment to not collide with other Load Balancers. Other important parameter is type
to define if the LB will be public
(default) or private
, this would be public so the fully qualified domain name (FQDN) will be accessible from the internet and have assigned multiple public IP addresses.
For private load balancers the access is restricted to internal clients on the same subnet, region and VPC. It also has assigned a FQDN with multiple IP addresses and only accepts traffic from RFC1918 address spaces such as those in the blocks 10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16.
The Load Balancer needs a listener to listen on a given port and protocol. The FQDN and port assigned to the load balancer are exposed to the public internet. The traffic is then redirected to a pool member or VSI from the assigned default pool. The API on the VSI is listening/serving on port 8080
by default, unless a different port is set in the port
variable. We can make the LB listener to use the same port or different, in this case, let’s use the same.
The available protocols on the LB listener are HTTP, HTTPS and TCP, this one uses the HTTP protocol. The pool or VSI protocols supported are only HTTP and TCP, in this case it’s HTTP. Let’s append the following code to lb.tf
to define the LB listener:
lb.tfresource "ibm_is_lb_listener" "iac_app_lb_listener" {lb = ibm_is_lb.iac_app_lb.idport = var.portprotocol = "http"default_pool = ibm_is_lb_pool.iac_app_lb_pool.id}
The ibm_is_lb
ID, port
and protocol
are required parameters. Other optional parameters are default_pool
, certificate_instance
and connection_limit
, the last two are not needed for this project.
On the other side of the Load Balancer are the VSI or backend application which are identified as a Pool Member (ibm_is_lb_pool_member
) that is, obviously, a member of the defined pool (ibm_is_lb_pool
). Let’s begin modifying the VSI ibm_is_instance.iac_app_instance
resource in the compute.tf
file to have multiple instances adding the Terraform count
parameter and modifying the name
parameter to include the number of that instance. Like so:
compute.tf...resource "ibm_is_instance" "iac_app_instance" {name = "${var.project_name}-${var.environment}-instance-${format("%02s", count.index)}"...count = var.max_size...}
The new variable max_size
defines how many VSI, therefore how many pool members too, will be created. This variable is defined like this in the variables.tf
file:
variable "max_size" {default = 3}
All is set to create the Pool and the Pool Members, add the following code to the lb.tf
file:
lb.tfresource "ibm_is_lb_pool" "iac_app_lb_pool" {name = "${var.project_name}-${var.environment}-lb-pool"lb = ibm_is_lb.iac_app_lb.idalgorithm = "round_robin"protocol = "http"health_delay = 5health_retries = 2health_timeout = 2health_type = "http"
The ibm_is_lb_pool.iac_app_lb_pool
resource requires the following input attributes:
Input parameter | Description |
---|---|
name | name of the pool. In this case it has the name of the project and environment to not collide with other pools |
lb | ID of the load balancer is linked to |
algorithm | load balancing algorithm. Supported values are round_robin , weighted_round_robin , or least_connections |
protocol | pool protocol. Supported values are http , and tcp |
health_delay | health check interval in seconds. Interval must be greater than timeout value |
health_retries | health check max retries |
health_timeout | health check timeout in seconds |
health_type | pool protocol. Supported values are http , and tcp |
Other input and output parameters are described in the ibm_is_lb_pool
resource documentation. The 3 different load balancing methods to set in the algorithm
input parameter are described in the Load Balancers documentation.
The ibm_is_lb_pool_member.iac_app_lb_pool_mem
is actually a list of resources because it also has the Terraform count
attribute, just like the ibm_is_instance
resource and this is because we need one pool member per VSI.
The ibm_is_lb_pool_member
uses the target_address
attribute to link the pool member to the VSI using its IP address. Here we use the count.index
to reference to the VSI with the same index, so the pool member 0 is linked to the VSI 0, and so on. As we did to link the Floating IP to the IP address of the VSI, this IP address is identified by the ibm_is_instance.iac_app_instance
resource output parameter primary_network_interface.0.primary_ipv4_address
.
The required input attributes for ibm_is_lb_pool_member
are:
Input parameter | Description |
---|---|
pool | ID of the load balancer pool |
lb | load balancer ID |
port | port number of the application running in the server member, in this case is in the variable port |
target_address | IP address of the pool member or VSI |
weight | weight of the server member. This parameter is optional and it takes effect only when the load balancing algorithm of its belonging pool is weighted_round_robin |
More information about this resource can be found at the ibm_is_lb_pool_member
resource documentation.
The load balancer adjusts its capacity automatically according to the load. When this adjustment occurs, you may see a change in the number of IP addresses associated with the load balancer’s DNS name. To know the load balancer’s DNS name and the associated IP addresses we use the following output variables to be included in the file output.tf
output.tfoutput "lb_ip_address" {value = ibm_is_lb.iac_app_lb.public_ips}output "entrypoint" {value = "http://${ibm_is_lb.iac_app_lb.hostname}:${var.port}"}
The new value of entrypoint
has now the hostname or FQDN of the load balancer.
Having a load balancer there is no need to have floating IP assigned to every VSI so you can remove them but in case you want to keep them you need one per VSI, so you have to do something similar done with the pool members modifying the ibm_is_floating_ip.iac_app_floating_ip
and the ip_address
output variable that uses this resource, like so:
resource "ibm_is_floating_ip" "iac_app_floating_ip" {name = "${var.project_name}-${var.environment}-ip"target = ibm_is_instance.iac_app_instance[count.index].primary_network_interface.0.idcount = var.max_size}
output "ip_address" {value = ibm_is_floating_ip.iac_app_floating_ip[*].address}
Here we use [*]
to let Terraform knows that we need the address
of all the iac_app_floating_ip
resources. This variable would be a list just like lb_ip_address
.
But, again, there is no need to have Floating IP’s assigned per VSI, the Load Balancer provide the FQDN and an IP per pool member or VSI. Remove the Floating IP’s once you verify the Load Balancer works.
Health Checks
Health check definitions are mandatory for back-end pools. Not having health checks cause the pool identify the pool members as unhealthy therefore not forwarding new connections to them.
The health check is configured in the ibm_is_lb_pool
resource using the health_*
attributes. Read the Health Check documentation and the ibm_is_lb_pool
input attributes that can be used for health check.
Health checks can be configured on back-end ports or on a separate health check port, based on the application. For this application, the API is being served with HTTP protocol, so we ser the input parameter health_type
of ibm_is_lb_pool
to http
. The port to monitor is the same port where the API is exposed so the health_monitor_port
parameter is set to the value we have on the variable port
but if not set, the port used in the pool member will be used. Finally, the URL to monitor is the path to be used for the health check, the default value is /
but we may have something like /health
. If the HTTP response code is 200
then the pool member is considered healthy.
IBM Cloud Object Storage
There are different type of storage objects in cloud environment and you use different types for different situations Review the IBM Cloud Object Storage resources to know how to use them and which storage object to use.
We use the resource ibm_cos_bucket
to create an Object Storage bucket to store data but it requires a ibm_resource_instance
to be created in advance. There are 5 different storage classes to choose:
- Smart Tier (
smart
) can be used for any workload, especially dynamic workloads where access patterns are unknown or difficult to predict. Smart Tier provides a simplified pricing structure and automatic cost optimization by classifying the data into “hot”, “cool”, and “cold” tiers based on monthly usage patterns. All data in the bucket is then billed at the lowest applicable rate. There are no threshold object sizes or storage periods, and there are no retrieval fees. - Standard (
standard
) is used for active workloads, with no charge for data retrieved (other than the cost of the operational request itself). - Vault (
vault
) is used for cool workloads where data is accessed less than once a month - an extra retrieval charge ($/GB) is applied each time data is read. The service includes a minimum threshold for object size and storage period consistent with the intended use of this service for cooler, less-active data. - Cold Vault (
cold
) is used for cold workloads where data is accessed every 90 days or less - a larger extra retrieval charge ($/GB) is applied each time data is read. The service includes a longer minimum threshold for object size and storage period consistent with the intended use of this service for cold, inactive data. - Flex (
flex
) is being replaced by Smart Tier for dynamic workloads.
In our demo application we use this bucket to store images with the movie covers and they are actively used, so the object class to choose is standard
. For more information about storage classes, see Use storage classes.
Lets create the storage with the following code:
resource "ibm_resource_instance" "iac_app_cos_instance" {name = "${var.project_name}-${var.environment}-cos-instance"service = "cloud-object-storage"plan = "standard"location = "global"}resource "ibm_cos_bucket" "iac_app_cos_bucket" {bucket_name = "${var.project_name}-${var.environment}-bucket"
The ibm_cos_bucket
resource requires the following input parameters:
Input parameter | Description |
---|---|
bucket_name | to name the bucket |
resource_instance_id | ID of the ibm_resource_instance service instance for which you want to create a bucket |
storage_class | The storage class that you want to use for the bucket |
region_location | location of a regional bucket. Do not use this parameter with other *_location parameter |
single_site_location | location for a single site bucket. Do not use this parameter with other *_location parameter |
cross_region_location | location for a cross-regional bucket. Do not use this parameter with other *_location parameter |
For more information about other optional input parameters and the output parameters read the ibm_cos_bucket
IBM Cloud Object Storage resource documentation.
The upload of data to the bucket is not done with Terraform nor Schematics, but can be done using code in different languages such as Go, Python, Node or Java using the SDK, or using the Linux commands (i.e. curl
) with the Cloud Storage API, or using any file transfer tools such as Cyberduck or Transmit and command-line utilities like s3cmd or Minio Client, and many others.
Block Storage for Volumes
Besides IBM Cloud Storage Object we can attach volumes to the instance to store data or files to access though file system interface, file system access semantics (such as strong consistency and file locking), and concurrently-accessible storage.
Block Storage are block-level volumes that can be attached to VSI as either a boot volume or as a data volume. The boot volumes are attached by default. The Block Storage for VPC documentation can give you more information.
To create a data volume per instance we use the ibm_is_volume
resource with a code like this one.
storage.tfresource "ibm_is_volume" "iac_app_volume" {count = var.max_sizename = "${var.project_name}-${var.environment}-volume-${format("%02s", count.index)}"profile = "10iops-tier"zone = "us-south-1"capacity = 100}
The main input parameters for ibm_is_volume
are described in the following table.
Input parameter | Description |
---|---|
name | to name the volume |
profile | volume profile |
zone | location of the volume |
capacity | capacity of the volume in gigabytes. the default value is 100 |
iops | total input/ output operations per second (IOPS) for your storage. This value is required for custom storage profiles only |
The Block Storage Capacity and Performance documentation give you information about the available capacities and the performance (IOPS and Throughput) per capacity.
To get a list of volume profiles with the CLI use the following command, also the Profiles documentation list the available profiles and explain how to define a custom one.
ibmcloud is volume-profiles
Having the Block Storage is not enough, you need to assign the volume to the VSI using the volumes
list attribute of the ibm_is_instance
resource, like so:
compute.tfresource "ibm_is_instance" "iac_app_instance" {...volumes = [ibm_is_volume.iac_app_volume[count.index].id]...}
Notice the use of count.index
to get one volume id. This way we ensure the volume 0
is assigned to the instance 0
and so on. The volume is mounted in the root partition, so there is no need to modify the user data to change the location of the JSON DB file.
Verify and apply all the changes executing the commands:
terraform planterraform apply
To test the API works execute the following command to get the Load Balancer FQDN or entrypoint to be used by the curl
command:
curl $(terraform output entrypoint)/movies/675
Final Terraform code
You can download the code from the GitHub repository https://github.com/IBM/cloud-enterprise-examples/ in the directory 07-compute where the main files are:
compute.tfresource "ibm_is_ssh_key" "iac_app_key" {name = "${var.project_name}-${var.environment}-key"public_key = var.public_key}data "local_file" "db" {filename = "${path.module}/db.min.json"}
lb.tfresource "ibm_is_lb" "iac_app_lb" {name = "${var.project_name}-${var.environment}-lb"subnets = [ibm_is_subnet.iac_app_subnet.id]}resource "ibm_is_lb_listener" "iac_app_lb_listener" {lb = ibm_is_lb.iac_app_lb.idport = var.portprotocol = "http"
storageresource "ibm_resource_instance" "iac_app_cos_instance" {name = "${var.project_name}-${var.environment}-cos-instance"service = "cloud-object-storage"plan = "standard"location = "global"}resource "ibm_cos_bucket" "iac_app_cos_bucket" {bucket_name = "${var.project_name}-${var.environment}-bucket"
output.tfoutput "lb_ip_address" {value = ibm_is_lb.iac_app_lb.public_ips}output "entrypoint" {value = "http://${ibm_is_lb.iac_app_lb.hostname}:${var.port}"}
variables.tfvariable "project_name" {}variable "environment" {}variable "public_key" {}variable "port" {default = 8080}variable "max_size" {
workspace.tmpl.json{"name": "iac_schematics_test","type": ["terraform_v0.12"],"description": "Sample workspace to test IBM Cloud Schematics. Deploys an web server on a VSI with a Hello World response","tags": ["app:helloworld","env:dev"
Clean up
In this section you have created the same infrastructure using Terraform CLI and IBM Cloud Schematics.
To destroy everything created with Terraform CLI, execute:
terraform destroy
To delete everything you’ve created with IBM Cloud Schematics, execute the following command to destroy the infrastructure:
ID=$(ibmcloud schematics workspace list --json | jq -r '.workspaces[] | select(.name == "iac_schematics_test") | .id')ibmcloud schematics destroy --id $ID# Or:act_ID=$(ibmcloud schematics destroy --id $ID --force --json | jq -r '.activityid')ibmcloud schematics logs --id $ID --act-id $act_ID
Finally, to delete the Workspace, execute these commands:
ibmcloud schematics workspace delete --id $ID --forceibmcloud schematics workspace list
Compute Resources & Data Source Reference
The following Terraform Resources or Data Source are used to handle compute resources. Most of them are covered in this section, for those that were not, the links include a description, examples, the input and output parameters. There are also other useful links related to the resources that you may find useful.
- ibm_is_ssh_key
- ibm_is_instance
- ibm_is_lb
- ibm_is_lb_listener
- ibm_is_lb_listener_policy
- ibm_is_lb_listener_policy_rule
- ibm_is_lb_pool
- ibm_is_lb_pool_member
- ibm_is_volume
- ibm_is_images
- ibm_is_instance_profile
- ibm_is_instance_profiles
- ibm_is_region
ibm_is_ssh_key
- Resource documentation with examples, input and output parameters.
- Data Source documentation retrieves non sensitive information from the given ssh key name
- Terraform input variables are required to load the content of the SSH public key, using filesystem functions such as file and Terraform data source like local_file.
- Command
ssh-keygen
is required to generate the SSH key pair files.
ibm_is_instance
- Resource documentation with examples, input and output parameters.
ibm_is_lb
- Resource documentation with examples, input and output parameters.
ibm_is_lb_listener
- Resource documentation with examples, input and output parameters.
ibm_is_lb_listener_policy
- Resource documentation with examples, input and output parameters.
ibm_is_lb_listener_policy_rule
- Resource documentation with examples, input and output parameters.
ibm_is_lb_pool
- Resource documentation with examples, input and output parameters.
ibm_is_lb_pool_member
- Resource documentation with examples, input and output parameters.
ibm_is_volume
- Resource documentation with examples, input and output parameters.
ibm_cos_bucket
- Resource documentation with examples, input and output parameters.
- Data Source documentation retrieves information about the given bucket
ibm_is_images
- Data Source documentation retrieves all the compute images
ibm_is_instance_profile
- Data Source documentation retrieves an instance profile from a given name.
ibm_is_instance_profiles
- Data Source documentation retrieves all the instance profiles
ibm_is_region
- Data Source documentation retrieves all the regions.