Following our previous post explaining how to deploy CKAN on an AWS EKS cluster, we’ll now look into creating the necessary infrastructure for CKAN’s dependencies as well as deploying CKAN on Kubernetes using AKS. ​ For CKAN to run it needs to have a PostgresSQL database server, a Redis server, an instance of Solr (with Zookeeper as it’s own dependency) and it’s own Datapusher. To fully leverage Azure while also ridding ourselves of the need to manage some of these resources we’ll use Azure Database for PostgreSQL and Azure Cache for Redis.This will all be done by using Terraform an the Azure CLI. You will also need to have kubectl installed. ​

The Tools

​ For the Azure CLI, you can follow the docs. ​ After this, you’ll need to log in with the CLI by running ​

az login

​ This should open a web page in your default browser where you’ll need to log in with your credentials. You will need an Azure subscription for this. ​ Similarly for Terraform, you can follow the docs and get up and running. ​

Getting ready for Terraform

​ To begin setting up everything with Terraform, you’ll need a place for Terraform to store it’s state. For this you’ll need to: ​

  • Create a resource group under which you’ll store different Terraform related resources ​
$ az group create -n RG-CompanyName-Terraform -l northeurope

  • Create a storage account where you can create the storage container needed for the Terraform state ​
$ az storage account create -n companyname-terraform-backend -g RG-CompanyName-Terraform -l northeurope
  • Create a storage container ​
$ az storage container create -n tfstate --account-name companyname-terraform-backend

  • Create an Azure Key Vault where we’ll keep different access secrets. ​
$ az keyvault create -n companyname-keyvault -g RG-CompanyName-Terraform -l northeurope

  • Create a Shared Access Signatures token and store it in the key vault ​
$ az storage container generate-sas --account-name companyname-terraform-backend --expiry 2023-01-01 --name tfstate --permissions dlrw -o json | xargs az keyvault secret set --vault-name companyname-keyvault --name TerraformSASToken --value

  • Create a Service Principal which will allow the AKS cluster to access different Azure services that you create and an ssh key. ​

Keep in mind I’m showing how to create the ssh key on a Linux machine. ​

#creating a Service Principal for AKS and Azure DevOps
$ az ad sp create-for-rbac -n "AksTerraformSPN"
​
# save the tenant, appId and password use them to add in the keyvault as secrets bellow
​
#creating an ssh key if you don't already have one
$ ssh-keygen  -f ~/.ssh/id_rsa_terraform
​
#store the public key in Azure KeyVault
$ az keyvault secret set --vault-name companyname-keyvault --name LinuxSSHPubKey -f ~/.ssh/id_rsa_terraform.pub > /dev/null
​
#store the service principal id in Azure KeyVault
$ az keyvault secret set --name spn-id --value "SPN_APP_ID"  --vault-name companyname-keyvault
​
#store the service principal secret in Azure KeyVault
$ az keyvault secret set --vault-name companyname-keyvault --name spn-secret --value "SPN_PASSWORD"

The IaC repo

​ At this point you will want to create a separate directory for the code that we’ll create. It’s up to you if this will be a version controlled repo or just a local folder. ​

mkdir -p azure-ckan-terraform/environment
cd azure-ckan-terraform/environment

​ Once inside this directory create a provider.tf file and add this to it: ​

provider "azurerm" {
 version = "=2.47.0"
 features {}
}

​ From here on, pay attention to variable names, resource names and resource sizes and change according to your needs (organization name, budget, etc…). ​ Create a vars.tf file and add this inside: ​

#KeyVault Resource Group and KeyVaultName
variable "keyvault_rg" {
 type    = string
 default = "RG-CompanyName-Terraform"
}
variable "keyvault_name" {
 type    = string
 default = "companyname-keyvault"
}
​
variable "azure_region" {
 type    = string
 default = "northeurope"
}
​
variable "aks_vnet_name" {
   type = string
   default = "aksvnet"
}

​ Because we need to access the data we saved in the key vault, we’ll also need to create a keyvault_data.tf file and add this to it:

data "azurerm_key_vault" "terraform_vault" {
  name                = var.keyvault_name
  resource_group_name = var.keyvault_rg
}

data "azurerm_key_vault_secret" "ssh_public_key" {
  name         = "LinuxSSHPubKey"
  key_vault_id = data.azurerm_key_vault.terraform_vault.id
}

data "azurerm_key_vault_secret" "spn_id" {
  name         = "spn-id"
  key_vault_id = data.azurerm_key_vault.terraform_vault.id
}

data "azurerm_key_vault_secret" "spn_secret" {
  name         = "spn-secret"
  key_vault_id = data.azurerm_key_vault.terraform_vault.id
}

Now we’ll start setting up the required resources for our CKAN deployment. ​ First, we need an Azure Virtual Network and a subnet. Create a file called vn.tf and add this to it: ​

resource "azurerm_resource_group" "aks_companyname_rg" {
 name     = var.resource_group
 location = var.azure_region
}
​
resource "azurerm_virtual_network" "aks_vnet" {
 name                = var.aks_vnet_name
 resource_group_name = azurerm_resource_group.aks_companyname_rg.name
 location            = azurerm_resource_group.aks_companyname_rg.location
 address_space       = ["10.0.0.0/12"]
}
​
resource "azurerm_subnet" "aks_subnet" {
 name                 = "aks_subnet"
 resource_group_name  = azurerm_resource_group.aks_companyname_rg.name
 virtual_network_name = azurerm_virtual_network.aks_vnet.name
 address_prefixes     = ["10.1.0.0/16"]
​
 enforce_private_link_endpoint_network_policies = true
​
 service_endpoints = ["Microsoft.Sql"]
}

​ This creates a resource group, which does exactly what it’s name says. After this, we create a Virtual Network in that resource group, as well as a subnet inside the virtual network we just created. ​ Important to note here is the service endpoint. This is needed so that any resources in the subnet can access the PostgreSQL database service that we will create a bit later. ​ Next we’ll create the AKS instance in the same resource group as before. Create an aks.tf file then add this to it: ​

resource "azurerm_kubernetes_cluster" "aks_cluster01" {
 name                    = "companynameAKS"
 location                = azurerm_resource_group.aks_companyname_rg.location
 resource_group_name     = azurerm_resource_group.aks_companyname_rg.name
 dns_prefix              = "AKSTerraform"
 kubernetes_version      = "1.18.14"
​
 default_node_pool {
   name            = "ckanaks"
   node_count      = 1
   vm_size         = "Standard_D2_v4"
   os_disk_size_gb = 30
   type            = "VirtualMachineScaleSets"
   vnet_subnet_id  = azurerm_subnet.aks_subnet.id
 }
​
 linux_profile {
   admin_username = "aksadmin"
   ssh_key {
     key_data = data.azurerm_key_vault_secret.ssh_public_key.value
   }
 }
​
 network_profile {
   network_plugin     = "azure"
   network_policy     = "azure"       # Options are calico or azure - only if network plugin is set to azure
   dns_service_ip     = "172.16.0.10" # Required when network plugin is set to azure, must be in the range of service_cidr and above 1
   docker_bridge_cidr = "172.17.0.1/16"
   service_cidr       = "172.16.0.0/16" # Must not overlap any address from the VNEt
 }
​
​
 role_based_access_control {
   enabled = true
 }
​
 service_principal {
   client_id     = data.azurerm_key_vault_secret.spn_id.value
   client_secret = data.azurerm_key_vault_secret.spn_secret.value
 }
​
 tags = {
   Environment = "CKAN"
 }
}

With this, we’ll have an AKS cluster, so we have only the PostgreSQL and Redis service left. ​ Create a psql.tf file and add this to it: ​

resource "azurerm_resource_group" "az_psql_rg" {
 name     = "companyname-ckan-postgresql-rg"
 location = var.azure_region
}
​
resource "azurerm_postgresql_server" "companyname_ckan_psql" {
 name                = "sckan-psql-server"
 location            = azurerm_resource_group.az_psql_rg.location
 resource_group_name = azurerm_resource_group.az_psql_rg.name
​
 administrator_login          = "psqladminun"
 administrator_login_password = "[email protected]!"
​
 sku_name   = "GP_Gen5_2" #"GP_Gen5_4"
 version    = "11"
 storage_mb = 5120 #640000
​
 # backup_retention_days        = 7
 geo_redundant_backup_enabled = false #true
 auto_grow_enabled            = false #true
​
 public_network_access_enabled    = true
 ssl_enforcement_enabled          = false
 # ssl_minimal_tls_version_enforced = "TLS1_2"
​
 tags = {
   Environment = "CKAN"
 }
}
​
resource "azurerm_private_endpoint" "psql" {
 name                = "ckan-psql-private-endpoint"
 location            = azurerm_resource_group.az_psql_rg.location
 resource_group_name = azurerm_resource_group.az_psql_rg.name
 subnet_id           = azurerm_subnet.aks_subnet.id
​
 private_service_connection {
   name                           = "ckan-psql-privateserviceconnection"
   private_connection_resource_id = azurerm_postgresql_server.companyname_ckan_psql.id
   subresource_names              = [ "postgresqlServer" ]
   is_manual_connection           = false
 }
}
​
resource "azurerm_postgresql_virtual_network_rule" "example" {
 name                                 = "postgresql-vnet-rule"
 resource_group_name                  = azurerm_resource_group.az_psql_rg.name
 server_name                          = azurerm_postgresql_server.companyname_ckan_psql.name
 subnet_id                            = azurerm_subnet.aks_subnet.id
 ignore_missing_vnet_service_endpoint = true
}

​ This one creates a few other things apart from the PSQL service. Nothing too complicated though, the virtual network rule and the private endpoint are needed to allow different resources on the subnet to access the PSQL Service. This is why we added the service endpoint value in the subnet previously. ​ We only have the Redis service left. Create a redis.tf file and add this to it: ​

resource "azurerm_resource_group" "companyname_redis_rg" {
 name     = "companyname-ckan-postgresql-rg"
 location = var.azure_region
}
​
resource "azurerm_redis_cache" "companyname_ckan_redis" {
 name                = "ckan-redis"
 location            = azurerm_resource_group.companyname_redis_rg.location
 resource_group_name = azurerm_resource_group.companyname_redis_rg.name
 capacity            = 0
 family              = "C"
 sku_name            = "Basic"
 enable_non_ssl_port = false
 minimum_tls_version = "1.2"
​
 tags = {
   Environment = "CKAN"
 }
​
 redis_configuration {
 }
}

​ With this, we have everything we need for CKAN to run on Kubernetes on Azure. To sum up, we requested an Azure Virtual Network were we’ll create a subnet in which we’ll add our Azure Kubernetes Service (AKS). We’ll also create a PostgreSQL and Redis server (managed services by Azure).

If you want to further explore how to leverage Terraform for Azure Management, you can go through the azure provider docs here

Creating the Infrastructure

​ Before we create the infrastructure we have to initialize terraform and the resources that we use. To do this run:

terraform init \
   -backend-config="resource_group_name=RG-CompanyName-Terraform" \
   -backend-config="storage_account_name=companyname-terraform-backend" \
   -backend-config="container_name=tfstate"

​ This will set up terraform to keep it’s state in the storage container we created earlier. It’s important to do this and not keep the state locally. ​ After this you can run terraform apply in the same directory. Terraform will generate a plan to create: ​

  • Azure Virtual Network with a subnet
  • AKS cluster on the created subnet
  • Azure PostgreSQL managed database server accessible by the resources on the subnet
  • Azure Cache for Redis managed server ​ After you review the output and check that everything is as you want, input yes in the terminal and hit enter. Depending on the types and sizes of the machines you set for each different resource, the creation time will vary but you should have everything ready in a couple of minutes. ​ When everything passes ok, you’ll need to set your kubectl context to point to your new cluster. To do this run: ​
az aks get-credentials --resource-group RG-CompanyName-Terraform --name companynameAKS

​ With this you’ll be ready to deploy CKAN on the Kubernetes cluster. For this you can follow our guide here, just remember to change the values for the PostgreSQL and Redis services to use the newly created ones in Azure. You can get these either from the CLI or from the Azure Portal under the Access Keys sub-menu for the specific service.

12 Replies to “Creating infrastructure for CKAN on Azure using Terraform”

    1. Thanks for catching this and sorry for the inconvenience. I just updated the post with the info. Search for keyvault_data.tf and check the update.

  1. which value in “just remember to change the values for the PostgreSQL and Redis services to use the newly created ones in Azure” to change?

    1. Specifically the connection strings for your Redis and Postgres server that you created with the Terraform code.

        1. No, you will have to check the connection string in your Azure Dev Portal under the specific resource type.

  2. hey I am trying to create AKS cluster using terraform and I am implementing code same as yours. In that, this block of code is giving me error

    code –

    linux_profile {
    admin_username = var.vm_user_name
    ssh_key {
    key_data = data.azurerm_key_vault_secret.ssh_public_key.value
    }
    }

    error –

    │ Error: creating Managed Kubernetes Cluster “aks-cluster1” (Resource Group “ct-entry-platform-poc”): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=0 — Original Error: Code=”InvalidParameter” Message=”The value of parameter linuxProfile.ssh.publicKeys.keyData is invalid. Please see https://aka.ms/aks-naming-rules for more details.” Target=”linuxProfile.ssh.publicKeys.keyData”

    │ with azurerm_kubernetes_cluster.k8s,
    │ on resources.tf line 164, in resource “azurerm_kubernetes_cluster” “k8s”:
    │ 164: resource “azurerm_kubernetes_cluster” “k8s” {

    can you please help me ?

    1. Hi!

      From the error message I’m seeing that you have an issue with the ssh public key data that is stored in the keyvault. You need to check if you have created the data in your keyvault successfully and if it is your actual public key.

      Also see if you run into any naming issues, as the error message is saying. Also keep in mind that most of the resource names for Azure are global and you really have to pay attention for them to be unique.

  3. Hi in this step: Create a Shared Access Signatures token and store it in the key vault​. How the effect if the expiry date is expired. We can not use the CKAN service anymore?. Thank

    1. No, just terraform won’t have access to it’s shared state and you will have to renew the token.

  4. Hi I have a bit of question: where does CKAN saves the dataset files being uploaded? I assume it’s in PSQL but I’m still not sure how CKAN handles it?

    1. No, PSQL is used only for keeping track of all of the uploaded resources. The physical storage (CKAN filestore) is done on a PVC inside the Kubernetes cluster in this example here. There are also options for using cloud storage like Azure Blob storage, S3 on AWS and so on. These are separate extensions for CKAN that you can look into.

Leave a Reply