Public clouds are sexy these days. If you work as a developer, no doubt you have had or will have the opportunity to work with them. The technology is developing dynamically, the number of clients is increasing, and apps require more computing power every day. In the past, you would have had to have a huge budget (for starters) to plan the production implementation of the app – but how about nowadays?
You can do this from the comfort of your sofa within a few hours using a web browser. But is a web browser the right tool for this? Could it be more standardized and automated? Yes of course! And with that, I would like to move smoothly into an explanation of a process called “Infrastructure as Code” (IaC).
What is IaC?
In simple terms, it’s code that describes the configuration of all services required for our app to run. A kind of contract which ensures that what is defined in the code is also implemented. IaC tries to solve the problem of the lack of a standardized structure that facilitates the multiple implementations of twin environments, among other things. This is solved by using a standardized process of describing the environment and making sure that our expectations (as defined in the infrastructure schema) correspond to the actual state of things.
IaC derives from the assumption that our entire infrastructure will be defined in code form, but the exact form of description of our infrastructure has not been defined in any way – more specifically the language and technologies used.
Creating a Kubernetes cluster on MS Azure using Terraform
Follow the link for code: https://pastebin.com/80wnAz4Y
Nowadays there are a number of solutions in most popular programming languages (e.g. TypeScript / JavaScript, Python, Go or C#), by means of which app developers are able to read the defined stack implementing apps without any problems.
There are numerous technologies for describing infrastructure schemas available, but I would like to divide them into two main categories.
I have called the first of these dedicated technologies, as these are services made available on the platforms of particular providers. They have been assigned to specific solutions below (I have cited the 3 most popular public cloud operators).
- Amazon Web Services – CloudFormation – Schema formats supported: yaml and json,
- Microsoft Azure – Azure Resource Manager (ARM) – Schema format supported: json,
- Google Cloud Computing – Google Cloud Deployment Manager – Schema formats supported: yaml, py and jinja.
The second category is community-developed technologies, which are, for example (the 2 most popular technologies are listed below):
- Pulumi – Languages supported: TypeScript / JavaScript, Python, Go and C#,
- Terraform – Language supported: HCL.
The operating principle of all the technologies presented in this article is quite simple – each process consists of the following steps, among others:
- Checking the correctness of the schemas.
- Retrieving the initial / actual state of the pre-existing cloud resources.
- Building the expected state on the basis of the described resources.
- Comparing the initial state with the expected state and applying changes so that the expectations match reality.
Generating both states has several advantages:
- In case of an error during implementation, it’s possible to roll back to the last working stack.
- When the described infrastructure is properly implemented, we are able to delete/destroy it quite efficiently because we have precise knowledge of what resources have been created and what identifiers they have. This is especially important when, for whatever reason, we wish to quickly delete all the resources of a given environment.
The very use of IaC, regardless of the technology, allows us to extend the entire process, using the available linters or formatters – not to mention that any modification to existing schemas can be checked by team members or infrastructure specialists using Code Review. The code describing the stack (if it was done in accordance with best quality practices) is also an excellent form of documentation, and should help the onboarding process of new employees run much more efficiently. By adding automatic tests, we are able to significantly reduce the risk of making a mistake.
When selecting the aforementioned community technologies, using programming languages or languages adapted to the descriptions of the infrastructure, we use the standard syntax and their wide range of integrations. It is an ideal choice if our infrastructure is distributed and requires the configuration of a number of services from different providers in order to function properly. For example: the architecture of our app is based on Kubernetes (MS Azure – AKS), the database is delivered using Mongo Atlas, and the domain is connected to Cloudflare. By choosing Pulumi or Terraform, we will be able to automate the creation of a Mongo database, a Kubernetes cluster on MS Azure, connect to it and create deployments, configure Ingress, and finally connect the corresponding DNS record on the Cloudflare platform to the address obtained in the cluster creation process.
It is worth remembering…
that the state synchronization process is sometimes quite problematic when we dynamically modify the parameters of the described infrastructure resources from the level of our target app (or manually through the browser) – for example, when DNSs are created in the schemas, and specific records are added at a later stage. In this case, when we manually add a new record, there is a risk that this record will be deleted when implementing our stack. This is because, when synchronizing the IaC states, it will be noted that the configuration of a given resource (in this case DNS) differs significantly from that described in the schemas. In this case, the given parameter will be changed, and if it cannot be changed directly, the old resource will be destroyed and a new one will be created with the correct value – in doing so generating the downtime of the app. So any modification of the existing infrastructure schemas should be carefully thought through and analyzed in this context.
The use of formats like YAML and JSON limits our possibilities because there is often a need to refer to other resources (their identifiers) or string operations. These problems forced the publishers of the available solutions to apply additional extensions to increase their concatenation. And that’s where the problem arises, as these “extensions” vary significantly (between platforms), and furthermore, they complicate development as well.
Using the infrastructure description technology made available by providers, the entire process of interaction with state files is invisible to us because it takes place in a specially adapted data store, which we don’t usually have access to. Unfortunately, when using solutions other than dedicated ones, it becomes a considerable issue. While not for one person – because both states (real and expected) can be kept on our disk (after all, no one will overwrite it) – the problem is not so trivial if several people are working on the infrastructure. Luckily, there are ready-made solutions at hand. Most of the abovementioned technologies also come with a data warehouse, which simplifies the entire process – of course, this requires additional configuration. However, if this is not suitable, we are able to integrate with any service that allows you to store and update files with our schemas (AWS – S3, Azure – Blob storage).
Sign up for a meetup:
I hope that I have succeeded in introducing the concept behind IaC, delving into its true advantages and disadvantages, and showing that there is nothing too scary about it. If you want to learn more about Iac, follow the link or click the picture below and signup for an upcoming meetup at our office in Katowice. Ah, and before I forget, good luck persuading your bosses / CTOs to use IaC!