Over the past four articles, you've seen how to distribute a Docker application across multiple nodes and load balance it. This was performed using Docker Swarm and Kubernetes. These techniques are powerful and increase the availability and fault tolerance of your applications. However, up until now, none of the examples explained managing state.
In this article, you'll take a look at how state can be handled with Kubernetes as well as a little theory for tackling bigger and more complex applications.
What Is State?
In case you're new to this, it helps to identify what state actually is.
When creating the examples in the previous articles, each of the nodes were agnostic of both user and application-based information. The containers created were simply processing handlers. Information went in one end and came out transformed; a simple request -> response transaction.
In the wild, very few applications in their entirety function in this way. While we endeavor to make microservices as stateless as possible, an application on the whole has many elements of state, such as database files, uploaded user assets, user session files, etc.
When dealing with state in a microservice application, state should be reduced to as few services as possible. If multiple services utilize the same data, then that data should be extracted to a single independent service, allowing other interested services access to the data.
Categorizing State
While state can be categorized into multiple types, such as database data, static assets, user assets, log files, user session files, etc., it helps instead to consider the state in terms of availability. At its root, state is usually either remote or local data and exists either ephemerally or physically (in memory or on disk). By grouping data into as few categories as possible, you are then able to reduce the necessary resources required to serve this information.
For instance, providing a stored file resource for logs, user assets and static assets means requiring fewer marshalling services (those services in charge of managing such files) as well as fewer attached storage mediums when spaced over a cluster of nodes.
The caveat of this approach is with regard to black-box data. Databases, for instance, will typically run independently of other services in larger applications and, as they will require direct management of their files, will likely warrant an independent file storage medium.
Ultimately, the strategy you leverage will be unique to your application's requirements. Since application state requirements can vary wildly, this will be an important consideration at every stage of your development.
Managing State: an Example
Any physical state in your application will require file storage space. With Docker, this means using volumes
.
Kubernetes provides a wealth of functionality for attaching and managing persistent storage using the PersistentVolume
and PersistentVolumeClaim
functionality, which you'll see shortly. The majority of this functionality is oriented around third-party storage mediums, such as GlusterFS, CephFS, Amazon Web Services Elastic Block Storage, Google Cloud Engines Persistent Disk, and many others, most of which are remote storage options. A decent explanation of the supported mediums are provided in the Kubernetes documentation.
For Kubernetes applications, these are considered the go-to options, as they increase the reliability of your application should any nodes become unstable.
Despite this, however, storing data simply and locally is still an important requirement and one that is surprisingly infrequently documented on the web. As such, the example detailed in this article will demonstrate just that.
A note about PersistentVolume and PersistentVolumeClaim
Kubernetes is designed to abstract pods (those services performing functionality within your cluster) and the orchestration of those pods. The same is true of any and all storage or volumes.
For persistent volumes, we have PersistentVolume
and PersistentVolumeClaim
; the former is the representation of your physical storage volume and the latter is the association assigned to any given pod type. I like to think of them as velcro! You attach the spiky strip (PersistentVolume
) to the volume and the fluffy strip (PersistentVolumeClaim
) to your pods. This way, pods can be attached and detached as needed.
A PersistentVolumeClaim
can be assigned to a single pod type only, and with local storage, a given volume can be associated with pods on the same node only. This means that with storage that can only be attached to a single node, those pods that need to access the storage medium must also be assigned to that node.
However, storage such as AWS Elastic File Storage that can be attached to multiple nodes can have pods exist on any of those nodes. You can, however, assign multiple single-node storage mediums if you require distributing pods across multiple nodes; useful if you're hosting your own GlusterFS or CephFS cluster.
Getting started
This example continues from the previous article, so if you do not have your cluster running with Kubernetes installed, go ahead and do that now.
Ensure the services in the previous article are no longer running. To check, simply run the following:
$ kubectl get pods
If you receive the following error:
The connection to the server localhost:8080 was refused - did you specify the right host or port?
then make sure you switch to the kubeuser
, using su - kubeuser
before executing any kubectl
commands.
You'll create new services for this tutorial and, while it may seem a little contrived, you will find yourself utilizing a similar approach for many applications in Kubernetes.
Attaching storage
Now that your nodes are ready to run services, you'll need to allocate some local storage. Civo provides block storage for this purpose. This storage type is a single-node only medium, so it should illustrate the purposes of this example nicely.
An excellent how-to for attaching the storage is presented on Civo's website, so I will not be reproducing that info. Go ahead and attach a 1GB block storage to kube-worker1
with the path /mnt/assets
.
Attaching storage to the controller node is always a bad idea. Kubernetes prefers not to allocate any services to the controller node if possible. If you were to follow this tutorial and attempt to attach pods to storage on the controller node, the pod would remain in a
pending
state. Investigating the logs of that pod would show that the designated node hastaints
. This is because Kubernetes allocates the controller node asNoSchedule
.
Now that your storage is attached, you'll need to place a file or two in there. For this demonstration, I added an image called cats.png
. After all, who doesn't like cats? If you wish to do the same, simply run the following on kube-worker1
:
$ wget http://pngimg.com/uploads/cat/cat_PNG133.png -O /mnt/assets/cats.png
Backend service
With that out of the way, let's get down to building the services. As before, you'll use a PHP service to serve a file, but with a slight twist. The content of the served page will contain an added img
tag, to display our cute little cats image.
< ?php echo getHostName(); echo "<br />"; echo getHostByName(getHostName()); ?> <img src="assets/cats.png" />
And the Dockerfile:
FROM webgriffe/php-apache-base:5.5 COPY src/ /var/www/html/
I've already created Docker images for each of the services in this tutorial on Docker Hub, so I'll use those in the following
YAML
files, but you're free to upload your own if you wish.
With the Docker image in Docker Hub, you can now create the service. Create a new file on kube-controller
called backend.yaml
with the following content:
apiVersion: v1 kind: Service metadata: name: webserver spec: selector: app: php-service srv: backend ports: - protocol: TCP port: 80 targetPort: http --- apiVersion: apps/v1 kind: Deployment metadata: name: webserver spec: selector: matchLabels: app: php-server srv: backend replicas: 3 template: metadata: labels: app: php-server srv: backend spec: containers: - name: php-server image: "leesylvester/codeship-p5-server" ports: - name: http containerPort: 80
This includes both the pod deployment definition and the service definition in a single file. You can then go ahead and create the service, using:
$ kubectl create -f backend.yaml
Now, check that the service is running:
$ kubectl get pods NAME READY STATUS RESTARTS AGE webserver-5dc468f9bd-8kcf9 1/1 Running 0 1m webserver-5dc468f9bd-fbz8c 1/1 Running 0 1m webserver-5dc468f9bd-mw24f 1/1 Running 0 1m
Great!
Assigning PersistentVolume and PersistentVolumeClaim
You now have a server for serving your PHP files, but before you can add one for the static assets, you'll first need to inform Kubernetes of the storage medium.
As stated previously, this is done using the PersistentVolume
resource. For the block storage you have attached to kube-worker1
, your YAML
file will look like this:
apiVersion: v1 kind: PersistentVolume metadata: name: bls-pv labels: name: assets spec: capacity: storage: 1Gi accessModes: - ReadWriteMany storageClassName: local-storage persistentVolumeReclaimPolicy: Retain volumeMode: Block local: path: /mnt/assets nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - kube-worker1
This file contains a number of new properties, but each are relatively easy to explain:
capacity: provides a means to tell Kubernetes how much space the storage has. This is important, because if a pod requests storage that is greater than this, Kubernetes will continue searching through any other attached
PersistentVolume
definitions until a suitable size is found.accessModes: is a little misleading. One would assume this enforces how the storage can be accessed, but it actually acts as a simple guide label for the benefit of storage service providers. Possible options are
ReadWriteOnce
,ReadWriteMany
, andReadOnlyMany
.storageClassName: can be any of the supported service providers or
local-storage
for, well, local storage.persistentVolumeReclaimPolicy: dictates if the data stored in the storage medium should be kept when its attached service has been deleted. Options include
Reclaim
,Recycle
, andDelete
.volumeMode: should be set to
Block
for block storage orFileSystem
for local file system storage. Other options exist for service providers.local_path: provides the path to the storage directory or mount.
nodeAffinity: provides the node selection requirements for the associated pods, NOT the volume itself.
The PersistentVolume
definition is applied cluster wide. The nodeAffinity
requirements are pod specific. Therefore, if you wished to have this apply to pods on multiple nodes, the defined storage would be expected to exist on each of those nodes with the same configuration.
Copy the above YAML
into a file called persistent_volume.yaml
and execute it on kube-controller
with:
$ kubectl create -f persistent_volume.yaml
Next, you need to partner it with a claim:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: bls-pvc spec: accessModes: - ReadWriteMany storageClassName: local-storage resources: requests: storage: 1Gi selector: matchLabels: name: assets
This definition matches the volume definition with the same accessModes
and a storage
capacity that is equal to or less than the volumes capacity
value.
The selector
property will allow us to pair the upcoming service to it.
Once more, copy the definition to a file called persistent_volume_claim.yaml
and execute with:
$ kubectl create -f persistent_volume_claim.yaml
You can check if this worked by executing the following:
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE bls-pv 1Gi RWX Retain Bound default/bls-pvc local-storage 1m
Labeling your storage
Kubernetes is unaware of the hardware capabilities of each node, at least as far as additional storage, etc., is concerned. As such, you'll need to manually mark the node with the storage. You do this with labels.
Labels can be anything you choose and can be associated with any resource you choose. In the case of the backend service, you labeled it with the keys app
and srv
and the values php-server
and backend
respectively.
Labeling nodes works in the same way. However, in this instance, you'll assign the label directly, without the use of a YAML
file, with the following command:
$ kubectl label node kube-worker1 node_type=storage node "kube-worker1" labeled
With that done, you can then assign your asset server pod directly to the node with the storage, without needing to identify the node by IP or other such specific notation in the pod definition.
The asset service
With your persistent volume in place, it's now time to create the asset service. This service will be a simple NGINX service that serves any files in a given directory. The nginx.conf
file for this server will look like this:
worker_processes 2; events { worker_connections 1024; } http { server { root /var/www/html; listen 80; location / { } } }
and the Dockerfile
:
FROM nginx COPY nginx.conf /etc/nginx/nginx.conf
Super simple!
You'll then pair this with the PersistentVolumeClaim
in its YAML
file:
apiVersion: v1 kind: Service metadata: name: assetserver spec: selector: app: php-service srv: backend ports: - protocol: TCP port: 80 targetPort: http --- apiVersion: apps/v1 kind: Deployment metadata: name: assetserver spec: selector: matchLabels: app: asset-server srv: backend template: metadata: labels: app: asset-server srv: backend spec: volumes: - name: assetvol persistentVolumeClaim: claimName: bls-pvc nodeSelector: node_type: storage containers: - name: asset-server image: "leesylvester/codeship-p5-asset" volumeMounts: - name: assetvol mountPath: /var/www/html/assets ports: - name: http containerPort: 80
This file should look similar to the backend.yaml
definition. However, it also supplies a volumes
segment which associates it with the PersistentVolumeClaim
. The definition then references the volumeMounts
for the pod, which is supplied to the associated Docker container as a Docker volume.
The mountPath
extends the root parameter in the nginx.conf
file. This means that the block storage will be mapped to ./assets
relative to the NGINX root.
Also, note the nodeSelector
option. This binds the pod to only those nodes that have the label node_type
with the value storage
, which we created earlier for kube-worker1
. This pod will not, therefore, be assigned to any other node!
Go ahead and copy the above definition to asset_backend.yaml
and execute with:
$ kubectl create -f asset_backend.yaml
Now, check that the node is running:
$ kubectl get pods NAME READY STATUS RESTARTS AGE assetserver-8746b67d-rlcxh 1/1 Running 0 1m webserver-5dc468f9bd-8kcf9 1/1 Running 0 3m webserver-5dc468f9bd-fbz8c 1/1 Running 0 3m webserver-5dc468f9bd-mw24f 1/1 Running 0 3m
As you can see, only one asset pod has been deployed, which will be running on the node with the block storage.
The load balancer
Finally, you will need the load balancer. As before, this will also be an NGINX service but will distribute to the appropriate backend service based on the supplied URL.
The nginx.conf
will therefore contain:
worker_processes 2; events { worker_connections 1024; } http { server { listen 80; location / { proxy_pass http://webserver; proxy_http_version 1.1; } location /assets/ { proxy_pass http://assetserver/assets/; proxy_http_version 1.1; } } }
Here, the NGINX instance will route all requests to the PHP server, unless the URL path starts with assets/
, whereby it will be routed to the asset server.
The Dockerfile
for this instance will look like this:
FROM nginx COPY nginx.conf /etc/nginx/nginx.conf
Copy the definition above to frontend.yaml
and execute with:
$ kubectl create -f frontend.yaml
Checking the running pods will now show all of the services in attendance:
kubectl get pods NAME READY STATUS RESTARTS AGE assetserver-8746b67d-rlcxh 1/1 Running 0 2m frontend-5cdddc6458-5ltvw 1/1 Running 0 1m frontend-5cdddc6458-6rpbv 1/1 Running 0 1m frontend-5cdddc6458-x9576 1/1 Running 0 1m webserver-5dc468f9bd-8kcf9 1/1 Running 0 4m webserver-5dc468f9bd-fbz8c 1/1 Running 0 4m webserver-5dc468f9bd-mw24f 1/1 Running 0 4m
Checking your handiwork
If you navigate to port 30001 of any of your nodes in a web browser, you should be presented with the page below:
Refreshing the page will update the id of the server shown in the page, informing you that the page was served by a different node. However, the image will remain present, as served by the single asset server.
Taking It Further
As with any example, this tutorial has been a little contrived, but I'm sure it's easy to see the power and potential this route offers.
When applied to a controller -> agent MySQL deployment, a single block-server storage is more than adequate, allowing the controller to store its necessary files within the attached storage while the agent nodes manage their data ephemerally. Likewise, utilizing block storage on each node is perfect for GlusterFS and CephFS deployments, where the contents stored are mirrored across nodes for high availability.
The important point to note is that there is no "one solution fits all" with distributed applications. Spending some time to work out what your application needs is paramount before tackling how it should be deployed. Then, it's simply a matter of getting creative and keeping an eye on its performance.