Kubernetes Any% Speedrun

21 May 2019

They say the key to confidence is knowing something that nobody else in the room does.

It’s probably why I’ve never seen a guy named Sean/Shawn/Shaun/Shon who wasn’t confident. He’s the only person in the room who knows how to spell his name. I would imagine that the same goes for the people who know how to properly pronounce Kubernetes. At worst, if the other party gets it right, you can pretty easily claim “it’s not pronounced that way” and put on the smug face you usually reserve for finding out six months deep into a project that you were correct about the budget overruns, flex your vocal chords, and pound out that correct pronounciation.

The difficulty of pronouncing or typing the word Kubernetes (or, for the sake of my fingers, “k8s” - because there’s eight characters between the first and last letter. Clever, huh?) correctly is basically all the introduction you need as to exactly how complex just about every single component of the environment is. When I got started trying to deploy something to DigitalOcean’s managed k8s environment, my best friend described it to me in four simple lines:

dude
kubernetes is like
it can span someone’s entire career
just using and understanding it

Shortly following this was:

kubernetes is a hellscape

I’m not entirely sure I disagree with that assessment. It somewhat reminds me of Postfix configuration, which was pretty obviously designed by people who wanted to ensure limitless job security configuring Postfix servers. Postfix has one key advantage in this race, though: Postfix is old-world, old-style UNIX. It hasn’t really fundamentally changed since its inception. K8s, on the other hand, follows the nue-age methodology of changing as much as possible in every iteration so that nobody ever gets complacent, or something.

My introduction to k8s began after work when I was trying to move a Ruby on Rails app to a place where I didn’t have to manage a server anymore, because I was honestly pretty sick of managing servers. I wanted to Experience Tranquility™, running the application in an environment where I didn’t manage the application server, I didn’t manage the database, and I didn’t manage the file storage server.

File storage was fettled pretty quickly, I could off that to ~~Backblaze B2~~ ~~Amazon S3~~ DigitalOcean Spaces, thanks to the Rails ActiveStorage module drastically simplifying file storage, including uploading those files to The Cloud™. After we’d chosen DigitalOcean spaces for file storage it made sense to also use their managed database offering. So all that left was the application environment itself.

Now to be clear, in times past when I’ve decided to run something that doesn’t need to have its environment managed, I’ve turned to Heroku. They’re pretty much the masters of allowing you to just shunt your application code to GitHub and have it magically appear at whatever address you set. But they don’t offer managed storage at all, and k8s has been up and coming a bit so I figured I’d check it out. Six hours would be enough for a whirlwind tour, right?

Not even close.

I’m not going into what I tried and failed, but let me assure you it consisted of 15 “blogs” and Medium dot com thought leadership pieces, four articles from the DigitalOcean knowledgebase, and two articles from Engine Yard. None of those places actually allowed me to move this app to a k8s environment by themselves. By the powers combined of all 21 sources, I was able to get this deployed. So here’s my journey. A thinkpiece, if you will. I am writing it and you are reading it, making me a thought leader. The future is a strange and exciting place.

Terminology

Absolutely nothing in the glossary of k8s is standard. I’m only going to touch on the pieces we actually care about, which are Deployments, Ingresses, Pods, and Services. Those are in alphabetical order because I honestly have no idea what logical order they go in. Apparently “it depends”. Great.

Cluster: The whole thing. We won’t be using this word going forward.
Deployment: An ultra-abstract combination of a couple of other abstract concepts, notably ReplicaSets (which are basically deprecated and got folded into the Deployment schema) and Pods. This is basically a definition of the state for your application. No networking concepts really happen here.
Ingress: Pretty much a way of defining how external traffic gets into your service.
Pod: A group of one or more (probably Docker) containers on one host. They have shared networking. If you’re only dropping one container, you can safely replace the word ‘pod’ with ‘container’ without violating your warranty.
Service: Kind of the entry door to your Deployment.

The reason I’m even laying this out is because if I don’t, going forward will make literally no sense. Unlike almost everything else in life there doesn’t seem to be a real shortcut to k8s without paying an absurd amount of money for a “bootcamp”. You can’t speed through and learn things by osmosis. This is a design flaw and should be rectified immediately.

Note I mentioned Docker above. I should probably clarify that k8s works around/with Docker, not instead of it. The thing that it deprecated is Docker Swarm. Or Docker Compose. Or both. Honestly, I have no idea. It was probably none of them.

The Speedrun

Alright, now we’ve got the terms out of the way. Let’s begin with dockerising our application. Note that this is literally baby’s first k8s setup. Things in here are ~~definitely~~ probably not best practice. A good example is the fact that a Deployment should be one service, apparently. The below speedrun will show you that I decided I would put Rails and Nginx in a single deployment. Some people will probably hate that, and they’re probably right. That’s why it’s an any% speedrun.

I won’t walk you through dockerising an application because honestly it depends entirely on pretty much every single facet of your application and Docker by itself is honestly pretty easy. Instead, I’ll show you my Dockerfile:

FROM ruby:alpine
ENV RAILS_ENV development
ENV BIND 127.0.0.1
RUN apk add --no-cache build-base postgresql-dev git nodejs npm tzdata ffmpeg graphicsmagick
RUN mkdir /app
WORKDIR /app
COPY Gemfile /app/Gemfile
COPY Gemfile.lock /app/Gemfile.lock
RUN bundle install --jobs `expr $(cat /proc/cpuinfo | grep -c "cpu cores") - 1` --retry 3
COPY . /app
RUN npm install --global yarn && yarn install --check-files
EXPOSE 3000
ENTRYPOINT ["sh", "-c", "rails db:migrate; rails assets:precompile; rails server -b $BIND"]

Pretty simple. It pulls the latest Ruby on Alpine Linux image, sets a couple of variables, installs some stuff using the apk package manager for Alpine Linux, then does the usual work with Bundler.

Once we’ve verified that we can indeed create a working Docker image of our application, it’s time to push it to The Cloud™. You’ll need two things. First, the Kubernetes command line application. Second, the Kubernetes local VM. The second one is for sanity reasons. I would recommend, if you’re on Linux, using the kvm2 driver instead of the standard VirtualBox driver.

Next up, here’s almost FIFTY (50) lines of Yaml Ain’t Markup Language (YAML), the chosen declarative language of our lords and saviours at Google Incorporated, detailing that I want k8s to use two Docker images and expose the ports:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: neko
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  minReadySeconds: 10
  replicas: 1
  template:
    metadata:
      labels:
        app: neko
    spec:
      volumes:
        - name: static-assets
          emptyDir: {}
      imagePullSecrets:
      - name: regcrd
      containers:
      - image: tecuane/neko
        name: neko
        volumeMounts:
        - name: static-assets
          mountPath: /app/public
        imagePullPolicy: Always
        ports:
        - containerPort: 3000
        env:
        - name: BIND
          value: 0.0.0.0
        - name: RAILS_ENV
          value: production
        - name: DB_URL
          value: postgres://neko:[email protected]:5432/neko
      - image: nginx
        name: nginx
        volumeMounts:
        - name: static-assets
          mountPath: /usr/share/nginx/html
        imagePullPolicy: Always
        ports:
        - containerPort: 80

The first line clocking in with v1beta1 indicates you’re about to have a Really Good Time™. Apparently there’s feature branches everywhere with different versions, reminding me very much of XMPP which we all still use extremely heavily because it’s all widely implemented the same and standardised.

Wait, no, that was IRC. Never mind.

Let’s call out some lines or clauses that are kind of important:

kind: Deployment tells k8s that this particular block is of the Deployment class we talked about earlier.
metadata and its sub-key are important because later we will use it to point the Service in the direction of our application containers. This is the same for the metadata key under template a bit later down.
strategy basically defines how the deployment works. The rollingUpdate value indicates that if we have more than a single replica up, we want to bring one down, update it, check it’s good, then move on to the next replica.
replicas tells k8s you want x of this type of deployment replicated in the cluster.

The template is basically the root key for the template for those replicas. Inside this, we define how this template should be instantiated.

volumes allows you to define shared volumes for the containers. Here, they aren’t persistent. They will die upon the deployment ending. I’m using it here to share assets precompiled by the Rails pipeline with Nginx, so it can serve them statically.
imagePullSecrets is a neat key allowing you to define the name of a Secret (another k8s type like Deployments etc.) that you have set in the cluster. This secret can be used to log in to a private Docker registry if you have private images. You can see that I have a secret called regcred, which I am injecting into the container build process. In this particular case it’s the API key for Docker Hub.

Finally, containers is the beginning of your container definitions. Basically explaining how you want k8s to handle your Docker images when they’re running. This includes environment variable you want to set (which can use secrets as well), exposed ports on the containers, and so forth.

I imagine the rest, beyond there, is pretty self-explanatory. Problem is, it wasn’t when I was putting it together. Originally, I didn’t have an Nginx container because you can basically sideload Nginx into k8s itself. K8s doesn’t actually know what HTTP is in order to keep it somewhat technology agnostic and give you more freedom, or at least I heard something to that effect. Interestingly it was from the same organisation that brought us the Go programming language and ignores when you use quotation marks around search terms. Turns out that you can’t define a volume from which to serve files that way, though. Without that, I’m not entirely sure what the point is. Why would you add that but not add a Lisp REPL, or the ability to read mail? Come on, CNCF. You can do better.

So if you massage that config a bit by throwing out the things you don’t need, tweaking various values to match what you need (e.g., change the docker container source to be your actual container and/or changing volume mount points), you can load this into your k8s cluster by saving it as a yml file and running the following:

kubectl load -f /path/to/the/config.yml

Congratulations! Now you have a Deployment. You should feel so proud. I did. To the outside world, that means utterly nothing. In fact, to things outside that Deployment it still means utterly nothing too. That’s because you haven’t defined a Service yet, which tells everything else what that Deployment is and how to access it. It looks a bit like this:

apiVersion: v1
kind: Service
metadata:
  name: neko
spec:
  selector:
    app: neko
  ports:
  - port: 3000
    name: api
    targetPort: 3000
    protocol: TCP
  - port: 80
    name: static
    targetPort: 80
    protocol: TCP

We can see here that now we’re on API version 1. Why? Who knows. I wanted to ask but the official chat medium is apparently Slack and honestly I’d rather write a Python script to fax them and poll for a response via smoke signals than install Slack.

To summarise this block, the Service is called neko, and exposes two ports over TCP: 3000 and 80. Port 3000 is called ‘api’, and port 80 is called ‘static’. This is probably the easiest piece of k8s to understand.

So now other services in your cluster can talk to this service if they so desire. However, it’s still not contactable from the outside world. There’s a couple of ways to do this but I’m only really going to show you the vendor-agnostic bit:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: neko-ingress
spec:
  rules:
  - http:
      paths:
        - path: /
          backend:
            serviceName: neko
            servicePort: 3000
        - path: /assets
          backend:
            serviceName: neko
            servicePort: 80
        - path: /packs
          backend:
            serviceName: neko
            servicePort: 80

Back on our v1beta1, apparently. “This time we have extensions” I yell, my cursor hovering over the Kubernetes Slack link as I slowly break down into tears.

Thankfully (for you, desperately hoping for an end to this storm of bitter sarcasm) this is also pretty easy to parse. An Ingress is kind of like a load balancer, but instead of just balancing traffic per port you can do it per path or per host. You can see that any requests coming to / go to the neko service on port 3000, and any requests coming to /assets or /packs go to port 80. You might remember those ports from the Deployment and Service definitions. If you don’t that’s fine. I’m not mad, I’m just disappointed. I said it’s fine.

At this point, with Minikube, you can simply run the following:

minikube addons enable ingress

What this does under the hood is a bunch of other stuff that it doesn’t actually tell you about. That’s why when I went to move this from Minikube to DigitalOcean it all fell over again. Turns out, what it does is:

Sets up a pretty large amount of configuration for the Nginx load balancer
Actually adds the Nginx load balancer
Creates a service you can use

The third one we don’t need, because we are using our own services. You can delete it:

kubectl delete svc default-http-backend

And that’s pretty much it. You have a working Kubernetes installation with «Your Simple Application Here».

Elliot Speck

Kubernetes Any% Speedrun

Terminology

The Speedrun

Greetz