Let’s make the composition a little more stable.
Enforcing Container Dependency #
Previously, we managed to get the database and game server containers launched together with Docker Compose, and it seemed to work. But without any coordination, the game server could easily start before the database server was ready, causing it to fail. Luckily this is a very common issue, and we have ways to avoid it.
Healthy Database #
First, let’s add a health check to the MSSQL container. Waiting for it to start listening on port 1433 should work:
|
|
Of course we need netstat
for that, so we’ll install that first:
|
|
…and throw in an upgrade that gets rid of the high level vulnerabilities that were being detected in the container:
Using docker image sha256:e3c8f81f90d576ee0f1322ed05436ddd6b45743f3f5905b10fbc42512beb0670 for
registry.gitlab.com/security-products/container-scanning:6 with digest
registry.gitlab.com/security-products/container-scanning@sha256:83ce4e1e1872540ebc27159b36a6d4d3d37970d23ee3913e027e9964fb35b82c ...
$ gtcs scan
[INFO] [2023-06-05 16:56:34 +0000] [container-scanning] > Scanning container from registry
registry.gitlab.com/301days/mssql-dragspinexp: for vulnerabilities with severity level HIGH or
higher, with gcs 6.0.1 and Trivy Version: 0.36.1, advisories updated at 2023-06-05T12:09:02+00:00
[INFO] [2023-06-05 16:56:55 +0000] [container-scanning] > Scanning container from registry
registry.gitlab.com/301days/mssql-dragspinexp: for vulnerabilities with severity level HIGH or
higher, with gcs 6.0.1 and Trivy Version: 0.36.1, advisories updated at 2023-06-05T12:09:02+00:00
[INFO] [2023-06-05 16:57:12 +0000] [container-scanning] > Scanning container from registry
registry.gitlab.com/301days/mssql-dragspinexp: for vulnerabilities with severity level HIGH or
higher, with gcs 6.0.1 and Trivy Version: 0.36.1, advisories updated at 2023-06-05T12:09:02+00:00
Uploading artifacts for successful job
Patient Game Server #
Now that the MSSQL container has a health check, we just tweak the docker-compose.yml
to make the
game server wait for it to be healthy with depends_on
.
|
|
And now running docker compose up
shows the modified sequence:
$ docker compose up --quiet-pull --detach
sql Pulling
gameserver Pulling
gameserver Pulled
sql Pulled
Network drag-spin-exp_default Creating
Network drag-spin-exp_default Created
Container drag-spin-exp-sql-1 Creating
Container drag-spin-exp-sql-1 Created
Container drag-spin-exp-gameserver-1 Creating
Container drag-spin-exp-gameserver-1 Created
Container drag-spin-exp-sql-1 Starting
Container drag-spin-exp-sql-1 Started
Container drag-spin-exp-sql-1 Waiting
Container drag-spin-exp-sql-1 Healthy
Container drag-spin-exp-gameserver-1 Starting
Container drag-spin-exp-gameserver-1 Started
And a Very Patient Integration Test #
All this work makes it more reasonable to add some integration testing into the CI configuration. Before kicking off any more complete testing, we should at least know we can log in and pet a dog in a minimal world.
First we need a way to make sure the game server is ready for business. I tried just waiting for it to be listening, like the database server, but that didn’t always work. So instead we can use a simple script to keep checking the logs for the regular status output, e.g.:
06/01/2023 23:58:12: NPCs: [1] | Players: [0] | CPU: [??%] | Rnd: [6]
|
|
Then we can use VHS running in its own container to script logging in and petting a dog, and check the output.
|
|
Most of those docker
calls are to put plenty of useful info into the output for troubleshooting.
The fun one is using docker run
to:
- pull down the VHS image from the repo
- bring up the container on the same network as the database and game servers
- map the current directory into the VHS container
- run VHS with the
dog.tape
file of instructions
|
|
When run, it will create both a dog.ascii
and a dog.gif
(most of the Set
s are for the gif)
with the results of the inputs and timing. We have to install telnet first, since it’s not already
in the VHS image. The gif is useful for troubleshooting, but in the CI flow we just check the
ascii file for the expected result of the pet dog
command.
$ grep "You pet the dog." dog.ascii
You pet the dog.
You pet the dog.
You pet the dog.
Uploading artifacts for successful job 00:03
Cleaning up project directory and file based variables 00:00
Job succeeded
data:image/s3,"s3://crabby-images/1502c/1502c068ed4e86c11ff1503ab28ea5498cfb3e09" alt="The whole VHS run in GitLab’s container"
The whole job takes about 5 minutes on GitLab’s free runners, using the minimal database. Of course I also tried doing the same using the production database, but found that there was much more work to be done. Tomorrow.
One More Thing #
I also spent some time and effort trying to get rid of some of the vulnerabilities in the game server container image.
|
|
Adding the buster-backports
package repo allowed some upgrades, and removing curl
and perl
obviously resolves the vulnerabilities in those; but the analysis of the container is still pretty
grim.
$ gtcs scan
[INFO] [2023-06-07 01:19:18 +0000] [container-scanning] > Scanning container from registry registry.gitlab.com/libipljoe/drag-spin-exp: for vulnerabilities with severity level HIGH or higher, with gcs 6.0.1 and Trivy Version: 0.36.1, advisories updated at 2023-06-06T12:10:02+00:00
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| STATUS | CVE SEVERITY | PACKAGE NAME | PACKAGE VERSION | CVE DESCRIPTION |
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| Unapproved | High | e2fsprogs | 1.46.2-1~bpo10+2 | An out-of-bounds read/write vulnerability was found in e2fsprogs 1.46. |
| | | | | 5. This issue leads to a segmentation fault and possibly arbitrary cod |
| | | | | e execution via a specially crafted filesystem. |
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| Unapproved | High | gcc-8-base | 8.3.0-6 | stack_protect_prologue in cfgexpand.c and stack_protect_epilogue in fu |
...
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| Unapproved | High | gcc-8-base | 8.3.0-6 | The POWER9 backend in GNU Compiler Collection (GCC) before version 10 |
...
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| Unapproved | High | libc-bin | 2.28-10+deb10u2 | An out-of-bounds write vulnerability was found in glibc before 2.31 wh |
...
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| Unapproved | High | libc6 | 2.28-10+deb10u2 | An out-of-bounds write vulnerability was found in glibc before 2.31 wh |
...
+------------+--------------+--------------+------------------------+------------------------------------------------------------------------+
| Unapproved | High | libcom-err2 | 1.46.2-1~bpo10+2 | An out-of-bounds read/write vulnerability was found in e2fsprogs 1.46. |
...
Before using this image in production, we’d want to peel another layer back and rebuild
the mono
container itself with a more recent and secure OS image. The benefit of an open source
infrastructure is that we can do that pretty easily, at least in theory.