Victor Franzi's Home Page

13 November 2020

Find all the sources here

From Google to Franzi

For a few years, I have been thinking about the impact of Google on my life. I have a GMail, a GDrive, an Android phone, I use GSearch and so much more.

During the first lockdown, I set up a torrenting server to watch movies and to not pay Netflix. This time, I decided to remove me from Google the most I can.

I have a few needs concerning my digital life and the way I store and use my data. There are services I really want to have:

Architecture

Once again, Docker rules everything. Each service defined above corresponds to a docker-compose service.

I choosed to host three different services:

Next to that, some transversal services are required such as a firewall, a reverse proxy and TLS.

DNS

First, we need to have franzi.fr to point to my server. To do so, I just create DNS records such as the bare domain and all of its sub-domains are linked to my server. And don’t forget IPv6.

*    1800  IN  A      $VPS_IP4
@    1800  IN  A      $VPS_IP4
*    1800  IN  AAAA   $VPS_IP6
@    1800  IN  AAAA   $VPS_IP6

To test my environment, I also play with some AWS VPCs so I often create one and link its IP under aws.franzi.fr to type it faster and reuse my shell history.

./manage.sh

As all this crap needs some configurations, and an initialization process that requires system interactions, I wrapped all the garbage in an all-in-one script.

This script is very simple to use as it consists of two (two?) commands:

./manage.sh init
./manage.sh <cmd>  # where <cmd> is passed to docker-compose

It just read the configuration before running docker-compose. As I wanted is as predictable as possible, running init multiple times won’t erase anything (without changing the underlying configuration, obviously).

Firewall

But before firing up my services, let’s put some security.

As a firewall, I use the simple UFW. It is configured as below:

out:
    allow all

in:
    default deny
    allow http/https
    allow ssh

By the way, my server runs Arch Linux :)

Webpage & posts

So, we got the DNS working, and we only allow supposed valid traffic. The landing page is my personal homepage.

It shows my interests, lists my posts and includes some useful links about me. This is managed using jekyll as mentioned previously.

Before having my own instance, I used GitHub-pages and have been lazy about switching to something else when I did. All I had to do was adding a Gemfile and renaming README.md to index.md.

Easy peasy.

NextCloud

For the webpage, that’s good, now let’s have a look at the storage and this cloud. NextCloud wasn’t a piece of cake to configure. To be honest, it’s the only reason why I didn’t self-hosted earlier, because I was so failing at configuring it.

One morning, I just cloned the Docker examples and it ran so smoothly the first time. I ended up by customizing the docker-compose.yml that was furnished.

After having configured NextCloud, the administrator is invited to finish the configuration by adding an admin account and fill database credentials. As we use Docker services, the database is (by default) accessible at db:5432, the dedicated database is nextcloud from the eponym user.

Git

A feature that I really appreciate is to have my personal git web interface. For all those projects I don’t want to end on GitHub or are more related to self-hosting by example, it is also a conveniant way to have them under my own domain.

This service is composed of two smaller. The web interface, I use klaus to assume this role. As I said, it is very minimal and doesn’t integrate big features such as CI/CD or accounts.

Next to web, I need a git user to manage my repositories and allow SSH connections.

Web interface

klaus is available as a Docker image but the Dockerfile ends by only furnishing an entry point. Instead, the project page proposes using uwsgi. The following line did the job.

command: >-
    uwsgi --plugin python --http11-socket 0.0.0.0:80
    -w klaus.contrib.wsgi_autoreload --env KLAUS_REPOS=/repos --env KLAUS_USE_SMARTHTTP=1

Notice the last environment variable. It sets smart HTTP. This option allows to go get the repositories.

To go get the repository, the trailing .git in the URL is mandatory.

SSH remote

The SSH remote is the second service, with the firewall, to require system interactions.

First, create the git user. Using useradd made it pretty easy, just set the shell to git-shell and don’t deploy the home skeleton. Once created, the script copies the allowed commands to the git home directory.

I stole shamelessly those commands from here, but I should rewrite them as the user check doesn’t work.

Reverse proxy and TLS

All those services are accessible through different endpoints. The very common solution is to use a reverse proxy to handle the sub-domains.

This service works in pair with the SSL certificates provider. Well there isn’t much more to say about, I copy-pasted and it just worked.

Continuous Delivery

The first issue I faced was that my website wasn’t up-to-date. The bare repository is located at /home/git/website.git, but this is the bare repository. Jekyll want a regular repository as input and not this git mess.

To arrange this, I took advantage to have the bare and the cloned repository on the same host by cloning the bare in a place accessible by Jekyll.

To keep the site up-to-date, I added a hook in the bare repository triggered after data is received. It looks like this:

### <bare>/hooks/post-receive
#!/bin/bash

WEB_DIR=/path/to/website-clone
git --work-tree=$WEB_DIR --git-dir=$WEB_DIR/.git pull

I also use the same mechanism to mirror the self-hosted repository on GitHub.

I consider this trick as the first step of a bigger project that is to automate the whole server deployment.

Further ideas

Real CI/CD

As a project never really ends, this one won’t either. As I was just speaking about automation, I really would like to automate things on this server.

I’d like to setup a DevOps pipeline to test the infrastructure (testing what?), as much from an operational point of view as a security one. This pipeline would also turn off the services, update them and turn them on again.

I tried to play with Buildbot but had a few issues with some mystic Python libraries. Again, as lazy as I am, I gave up and worked on something else.

Ad-blocker

I. Can’t. Take. Any. More. Ads. I disabled JavaScript, I customized my /etc/hosts, I block cookies, but that’s only working on my laptop. What about my phone, my TV, my Windows machine?

The easy way would be to setup a Pi-Hole at home, but my ISP doesn’t allow to change the router DNS and I’m too lazy to change it for each device. Another way is to use my server to block the ads.

To do so, I thought about configuring a WireGuard endpoint, and host the Pi-Hole here. Each of the WireGuard peers would have its DNS set to point to the Pi-Hole, boum, no more ads.

As I use arch, I have an up-to-date kernel that includes WireGuard natively :)

Backups

Ah, interesting point.

All of the presented services are working with sensible data. By sensible, read data that I can’t loose. All my server is then backed-up regularly using an ugly rsync script.

The goal would be to use Borg to automate and arrange this.

Conclusion

Well, I had a lot of fun setting up this infrastructure. The beginning was mostly a headache, but as it grew I started enjoying it.

There is still a lot of work to improve the automation and make those scripts prettier but I’m happy with this yet, until I break it and make it even better!