68 lines
3.9 KiB
Markdown
68 lines
3.9 KiB
Markdown
|
---
|
||
|
title: "Infrastructure overview"
|
||
|
date: 2020-07-20T18:30:00+02:00
|
||
|
---
|
||
|
|
||
|
The idea behind this infrastructure is to run on commodity servers. No need to buy big racks of expensive servers as we
|
||
|
see in data centers. Simple homemade computers will do the job. At work, I have access to cheap hard drives that were
|
||
|
used in servers and either are out of warranty or not suitable for enterprise workload. They generally are half their
|
||
|
market price. I have a mix of brand new and re-used drives to reduce the risk of having two disks failing at the same
|
||
|
time in the same host.
|
||
|
|
||
|
There are three components in the infrastructure:
|
||
|
* **storage** servers that hold the data
|
||
|
* **monitoring** server that grabs metrics and sends alerts
|
||
|
* **vps**[^1] server used to create a VPN[^2] and watch for monitoring server availability
|
||
|
|
||
|
{{< rawhtml >}}
|
||
|
<p style="text-align: center;"><img src="/infrastructure-overview.svg" alt="Infrastructure overview" style="width: 65%;"></p>
|
||
|
{{< /rawhtml >}}
|
||
|
|
||
|
# Storage
|
||
|
|
||
|
Every storage server is designed to be hosted on a different location. Each one could be unplugged from a location then
|
||
|
plugged somewhere else and work the same way as before. They require an Internet access to be able to contact the VPS to
|
||
|
join the VPN.
|
||
|
|
||
|
The technology that holds data is **[ZFS](https://en.wikipedia.org/wiki/ZFS)**. I have the chance to use it at work for
|
||
|
production workloads and it makes life way easier. I am used to manage GNU/Linux servers
|
||
|
([Debian](https://www.debian.org/)) and I know that [FreeBSD](https://www.freebsd.org/) has built-in ZFS support, so I
|
||
|
wanted to give it a try. I didn't choose [FreeNAS](https://www.freenas.org/) because I wanted to do everything by myself
|
||
|
to learn and use only the features I needed.
|
||
|
|
||
|
The right balance I found to maximize available disk space while keeping data safe is to use **three disks** in a
|
||
|
[RAID-Z](https://en.wikipedia.org/wiki/ZFS#RAID_(%22RaidZ%22)). Storage servers are allowed to lose one disk at a time
|
||
|
without breaking the service. In the meantime, almost all the cumulative space is available to use. Datasets are
|
||
|
configured to use **lz4** compression because it saves disk space without pushing too much pressure on the CPU.
|
||
|
|
||
|
| Host | Disk capacity |
|
||
|
| -------- | ------------: |
|
||
|
| storage1 | 5.44T |
|
||
|
| storage2 | 2.72T |
|
||
|
| storage3 | 10.9T |
|
||
|
|
||
|
# Monitoring
|
||
|
|
||
|
Like any system administrator, I want to be alerted when something goes wrong on the infrastructure. I also want to
|
||
|
browse the history with graphs to see trends. There was a [Raspberry Pi](https://www.raspberrypi.org/) waiting to be
|
||
|
used in a drawer. It is now connected to the Wi-Fi network somewhere in the house, perfectly hidden, to do this job in
|
||
|
the background.
|
||
|
|
||
|
# VPS
|
||
|
|
||
|
I am not a network engineer. Actually, this is not my job and I don't want it to be. There are numerous experts in the
|
||
|
field that do this very well and I am thankful to them. But a computer without network connectivity is not very useful.
|
||
|
When self-hosting, you have to deal with your ISP modem settings. There is no standard as far as I know. Mine has no
|
||
|
fixed public IPv4 address. I tried to develop scripts to automatically update a subdomain name with the current public
|
||
|
IP address and try to contact it from the outside. The name worked, but the communication always failed.
|
||
|
|
||
|
To solve this problem, I [rent a VPS](https://www.ovhcloud.com/fr/vps/) hosted close to storage locations and I have
|
||
|
configured an [OpenVPN](https://openvpn.net/) server. This is a single point of failure and a *bottleneck* because all
|
||
|
the traffic goes to this server to communicate with others. In fact, Internet bandwidth at home is the real bottleneck
|
||
|
so the VPS should not be a problem. It also acts as the entry point from the outside world for metrics and monitoring
|
||
|
websites.
|
||
|
|
||
|
[^1]: [Virtual Private Server](https://en.wikipedia.org/wiki/Virtual_private_server)
|
||
|
|
||
|
[^2]: [Virtual Private Network](https://en.wikipedia.org/wiki/Virtual_private_network)
|