CNM Bureau Farm
CNM Bureau Farm (formerly known as CNM EndUser Farm; hereinafter, #The Farm) is the segment of Opplet that is responsible for providing users with CNM Corp and CNM Social. Most likely, #The Farm will also handle the applications that CNM Fed Farm currently provides.
CNMCyber Team (hereinafter, #The Team) develops and administers #The Farm. To sustain #The Farm online, #The Team rents Bureau Infrastructure.
Contents
In the nutshell
For the purposes of this very wikipage, end-users are called #The Patrons. Collectively, end-user applications are called #The User Apps.
Architecture
- While using #The Farm, #The Patrons work with #The User Apps that are installed in #The Cluster that is, consequently, hosted by #The Infrastructure. #The Cluster consists of #The Storage, #The Environment, #The Gateway. #The Infrastructure includes #The Bridges and #The Metal, which is #The Farm's hardware.
Cluster-based
- To mitigate a single point of failure (SPOF), #The Farm is built on not just one, but three bare-metal servers. Each of those hardware servers with all of software installed on the top of it (hereinafter, #The Node) is self-sufficient to host #The User Apps. Various solutions such as #The uptime tools orchestrate coordination between #The Nodes.
COTS-powered
- #The Farm's software is a collection of commercial off-the-shelf (COTS) packages (hereinafter, #The COTS). In plain English, #The COTS is the software that is already available on the market. No single line of programming code is written specifically for #The Farm. Only time-tested market-proven solutions have been used. #The User Apps use HumHub, Jitsi, and Odoo instances. #The Cluster uses Ceph, Proxmox, and pfSense.
Addressing the needs
Development and sustenance of #The Farm address two business needs of CNMCyber Team (hereinafter, #The Team). For #The Team, #The Farm shall serve as both #Tech side of the Apps and #Worksite.
Tech side of the Apps
- #The Team needs to provide #The Patrons with the services of #The User Apps 24 hours a day, 7 days a week. #The Farm shall power #The User Apps technologically; their content is outside of #The Farm's scope.
Worksite
- #The Team needs to provide those incumbents of CNMCyber practices who work with #The Farm's technologies with their worksite.
Farm users
For the purposes of this very wikipage, a farm user refers to any user of #The Farm. The farm user's authentication and authorization administration is a part of identity and access management (IAM).
The Patrons
- For the purposes of this very wikipage, the Patron refers to an end-user of #The User Apps. They access #The User Apps via the graphic user interfaces (GUIs) that are associated with the particular application. At #The Farm, those #User interfaces (UIs) are located at IPv4 addresses. Opplet.net provides the Patrons with access automatically or, by #The Power-Users, manually. The Patrons can access #The User Apps and #The User Apps without administrative tools only. The Patrons may or may not be members of #The Team.
The Power-Users
- For the purposes of this very wikipage, the Power-User refers to a power-user of #The User Apps. By definition, the Power-Users are those members of #The Team who have authorized to access more of #The User Apps' resources than #The Patrons. Normally, those resources are administrative tools that allow the Power-Users administer one or more of #The User Apps.
- While having administrative access, the Power-Users serve as application-level administrators. They access #The User Apps via the same graphic user interfaces (GUIs) as #The Patrons, but administrative-level UIs display administration tools in addition to those tools that are available to non-privileged end-users.
- At the moment, #The Sysadmins can grant the Power-User rights to one or more of #The Patrons manually. Administrative access credentials are classified and securely stored in the application project spaces of CNM Lab. Those incumbents of CNMCyber practice who work with #The Farm's technologies may or may not be the Power-Users.
The Sysadmins
- For the purposes of this very wikipage, the Sysadmin refers to a system administrator of #The Cluster. By definition, the Sysadmins are those members of #The Team who have authorized to access at least some of #The Cluster's resources:
- #The Environment access is carried via #UI for the Environment. That access also includes root-level access to all of the applications that #The Environment hosts. While having such an access, the Sysadmins are able to delete or re-install applications such as #The User Apps.
- #The Storage access is carried via #UI for the Storage.
- #The Gateway access is carried via #UI for the Gateway.
- #The Superusers can grant the Sysadmin rights to one or more of #The Patrons manually. Administrative access credentials are classified and securely stored in the cluster project spaces of CNM Lab. No incumbent of CNMCyber practice can be the Sysadmin; only those who work in CNMCyber Office can.
The Superusers
- For the purposes of this very wikipage, the Superuser refers to a superuser of #The Farm. By definition, the Superusers are those members of #The Team who have authorized to access any part of #The Farm starting with its hardware. #The Metal's access is carried via #UI for the Metal.
- #The Provider grants the Superuser rights to CNMCyber Customer, who can manually pass it to one or more members of CNMCyber Office. The Superuser credentials are not stored in CNM Lab.
User interfaces (UIs)
For the purposes of this very wikipage, a user interface (UI) refers to #The COTS' feature that allows #The COTS' instance and its users to interact. UIs of #The User Apps are described in the wikipages that are dedicated to particular applications. The other UIs of #The Farm are described in the #UI of the Environment, #UI of the Storage, #UI of the Gateway, and #UI for the Metal sections of this very wikipage.
Dashboards
- For the purposes of this very wikipage, a Dashboard refers to a graphic user interface (GUI) that either belongs to any #The COTS package installed in #The Farm or is provider by #The Provider. This screen-based interface allows #Farm users to interact with #The User Apps and other software through graphical buttons, icons, or hyperlinked texts rather than typed commands in command line interfaces (CLIs).
Third-party UIs
- For the purposes of this very wikipage, a Third-party UI refers to any user interface (UI) that neither belongs to any #The COTS package installed in #The Farm nor is provider by #The Provider. #The Sysadmins and #The Suparusers may use access tools such as PuTTy and Midnight Commander to access #UI for the Metal.
The User Apps
For the purposes of this very wikipage, the User Apps refer to those end-user applications with which #The Patrons interact.
HumHub
- CNM Social, which is the end-user instance of CNM HumHub.
Jitsi
Odoo
The Cluster
For the purposes of this very wikipage, the Cluster refers to all of the software between #The User Apps and #The Infrastructure.
Cluster components
- #The Cluster consists of #The Nodes and their management tools. The following components compose #The Cluster:
- #The Storage that is powered by CNM Ceph to provide #The Environment with stored objects, blocks, and files. Three storage spaces of #The Nodes create one distributed storage foundation.
- #The Environment that is powered by CNM Proxmox to make containers and virtual machines (VMs) available to #The User Apps, so #The User Apps can function properly. Each of #The Nodes features its own environment; #The uptime tools orchestrate them all.
- #The Gateway that is powered by CNM pfSense to create a gateway between #The Environment and the outside world. There is only one gateway; if it fails, #The Farm fails.
Choice of COTS
- While building #The Farm generally and #Cluster components specifically, #The Team utilized only #The COTS that is both open-source and free of charge. Other considerations for the choice are stated in the #COTS for the Environment, #COTS for the Storage, #COTS for the Gateway, #COTS for backups sections of this very wikipage.
Cluster provisioning
- Provisioning of #The Cluster occurs in the following sequence:
- Since #The Environment is installed on the top of #The Infrastructure, the #Environment provisions shall be accommodated first.
- Since #The Storage is a part of #The Environment, #Storage provisions shall be accommodated second.
- Since #The Gateway is installed in #The Environment, #Gateway provisions shall be accommodated third.
Cluster monitoring
- Monitoring features are to be identified. Previously, various candidates offered three options:
- Stack -- prometheus + node-exporter + grafana
- Prometheus to monitor VMs, Influx to monitor Pve nodes, Grafana for Dashbord
- grafana + influxdb + telegraf, as well as zabbix. To monitor websites, use uptimerobot
Observability vs. APM vs. Monitoring
application performance management
Cluster recovery
High availability (HA)
Generally speaking, high availability (HA) of any system assumes its higher uptime in comparison with a similar system without higher uptime ability. HA of #The Farm assumes its higher uptime in comparison with a similar farm built on one of #The Nodes. Before #The uptime tools were deployed, #The Farm functioned only on one of #The Nodes and, when it failed, services of #The User Apps were no longer available until the failure was fixed. Now, until at least one of #The Nodes is operational, #The Farm is operational.
The uptime tools
- Both #The Environment and #The Storage feature advanced tools for #High availability (HA).
- The CNM Proxmox instances. With regards to #The Farm's applications, when any application fails, its work continues its sister application installed on the second of #The Nodes. If another application fails, its work continues its sister application installed on the third of #The Nodes. If the third application fails, #The Farm can no longer provide #The Patrons with #The Farm's services in full. To ensure that, #The Farm utilizes tools that come with ProxmoxVE. Every virtual machine (VM) or container is kept on at least two of #The Nodes. When the operational resource, VM or container, fails on one instance, the second CNM Proxmox instance activates its own resource and requests the third instance to create the third resource as a reserve. As a result, VM or container "migrates" from one of #The Nodes to another.
- The CNM Ceph instance. Because of distributed nature of Ceph, #High availability (HA) is the native feature of #The Storage. When one DBMS fails, its work continues its sister DBMS installed on the second of #The Nodes. When another DBMS fails, its work continues its sister DBMS installed on the third of #The Nodes. If the third DBMS fails, #The Farm can no longer provide #The User Apps with the data it requires to properly work.
Uptime limitations
- Generally speaking, HA comes with significant costs. So does #The uptime tools. At very least, running three of #The Nodes is more expensive than running one. The cost cannot exceed the benefit, so high availability (HA) cannot be equal to failure tolerance.
Uptime management
- To manage redundant resources, #The uptime tools:
- Monitor its resources to identify whether they are operational or failed as described in the #Monitoring section of this very wikipage.
- Fence those resources that are identified as failed. As a result, non-operational resources are withdrawn from the list of available.
- Restore those resources that are fenced. The #Recovery supports that feature, while constantly creating snapshots and reserve copies of #The Farm and its parts in order to make them available for restoring when needed.
Uptime principles
- Principally, #High availability (HA) of #The Cluster is based on:
- A principle of redundancy. Each of #The User Apps, as well as every object, block, or file that #The User Apps may use is stored at least twice on different hardware servers of #The Nodes as #The uptime tools and #Uptime of the Storage sections describe.
- Management of redundant resources. #The Cluster needs to put into operations those and only those resources that are in a good standing and operational shape as described in the #Uptime management section.
The Environment
For the purposes of this very wikipage, the Environment refers to the virtual environment (VE) of #The Cluster or, allegorically, to the environment where #The User Apps "live".
COTS for the Environment
- As #The COTS for the Environment, #The Team utilizes CNM Proxmox. For a while, #The Team has also tried OpenStack and VirtualBox as its virtualization tools. The trials suggested that OpenStack required more hardware resources and VirtualBox didn't allow for required sophistication in comparison with ProxmoxVE, which has been chosen as #The COTS for #The Farm's virtualization.
Environment features
- #The Team uses virtualization to divide hardware resources of #The Node's bare-metal servers in smaller containers and virtual machines (VMs), which are created in the Environment to run #The User Apps, #The Gateway, and other applications. In #The Farm, the ProxmoxVE instance tightly integrates KVM hypervisor, LXC containers, CNM Ceph as software-defined storage, as well as networking functionality on a single virtualization platform.
Environment functions
- #The Environment executes four major functions. It:
- Runs #The User Apps, #The Gateway, and other applications. They can be deployed utilizing two models:
- Using containers; they already contain operating systems tailored specifically to the needs of the App.
- In virtual machines (VM) and without containers. In that model, the App is installed on the operating system of its VM.
- Hosts #The Storage and #Backup box
- Connects the applications it runs and the storages it hosts to each other and to #The Bridges, while creating networks.
- Creates backups and accommodates its own recovery when requested.
- Runs #The User Apps, #The Gateway, and other applications. They can be deployed utilizing two models:
Environment provisions
- Every instance of ProxmoxVE requires one "physical" bare-metal server. The interaction between ProxmoxVE instances and #The Infrastructure is carried out by Debian operating system (OS) that comes in the same "box" of #The COTS as ProxmoxVE and is specifically configured for that interaction. #The Farm's ProxmoxVE also hosts #The Storage as its storage.
UI of the Environment
- With regards to #User interfaces (UIs), #The Environment features its administrative interface, which belongs to #Dashboards.
The Storage
For the purposes of this very wikipage, the Storage refers to the storage platform or database management system (DBMS) that provides #The User Apps with the storage they need to operate. Thus, the Storage supports #The Environment's non-emergency operations and differs from the #Backup box that comes into play in emergencies.
COTS for the Storage
- As #The COTS for #The Storage, #The Team utilizes CNM Ceph. Any ProxmoxVE instance requires some storage to operate.
- Before deploying #The uptime tools, #The Team used RAID to make the double hard disks redundant. So, the ProxmoxVE instance was just installed on the top of one disk and replicated to the other disk automatically. Flexibly, ProxmoxVE allows for better usage of hard disks. ProxmoxVE can be configured to host as #The COTS, many storage-type software packages such as ZFS, NFS, GlusterFS, and so on.
- Initially, the cluster developer proposed using Ceph. Later, #The Team substituted one node with another with higher hard disk, but without SSD and NVMe; as a result, #The Farm's storage collapsed. The substituted node was disconnected (today, it serves as hardware for CNM Lab Farm), a new bare-metal server was purchased (today, it is the #Node 3 hardware) and Ceph restored.
- As #The COTS, ProxmoxVE comes with OpenZFS. #The Team has deployed the combination of both in its CNM Lab Farm.
Storage features
- #The Storage features are:
- File system
- Distributed
- Fault tolerant
- Object storage
- Block device
Storage functions
- To make objects, blocks, and files immediately available for #The User Apps' operations, #The Cluster uses a common distributed cluster foundation that orchestrates storage spaces of #The Nodes.
Storage provisions
- Since #The Storage is installed on the top of #The Environment, the Storage provisioning entails configuring a ProxmoxVE instance to work with a CNM Ceph instance.
- At #The Farm, CNM Ceph is deployed at all of #The Nodes. Each of #The Node's servers features doubled hard disks. Physically, a ProxmoxVE instance is installed on one disk of each of #The Nodes; CNM Ceph uses three "second" disks. So, #The Farm features three instances of ProxmoxVE and one instance of Ceph.
- While experimenting with OpenZFS and RAID, #The Team has also tried another model. The second disks then served as reserve copies of the first ones. Since every disk is just 512 GB, that model shrank #The Farm's capacity in a half since both #The User Apps and their storage needed to fit the 512 GB limitation together.
- In the current model, #The User Apps shouldn't share their 512 GB with the storage. On another hand, #The Farm's CNM Ceph capacity is about 3 * 512 GB = 1.536 GB.
UI of the Storage
- With regards to #User interfaces (UIs), #The Storage features its administrative interface, which belongs to #Dashboards.
The Gateway
For the purposes of this very wikipage, the Gateway refers to the composition of software that is built on the external Bridge. The Gateway is the hub for both #The Farm's wide area network (WAN) and local area network (LAN). To power the Gateway, CNM pfSense is deployed.
The composition of software such as a load balancer or reverse proxy that is built on the #External Bridge.
COTS for the Gateway
- As #The COTS for the Gateway, #The Team utilizes CNM pfSense. For a while, #The Team has also tried iptables as a firewall and Fail2ban, which operates by monitoring log files (e.g. /var/log/auth.log, /var/log/apache/access.log, etc.) for selected entries and running scripts based on them. Most commonly this is used to block selected IP addresses that may belong to hosts that are trying to breach the system's security. It can ban any host IP address that makes too many login attempts or performs any other unwanted action within a time frame defined by the administrator. Includes support for both IPv4 and IPv6.
Gateway features
Gateway functions
FreeBSD, HA, VPN, LDAP, backups, CARP VIP
- #The Gateway can be compared to an executive secretary, who (a) takes external client's requests, (b) serves as a gatekeeper, while checking validity of those requests, (c) when the request is valid, selects to which internal resource to dispatch it, (d) dispatches those requests to the selected resource, (e) gets internal responses, and (f) returns them back to the client in the outside world.
- Thus, #The Gateway:
- (constantly) Is monitoring state of internal resources of #The Farm.
- Receives requests from the world outside of #The Farm.
- Checks validity of external requests, while serving as a firewall.
- When the request is valid, selects to which of #The Nodes to dispatch it. #The Gateway is responsible for dispatching external requests to those and only to those internal resources that #Cluster monitoring has identified as operational.
- Dispatches those requests to #The Node that was selected.
- Collects internal responses.
- Returns those responses to the outside world.
- To be more accessible to its clients, #The Gateway utilizes public IPv4 addresses.
Gateway provisions
- #The Gateway is deployed in a virtual machine (VM) of #The Environment.
UI of the Gateway
- With regards to #User interfaces (UIs), #The Gateway features its administrative interface, which belongs to #Dashboards.
Gateway components
#The Gateway includes #Firewall and router, #Load balancer, and #Web server.
Firewall and router
- CNM pfSense plays roles of firewall, reverse proxy, and platform to which #Load balancer and #Web server are attached.
Load balancer
- As a load balancer, CNM pfSense uses the select version of HAProxy that is specifically configured as HAProxy's add-on. As of summer of 2023, no full HAProxy Manager exists in #The Farm. As of summer of 2023, a round robin model is activated for load balancing.
Web server
- As its web server, pfSense utilizes lighttpd. Prior to deployment of CNM pfSense, #The Team utilized two web servers to communicate with the outside world via HTTP. Nginx handled requests initially and Apache HTTP Server handled those requests that hadn't handled by Nginx.
Web architecture
For the purposes of this wikipage, "web architecture" refers to #The Farm's outline of DNS records and IP addresses.
Channels and networks
- #The Farm's communication channels are built on #The Metal and #The Bridges. Currently, #The Cluster uses three communication channels, each of which serves one of the network as follows:
- WAN (wide area network), which is #The Farm's public network that uses external, public IPv4 addresses to integrate the #The Gateway into the Internet. The public network is described in the #The Gateway section of this wikipage.
- LAN (local area network), which is #The Farm's private network that uses internal, private IPv6 addresses to integrate #The Gateway and #The Nodes into one network cluster. This network cluster is described in #The Environment section of this very wikipage.
- SAN (storage area network), which is #The Farm's private network that uses internal, private IPv6 addresses to integrate storage spaces of #The Nodes into one storage cluster. This storage cluster is described in #The Storage section of this wikipage.
- #The Farm's usage of IP addresses is best described in the #IP addresses section.
DNS zone
- To locate #The Farm's public resources in the Internet, the following DNS records are created in #The Farm's DNS zone:
Field Type Data Comment (not a part of the records) Review pm1.bskol.com AAAA record 2a01:4f8:10a:439b::2 Node 1 No data pm2.bskol.com AAAA record 2a01:4f8:10a:1791::2 Node 2 No data pm?.bskol.com AAAA record Node ? No data pf.bskol.com A record 88.99.71.85 CNM pfSense Record is not operational talk.cnmcyber.com A record 188.34.147.106 CNM Talk (CNM Jitsi) Passed corp.cnmcyber.com A record 188.34.147.106 CNM Corp (CNM Odoo) Passed social.cnmcyber.com A record 188.34.147.106 CNM Social (CNM HumHub) Passed portainer.cnmcyber.com A record 188.34.147.107 Docker server, dockers are used for all monitoring Passed dash-status.cnmcyber.com A record 188.34.147.107 Dashboard for monitoring status powered by Uptime Kuma Passed status.cnmcyber.com A record 188.34.147.107 Passed influxdb.cnmcyber.com A record 188.34.147.107 InfluxDB Passed monitor.cnmcyber.com A record 188.34.147.107 Grafana Passed npm.cnmcyber.com A record 188.34.147.107 Nginx Proxy Manager Passed pass.cnmcyber.com A record 188.34.147.107 Passbolt Passed
Web server files
Legacy
haproxy.bskol.com 86400 A 0 185.213.25.206 influx.bskol.com 86400 A 0 49.12.5.41 monitor.bskol.com 86400 A 0 49.12.5.41 pbs.bskol.com 86400 A 0 88.99.214.92 pf.bskol.com 86400 A 0 88.99.71.85 pm1.bskol.com 86400 A 0 88.99.218.172 pm2.bskol.com 86400 A 0 88.99.71.85 pm3.bskol.com 86400 A 0 88.99.214.92 zabbix.bskol.com 86400 A 0 167.235.255.244 pbs.bskol.com 86400 AAAA 0 2a01:4f8:10a:3f60::2 pf.bskol.com 86400 AAAA 0 2a01:4f8:fff0:53::6
IP addresses
- To locate its resources in the #Communication channels, #The Farm uses three types of IP addresses:
- To access #The Environment from the outside world, #The Farm features public IPv6 addresses. One address is assigned to each of #The Nodes. Since there are three of them, three addresses of that type are created.
- For an internal network of #The Nodes, which is assembled on the #Internal Bridge, a private IP address is used. This network is not accessible from the Internet and not included in #The Farm's DNS zone. For instance, #The Storage utilizes this network to synchronize its data. For this network, an address with the type "/24" is selected.
- For an external network of three Nodes, which is assembled on the #External Bridge, #The Farm features public IPv4 addresses. They are handled by #Web intermediaries.
SSL certificates
Backup box
A backup box is deployed on a 1 TB, unlimited traffic storage box BX-11 that has been rented for that purpose.
COTS for backups
- #The Team utilizes no additional software beyond ProxmoxVE for backups. Initially, Proxmox Backup Server was used. However, it consumed the active storage. As a result, the storage box was just attached to #The Environment. And backup to that storage goes directly from and to #The Environment.
Box features
- 10 concurrent connections, 100 sub-accounts, 10 snapshots, 10 automated snapshots, FTP, FTPS, SFTP, SCP, Samba/CIFS, BorgBackup, Restic, Rclone, rsync via SSH, HTTPS, WebDAV, Usable as network drive
Box functions
- #The Provider's description: Storage Boxes provide you with safe and convenient online storage for your data. Score a Storage Box from one of Hetzner Online's German or Finnish data centers! With Hetzner Online Storage Boxes, you can access your data on the go wherever you have internet access. Storage Boxes can be used like an additional storage drive that you can conveniently access from your home PC, your smartphone, or your tablet. Hetzner Online Storage Boxes are available with various standard protocols which all support a wide array of apps. We have an assortment of diverse packages, so you can choose the storage capacity that best fits your individual needs. And upgrading or downgrading your choice at any time is hassle-free!
Box provisions
UI for backups
Used terms
On this very wikipage, a few abbreviations and terms are commonly used.
- Bridge. Any of two Hetzner vSwitches that #The Farm utilizes.
- UI. A user interface, which is the COTS feature that allows a COTS instance and its users to interact.
The COTS
- Commercial off-the-shelf (COTS) software.
The Farm
CNM Bureau Farm, this very wikipage describes it.
The Node
One hardware, bare-metal server of #The Infrastructure with all of software installed on the top of it.
The Team
See also
Related lectures
Useful recommendations
- https://www.informaticar.net/how-to-setup-proxmox-cluster-ha/ (using Ceph without Hetzner vSwitch)
- https://community.hetzner.com/tutorials/hyperconverged-proxmox-cloud (using Ceph with Hetzner vSwitch)
- https://pve.proxmox.com/wiki/High_Availability (general ProxmoxVE HA functionality)
- https://docs.hetzner.com/robot/dedicated-server/network/vswitch/ (general Hetzner vSwitch functionality)
Jitsi/Odoo/HumHub, DNS, CNM pfSense, monitoring -- Telegraf + InfluxDB + Grafana, Uptime Kuma, Passbolt, mail server, LDAP, DNS zone, IPv4, what else to do? HA test, alpha testing, DNS