LAB project: Difference between revisions
Jump to navigation
Jump to search
(→HA) |
(→Futur) |
||
| (21 intermediate revisions by the same user not shown) | |||
| Line 4: | Line 4: | ||
[[file:Infocepo.drawio.png]] | [[file:Infocepo.drawio.png]] | ||
==Data security== | ==Data security== | ||
* | * Availability is via [[LAB project#HA]] & RSYNC & another location & another internet provider & public DNS | ||
* | * Integrity is provided by BTRFS and will be provided by MINIO for large static files | ||
* | * Theft protection is provided by AC and FDE encryption | ||
* | * Loss protection is performed via AC & RSYNC & BTRFS and locks/unlocks FDE drives | ||
* | * Scalability will be via the CEPH | ||
==My LAB tools== | ==My LAB tools== | ||
*MEDIAWIKI (sharing editor) | *MEDIAWIKI (sharing editor) | ||
*DEBIAN | *DEBIAN | ||
* | *[https://app.diagrams.net app.diagrams.net] (diagram editors) | ||
*x86 CPU (for ARM I will use QEMU emulator) | *x86 CPU (for ARM I will use QEMU emulator) | ||
*OPENCL GPU (AI&SIGNAL processing) | *OPENCL GPU (AI&SIGNAL processing) | ||
*LXD/SNAP/ | *LXD/SNAP/MICROK8S (virtualization&container) | ||
*LSOF (system&network audit) | *LSOF (system&network audit) | ||
*NMAP (Network scanner) | *NMAP (Network scanner) | ||
| Line 22: | Line 22: | ||
==Tested in this LAB== | ==Tested in this LAB== | ||
*NEXTCLOUD [https://nextcloud.infocepo.com/index.php/s/ | *NEXTCLOUD [https://nextcloud.infocepo.com/index.php/s/oKrHsMX9bmxcjMf nextcloud.infocepo.com] (aFM8yKYmDa) | ||
*MEDIAWIKI [https://infocepo.com/wiki infocepo.com] | *MEDIAWIKI [https://infocepo.com/wiki infocepo.com] | ||
*KUBERNETES Cluster [https://wiki.infocepo.com/wiki wiki.infocepo.com] (demo available, ask me | *KUBERNETES Cluster [https://wiki.infocepo.com/wiki wiki.infocepo.com] (demo available, ask me) | ||
*MARIADB GALERA Cluster | *MARIADB GALERA Cluster | ||
*OPENSTACK | *OPENSTACK | ||
*PROMETHEUS [https://prometheus.infocepo.com prometheus.infocepo.com] (demo available, ask me | *PROMETHEUS [https://prometheus.infocepo.com prometheus.infocepo.com] (demo available, ask me) | ||
*YACY [https://google.infocepo.com google.infocepo.com] (decentralized search engine) (demo available, ask me | *YACY [https://google.infocepo.com google.infocepo.com] (decentralized search engine) (demo available, ask me) | ||
*GLUSTERFS | *GLUSTERFS | ||
*KATA | *KATA (container runtime like a VM) | ||
*AWS CLI | *AWS CLI | ||
*ANSIBLE | *ANSIBLE | ||
| Line 38: | Line 38: | ||
I love COROSYNC/PACEMAKER, but for this LAB I wanted something from scratch: | I love COROSYNC/PACEMAKER, but for this LAB I wanted something from scratch: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
#!/bin/bash | #!/usr/bin/env bash | ||
# | # ynotopec at gmail.com | ||
domainName=$( | set -u | ||
portNumber=$( | |||
publicHost1=$( | domainName="$(<domainName)" | ||
publicHost2=$( | portNumber="$(<portNumber)" | ||
publicHost1="$(<publicHost1)" | |||
publicHost2="$(<publicHost2)" | |||
[ -n "$domainName" ] || exit 1 | |||
[ -n "$portNumber" ] || exit 1 | |||
[ -n "$publicHost1" ] || exit 1 | |||
[ -n "$publicHost2" ] || exit 1 | |||
command -v dig >/dev/null 2>&1 || exit 1 | |||
command -v nc >/dev/null 2>&1 || exit 1 | |||
command -v rsync >/dev/null 2>&1 || exit 1 | |||
command -v ping >/dev/null 2>&1 || exit 1 | |||
command -v flock >/dev/null 2>&1 || exit 1 | |||
exec 9>/tmp/"${domainName}".ha.lock | |||
flock -n 9 || exit 0 | |||
mkdir -p /storage/rsync-rollback ha_synced | |||
resolve_ipv4() { | |||
dig +time=2 +tries=1 +short A "$1" | awk 'NF{print; exit}' | |||
} | |||
echo "$(date): init" | echo "$(date): init" | ||
#Clean | # Clean | ||
stop${domainName}.sh | stop"${domainName}".sh | ||
#If I was the master sleep a little ! | # If I was the master sleep a little ! | ||
ipPublic="$(dig +short myip.opendns.com @resolver1.opendns.com )" | ipPublic="$(dig +time=2 +tries=1 +short myip.opendns.com @resolver1.opendns.com | awk 'NF{print; exit}')" | ||
ipMasterOld="$( | ipMasterOld="$(resolve_ipv4 "${domainName}")" | ||
[ "${ipPublic}" = "${ipMasterOld}" ] &&sleep 240 | [ -n "${ipPublic}" ] && [ "${ipPublic}" = "${ipMasterOld}" ] && sleep 240 | ||
#Wait Internet | # Wait Internet | ||
while | while ! ping -w2 -c1 "${publicHost1}" >/dev/null 2>&1; do | ||
sleep 10 | sleep 10 | ||
done | done | ||
#Wait Admin Unlock Backup | # Wait Admin Unlock Backup | ||
waitAdminUnLockBCK.sh | waitAdminUnLockBCK.sh | ||
#If | # If passive merge backup | ||
if | if nc -zw2 "${domainName}" 443 >/dev/null 2>&1 || { sleep 10; nc -zw4 "${domainName}" 443 >/dev/null 2>&1; }; then | ||
echo "$(date): merge backup from ${domainName}:443" | |||
rsync --max-size=4M --ignore-existing --numeric-ids --modify-window=1 --ignore-errors --block-size=128.00K --inplace --no-whole-file \ | rsync --max-size=4M --ignore-existing --numeric-ids --modify-window=1 --ignore-errors --block-size=128.00K --inplace --no-whole-file \ | ||
-z --compress-level=9 \ | |||
-aAXx \ | |||
--exclude=.rsync_* \ | |||
--rsh="ssh -i ~/.ssh/storage@${domainName}.key -p ${portNumber} -oStrictHostKeyChecking=no" \ | |||
"${domainName}:/storage/rsync-rollback/" /storage/rsync-rollback/ | |||
fi | fi | ||
#Wait master down (big loop) | # lock bck source after merge | ||
while | lockBckSource.sh | ||
if [ -z "$(find ha_synced -type f -mmin -60 2>/dev/null )" ] ;then | |||
# Wait master down (big loop) | |||
while nc -zw2 "${domainName}" "${portNumber}" >/dev/null 2>&1 || { sleep 10; nc -zw4 "${domainName}" "${portNumber}" >/dev/null 2>&1; }; do | |||
if [ -z "$(find ha_synced -type f -mmin -60 2>/dev/null)" ]; then | |||
echo "$(date): sync from ${domainName}:${portNumber}" | |||
rsync --numeric-ids --delete --force --modify-window=1 --ignore-errors --block-size=128.00K --inplace --no-whole-file \ | rsync --numeric-ids --delete --force --modify-window=1 --ignore-errors --block-size=128.00K --inplace --no-whole-file \ | ||
-z --compress-level=9 \ | |||
-aAXx \ | |||
--backup-dir="rsync-rollback/$(date '+%Y-%m-%d')" \ | |||
--exclude=rsync-rollback \ | |||
--exclude=.rsync_* \ | |||
--rsh="ssh -i ~/.ssh/storage@${domainName}.key -p ${portNumber} -oStrictHostKeyChecking=no" \ | |||
"${domainName}:/storage/" /storage/ && touch ha_synced | |||
fi | fi | ||
sleep 10 | sleep 10 | ||
done | done | ||
#Maybe it's me down ! If INTERNET down, reboot | echo "$(date): master detected down on ${domainName}:${portNumber}" | ||
! ping -w2 -c1 ${publicHost1} >/dev/null 2>&1 | |||
# Maybe it's me down ! If INTERNET down, reboot | |||
if ! ping -w2 -c1 "${publicHost1}" >/dev/null 2>&1; then | |||
echo "$(date): reboot because connectivity confirmation failed" | |||
reboot | |||
fi | |||
#Become master | # Become master | ||
#lock Backup (integrity protection against attacks) | # lock Backup (integrity protection against attacks) | ||
lockBCK.sh | lockBCK.sh | ||
#Register DNS | # Register DNS | ||
ipMasterDown="$( | ipMasterDown="$(resolve_ipv4 "${domainName}")" | ||
[ -n "${ipPublic}" ] || reboot | |||
updateDns.sh "${ipPublic}" | updateDns.sh "${ipPublic}" | ||
echo "$(date): dns updated to ${ipPublic}" | |||
#Random startup time | # Random startup time | ||
sleep $((RANDOM%10)) | sleep $((RANDOM % 10)) | ||
${domainName}Start.sh & | "${domainName}"Start.sh & | ||
#Wait DNS propagation | # Wait DNS propagation | ||
sleep | sleep 215 | ||
#Monitor | # Monitor | ||
echo "$(date): up" | echo "$(date): up" | ||
[ "${ipPublic}" = "${ipMasterDown}" ] &&ipMasterDown="" | [ "${ipPublic}" = "${ipMasterDown}" ] && ipMasterDown="" | ||
while [ "${ipPublic}" = "$( | while [ "${ipPublic}" = "$(resolve_ipv4 "${domainName}")" ] \ | ||
&& { [ -z "${ipMasterDown}" ] || ! nc -zw2 "${ipMasterDown}" 443 >/dev/null 2>&1; } \ | |||
&& { nc -zw2 "${ipPublic}" 443 >/dev/null 2>&1 || nc -zw4 "${ipPublic}" 443 >/dev/null 2>&1; } \ | |||
&& { [ -n "${ipMasterDown}" ] && nc -zw2 "${ipMasterDown}" "${portNumber}" >/dev/null 2>&1 || ping -w2 -c1 "${publicHost1}" >/dev/null 2>&1 || ping -w2 -c1 "${publicHost2}" >/dev/null 2>&1; }; do | |||
sleep 4 | sleep 4 | ||
done | done | ||
echo "$(date): down" | echo "$(date): down" | ||
#Stop | # Stop | ||
stop${domainName}.sh | stop"${domainName}".sh | ||
reboot | |||
</syntaxhighlight> | </syntaxhighlight> | ||
| Line 131: | Line 164: | ||
--inplace --block-size=128.00K | --inplace --block-size=128.00K | ||
*compress transfer | *compress transfer | ||
-z --compress-level=9 | -z --compress-level=9 | ||
To increase flash storage life: | To increase flash storage life: | ||
--inplace --no-whole-file | --inplace --no-whole-file | ||
| Line 148: | Line 181: | ||
==Futur== | ==Futur== | ||
*Migrate rsync to real time replication. | *Migrate rsync to real time replication. | ||
*Add | *Add S3 for objects storage and streaming. | ||
*Improve synchronization algorithm ([[bisync.sh]]): | *Improve synchronization algorithm ([[bisync.sh]]): | ||
#The NEXTCLOUD algorithm is very good average, but is very poor for dynamic big files like DB or virtual images. | #The NEXTCLOUD algorithm is very good average, but is very poor for dynamic big files like DB or virtual images. | ||
| Line 154: | Line 187: | ||
#RSYNC is very good average but not dealing with inodes (moving files). | #RSYNC is very good average but not dealing with inodes (moving files). | ||
*Alert passive location locked and add de-lock page. | *Alert passive location locked and add de-lock page. | ||
*Double storage to have active/active locations. Storage 1 location A to B. Storage 2 location B to A. | |||
Latest revision as of 14:16, 4 April 2026
Resilient, low power and encrypted LAB CLOUD project.
Diagram
Data security
- Availability is via LAB project#HA & RSYNC & another location & another internet provider & public DNS
- Integrity is provided by BTRFS and will be provided by MINIO for large static files
- Theft protection is provided by AC and FDE encryption
- Loss protection is performed via AC & RSYNC & BTRFS and locks/unlocks FDE drives
- Scalability will be via the CEPH
My LAB tools
- MEDIAWIKI (sharing editor)
- DEBIAN
- app.diagrams.net (diagram editors)
- x86 CPU (for ARM I will use QEMU emulator)
- OPENCL GPU (AI&SIGNAL processing)
- LXD/SNAP/MICROK8S (virtualization&container)
- LSOF (system&network audit)
- NMAP (Network scanner)
- TCPDUMP (Network monitoring)
Tested in this LAB
- NEXTCLOUD nextcloud.infocepo.com (aFM8yKYmDa)
- MEDIAWIKI infocepo.com
- KUBERNETES Cluster wiki.infocepo.com (demo available, ask me)
- MARIADB GALERA Cluster
- OPENSTACK
- PROMETHEUS prometheus.infocepo.com (demo available, ask me)
- YACY google.infocepo.com (decentralized search engine) (demo available, ask me)
- GLUSTERFS
- KATA (container runtime like a VM)
- AWS CLI
- ANSIBLE
- GIT
HA
I love COROSYNC/PACEMAKER, but for this LAB I wanted something from scratch:
#!/usr/bin/env bash
# ynotopec at gmail.com
set -u
domainName="$(<domainName)"
portNumber="$(<portNumber)"
publicHost1="$(<publicHost1)"
publicHost2="$(<publicHost2)"
[ -n "$domainName" ] || exit 1
[ -n "$portNumber" ] || exit 1
[ -n "$publicHost1" ] || exit 1
[ -n "$publicHost2" ] || exit 1
command -v dig >/dev/null 2>&1 || exit 1
command -v nc >/dev/null 2>&1 || exit 1
command -v rsync >/dev/null 2>&1 || exit 1
command -v ping >/dev/null 2>&1 || exit 1
command -v flock >/dev/null 2>&1 || exit 1
exec 9>/tmp/"${domainName}".ha.lock
flock -n 9 || exit 0
mkdir -p /storage/rsync-rollback ha_synced
resolve_ipv4() {
dig +time=2 +tries=1 +short A "$1" | awk 'NF{print; exit}'
}
echo "$(date): init"
# Clean
stop"${domainName}".sh
# If I was the master sleep a little !
ipPublic="$(dig +time=2 +tries=1 +short myip.opendns.com @resolver1.opendns.com | awk 'NF{print; exit}')"
ipMasterOld="$(resolve_ipv4 "${domainName}")"
[ -n "${ipPublic}" ] && [ "${ipPublic}" = "${ipMasterOld}" ] && sleep 240
# Wait Internet
while ! ping -w2 -c1 "${publicHost1}" >/dev/null 2>&1; do
sleep 10
done
# Wait Admin Unlock Backup
waitAdminUnLockBCK.sh
# If passive merge backup
if nc -zw2 "${domainName}" 443 >/dev/null 2>&1 || { sleep 10; nc -zw4 "${domainName}" 443 >/dev/null 2>&1; }; then
echo "$(date): merge backup from ${domainName}:443"
rsync --max-size=4M --ignore-existing --numeric-ids --modify-window=1 --ignore-errors --block-size=128.00K --inplace --no-whole-file \
-z --compress-level=9 \
-aAXx \
--exclude=.rsync_* \
--rsh="ssh -i ~/.ssh/storage@${domainName}.key -p ${portNumber} -oStrictHostKeyChecking=no" \
"${domainName}:/storage/rsync-rollback/" /storage/rsync-rollback/
fi
# lock bck source after merge
lockBckSource.sh
# Wait master down (big loop)
while nc -zw2 "${domainName}" "${portNumber}" >/dev/null 2>&1 || { sleep 10; nc -zw4 "${domainName}" "${portNumber}" >/dev/null 2>&1; }; do
if [ -z "$(find ha_synced -type f -mmin -60 2>/dev/null)" ]; then
echo "$(date): sync from ${domainName}:${portNumber}"
rsync --numeric-ids --delete --force --modify-window=1 --ignore-errors --block-size=128.00K --inplace --no-whole-file \
-z --compress-level=9 \
-aAXx \
--backup-dir="rsync-rollback/$(date '+%Y-%m-%d')" \
--exclude=rsync-rollback \
--exclude=.rsync_* \
--rsh="ssh -i ~/.ssh/storage@${domainName}.key -p ${portNumber} -oStrictHostKeyChecking=no" \
"${domainName}:/storage/" /storage/ && touch ha_synced
fi
sleep 10
done
echo "$(date): master detected down on ${domainName}:${portNumber}"
# Maybe it's me down ! If INTERNET down, reboot
if ! ping -w2 -c1 "${publicHost1}" >/dev/null 2>&1; then
echo "$(date): reboot because connectivity confirmation failed"
reboot
fi
# Become master
# lock Backup (integrity protection against attacks)
lockBCK.sh
# Register DNS
ipMasterDown="$(resolve_ipv4 "${domainName}")"
[ -n "${ipPublic}" ] || reboot
updateDns.sh "${ipPublic}"
echo "$(date): dns updated to ${ipPublic}"
# Random startup time
sleep $((RANDOM % 10))
"${domainName}"Start.sh &
# Wait DNS propagation
sleep 215
# Monitor
echo "$(date): up"
[ "${ipPublic}" = "${ipMasterDown}" ] && ipMasterDown=""
while [ "${ipPublic}" = "$(resolve_ipv4 "${domainName}")" ] \
&& { [ -z "${ipMasterDown}" ] || ! nc -zw2 "${ipMasterDown}" 443 >/dev/null 2>&1; } \
&& { nc -zw2 "${ipPublic}" 443 >/dev/null 2>&1 || nc -zw4 "${ipPublic}" 443 >/dev/null 2>&1; } \
&& { [ -n "${ipMasterDown}" ] && nc -zw2 "${ipMasterDown}" "${portNumber}" >/dev/null 2>&1 || ping -w2 -c1 "${publicHost1}" >/dev/null 2>&1 || ping -w2 -c1 "${publicHost2}" >/dev/null 2>&1; }; do
sleep 4
done
echo "$(date): down"
# Stop
stop"${domainName}".sh
reboot
Optimizations explained
To reduce Network data I added options to rsync:
- only send modified data blocks from files
--inplace --block-size=128.00K
- compress transfer
-z --compress-level=9
To increase flash storage life:
--inplace --no-whole-file
Cost per month
800€*2=1600€ :hypervisors (popular CPU X86 notebook, 8GB+) -400€*2=-800€ :sale after one year 800€/12m=33€ :price/month 30*2=60€ :Internet (~1PB outbound/month) (7.5W+7.5W)*24h*30.5d/1000Wh*.15€*2~=3.3€ :electricity 33+60+3.3~=96€ :cost/month
Futur
- Migrate rsync to real time replication.
- Add S3 for objects storage and streaming.
- Improve synchronization algorithm (bisync.sh):
- The NEXTCLOUD algorithm is very good average, but is very poor for dynamic big files like DB or virtual images.
- OSYNC is slow, and I don't know for big files.
- RSYNC is very good average but not dealing with inodes (moving files).
- Alert passive location locked and add de-lock page.
- Double storage to have active/active locations. Storage 1 location A to B. Storage 2 location B to A.
