Main Page: Difference between revisions

From Essential
Jump to navigation Jump to search
 
(46 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:DALL·E 2024-01-06 13.58.36 - Logo for a website named 'Infocepo.com', focusing on cloud computing, AI, IT SRE salaries, and best practices in open source. The design should be mod.png|thumb|right]]
[[File:Infocepo-picture.png|thumb|right|Discover cloud computing on infocepo.com]]
'''Discover cloud computing on infocepo.com''':
* Master cloud infrastructure
* Explore AI
* Compare Kubernetes and AWS
* Advance your IT skills with hands-on labs and open-source software.


Start your journey to expertise.
= Discover Cloud Computing on infocepo.com =


<br>
Welcome! This portal is designed for IT professionals, engineers, students, and enthusiasts who want to master cloud infrastructure, explore AI tools, and accelerate their IT skills through hands-on labs and open-source solutions.
== AI Tools ==
*[https://chat.openai.com ChatGPT4] - public assistant with learning abilities.
*[https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard LLM] + [https://www.scaleway.com/en/h100-pcie-try-it-now/ GPU H100] + [https://ollama.com Ollama] - private assistant and API.
*[https://github.com/lm-sys/FastChat/blob/main/docs/langchain_integration.md LANGCHAIN] - [https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1#embed-data RAG] and automation.
*[https://github.com/ynotopec/summarize private summary]


=== DEV ===
__TOC__
*[https://huggingface.co/models Models Trending]
*[https://github.com/trending Project Trending]
*[https://chat.lmsys.org ChatBot Evaluate]
*[https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard LLM Ranking]
*[https://huggingface.co/spaces/mteb/leaderboard Embeddings Ranking]
*[https://huggingface.co/spaces/TIGER-Lab/GenAI-Arena Image Evaluate]
*[https://www.perplexity.ai Perplexity AI] - R&D
*[https://github.com/THUDM/CogVLM CogVLM] - Private API for multimodal purposes. Usable with RAG.
*[https://ann-benchmarks.com Vectors DB Ranking]
*[https://github.com/chatchat-space/Langchain-Chatchat Chatchat] - private assistant with RAG capabilities but Chinese language.
*[https://www.nvidia.com/en-us/data-center/h200 NVIDIA H200] - KUBERNETES or HPC clusters for DATASCIENCE.
*[https://www.nvidia.com/fr-fr/geforce/graphics-cards/40-series/rtx-4070-family NVIDIA 4070] - GPU card for private assistance.
==== INTERESTING LLMs (updated 15/02/2024) ====
* Vicuna-33B (private assistant)
* HanNayeoniee/LHK_DPO_v1 (32k, RAG)
* Vicuna-7B (summary)
* cognitivecomputations/dolphin-2.2.1-mistral-7b (processing text)
* HuggingFaceH4/zephyr-7b-beta (efficient)
* FastChat-T5-3B (small devices)


=== NEWS ===
== Quick Start ==
* LLM + VISION [https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat deepseek-ai/deepseek-vl-7b-chat]
* '''Master cloud infrastructure:''' Practical guides and labs
* LLM [https://huggingface.co/01-ai/Yi-34B-200K Yi-34B 200k] for long context available
* '''Explore artificial intelligence:''' Trends and hands-on tools
* Small vision language model [https://huggingface.co/vikhyatk/moondream2 moondream2] for embedded systems. Not yet available under Ollama
* '''Compare cloud providers:''' Kubernetes, AWS, OpenStack, and more
* For the [https://betterprogramming.pub/color-your-captions-streamlining-live-transcriptions-with-diart-and-openais-whisper-6203350234ef '''transcription'''] real time with Diart it is possible to follow the interlocutors
* '''Develop expertise:''' Training, open-source, and real-world projects
* The [https://python.langchain.com/docs/integrations/chat/openai '''OpenAI library'''] seems preferable to LiteLLM with LANGCHAIN to not have text breakage
* [https://github.com/openai-translator/openai-translator translation] tools like Google translate are becoming popular
* Claude 3 beats ChatGPT4? (with these [https://infocepo.com/wiki/index.php/Enigme Enigmes], no)
* [https://www.mouser.fr/ProductDetail/BittWare/RS-GQ-GC1-0109?qs=ST9lo4GX8V2eGrFMeVQmFw%3D%3D '''Mixtral 10x accelerator'''] and cheaper with GROQ
* [https://opensearch.org/docs/latest/search-plugins/conversational-search Opensearch with RAG]
*Mistral-new
*HanNayeoniee/LHK_DPO_v1 13b (processing)
*ACCEL : vision IA chip very efficient and powerful.
*IBM NorthPole : an IA chip very efficient and powerful.


=== TRAINING ===
----
*[https://www.youtube.com/watch?v=4Bdc55j80l8 TRANSFORMERS ALGORITHM]


=== Cloud Native Install ===
= AI & Cloud Tools =
* [https://github.com/ynotopec/gpu-cluster GPU cluster]
* [https://github.com/ynotopec/llm-k8s LLM API]
[[File:AI-API.drawio.png]]


== CLOUD LAB ==
; '''AI Assistants'''
[[file:Infocepo.drawio.png]]
* [https://chat.openai.com ChatGPT4] – Public conversational AI with strong learning capabilities
<br><br>
* [https://github.com/open-webui/open-webui Open WebUI] + [https://www.scaleway.com/en/h100-pcie-try-it-now/ GPU H100] + [https://ollama.com Ollama] – Private assistants and self-hosted LLM APIs
Presenting my [[LAB project]].
* [https://github.com/ynotopec/summarize Private summary] – Fast, offline summarizer for your data


== CLOUD Audit ==
; '''Development & Model Tracking'''
Created [[ServerDiff.sh]] for server audits. Enables configuration drift tracking and environment consistency checks.
* [https://ollama.com/library LLM Trending] – Latest open-source LLMs
* [https://github.com/search?q=stars%3A%3E15000+forks%3A%3E1500+created%3A%3E2022-06-01&type=repositories&s=updated&o=desc Project Trending] – Top trending codebases since 2022
* [https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard LLM Leaderboard] – Community benchmarks
* [https://chat.lmsys.org ChatBot Evaluation] – Compare chatbot performance
* [https://www.perplexity.ai Perplexity AI] – Cutting-edge research and question answering
* [https://huggingface.co/models Models Trending] – Model marketplace
* [https://github.com/hiyouga/LLaMA-Factory LLM Fine Tuning] – Advanced training framework
* [https://huggingface.co/spaces/mteb/leaderboard Embedding Leaderboard] – Ranking for vector search models
* [https://ann-benchmarks.com Vectors DB Ranking] – Database speed and feature comparison
* [https://www.nvidia.com/en-us/data-center/h100/ NVIDIA H100] – HPC/AI GPUs for Kubernetes clusters
* [https://www.nvidia.com/fr-fr/geforce/graphics-cards/40-series/rtx-4080-family NVIDIA 4080] – Prosumer GPU for private deployments
* [https://huggingface.co/models?pipeline_tag=image-text-to-text&sort=trending Img2txt Trending] – Vision-language models
* [https://huggingface.co/spaces/TIGER-Lab/GenAI-Arena Txt2img Evaluation] – Compare generative image models
* [https://github.com/chatchat-space/Langchain-Chatchat Chatchat] – Private RAG assistant (multi-lingual)
* [https://top500.org/lists/green500/ HPC Efficiency] – Top green supercomputers


== CLOUD Migration Example ==
----
[[File:Diagram-migration-ORACLE-KVM-v2.drawio.png]]
*1.5d: Infrastructure audit of 82 services ([https://infocepo.com/wiki/index.php/ServerDiff.sh ServerDiff.sh])


*1.5d: Create cloud architecture diagram
== Notable Open LLMs ==
''(Last updated: 25/04/2025)''


*1.5d: Compliance check of 2 clouds (6 hypervisors, 6TB memory)
{| class="wikitable"
 
! Model !! Description / Notable Features
*1d: Cloud installations
|-
 
| '''ai-chat''' || gemma3-12b, cost efficient
*.5d: Stability check
|-
{| style="border-spacing:0;width:18.12cm;"
| '''ai-chat-hq''' || gemma3-27b, higher quality
|- style="background-color:#ffc000;border:0.05pt solid #000000;padding:0.049cm;"
|-
| align=center style="color:#000000;" | '''ACTION'''
| '''ai-translate''' || gemma2, temperature=0 (deterministic translation)
| align=center style="color:#000000;" | '''RESULT'''
|-
| align=center style="color:#000000;" | '''OK/KO'''
| '''ai-summary''' || qwen2.5, optimized for summarization
|-
| '''ai-code''' || gemma3-27b, advanced code reasoning
|-
|-
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | Activate maintenance for n/2-1 nodes or 1 node if 2 nodes.
| '''ai-code-completion''' || gemma3-1b, fast code suggestions
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | All resources are started.
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |
|-
|-
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | Un-maintenance all nodes. Power off n/2-1 nodes or 1 node if 2 nodes, different from the previous test.
| '''ai-parse''' || gemma2-simpo, parsing & extraction
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | All resources are started.
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |
|-
|-
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | Power off simultaneous all nodes. Power on simultaneous all nodes.
| '''ai-RAG-FR''' || qwen2.5, French RAG applications
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | All resources are started.
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |
|-
|-
| '''mannix/gemma2-9b-simpo''' || OllamaFunctions integration
|}
|}
*1.5d: Cloud automation study


*1.5d: Develop 6 templates (2 clouds, 2 OS, 8 environments, 2 versions)
----


*1d: Create migration diagram
= Industry News & Trends =


*1.5d: Write 138 lines of migration code ([https://infocepo.com/wiki/index.php/MigrationApp.sh MigrationApp.sh])
* [https://www.youtube.com/@lev-selector/videos Top AI News] – Video digest
* [https://betterprogramming.pub/color-your-captions-streamlining-live-transcriptions-with-diart-and-openais-whisper-6203350234ef Real-time transcription with Diart + Whisper] – Speaker tracking
* [https://github.com/openai-translator/openai-translator OpenAI Translator] – Modern open-source translation
* [https://www.mouser.fr/ProductDetail/BittWare/RS-GQ-GC1-0109?qs=ST9lo4GX8V2eGrFMeVQmFw%3D%3D GROQ LLM accelerator] – Fast, low-cost inference hardware
* [https://opensearch.org/docs/latest/search-plugins/conversational-search Opensearch with LLM] – Enhanced search experiences


*1.5d: Process stabilization
----


*1.5d: Cloud vs old infrastructure benchmark
= Training & Learning =


*.5d: Unavailability time calibration per migration unit
* [https://www.youtube.com/watch?v=4Bdc55j80l8 Transformers Explained] – Intro to Transformers algorithm
* Hands-on labs and scripts in the [[LAB project|CLOUD LAB]] below


*5min: Load 82 VMs (env, os, application_code, 2 IP)
----


Total = 15 man-days
= Cloud Lab & Audit Projects =


== WEB Enhancement ==
[[File:Infocepo.drawio.png|400px|Cloud Lab Reference Diagram]]
[[File:WebModelDiagram.drawio.png]]


* Formalize infrastructure for flexibility and reduced complexity.
; '''Lab Project''' 
* Utilize customer-location tracking name server like GDNS.
Experiment with high-availability, cloud migration, and audit automation.
* Use minimal instances with a network load balancer like LVS.
 
* Compare prices of dynamic computing services, beware of tech lock-in.
=== Cloud Audit ===
* Employ efficient frontend TLS decoder like HAPROXY.
* '''[[ServerDiff.sh]]''' – Bash script for auditing servers, tracking config drift, and checking environment consistency
* Opt for fast HTTP cache like VARNISH and Apache Traffic Server for large files.
 
* Use PROXY with TLS decoder like ENVOY for service compatibility.
=== Cloud Migration Example ===
* Consider serverless service for standard runtimes, mindful of potential incompatibilities.
[[File:Diagram-migration-ORACLE-KVM-v2.drawio.png|400px|Cloud Migration Diagram]]
* Employ load balancing or native services for dynamic computing power.
 
* Use open source STACKs where possible.
{| class="wikitable"
* Employ database caches like MEMCACHED.
! Task !! Description !! Duration (days)
* Use queues when possible.
|-
* More information at [https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure CLOUD WIKIPEDIA] and [https://github.com/systemdesign42/system-design GITHUB].
| Audit infrastructure || 82 services, automated via ServerDiff.sh || 1.5
|-
| Diagram cloud architecture || Visual design || 1.5
|-
| Compliance check || 2 clouds, 6 hypervisors, 6TB RAM || 1.5
|-
| Install cloud platforms || Deploy core cloud environments || 1.0
|-
| Stability check || Early operations || 0.5
|-
| Automation study || Automate deployment/tasks || 1.5
|-
| Develop templates || 6 templates, 8 envs, 2 clouds/OS || 1.5
|-
| Migration diagram || Process illustration || 1.0
|-
| Write migration code || 138 lines (see MigrationApp.sh) || 1.5
|-
| Process stabilization || Ensure repeatable migration || 1.5
|-
| Cloud benchmarking || Performance test vs legacy || 1.5
|-
| Downtime calibration || Per-migration time calculation || 0.5
|-
| VM loading || 82 VMs: OS, code, 2 IPs each || 0.1
|-
! colspan=2 align="right"| '''Total''' !! 15 man-days
|}


== CLOUD WIKIPEDIA ==
==== Stability check ====
* [https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure CLOUD WIKIPEDIA]


== CLOUD vs HW ==
{| class="wikitable"
{| class="wikitable"
! Action !! Expected Result
|-
|-
! Function
| Power off one node || All resources started
! Kubernetes
! OpenStack
! AWS
! Bare-metal
! HPC
! CRM
! oVirt
|-
|-
| '''Deployment Tools'''<br>''(Tools used for deployment)''
| Power off/on all nodes simultaneously || All resources started
| Helm, YAML, Operator, Ansible, Juju, ArgoCD
|}
| Ansible, Packer, Terraform, Juju
 
| Ansible, Terraform, CloudFormation, Juju
----
| Ansible, Shell Scripts
 
| xCAT, Clush
= Web Infrastructure & Best Practices =
| Ansible, Shell Scripts
 
| Ansible, Python, Shell Scripts
[[File:WebModelDiagram.drawio.png|400px|Web Architecture Reference]]
 
* Favor minimal, flexible infrastructure
* Track customer location via GDNS or similar
* Use network load balancers (LVS, IPVS) for scaling
* Compare prices and beware of vendor lock-in
* For TLS: use HAProxy for fast frontend, Envoy for compatibility
* Caching: Varnish, Apache Traffic Server for large content
* Prefer open-source stacks and database caches (e.g. Memcached)
* Use message queues and buffers for workload smoothing
* For more examples: [https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure Wikimedia Cloud Architecture], [https://github.com/systemdesign42/system-design System Design GitHub]
 
----
 
= Major Cloud Platforms: Feature Comparison =
 
{| class="wikitable"
! Function !! Kubernetes !! OpenStack !! AWS !! Bare-metal !! HPC !! CRM !! oVirt
|-
| '''Deployment Tools''' || Helm, YAML, ArgoCD, Juju || Ansible, Terraform, Juju || CloudFormation, Terraform, Juju || Ansible, Shell || xCAT, Clush || Ansible, Shell || Ansible, Python
|-
|-
| '''Bootstrap Method'''<br>''(Initial configuration and setup)''
| '''Bootstrap Method''' || API || API, PXE || API || PXE, IPMI || PXE, IPMI || PXE, IPMI || PXE, API
| API
| API, PXE
| API
| PXE, IPMI
| PXE, IPMI
| PXE, IPMI
| PXE, API
|-
|-
| '''Router Control'''<br>''(Routing services)''
| '''Router Control''' || Kube-router || Router/Subnet API || Route Table/Subnet API || Linux, OVS || xCAT || Linux || API
| API (Kube-router)
| API (Router/Subnet)
| API (Route Table/Subnet)
| Linux, OVS, External Hardware
| xCAT, External Hardware
| Linux, External Hardware
| API
|-
|-
| '''Firewall Control'''<br>''(Firewall rules and policies)''
| '''Firewall Control''' || Istio, NetworkPolicy || Security Groups API || Security Group API || Linux Firewall || Linux Firewall || Linux Firewall || API
| Ingress, Egress, Istio, NetworkPolicy
| API (Security Groups)
| API (Security Group)
| Linux Firewall
| Linux Firewall
| Linux Firewall
| API
|-
|-
| '''Network Virtualization'''<br>''(VLAN/VxLAN technologies)''
| '''Network Virtualization''' || VLAN, VxLAN, others || VPC || VPC || OVS, Linux || xCAT || Linux || API
| Multiple Options
| VPC
| VPC
| OVS, Linux, External Hardware
| xCAT, External Hardware
| Linux, External Hardware
| API
|-
|-
| '''Name Server Control'''<br>''(DNS services)''
| '''DNS''' || CoreDNS || DNS-Nameserver || Route 53 || GDNS || xCAT || Linux || API
| CoreDNS
| DNS-Nameserver
| Amazon Route 53
| GDNS
| xCAT
| Linux, External Hardware
| API, External Hardware
|-
|-
| '''Load Balancer'''<br>''(Load balancing options)''
| '''Load Balancer''' || Kube-proxy, LVS || LVS || Network Load Balancer || LVS || SLURM || Ldirectord || N/A
| Kube-proxy, LVS (IPVS)
| LVS
| Network Load Balancer
| LVS
| SLURM
| Ldirectord
| N/A
|-
|-
| '''Storage Options'''<br>''(Available storage technologies)''
| '''Storage Options''' || Local, Cloud, PVC || Swift, Cinder, Nova || S3, EFS, EBS, FSx || Swift, XFS, EXT4, RAID10 || GPFS || SAN || NFS, SAN
| Multiple Options
| Swift, Cinder, Nova
| S3, EFS, FSx, EBS
| Swift, XFS, EXT4, RAID10
| GPFS
| SAN
| NFS, SAN
|}
|}


== CLOUD providers ==
----
* [https://cloud.google.com/free/docs/aws-azure-gcp-service-comparison CLOUD providers]
 
== CLOUD INTERNET NETWORK ==
= Useful Cloud & IT Links =
* [https://global-internet-map-2021.telegeography.com/ CLOUD INTERNET NETWORK]
 
== CLOUD NATIVE ==
* [https://cloud.google.com/free/docs/aws-azure-gcp-service-comparison Cloud Providers Compared]
* [https://landscape.cncf.io/?fullscreen=yes OFFICIAL STACKS]
* [https://global-internet-map-2021.telegeography.com/ Global Internet Topology Map]
* DevSecOps :
* [https://landscape.cncf.io/?fullscreen=yes CNCF Official Landscape]
[[File:DSO-POC-V3.drawio.png]]
* [https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure Wikimedia Cloud Wiki]
* [https://openapm.io OpenAPM – SRE Tools]
* [https://access.redhat.com/downloads/content/package-browser RedHat Package Browser]
* [https://www.silkhom.com/barometre-2021-des-tjm-dans-informatique-digital Freelance IT Rates]
* [https://www.glassdoor.fr/salaire/Hays-Salaires-E10166.htm IT Salaries (Glassdoor)]
 
----
 
= Advanced: High-Availability, HPC & DevSecOps =
 
== High Availability with Corosync & Pacemaker ==
[[File:HA-REF.drawio.png|400px|HA Cluster Architecture]]


== High Availability (HA) with Corosync+Pacemaker ==
* Multi-node or dual-room clusters for redundancy
[[File:HA-REF.drawio.png]]
* Use IPMI for fencing, provision via PXE/NTP/DNS/TFTP
* For 2-node clusters: stagger fencing for stability; 3+ nodes recommended


=== Typical Architecture ===
=== Common Resources Pattern ===
* Multipath storage, LUN, LVM, NFS
* User and process resources
* IP, DNS, Listener management


*Dual-room.
== HPC ==
*IPMI LAN (fencing).
[[File:HPC.drawio.png|400px|HPC Cluster Overview]]
*NTP, DNS+DHCP+PXE+TFTP+HTTP (auto-provisioning), PROXY (updates or internal REPOSITORY).
*Choose 2+ node clusters.
*For 2-node, require COROSYNC 2-node config, 10-second staggered closing for stability. But for better stability choose 3+ nodes architecture.
*Allocate 4GB/base for DB resources. CPU resource requirements generally low.


=== Typical Service Pattern ===
== DevSecOps ==
*Multipath
[[File:DSO-POC-V3.drawio.png|400px|DevSecOps Reference Design]]
*LUN
*LVM (LVM resource)
*FS (FS resource)
*NFS (FS resource)
*User
*IP (IP resource)
*DNS name
*Process (Process resource)
*Listener (Listener resource)


== IT wage ==
----
*[http://jobsearchtech.about.com/od/educationfortechcareers/tp/HighestCerts.htm Best IT certifications]
*[https://www.silkhom.com/barometre-2021-des-tjm-dans-informatique-digital/ FREELANCE]
*[http://www.journaldunet.com/solutions/emploi-rh/salaire-dans-l-informatique-hays/ IT]


== SRE ==
'''For more examples, guides, and scripts, visit [https://infocepo.com infocepo.com]. Contributions and suggestions welcome!'''
* [https://openapm.io SRE]
== REDHAT package browser ==
* [https://access.redhat.com/downloads/content/package-browser REDHAT package browser]

Latest revision as of 10:06, 17 July 2025

Discover cloud computing on infocepo.com

Discover Cloud Computing on infocepo.com

Welcome! This portal is designed for IT professionals, engineers, students, and enthusiasts who want to master cloud infrastructure, explore AI tools, and accelerate their IT skills through hands-on labs and open-source solutions.

Quick Start

  • Master cloud infrastructure: Practical guides and labs
  • Explore artificial intelligence: Trends and hands-on tools
  • Compare cloud providers: Kubernetes, AWS, OpenStack, and more
  • Develop expertise: Training, open-source, and real-world projects

AI & Cloud Tools

AI Assistants
Development & Model Tracking

Notable Open LLMs

(Last updated: 25/04/2025)

Model Description / Notable Features
ai-chat gemma3-12b, cost efficient
ai-chat-hq gemma3-27b, higher quality
ai-translate gemma2, temperature=0 (deterministic translation)
ai-summary qwen2.5, optimized for summarization
ai-code gemma3-27b, advanced code reasoning
ai-code-completion gemma3-1b, fast code suggestions
ai-parse gemma2-simpo, parsing & extraction
ai-RAG-FR qwen2.5, French RAG applications
mannix/gemma2-9b-simpo OllamaFunctions integration

Industry News & Trends


Training & Learning


Cloud Lab & Audit Projects

Cloud Lab Reference Diagram

Lab Project

Experiment with high-availability, cloud migration, and audit automation.

Cloud Audit

  • ServerDiff.sh – Bash script for auditing servers, tracking config drift, and checking environment consistency

Cloud Migration Example

Cloud Migration Diagram

Task Description Duration (days)
Audit infrastructure 82 services, automated via ServerDiff.sh 1.5
Diagram cloud architecture Visual design 1.5
Compliance check 2 clouds, 6 hypervisors, 6TB RAM 1.5
Install cloud platforms Deploy core cloud environments 1.0
Stability check Early operations 0.5
Automation study Automate deployment/tasks 1.5
Develop templates 6 templates, 8 envs, 2 clouds/OS 1.5
Migration diagram Process illustration 1.0
Write migration code 138 lines (see MigrationApp.sh) 1.5
Process stabilization Ensure repeatable migration 1.5
Cloud benchmarking Performance test vs legacy 1.5
Downtime calibration Per-migration time calculation 0.5
VM loading 82 VMs: OS, code, 2 IPs each 0.1
Total 15 man-days

Stability check

Action Expected Result
Power off one node All resources started
Power off/on all nodes simultaneously All resources started

Web Infrastructure & Best Practices

Web Architecture Reference

  • Favor minimal, flexible infrastructure
  • Track customer location via GDNS or similar
  • Use network load balancers (LVS, IPVS) for scaling
  • Compare prices and beware of vendor lock-in
  • For TLS: use HAProxy for fast frontend, Envoy for compatibility
  • Caching: Varnish, Apache Traffic Server for large content
  • Prefer open-source stacks and database caches (e.g. Memcached)
  • Use message queues and buffers for workload smoothing
  • For more examples: Wikimedia Cloud Architecture, System Design GitHub

Major Cloud Platforms: Feature Comparison

Function Kubernetes OpenStack AWS Bare-metal HPC CRM oVirt
Deployment Tools Helm, YAML, ArgoCD, Juju Ansible, Terraform, Juju CloudFormation, Terraform, Juju Ansible, Shell xCAT, Clush Ansible, Shell Ansible, Python
Bootstrap Method API API, PXE API PXE, IPMI PXE, IPMI PXE, IPMI PXE, API
Router Control Kube-router Router/Subnet API Route Table/Subnet API Linux, OVS xCAT Linux API
Firewall Control Istio, NetworkPolicy Security Groups API Security Group API Linux Firewall Linux Firewall Linux Firewall API
Network Virtualization VLAN, VxLAN, others VPC VPC OVS, Linux xCAT Linux API
DNS CoreDNS DNS-Nameserver Route 53 GDNS xCAT Linux API
Load Balancer Kube-proxy, LVS LVS Network Load Balancer LVS SLURM Ldirectord N/A
Storage Options Local, Cloud, PVC Swift, Cinder, Nova S3, EFS, EBS, FSx Swift, XFS, EXT4, RAID10 GPFS SAN NFS, SAN

Useful Cloud & IT Links


Advanced: High-Availability, HPC & DevSecOps

High Availability with Corosync & Pacemaker

HA Cluster Architecture

  • Multi-node or dual-room clusters for redundancy
  • Use IPMI for fencing, provision via PXE/NTP/DNS/TFTP
  • For 2-node clusters: stagger fencing for stability; 3+ nodes recommended

Common Resources Pattern

  • Multipath storage, LUN, LVM, NFS
  • User and process resources
  • IP, DNS, Listener management

HPC

HPC Cluster Overview

DevSecOps

DevSecOps Reference Design


For more examples, guides, and scripts, visit infocepo.com. Contributions and suggestions welcome!