Main Page: Difference between revisions

From Essential
Jump to navigation Jump to search
No edit summary
No edit summary
 
(241 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:Ynotopec elementary particles motion interaction science 4f947bd8-3a57-49f5-a5b5-df3128737f22.png|thumb|right]]
[[File:Infocepo-illustration.jpg|thumb|right]]
Welcome to my experimental WIKI.
'''Discover cloud computing on infocepo.com''':
* Master cloud infrastructure
* Explore AI
* Compare Kubernetes and AWS
* Advance your IT skills with hands-on labs and open-source software.


I provide here a list of topics related to cloud computing and information technology. Some of the topics mentioned include cloud lab, infra audit, cloud migration, cloud improvement, cloud providers, cloud mapping, infrastructure examples, IT salaries, and high availability using Corosync and Pacemaker.
Start your journey to expertise.


I'm discussing various tools and technologies that can be used in cloud computing, such as Kubernetes, OpenStack, AWS, and Open Virtualization (OVIRT). I also mention the importance of using open source software and minimizing technology lock-in.
<br>
== AI Tools ==
* [https://chat.openai.com ChatGPT4] - Public assistant with learning abilities.
* [https://github.com/open-webui/open-webui open-webui] + [https://www.scaleway.com/en/h100-pcie-try-it-now/ GPU H100] + [https://ollama.com Ollama] - Private assistant and API.
* [https://github.com/ynotopec/summarize Private summary]


I'm discussing the process of migrating to the cloud, including the steps involved in preparing for the migration, as well as the tools and technologies that can be used to automate the process. I also mention the importance of considering the cost and benefits of different cloud providers and the need to carefully evaluate the trade-offs between different options.
=== DEV ===
(28/08/2024)
* [https://ollama.com/library LLM Trending]
* [https://github.com/search?q=stars%3A%3E15000+forks%3A%3E1500+created%3A%3E2022-06-01&type=repositories&s=updated&o=desc Project Trending]
* [https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard LLM Ranking]
* [https://chat.lmsys.org ChatBot Evaluate]
* [https://www.perplexity.ai Perplexity AI] - R&D
* [https://huggingface.co/models Models Trending]
* [https://github.com/hiyouga/LLaMA-Factory LLM Fine Tuning]
* [https://huggingface.co/spaces/mteb/leaderboard Embeddings Ranking]
* [https://ann-benchmarks.com Vectors DB Ranking]
* [https://www.nvidia.com/en-us/data-center/h100/ NVIDIA H100] - KUBERNETES or HPC clusters for DATASCIENCE.
* [https://www.nvidia.com/fr-fr/geforce/graphics-cards/40-series/rtx-4080-family NVIDIA 4080] - GPU card for private assistance.
* [https://huggingface.co/models?pipeline_tag=image-text-to-text&sort=trending Img2txt Trending]
* [https://huggingface.co/spaces/TIGER-Lab/GenAI-Arena Txt2img Evaluate]
* [https://github.com/chatchat-space/Langchain-Chatchat Chatchat] - Private assistant with RAG capabilities in Chinese.
* [https://top500.org/lists/green500/ HPC Efficiency]


Overall, I'm discussing the various factors that need to be considered when designing and implementing a cloud computing infrastructure. I hope this information is helpful! If you have any specific questions about any of the topics mentioned, please don't hesitate to ask.
==== INTERESTING LLMs ====
(28/08/2024)
{| class="wikitable"
! Model
! Comment
|-
| '''RAG'''
| (gemma2-27b), $$
|-
| '''RAG-FR'''
| (glm4)
|-
| '''code'''
| (gemma2-27b), $$
|-
| '''math'''
| (gemma2-27b), $$
|-
| '''summary'''
| (llama3.1)
|-
| '''gemma2'''
| Fast
|-
| '''gemma2-27b'''
| Medium, best, $$
|-
| '''gemma2'''
| OllamaFunctions
|}


<br>
=== NEWS ===
==[https://openai.com/ AI tools]==
(04/05/2024)
==CLOUD LAB==
* [https://www.youtube.com/@lev-selector/videos Very good AI News]
I want to share my [[LAB project]].<br>
* For the [https://betterprogramming.pub/color-your-captions-streamlining-live-transcriptions-with-diart-and-openais-whisper-6203350234ef '''transcription'''] in real time with Diart, it is possible to follow the interlocutors.
<br>
* [https://github.com/openai-translator/openai-translator Translation] tools like Google Translate are becoming popular.
[[file:Infocepo.drawio.png]]
* [https://www.mouser.fr/ProductDetail/BittWare/RS-GQ-GC1-0109?qs=ST9lo4GX8V2eGrFMeVQmFw%3D%3D '''LLM 10x accelerator'''] and cheaper with GROQ.
==INFRA audit==
* [https://opensearch.org/docs/latest/search-plugins/conversational-search Opensearch with LLM]
I made [[ServerDiff.sh]] script to audit servers.
You can track configuration drift.
You can check if your environments are the same.


==CLOUD migration example==
=== TRAINING ===
*1.5 days: infra audit (82 clustered services) ([https://infocepo.com/wiki/index.php/ServerDiff.sh audit own tool])
* [https://www.youtube.com/watch?v=4Bdc55j80l8 TRANSFORMERS ALGORITHM]


*1.5 days: physical and virtual target CLOUD architecture diagram
== CLOUD LAB ==
[[File:Infocepo.drawio.png]]
<br><br>
Presenting my [[LAB project]].


*1.5 days: physical compliance of 2 CLOUD (6 hypervisors, 6TB memory)
== CLOUD Audit ==
Created [[ServerDiff.sh]] for server audits. Enables configuration drift tracking and environment consistency checks.


*1 days: installation of the 2 CLOUD
== CLOUD Migration Example ==
[[File:Diagram-migration-ORACLE-KVM-v2.drawio.png]]
* 1.5d: Infrastructure audit of 82 services ([https://infocepo.com/wiki/index.php/ServerDiff.sh ServerDiff.sh])
* 1.5d: Create cloud architecture diagram.
* 1.5d: Compliance check of 2 clouds (6 hypervisors, 6TB memory).
* 1d: Cloud installations.
* 0.5d: Stability check.


*.5 day: stability check
{| style="border-spacing:0;width:18.12cm;"
{| style="border-spacing:0;width:18.12cm;"
|- style="background-color:#ffc000;border:0.05pt solid #000000;padding:0.049cm;"
|- style="background-color:#ffc000;border:0.05pt solid #000000;padding:0.049cm;"
Line 45: Line 104:
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |  
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |  
|-
|-
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | Power off simultaneous all nodes. Power on simultaneous all nodes.
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | Power off all nodes simultaneously. Power on all nodes simultaneously.
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | All resources are started.
| style="border:0.05pt solid #000000;padding:0.049cm;color:#000000;" | All resources are started.
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |  
| style="background-color:#d8e4bc;border:0.05pt solid #000000;padding:0.049cm;color:#000000;" |  
|-
|-
|}
|}
*1.5 days: CLOUD automation study
* 1.5d: Cloud automation study.
* 1.5d: Develop 6 templates (2 clouds, 2 OS, 8 environments, 2 versions).
* 1d: Create migration diagram.
* 1.5d: Write 138 lines of migration code ([https://infocepo.com/wiki/index.php/MigrationApp.sh MigrationApp.sh]).
* 1.5d: Process stabilization.
* 1.5d: Cloud vs. old infrastructure benchmark.
* 0.5d: Unavailability time calibration per migration unit.
* 5 min: Load 82 VMs (env, OS, application code, 2 IPs).


*1.5 days: 6 templates (2 CLOUD, 2 OS, 8 environments, 2 versions)
Total = 15 man-days.


*1 day: migration diagram
== WEB Enhancement ==
[[File:Diagram-migration-ORACLE-KVM-v2.drawio.png]]
[[File:WebModelDiagram.drawio.png]]


*1.5 days: 138 lines of industrialization code for migration ([https://infocepo.com/wiki/index.php/MigrationApp.sh migration own code])
* Formalize infrastructure for flexibility and reduced complexity.
* Utilize customer-location tracking name server like GDNS.
* Use minimal instances with a network load balancer like LVS.
* Compare prices of dynamic computing services, beware of tech lock-in.
* Employ efficient frontend TLS decoder like HAPROXY.
* Opt for fast HTTP cache like VARNISH and Apache Traffic Server for large files.
* Use PROXY with TLS decoder like ENVOY for service compatibility.
* Consider serverless services for standard runtimes, mindful of potential incompatibilities.
* Employ load balancing or native services for dynamic computing power.
* Use open-source STACKs where possible.
* Employ database caches like MEMCACHED.
* Use queues for long batches.
* Use buffers for stability of real streams.
* More information at [https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure CLOUD WIKIPEDIA] and [https://github.com/systemdesign42/system-design GITHUB].


*1.5 days: process stabilization
== CLOUD WIKIPEDIA ==
* [https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure CLOUD WIKIPEDIA]


*1.5 days: CLOUD benchmark vs old INFRA
== CLOUD vs HW ==
 
*.5 days: calibration of unavailability time per unit migration
 
*5 minutes (effective load): 82 VM (env, os, application_code, 2 IP)
 
Total = 15 man-days
 
==CLOUD improvement==
[[File:WebModelDiagram.drawio.png]]
*Formalize your infrastructure as much as possible for more flexibility, low complexity and less technology lock-in.
*Use a name server able to handle the position of your customers like GDNS.
*Use a minimal instance and use a network load balancer like LVS. Monitor the global load of your instances and add/delete dynamically as needed.
*Or, many providers have dynamic computing services. Compare the prices. But take care about the technology lock-in.
*Use a very efficient TLS decoder for the frontend like the HAPROXY decoder.
*Use very fast http cache like VARNISH.
*Use a big cache for big files like Apache Traffic Server.
*...
*Use a REVERSE PROXY with TLS decoder like ENVOY for more services compatibility.
*Use serverless service for standard runtimes like Java, Python and PHP. But beware of certain incompatibilities and a lack of consistency over time.
*...
*Each time you need dynamic computing power think about load balancing or native service from the providers (caution about providers services!)
*...
*Try to use open source STACKs as much as possible.
*...
*Use cache for your databases like MEMCACHED
 
==CLOUD vs HW==
{| class="wikitable"
{| class="wikitable"
|'''Function'''
|'''KUBERNETES'''
|'''OPENSTACK'''
|'''AWS'''
|'''Bare-metal'''
|'''HPC'''
|'''CRM'''
|'''OVIRT'''
|-
|DEPLOY
|HELM/ANSIBLE/SH
|TERRAFORM/ANSIBLE/SH/JUJU
|TERRAFORM/CLOUDFOUNDATION/ANSIBLE/JUJU
|ANSIBLE/SH
|XCAT/CLUSH
|ANSIBLE/SH
|ANSIBLE/PYTHON/SH
|-
|-
|BOOTSTRAP
! Function
|API/CLI
! Kubernetes
|PXE/API/CLI
! OpenStack
|API/CLI
! AWS
|PXE/IPMI
! Bare-metal
|PXE/IPMI
! HPC
|PXE/IPMI
! CRM
|PXE/API
! oVirt
|-
|-
|
| '''Deployment Tools'''<br>''(Tools used for deployment)''
|
| Helm, YAML, Operator, Ansible, Juju, ArgoCD
|
| Ansible, Packer, Terraform, Juju
|
| Ansible, Terraform, CloudFormation, Juju
|
| Ansible, Shell Scripts
|
| xCAT, Clush
|
| Ansible, Shell Scripts
|
| Ansible, Python, Shell Scripts
|-
|-
|Router
| '''Bootstrap Method'''<br>''(Initial configuration and setup)''
|API/CLI (kube-router)
| API
|API/CLI (router/subnet)
| API, PXE
|API/CLI (Route table/subnet)
| API
|LINUX/OVS/external
| PXE, IPMI
|XCAT/external
| PXE, IPMI
|LINUX/external
| PXE, IPMI
|API
| PXE, API
|-
|-
|Firewall
| '''Router Control'''<br>''(Routing services)''
|INGRESS/EGRESS/ISTIO
| API (Kube-router)
|API/CLI (Security groups)
| API (Router/Subnet)
|API/CLI (Security group)
| API (Route Table/Subnet)
|LINUX (NFT)
| Linux, OVS, External Hardware
|LINUX (NFT)
| xCAT, External Hardware
|LINUX (NFT)
| Linux, External Hardware
|API
| API
|-
|-
|Vlan/Vxlan
| '''Firewall Control'''<br>''(Firewall rules and policies)''
|many
| Ingress, Egress, Istio, NetworkPolicy
|API/CLI (VPC)
| API (Security Groups)
|API/CLI (VPC)
| API (Security Group)
|OVS/LINUX/external
| Linux Firewall
|XCAT/external
| Linux Firewall
|LINUX/external
| Linux Firewall
|API
| API
|-
|-
|
| '''Network Virtualization'''<br>''(VLAN/VxLAN technologies)''
|
| Multiple Options
|
| VPC
|
| VPC
|
| OVS, Linux, External Hardware
|
| xCAT, External Hardware
|
| Linux, External Hardware
|
| API
|-
|-
|Name server
| '''Name Server Control'''<br>''(DNS services)''
|coredns
| CoreDNS
|dns-nameserver
| DNS-Nameserver
|Amazon Route 53
| Amazon Route 53
|GDNS
| GDNS
|XCAT
| xCAT
|LINUX/external
| Linux, External Hardware
|API/external
| API, External Hardware
|-
|-
|Load balancer
| '''Load Balancer'''<br>''(Load balancing options)''
|kube-proxy/LVS(IPVS)
| Kube-proxy, LVS (IPVS)
|LVS
| LVS
|Network Load Balancer
| Network Load Balancer
|LVS
| LVS
|SLURM
| SLURM
|Ldirectord
| Ldirectord
|
| N/A
|-
|-
|Storage
| '''Storage Options'''<br>''(Available storage technologies)''
|many
| Multiple Options
|SWIFT/CINDER/NOVA
| Swift, Cinder, Nova
|S3/EFS/FSX/EBS
| S3, EFS, FSx, EBS
|OPENSTACK SWIFT/XFS/EXT4/RAID10
| Swift, XFS, EXT4, RAID10
|GPFS
| GPFS
|SAN
| SAN
|NFS/SAN
| NFS, SAN
|}
|}


==[https://landscape.cncf.io/?fullscreen=yes CLOUD REF]==
== CLOUD providers ==
==[https://cloud.google.com/free/docs/aws-azure-gcp-service-comparison CLOUD providers]==
* [https://cloud.google.com/free/docs/aws-azure-gcp-service-comparison CLOUD providers]
==[https://global-internet-map-2021.telegeography.com/ CLOUD map]==
 
==[https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure Infrastructure example]==
== CLOUD INTERNET NETWORK ==
==IT salaries==
* [https://global-internet-map-2021.telegeography.com/ CLOUD INTERNET NETWORK]
*[http://jobsearchtech.about.com/od/educationfortechcareers/tp/HighestCerts.htm Best IT certifications]
 
*[https://www.silkhom.com/barometre-2021-des-tjm-dans-informatique-digital/ FREELANCE]
== CLOUD NATIVE ==
*[http://www.journaldunet.com/solutions/emploi-rh/salaire-dans-l-informatique-hays/ IT]
* [https://landscape.cncf.io/?fullscreen=yes OFFICIAL STACKS]
* DevSecOps :
[[File:DSO-POC-V3.drawio.png]]
 
== High Availability (HA) with Corosync+Pacemaker ==
[[File:HA-REF.drawio.png]]


==[https://access.redhat.com/downloads/content/package-browser REDHAT package browser]==
=== Typical Architecture ===
==HA COROSYNC+PACEMAKER==
===Typical architecture===


*2 rooms
* Dual-room.
*2 power supply
* IPMI LAN (fencing).
*2FC / server (active/active) (SAN)
* NTP, DNS+DHCP+PXE+TFTP+HTTP (auto-provisioning), PROXY (updates or internal REPOSITORY).
*2*10Gbit/s ethernet / server (active/passive, possible active/active if PXE on native VLAN 0)
* Choose 2+ node clusters.
*IPMI VLAN (for the fence)
* For 2-node, require COROSYNC 2-node config, 10-second staggered closing for stability. For better stability, choose 3+ nodes architecture.
*VLAN ADMIN which must be the native VLAN if BOOTSTRAP by PXE (admin, provisioning, heartbeat)
* Allocate 4GB/base for DB resources. CPU resource requirements are generally low.
*USER VLAN (application services)
*NTP
*DNS+DHCP+PXE+TFTP+HTTP for auto-provisioning
*PROXY (for update or otherwise internal REPOSITORY)


*Choose between 2 or more node clusters.
=== Typical Service Pattern ===
* Multipath
* LUN
* LVM (LVM resource)
* FS (FS resource)
* NFS (FS resource)
* User
* IP (IP resource)
* DNS name
* Process (Process resource)
* Listener (Listener resource)


*For a 2-node architecture, you need a 2-node configuration on COROSYNC and make sure to configure a 10-second staggered closing for one of the nodes (otherwise, an unstable cluster results).
== HPC ==
[[File:HPC.drawio.png]]


*Resources are stateless.
== IT Wage ==
* [http://jobsearchtech.about.com/od/educationfortechcareers/tp/HighestCerts.htm Best IT certifications]
* [https://www.silkhom.com/barometre-2021-des-tjm-dans-informatique-digital FREELANCE]
* [http://www.journaldunet.com/solutions/emploi-rh/salaire-dans-l-informatique-hays IT]


For DB resources it is necessary to provide 4GB per base in general.
== SRE ==
For CPU resources, as a rule there are no big requirements. Tip, for time-critical compressions, use PZSTD.
* [https://openapm.io SRE]


===Typical service pattern===
== REDHAT Package Browser ==
*MULTIPATH
* [https://access.redhat.com/downloads/content/package-browser REDHAT Package Browser]
*LUN
*LVM (LVM resource)
*FS (FS resource)
*NFS (FS resource)
*USER
*IP (IP resource)
*DNS name
*PROCESS (PROCESS resource)
*LISTENER (LISTENER resource)
==[https://openapm.io SRE]==

Latest revision as of 19:08, 17 September 2024

Infocepo-illustration.jpg

Discover cloud computing on infocepo.com:

  • Master cloud infrastructure
  • Explore AI
  • Compare Kubernetes and AWS
  • Advance your IT skills with hands-on labs and open-source software.

Start your journey to expertise.


AI Tools

DEV

(28/08/2024)

INTERESTING LLMs

(28/08/2024)

Model Comment
RAG (gemma2-27b), $$
RAG-FR (glm4)
code (gemma2-27b), $$
math (gemma2-27b), $$
summary (llama3.1)
gemma2 Fast
gemma2-27b Medium, best, $$
gemma2 OllamaFunctions

NEWS

(04/05/2024)

TRAINING

CLOUD LAB

Infocepo.drawio.png

Presenting my LAB project.

CLOUD Audit

Created ServerDiff.sh for server audits. Enables configuration drift tracking and environment consistency checks.

CLOUD Migration Example

Diagram-migration-ORACLE-KVM-v2.drawio.png

  • 1.5d: Infrastructure audit of 82 services (ServerDiff.sh)
  • 1.5d: Create cloud architecture diagram.
  • 1.5d: Compliance check of 2 clouds (6 hypervisors, 6TB memory).
  • 1d: Cloud installations.
  • 0.5d: Stability check.
ACTION RESULT OK/KO
Activate maintenance for n/2-1 nodes or 1 node if 2 nodes. All resources are started.
Un-maintenance all nodes. Power off n/2-1 nodes or 1 node if 2 nodes, different from the previous test. All resources are started.
Power off all nodes simultaneously. Power on all nodes simultaneously. All resources are started.
  • 1.5d: Cloud automation study.
  • 1.5d: Develop 6 templates (2 clouds, 2 OS, 8 environments, 2 versions).
  • 1d: Create migration diagram.
  • 1.5d: Write 138 lines of migration code (MigrationApp.sh).
  • 1.5d: Process stabilization.
  • 1.5d: Cloud vs. old infrastructure benchmark.
  • 0.5d: Unavailability time calibration per migration unit.
  • 5 min: Load 82 VMs (env, OS, application code, 2 IPs).

Total = 15 man-days.

WEB Enhancement

WebModelDiagram.drawio.png

  • Formalize infrastructure for flexibility and reduced complexity.
  • Utilize customer-location tracking name server like GDNS.
  • Use minimal instances with a network load balancer like LVS.
  • Compare prices of dynamic computing services, beware of tech lock-in.
  • Employ efficient frontend TLS decoder like HAPROXY.
  • Opt for fast HTTP cache like VARNISH and Apache Traffic Server for large files.
  • Use PROXY with TLS decoder like ENVOY for service compatibility.
  • Consider serverless services for standard runtimes, mindful of potential incompatibilities.
  • Employ load balancing or native services for dynamic computing power.
  • Use open-source STACKs where possible.
  • Employ database caches like MEMCACHED.
  • Use queues for long batches.
  • Use buffers for stability of real streams.
  • More information at CLOUD WIKIPEDIA and GITHUB.

CLOUD WIKIPEDIA

CLOUD vs HW

Function Kubernetes OpenStack AWS Bare-metal HPC CRM oVirt
Deployment Tools
(Tools used for deployment)
Helm, YAML, Operator, Ansible, Juju, ArgoCD Ansible, Packer, Terraform, Juju Ansible, Terraform, CloudFormation, Juju Ansible, Shell Scripts xCAT, Clush Ansible, Shell Scripts Ansible, Python, Shell Scripts
Bootstrap Method
(Initial configuration and setup)
API API, PXE API PXE, IPMI PXE, IPMI PXE, IPMI PXE, API
Router Control
(Routing services)
API (Kube-router) API (Router/Subnet) API (Route Table/Subnet) Linux, OVS, External Hardware xCAT, External Hardware Linux, External Hardware API
Firewall Control
(Firewall rules and policies)
Ingress, Egress, Istio, NetworkPolicy API (Security Groups) API (Security Group) Linux Firewall Linux Firewall Linux Firewall API
Network Virtualization
(VLAN/VxLAN technologies)
Multiple Options VPC VPC OVS, Linux, External Hardware xCAT, External Hardware Linux, External Hardware API
Name Server Control
(DNS services)
CoreDNS DNS-Nameserver Amazon Route 53 GDNS xCAT Linux, External Hardware API, External Hardware
Load Balancer
(Load balancing options)
Kube-proxy, LVS (IPVS) LVS Network Load Balancer LVS SLURM Ldirectord N/A
Storage Options
(Available storage technologies)
Multiple Options Swift, Cinder, Nova S3, EFS, FSx, EBS Swift, XFS, EXT4, RAID10 GPFS SAN NFS, SAN

CLOUD providers

CLOUD INTERNET NETWORK

CLOUD NATIVE

DSO-POC-V3.drawio.png

High Availability (HA) with Corosync+Pacemaker

HA-REF.drawio.png

Typical Architecture

  • Dual-room.
  • IPMI LAN (fencing).
  • NTP, DNS+DHCP+PXE+TFTP+HTTP (auto-provisioning), PROXY (updates or internal REPOSITORY).
  • Choose 2+ node clusters.
  • For 2-node, require COROSYNC 2-node config, 10-second staggered closing for stability. For better stability, choose 3+ nodes architecture.
  • Allocate 4GB/base for DB resources. CPU resource requirements are generally low.

Typical Service Pattern

  • Multipath
  • LUN
  • LVM (LVM resource)
  • FS (FS resource)
  • NFS (FS resource)
  • User
  • IP (IP resource)
  • DNS name
  • Process (Process resource)
  • Listener (Listener resource)

HPC

HPC.drawio.png

IT Wage

SRE

REDHAT Package Browser