Main Page: Difference between revisions
Jump to navigation
Jump to search
Line 76: | Line 76: | ||
*5min: Load 82 VMs (env, os, application_code, 2 IP) | *5min: Load 82 VMs (env, os, application_code, 2 IP) | ||
Total = 15 man-days | |||
== CLOUD improvement == | == CLOUD improvement == |
Revision as of 07:13, 13 May 2023
Welcome to my WIKI.
It explores cloud computing, focusing on migration, infrastructure, and high availability. It discusses tools like Kubernetes, OpenStack, AWS, emphasizes open-source software, and outlines key factors for cloud infrastructure implementation.
AI Solutions
- Science Research Tool - AI for science.
- Platform - AI-powered design.
Next Steps
- LANGCHAIN - Upcoming semantic tool.
- VICUNA (LLAMA) - Open-source AI chat.
POCs
- AUDIO - Real-time translation.
- Auto-GPT - AI for coding.
- OpenAI's AI Tools - Industry-transforming AI.
- Benchmark - AI performance standard.
CLOUD LAB
Presenting my LAB project.
Infrastructure Audit
Created ServerDiff.sh for server audits. Enables configuration drift tracking and environment consistency checks.
Cloud Migration Example
- 1.5d: Infrastructure audit of 82 services (ServerDiff.sh)
- 1.5d: Create cloud architecture diagram
- 1.5d: Compliance check of 2 clouds (6 hypervisors, 6TB memory)
- 1d: Cloud installations
- .5d: Stability check
ACTION | RESULT | OK/KO |
Activate maintenance for n/2-1 nodes or 1 node if 2 nodes. | All resources are started. | |
Un-maintenance all nodes. Power off n/2-1 nodes or 1 node if 2 nodes, different from the previous test. | All resources are started. | |
Power off simultaneous all nodes. Power on simultaneous all nodes. | All resources are started. |
- 1.5d: Cloud automation study
- 1.5d: Develop 6 templates (2 clouds, 2 OS, 8 environments, 2 versions)
- 1d: Create migration diagram
- 1.5d: Write 138 lines of migration code (MigrationApp.sh)
- 1.5d: Process stabilization
- 1.5d: Cloud vs old infrastructure benchmark
- .5d: Unavailability time calibration per migration unit
- 5min: Load 82 VMs (env, os, application_code, 2 IP)
Total = 15 man-days
CLOUD improvement
- Formalize your infrastructure as much as possible for more flexibility, low complexity and less technology lock-in.
- Use a name server able to handle the position of your customers like GDNS.
- Use a minimal instance and use a network load balancer like LVS. Monitor the global load of your instances and add/delete dynamically as needed.
- Or, many providers have dynamic computing services. Compare the prices. But take care about the technology lock-in.
- Use a very efficient TLS decoder for the frontend like the HAPROXY decoder.
- Use very fast http cache like VARNISH.
- Use a big cache for big files like Apache Traffic Server.
- ...
- Use a REVERSE PROXY with TLS decoder like ENVOY for more services compatibility.
- Use serverless service for standard runtimes like Java, Python and PHP. But beware of certain incompatibilities and a lack of consistency over time.
- ...
- Each time you need dynamic computing power think about load balancing or native service from the providers (caution about providers services!)
- ...
- Try to use open source STACKs as much as possible.
- ...
- Use cache for your databases like MEMCACHED
- ...
- For more informations go to CLOUD WIKIPEDIA
CLOUD WIKIPEDIA
CLOUD vs HW
Function | KUBERNETES | OPENSTACK | AWS | Bare-metal | HPC | CRM | OVIRT |
---|---|---|---|---|---|---|---|
DEPLOY | HELM/ANSIBLE/SH | TERRAFORM/ANSIBLE/SH/JUJU | TERRAFORM/CLOUDFORMATION/ANSIBLE/JUJU | ANSIBLE/SH | XCAT/CLUSH | ANSIBLE/SH | ANSIBLE/PYTHON/SH |
BOOTSTRAP | API/CLI | PXE/API/CLI | API/CLI | PXE/IPMI | PXE/IPMI | PXE/IPMI | PXE/API |
Router | API/CLI (kube-router) | API/CLI (router/subnet) | API/CLI (Route table/subnet) | LINUX/OVS/external | XCAT/external | LINUX/external | API |
Firewall | INGRESS/EGRESS/ISTIO | API/CLI (Security groups) | API/CLI (Security group) | LINUX (NFT) | LINUX (NFT) | LINUX (NFT) | API |
Vlan/Vxlan | many | API/CLI (VPC) | API/CLI (VPC) | OVS/LINUX/external | XCAT/external | LINUX/external | API |
Name server | coredns | dns-nameserver | Amazon Route 53 | GDNS | XCAT | LINUX/external | API/external |
Load balancer | kube-proxy/LVS(IPVS) | LVS | Network Load Balancer | LVS | SLURM | Ldirectord | |
Storage | many | SWIFT/CINDER/NOVA | S3/EFS/FSX/EBS | OPENSTACK SWIFT/XFS/EXT4/RAID10 | GPFS | SAN | NFS/SAN |
CLOUD providers
CLOUD INTERNET NETWORK
CLOUD NATIVE
HA COROSYNC+PACEMAKER
Typical architecture
- 2 rooms
- 2 power supply
- 2FC / server (active/active) (SAN)
- 2*10Gbit/s ethernet / server (active/passive, possible active/active if PXE on native VLAN 0)
- IPMI VLAN (for the fence)
- VLAN ADMIN which must be the native VLAN if BOOTSTRAP by PXE (admin, provisioning, heartbeat)
- USER VLAN (application services)
- NTP
- DNS+DHCP+PXE+TFTP+HTTP for auto-provisioning
- PROXY (for update or otherwise internal REPOSITORY)
- Choose between 2 or more node clusters.
- For a 2-node architecture, you need a 2-node configuration on COROSYNC and make sure to configure a 10-second staggered closing for one of the nodes (otherwise, an unstable cluster results).
- Resources are stateless.
For DB resources it is necessary to provide 4GB per base in general. For CPU resources, as a rule there are no big requirements. Tip, for time-critical compressions, use PZSTD.
Typical service pattern
- MULTIPATH
- LUN
- LVM (LVM resource)
- FS (FS resource)
- NFS (FS resource)
- USER
- IP (IP resource)
- DNS name
- PROCESS (PROCESS resource)
- LISTENER (LISTENER resource)