Main Page: Difference between revisions
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
| Line 76: | Line 76: | ||
*5min: Load 82 VMs (env, os, application_code, 2 IP)  | *5min: Load 82 VMs (env, os, application_code, 2 IP)  | ||
 Total = 15 man-days  | |||
== CLOUD improvement ==  | == CLOUD improvement ==  | ||
Revision as of 07:13, 13 May 2023
Welcome to my WIKI.
It explores cloud computing, focusing on migration, infrastructure, and high availability. It discusses tools like Kubernetes, OpenStack, AWS, emphasizes open-source software, and outlines key factors for cloud infrastructure implementation.
AI Solutions
- Science Research Tool - AI for science.
 - Platform - AI-powered design.
 
Next Steps
- LANGCHAIN - Upcoming semantic tool.
 - VICUNA (LLAMA) - Open-source AI chat.
 
POCs
- AUDIO - Real-time translation.
 - Auto-GPT - AI for coding.
 - OpenAI's AI Tools - Industry-transforming AI.
 - Benchmark - AI performance standard.
 
CLOUD LAB
Presenting my LAB project.
Infrastructure Audit
Created ServerDiff.sh for server audits. Enables configuration drift tracking and environment consistency checks.
Cloud Migration Example
- 1.5d: Infrastructure audit of 82 services (ServerDiff.sh)
 
- 1.5d: Create cloud architecture diagram
 
- 1.5d: Compliance check of 2 clouds (6 hypervisors, 6TB memory)
 
- 1d: Cloud installations
 
- .5d: Stability check
 
| ACTION | RESULT | OK/KO | 
| Activate maintenance for n/2-1 nodes or 1 node if 2 nodes. | All resources are started. | |
| Un-maintenance all nodes. Power off n/2-1 nodes or 1 node if 2 nodes, different from the previous test. | All resources are started. | |
| Power off simultaneous all nodes. Power on simultaneous all nodes. | All resources are started. | 
- 1.5d: Cloud automation study
 
- 1.5d: Develop 6 templates (2 clouds, 2 OS, 8 environments, 2 versions)
 
- 1d: Create migration diagram
 
- 1.5d: Write 138 lines of migration code (MigrationApp.sh)
 
- 1.5d: Process stabilization
 
- 1.5d: Cloud vs old infrastructure benchmark
 
- .5d: Unavailability time calibration per migration unit
 
- 5min: Load 82 VMs (env, os, application_code, 2 IP)
 
Total = 15 man-days
CLOUD improvement
- Formalize your infrastructure as much as possible for more flexibility, low complexity and less technology lock-in.
 - Use a name server able to handle the position of your customers like GDNS.
 - Use a minimal instance and use a network load balancer like LVS. Monitor the global load of your instances and add/delete dynamically as needed.
 - Or, many providers have dynamic computing services. Compare the prices. But take care about the technology lock-in.
 - Use a very efficient TLS decoder for the frontend like the HAPROXY decoder.
 - Use very fast http cache like VARNISH.
 - Use a big cache for big files like Apache Traffic Server.
 - ...
 - Use a REVERSE PROXY with TLS decoder like ENVOY for more services compatibility.
 - Use serverless service for standard runtimes like Java, Python and PHP. But beware of certain incompatibilities and a lack of consistency over time.
 - ...
 - Each time you need dynamic computing power think about load balancing or native service from the providers (caution about providers services!)
 - ...
 - Try to use open source STACKs as much as possible.
 - ...
 - Use cache for your databases like MEMCACHED
 - ...
 - For more informations go to CLOUD WIKIPEDIA
 
CLOUD WIKIPEDIA
CLOUD vs HW
| Function | KUBERNETES | OPENSTACK | AWS | Bare-metal | HPC | CRM | OVIRT | 
|---|---|---|---|---|---|---|---|
| DEPLOY | HELM/ANSIBLE/SH | TERRAFORM/ANSIBLE/SH/JUJU | TERRAFORM/CLOUDFORMATION/ANSIBLE/JUJU | ANSIBLE/SH | XCAT/CLUSH | ANSIBLE/SH | ANSIBLE/PYTHON/SH | 
| BOOTSTRAP | API/CLI | PXE/API/CLI | API/CLI | PXE/IPMI | PXE/IPMI | PXE/IPMI | PXE/API | 
| Router | API/CLI (kube-router) | API/CLI (router/subnet) | API/CLI (Route table/subnet) | LINUX/OVS/external | XCAT/external | LINUX/external | API | 
| Firewall | INGRESS/EGRESS/ISTIO | API/CLI (Security groups) | API/CLI (Security group) | LINUX (NFT) | LINUX (NFT) | LINUX (NFT) | API | 
| Vlan/Vxlan | many | API/CLI (VPC) | API/CLI (VPC) | OVS/LINUX/external | XCAT/external | LINUX/external | API | 
| Name server | coredns | dns-nameserver | Amazon Route 53 | GDNS | XCAT | LINUX/external | API/external | 
| Load balancer | kube-proxy/LVS(IPVS) | LVS | Network Load Balancer | LVS | SLURM | Ldirectord | |
| Storage | many | SWIFT/CINDER/NOVA | S3/EFS/FSX/EBS | OPENSTACK SWIFT/XFS/EXT4/RAID10 | GPFS | SAN | NFS/SAN | 
CLOUD providers
CLOUD INTERNET NETWORK
CLOUD NATIVE
HA COROSYNC+PACEMAKER
Typical architecture
- 2 rooms
 - 2 power supply
 - 2FC / server (active/active) (SAN)
 - 2*10Gbit/s ethernet / server (active/passive, possible active/active if PXE on native VLAN 0)
 - IPMI VLAN (for the fence)
 - VLAN ADMIN which must be the native VLAN if BOOTSTRAP by PXE (admin, provisioning, heartbeat)
 - USER VLAN (application services)
 - NTP
 - DNS+DHCP+PXE+TFTP+HTTP for auto-provisioning
 - PROXY (for update or otherwise internal REPOSITORY)
 
- Choose between 2 or more node clusters.
 
- For a 2-node architecture, you need a 2-node configuration on COROSYNC and make sure to configure a 10-second staggered closing for one of the nodes (otherwise, an unstable cluster results).
 
- Resources are stateless.
 
For DB resources it is necessary to provide 4GB per base in general. For CPU resources, as a rule there are no big requirements. Tip, for time-critical compressions, use PZSTD.
Typical service pattern
- MULTIPATH
 - LUN
 - LVM (LVM resource)
 - FS (FS resource)
 - NFS (FS resource)
 - USER
 - IP (IP resource)
 - DNS name
 - PROCESS (PROCESS resource)
 - LISTENER (LISTENER resource)
 


