Main Page: Difference between revisions
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
 (→POCs)  | 
				 (→POCs)  | 
				||
| Line 20: | Line 20: | ||
*[https://openai.com OpenAI's AI Tools] - Industry-transforming AI.  | *[https://openai.com OpenAI's AI Tools] - Industry-transforming AI.  | ||
*[https://lmsys.org/blog/2023-05-03-arena Benchmark] - AI performance standard.  | *[https://lmsys.org/blog/2023-05-03-arena Benchmark] - AI performance standard.  | ||
*  | *PICTURE - NVIDIA NeuralCompressionTechnic compression.  | ||
*[https://huggingface.co/models HuggingFace] - AI models.  | *[https://huggingface.co/models HuggingFace] - AI models.  | ||
Revision as of 23:10, 13 May 2023
Welcome to my WIKI.
It explores cloud computing, focusing on migration, infrastructure, and high availability. It discusses tools like Kubernetes, OpenStack, AWS, emphasizes open-source software, and outlines key factors for cloud infrastructure implementation.
AI Solutions
- Science Research Tool - AI for science.
 - Platform - AI-powered design.
 
Next Steps
- LANGCHAIN - Upcoming semantic tool.
 - VICUNA (LLAMA) - Open-source AI chat.
 
POCs
- AUDIO - Real-time translation.
 - Auto-GPT - AI for coding.
 - OpenAI's AI Tools - Industry-transforming AI.
 - Benchmark - AI performance standard.
 - PICTURE - NVIDIA NeuralCompressionTechnic compression.
 - HuggingFace - AI models.
 
CLOUD LAB
Presenting my LAB project.
Infrastructure Audit
Created ServerDiff.sh for server audits. Enables configuration drift tracking and environment consistency checks.
Cloud Migration Example
- 1.5d: Infrastructure audit of 82 services (ServerDiff.sh)
 
- 1.5d: Create cloud architecture diagram
 
- 1.5d: Compliance check of 2 clouds (6 hypervisors, 6TB memory)
 
- 1d: Cloud installations
 
- .5d: Stability check
 
- 1.5d: Cloud automation study
 
- 1.5d: Develop 6 templates (2 clouds, 2 OS, 8 environments, 2 versions)
 
- 1d: Create migration diagram
 
- 1.5d: Write 138 lines of migration code (MigrationApp.sh)
 
- 1.5d: Process stabilization
 
- 1.5d: Cloud vs old infrastructure benchmark
 
- .5d: Unavailability time calibration per migration unit
 
- 5min: Load 82 VMs (env, os, application_code, 2 IP)
 
Total = 15 man-days
| ACTION | RESULT | OK/KO | 
| Activate maintenance for n/2-1 nodes or 1 node if 2 nodes. | All resources are started. | |
| Un-maintenance all nodes. Power off n/2-1 nodes or 1 node if 2 nodes, different from the previous test. | All resources are started. | |
| Power off simultaneous all nodes. Power on simultaneous all nodes. | All resources are started. | 
Cloud Enhancement
- Formalize infrastructure for flexibility and reduced complexity.
 - Utilize customer-location tracking name server like GDNS.
 - Use minimal instances with a network load balancer like LVS.
 - Compare prices of dynamic computing services, beware of tech lock-in.
 - Employ efficient frontend TLS decoder like HAPROXY.
 - Opt for fast HTTP cache like VARNISH and Apache Traffic Server for large files.
 - Use REVERSE PROXY with TLS decoder like ENVOY for service compatibility.
 - Consider serverless service for standard runtimes, mindful of potential incompatibilities.
 - Employ load balancing or native services for dynamic computing power.
 - Use open source STACKs where possible.
 - Employ database caches like MEMCACHED.
 - More information at CLOUD WIKIPEDIA.
 
CLOUD WIKIPEDIA
CLOUD vs HW
| Function | KUBERNETES | OPENSTACK | AWS | Bare-metal | HPC | CRM | OVIRT | 
|---|---|---|---|---|---|---|---|
| DEPLOY | HELM/YAML/OPERATOR/ANSIBLE/JUJU | ANSIBLE+PACKER+TERRAFORM/JUJU | ANSIBLE/TERRAFORM/CLOUDFORMATION/JUJU | ANSIBLE/SH | XCAT/CLUSH | ANSIBLE/SH | ANSIBLE/PYTHON/SH | 
| BOOTSTRAP | API | API/PXE | API | PXE/IPMI | PXE/IPMI | PXE/IPMI | PXE/API | 
| Router (control) | API (Kube-router) | API (Router/Subnet) | API (Route table/Subnet) | LINUX/OVS/external | XCAT/external | LINUX/external | API | 
| Firewall (control) | INGRESS/EGRESS/ISTIO/NETWORKPOLICY | API (Security groups) | API (Security group) | LINUX | LINUX | LINUX | API | 
| Vlan/Vxlan | many | VPC | VPC | OVS/LINUX/external | XCAT/external | LINUX/external | API | 
| Name server (control) | coredns | dns-nameserver | Amazon Route 53 | GDNS | XCAT | LINUX/external | API/external | 
| Load balancer | kube-proxy/LVS(IPVS) | LVS | Network Load Balancer | LVS | SLURM | Ldirectord | |
| Storage | many | SWIFT/CINDER/NOVA | S3/EFS/FSX/EBS | SWIFT/XFS/EXT4/RAID10 | GPFS | SAN | NFS/SAN | 
CLOUD providers
CLOUD INTERNET NETWORK
CLOUD NATIVE
High Availability (HA) with Corosync+Pacemaker
Typical Architecture
- Dual-room.
 - IPMI LAN (fencing).
 - NTP, DNS+DHCP+PXE+TFTP+HTTP (auto-provisioning), PROXY (updates or internal REPOSITORY).
 - Choose 2+ node clusters.
 - For 2-node, require COROSYNC 2-node config, 10-second staggered closing for stability.
 - Stateless resources. Allocate 4GB/base for DB resources. CPU resource requirements generally low.
 
Typical Service Pattern
- Multipath
 - LUN
 - LVM (LVM resource)
 - FS (FS resource)
 - NFS (FS resource)
 - User
 - IP (IP resource)
 - DNS name
 - Process (Process resource)
 - Listener (Listener resource)
 
