Main Page: Difference between revisions

From Essential
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
[[File:Infocepo-picture.png|thumb|right|Discover cloud and AI on infocepo.com]]
[[File:Infocepo-picture.png|thumb|right|Cloud, AI and Labs on infocepo.com]]


= infocepo.com – Cloud, AI & Labs =
= infocepo.com – Cloud, AI & Labs =


Welcome to the '''infocepo.com''' portal.
Bienvenue sur '''infocepo.com'''.


This wiki is intended for system administrators, cloud engineers, developers, students, and enthusiasts who want to:
Ce wiki centralise des ressources sur :


* Understand modern architectures (Kubernetes, OpenStack, bare-metal, HPC…)
* l’'''infrastructure cloud''' et les architectures distribuées,
* Deploy private AI assistants and productivity tools
* l’'''IA appliquée''' (assistants privés, APIs, RAG, GPU),
* Build hands-on labs to learn by doing
* les '''labs techniques''' pour apprendre, tester et industrialiser,
* Prepare large-scale audits, migrations, and automations
* les '''scripts et méthodes''' pour l’audit, la migration et l’automatisation.


The goal: turn theory into '''reusable scripts, diagrams, and architectures'''.
L’objectif est de transformer des idées et des concepts en '''solutions concrètes, reproductibles et utiles'''.


__TOC__
__TOC__
Line 18: Line 18:
----
----


= Getting started quickly =
= Accès rapide =


== Recommended paths ==
* [https://chat.infocepo.com '''Assistant IA''']
 
* [https://infocepo.com '''Portail principal''']
; 1. Build a private AI assistant
* [[Special:AllPages|'''Toutes les pages''']]
* Deploy a typical stack: '''Open WebUI + Ollama + GPU''' (H100 or consumer-grade GPU)
* [https://github.com/ynotopec '''GitHub'''
* Add a chat model and a summarization model
* [https://uptime-kuma.ai.lab.infocepo.com/status/ai '''Statut des services''']]
* Integrate internal data (RAG, embeddings)
* [https://grafana.ai.lab.infocepo.com '''Monitoring''']]
 
; 2. Launch a Cloud lab
* Create a small cluster (Kubernetes, OpenStack, or bare-metal)
* Set up a deployment pipeline (Helm, Ansible, Terraform…)
* Add an AI service (transcription, summarization, chatbot…)
 
; 3. Prepare an audit / migration
* Inventory servers with '''ServerDiff.sh'''
* Design the target architecture (cloud diagrams)
* Automate the migration with reproducible scripts
 
== Content overview ==
 
* '''AI guides & tools''' : assistants, models, evaluations, GPUs
* '''Cloud & infrastructure''' : HA, HPC, web-scale, DevSecOps
* '''Labs & scripts''' : audit, migration, automation
* '''Comparison tables''' : Kubernetes vs OpenStack vs AWS vs bare-metal, etc.


----
----


= future =
= Parcours recommandés =
[[File:Automation-full-vs-humans.png|thumb|right|The world after automation]]


= AI Assistants & Cloud Tools =
== Construire un assistant IA privé ==
* Déployer une stack type '''Open WebUI + Ollama + GPU'''
* Ajouter des modèles de chat, résumé, OCR ou transcription
* Connecter des documents via '''RAG + embeddings'''


== AI Assistants ==
== Lancer un lab cloud ==
 
* Créer un environnement Kubernetes, OpenStack ou bare-metal
; '''ChatGPT'''
* Déployer avec Helm, Terraform ou Ansible
* https://chatgpt.com ChatGPT – Public conversational assistant, suited for exploration, writing, and rapid experimentation.
* Ajouter des services IA et des outils d’observabilité
 
; '''Self-hosted AI assistants'''
* https://github.com/open-webui/open-webui Open WebUI + https://www.scaleway.com/en/h100-pcie-try-it-now/ H100 GPU + https://ollama.com Ollama 
: Typical stack for private assistants, self-hosted LLMs, and OpenAI-compatible APIs.
* https://github.com/ynotopec/summarize Private summary – Local, fast, offline summarizer for your own data.
 
== Development, models & tracking ==
 
; '''Discovering and tracking models'''
* https://ollama.com/library LLM Trending – Model library (chat, code, RAG…) for local deployment.
* https://huggingface.co/models Models Trending – Model marketplace, filterable by task, size, and license.
* https://huggingface.co/models?pipeline_tag=image-text-to-text&sort=trending Img2txt Trending – Vision-language models (image → text).
* https://huggingface.co/spaces/TIGER-Lab/GenAI-Arena Txt2img Evaluation – Image generation model comparisons.
 
; '''Evaluation & benchmarks'''
* https://lmarena.ai/leaderboard ChatBot Evaluation – Chatbot rankings (open-source and proprietary models).
* https://huggingface.co/spaces/mteb/leaderboard Embedding Leaderboard – Benchmark of embedding models for RAG and semantic search.
* https://ann-benchmarks.com Vectors DB Ranking – Vector database comparison (latency, memory, features).
* https://top500.org/lists/green500/ HPC Efficiency – Ranking of the most energy-efficient supercomputers.
 
; '''Development & fine-tuning tools'''
* https://github.com/search?q=stars%3A%3E15000+forks%3A%3E1500+created%3A%3E2022-06-01&type=repositories&s=updated&o=desc Project Trending – Major recent open-source projects, sorted by popularity and activity.
* https://github.com/hiyouga/LLaMA-Factory LLM Fine Tuning – Advanced framework for LLM fine-tuning (instruction tuning, LoRA, etc.).
* https://www.perplexity.ai Perplexity AI – Advanced research and synthesis oriented as a “research copilot”.
 
== AI Hardware & GPUs ==
 
; '''GPUs & accelerators'''
* https://www.nvidia.com/en-us/data-center/h100/ NVIDIA H100 – Datacenter GPU for Kubernetes clusters and intensive AI workloads.
* NVIDIA 5080 – Consumer GPU for lower-cost private LLM deployments.
* https://www.mouser.fr/ProductDetail/BittWare/RS-GQ-GC1-0109?qs=ST9lo4GX8V2eGrFMeVQmFw%3D%3D GROQ LLM accelerator – Hardware accelerator dedicated to LLM inference.
 
----
 
= Open models & internal endpoints =
 
''(Last update: 2026-02-13)''
 
The models below correspond to '''logical endpoints''' (for example via a proxy or gateway), selected for specific use cases.
 
{| class="wikitable"
! Endpoint !! Description / Primary use case
|-
| '''ai-chat''' || Based on '''gpt-oss-20b''' – General-purpose chat, good cost / quality balance.
|-
| '''ai-translate''' || gpt-oss-20b, temperature = 0 – Deterministic, reproducible translation (FR, EN, other languages).
|-
| '''ai-summary''' || qwen3 – Model optimized for summarizing long texts (reports, documents, transcriptions).
|-
| '''ai-code''' || gpt-oss-20b – Code reasoning, explanation, and refactoring.
|-
| '''ai-code-completion''' || gpt-oss-20b – Fast code completion, designed for IDE auto-completion.
|-
| '''ai-parse''' || qwen3 – Structured extraction, log / JSON / table parsing.
|-
| '''ai-RAG-FR''' || qwen3 – RAG usage in French (business knowledge, internal FAQs).
|-
| '''gpt-oss-20b''' || Agentic tasks.
|}


Usage idea: each endpoint is associated with one or more labs (chat, summary, parsing, RAG, etc.) in the Cloud Lab section.
== Préparer un audit ou une migration ==
* Inventorier les environnements
* Concevoir l’architecture cible
* Automatiser les opérations avec des scripts reproductibles


----
----


= News & Trends =
= Sections principales =


* https://www.youtube.com/@lev-selector/videos Top AI News – Curated AI news videos.
== IA & APIs ==
* https://betterprogramming.pub/color-your-captions-streamlining-live-transcriptions-with-diart-and-openais-whisper-6203350234ef Real-time transcription with Diart + Whisper – Example of real-time transcription with speaker detection.
Services autour des assistants, modèles, OCR, transcription, synthèse vocale, résumé, embeddings et bases vecteur.
* https://github.com/openai-translator/openai-translator OpenAI Translator – Modern extension / client for LLM-assisted translation.
* https://opensearch.org/docs/latest/search-plugins/conversational-search Opensearch with LLM – Conversational search based on LLMs and OpenSearch.


----
== Cloud & Infrastructure ==
Kubernetes, haute disponibilité, architecture web, HPC, DevSecOps et bonnes pratiques d’exploitation.


= Training & Learning =
== Labs & Automatisation ==
Scénarios techniques, scripts réutilisables, audits, migrations et démonstrateurs.


* https://www.youtube.com/watch?v=4Bdc55j80l8 Transformers Explained – Introduction to Transformers, the core architecture of LLMs.
== Comparatifs & Références ==
* Hands-on labs, scripts, and real-world feedback in the [[LAB project|CLOUD LAB]] project below.
Tableaux de comparaison, benchmarks, liens utiles et veille technique.


----
----


= Cloud Lab & Audit Projects =
= Services mis en avant =
 
[[File:Infocepo.drawio.png|400px|Cloud Lab reference diagram]]
 
The '''Cloud Lab''' provides reproducible scenarios: infrastructure audits, cloud migration, automation, high availability.
 
== Audit project – Cloud Audit ==
 
; '''[[ServerDiff.sh]]'''
Bash audit script to:
 
* detect configuration drift,
* compare multiple environments,
* prepare a migration or remediation plan.
 
== Example of Cloud migration ==
 
[[File:Diagram-migration-ORACLE-KVM-v2.drawio.png|400px|Cloud migration diagram]]
 
Example: migration of virtual environments to a modernized cloud, including audit, architecture design, and automation.


{| class="wikitable"
{| class="wikitable"
! Task !! Description !! Duration (days)
! Service !! Rôle
|-
|-
| Infrastructure audit || 82 services, automated audit via '''ServerDiff.sh''' || 1.5
| [https://api.ai.lab.infocepo.com '''API LLM'''] || Chat, code, RAG, OCR
|-
|-
| Cloud architecture diagram || Visual design and documentation || 1.5
| [https://stt.ai.lab.infocepo.com/docs '''API STT'''] || Transcription audio
|-
|-
| Compliance checks || 2 clouds, 6 hypervisors, 6 TB of RAM || 1.5
| [https://tts.ai.lab.infocepo.com/docs '''API TTS'''] || Synthèse vocale
|-
|-
| Cloud platform installation || Deployment of main target environments || 1.0
| [https://api-summary.ai.lab.infocepo.com/docs '''API Summary'''] || Résumé de textes
|-
|-
| Stability verification || Early functional tests || 0.5
| [https://text-embeddings.ai.lab.infocepo.com/docs '''Text Embeddings'''] || Embeddings pour RAG
|-
|-
| Automation study || Identification and automation of repetitive tasks || 1.5
| [https://chromadb.ai.lab.infocepo.com '''ChromaDB'''] || Base de données vecteur
|-
|-
| Template development || 6 templates, 8 environments, 2 clouds / OS || 1.5
| [https://datalab.ai.lab.infocepo.com '''DataLab'''] || Environnement de travail et d’expérimentation
|-
| Migration diagram || Illustration of the migration process || 1.0
|-
| Migration code writing || 138 lines (see '''MigrationApp.sh''') || 1.5
|-
| Process stabilization || Validation that migration is reproducible || 1.5
|-
| Cloud benchmarking || Performance comparison vs legacy infrastructure || 1.5
|-
| Downtime tuning || Calculation of outage time per migration || 0.5
|-
| VM loading || 82 VMs: OS, code, 2 IPs per VM || 0.1
|-
! colspan=2 align="right"| '''Total''' !! 15 person-days
|}
 
=== Stability checks (minimal HA) ===
 
{| class="wikitable"
! Action !! Expected result
|-
| Shutdown of one node || All services must automatically restart on remaining nodes.
|-
| Simultaneous shutdown / restart of all nodes || All services must recover correctly after reboot.
|}
|}


----
----


= Web Architecture & Best Practices =
= Projets et usages =


[[File:WebModelDiagram.drawio.png|400px|Reference web architecture]]
Ce wiki sert notamment à :


Principles for designing scalable and portable web architectures:
* documenter des stacks IA privées,
 
* publier des schémas d’architecture,
* Favor '''simple, modular, and flexible''' infrastructure.
* capitaliser des retours d’expérience,
* Follow client location (GDNS or equivalent) to bring content closer.
* préparer des déploiements reproductibles,
* Use network load balancers (LVS, IPVS) for scalability.
* industrialiser des workflows cloud et IA.
* Systematically compare costs and beware of '''vendor lock-in'''.
* TLS:
** HAProxy for fast frontends,
** Envoy for compatibility and advanced use cases (mTLS, HTTP/2/3).
* Caching:
** Varnish, Apache Traffic Server for large content volumes.
* Favor open-source stacks and database caches (e.g., Memcached).
* Use message queues, buffers, and quotas to smooth traffic spikes.
* For complete architectures:
** https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure Wikimedia Cloud Architecture
** https://github.com/systemdesign42/system-design System Design GitHub


----
----


= Comparison of major Cloud platforms =
= Aller plus loin =


{| class="wikitable"
* [[ServerDiff.sh|'''Audit avec ServerDiff.sh''']]
! Feature !! Kubernetes !! OpenStack !! AWS !! Bare-metal !! HPC !! CRM !! oVirt
* [[MLFlow|'''MLFlow''']]
|-
* [[CI-CD-GITHUB-K8S|'''CI/CD + GitHub + Kubernetes''']]
| '''Deployment tools''' || Helm, YAML, ArgoCD, Juju || Ansible, Terraform, Juju || CloudFormation, Terraform, Juju || Ansible, Shell || xCAT, Clush || Ansible, Shell || Ansible, Python
|-
| '''Bootstrap method''' || API || API, PXE || API || PXE, IPMI || PXE, IPMI || PXE, IPMI || PXE, API
|-
| '''Router control''' || Kube-router || Router/Subnet API || Route Table / Subnet API || Linux, OVS || xCAT || Linux || API
|-
| '''Firewall control''' || Istio, NetworkPolicy || Security Groups API || Security Group API || Linux firewall || Linux firewall || Linux firewall || API
|-
| '''Network virtualization''' || VLAN, VxLAN, others || VPC || VPC || OVS, Linux || xCAT || Linux || API
|-
| '''DNS''' || CoreDNS || DNS-Nameserver || Route 53 || GDNS || xCAT || Linux || API
|-
| '''Load Balancer''' || Kube-proxy, LVS || LVS || Network Load Balancer || LVS || SLURM || Ldirectord || N/A
|-
| '''Storage options''' || Local, Cloud, PVC || Swift, Cinder, Nova || S3, EFS, EBS, FSx || Swift, XFS, EXT4, RAID10 || GPFS || SAN || NFS, SAN
|}


This table serves as a starting point for choosing the right stack based on:
Pour plus de détails, parcourir [[Special:AllPages|'''la liste complète des pages''']].
 
* Desired level of control (API vs bare-metal),
* Context (on-prem, public cloud, HPC, CRM…),
* Existing automation tooling.


----
----


= Useful Cloud & IT links =
= Contributions =
 
* https://cloud.google.com/free/docs/aws-azure-gcp-service-comparison Cloud Providers Compared – AWS / Azure / GCP service mapping.
* https://global-internet-map-2021.telegeography.com/ Global Internet Topology Map – Global Internet mapping.
* https://landscape.cncf.io/?fullscreen=yes CNCF Official Landscape – Overview of cloud-native projects (CNCF).
* https://wikitech.wikimedia.org/wiki/Wikimedia_infrastructure Wikimedia Cloud Wiki – Wikimedia infrastructure, real large-scale example.
* https://openapm.io OpenAPM – SRE Tools – APM / observability tooling.
* https://access.redhat.com/downloads/content/package-browser RedHat Package Browser – Package and version search at Red Hat.
* https://www.silkhom.com/barometre-2021-des-tjm-dans-informatique-digital Barometer of IT freelance daily rates.
* https://www.glassdoor.fr/salaire/Hays-Salaires-E10166.htm IT Salaries (Glassdoor) – Salary indicators.
 
----
 
= Advanced: High Availability, HPC & DevSecOps =
 
== High Availability with Corosync & Pacemaker ==
 
[[File:HA-REF.drawio.png|400px|HA cluster architecture]]
 
Basic principles:
 
* Multi-node or multi-site clusters for redundancy.
* Use of IPMI for fencing, provisioning via PXE/NTP/DNS/TFTP.
* For a 2-node cluster:
  – carefully sequence fencing to avoid split-brain,
  – 3 or more nodes remain recommended for production.
 
=== Common resource patterns ===
 
* Multipath storage, LUNs, LVM, NFS.
* User resources and application processes.
* Virtual IPs, DNS records, network listeners.
 
== HPC ==
 
[[File:HPC.drawio.png|400px|Overview of an HPC cluster]]
 
* Job orchestration (SLURM or equivalent).
* High-performance shared storage (GPFS, Lustre…).
* Possible integration with AI workloads (large-scale training, GPU inference).
 
== DevSecOps ==
 
[[File:DSO-POC-V3.drawio.png|400px|DevSecOps reference design]]
 
* CI/CD pipelines with built-in security checks (linting, SAST, DAST, SBOM).
* Observability (logs, metrics, traces) integrated from design time.
* Automated vulnerability scanning, secret management, policy-as-code.
 
----
 
= About & Contributions =
 
For more examples, scripts, diagrams, and feedback, see:


* https://infocepo.com infocepo.com
Les suggestions, corrections, améliorations de schémas et nouveaux labs sont les bienvenus.


Suggestions for corrections, diagram improvements, or new labs are welcome. 
Ce wiki a vocation à rester un '''laboratoire vivant''' autour du cloud, de l’IA et de l’automatisation.
This wiki aims to remain a '''living laboratory''' for AI, cloud, and automation.

Revision as of 18:09, 26 March 2026

Cloud, AI and Labs on infocepo.com

infocepo.com – Cloud, AI & Labs

Bienvenue sur infocepo.com.

Ce wiki centralise des ressources sur :

  • l’infrastructure cloud et les architectures distribuées,
  • l’IA appliquée (assistants privés, APIs, RAG, GPU),
  • les labs techniques pour apprendre, tester et industrialiser,
  • les scripts et méthodes pour l’audit, la migration et l’automatisation.

L’objectif est de transformer des idées et des concepts en solutions concrètes, reproductibles et utiles.


Accès rapide


Parcours recommandés

Construire un assistant IA privé

  • Déployer une stack type Open WebUI + Ollama + GPU
  • Ajouter des modèles de chat, résumé, OCR ou transcription
  • Connecter des documents via RAG + embeddings

Lancer un lab cloud

  • Créer un environnement Kubernetes, OpenStack ou bare-metal
  • Déployer avec Helm, Terraform ou Ansible
  • Ajouter des services IA et des outils d’observabilité

Préparer un audit ou une migration

  • Inventorier les environnements
  • Concevoir l’architecture cible
  • Automatiser les opérations avec des scripts reproductibles

Sections principales

IA & APIs

Services autour des assistants, modèles, OCR, transcription, synthèse vocale, résumé, embeddings et bases vecteur.

Cloud & Infrastructure

Kubernetes, haute disponibilité, architecture web, HPC, DevSecOps et bonnes pratiques d’exploitation.

Labs & Automatisation

Scénarios techniques, scripts réutilisables, audits, migrations et démonstrateurs.

Comparatifs & Références

Tableaux de comparaison, benchmarks, liens utiles et veille technique.


Services mis en avant

Service Rôle
API LLM Chat, code, RAG, OCR
API STT Transcription audio
API TTS Synthèse vocale
API Summary Résumé de textes
Text Embeddings Embeddings pour RAG
ChromaDB Base de données vecteur
DataLab Environnement de travail et d’expérimentation

Projets et usages

Ce wiki sert notamment à :

  • documenter des stacks IA privées,
  • publier des schémas d’architecture,
  • capitaliser des retours d’expérience,
  • préparer des déploiements reproductibles,
  • industrialiser des workflows cloud et IA.

Aller plus loin

Pour plus de détails, parcourir la liste complète des pages.


Contributions

Les suggestions, corrections, améliorations de schémas et nouveaux labs sont les bienvenus.

Ce wiki a vocation à rester un laboratoire vivant autour du cloud, de l’IA et de l’automatisation.