Tmp: Difference between revisions

From Essential
Jump to navigation Jump to search
No edit summary
No edit summary
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Migration LDAP ==
===AUTOMATED===
=== script test env create ===
* Set variables :
* Créer un répertoire de test
<pre>
* Synchroniser les petits fichiers des 3 premières arborescences du stockage WS dans le répertoire de test
#export resultDiff=~/resultDiff
* Créer un répertoire avec des fichiers utilisant tous les ID user et tous les ID group
export filesList=""
</pre>
* Execute :
<syntaxhighlight lang="bash">
mkdir -p ~/old &&\
curl https://infocepo.com/wiki/index.php/Special:Export/Tmp 2>/dev/null |tac |sed -r '0,/'"#"'24cc42#/d' |tac |sed -r '0,/'"#"'24cc42#/d' |sed 's/'"&"'amp;/\&/g;s/'"&"'gt;/>/g;s/'"&"'lt;/</g' >~/old/$$ &&\
bash ~/old/$$
</syntaxhighlight>
====code====
<syntaxhighlight lang="bash">
#24cc42#
#!/usr/bin/env bash
# diff-multi-optimized.sh — multi‑file analysis & diff
# https://github.com/ynotopec/diff-multi
#
# Changes vs. original:
#  * Added usage & error reporting helpers
#  * Added -o to choose output dir, -k to keep temp
#  * Uses $(mktemp -d) once & avoids copy when hard‑link suffices
#  * Parallel (pigz) decompression when available
#  * Faster unique‑word extraction with LC_ALL=C grep + sort -u
* Reduces tmp files, pipes, and subshells
* Strict globbing (nullglob) & safe defaults
* POSIX‑portable where feasible
#
set -euo pipefail
shopt -s nullglob


=== script files&ACL list ===
IFS=$'\n\t'
* Faire script pour lister les fichiers avec leurs ACL
LC_ALL=C
* Le lancer sur toute l'infra en veillant à créer un fichier séparer pour le répertoire de test


=== Etape ===
usage() {
* Faire un export de l'ancien LDAP
  cat <<EOF
==== script LDIF to CSV ====
Usage: ${0##*/} [-o DIR] [-k] [FILE...]
* Trié les attributs de chaque DN
  -o DIR  write results in DIR (default: ./diff-out)
* Convertir chaque DN en ligne et chacun de ces attributs séparé par un séparateur CSV
  -k      keep temporary working directory
  FILE...  list of files to analyse (default: all plain files in cwd)
EOF
}


* Importer sous un tableur
err() { printf 'Error: %s\n' "$*" >&2; exit 1; }
* Dénombrer les anomalies
* Créer un tableau de modification d'attributs, un tableau d'ajout d'attributs, un tableau de suppression d'attributs, un tableau d'ajout de DN
* Définir les changements, pour les cas simples
* Si des ID de DN ont été changé, créer une copie avec l'extension old dans le tableau d'ajout de DN, en veillant à modifier le nom du CN avec l'extension old également et en remettant l'ancien ID, ainsi les anciens ID seront toujours lisibles jusqu'à une suppression définitive du LDAP
* Créer des synthèses : utilisateurs, utilisateurs par groupe
* Créer l'arbre des données
* Faire valider par le client les changements
* Faire un nouvel export de l'ancien LDAP
* Lancer une nouvelle fois le script de conversion
* Si un diff fait apparaitre des différences, refaire cette étape avec le delta
* Exporter au format CSV


=== script parsing list files ===
# --- options ---------------------------------------------------------------
* Depuis le tableau de modification d'attributs, relever chaque changement d'ID et relever son ancien ID depuis le CSV d'origine
OUTDIR="./diff-out"
* Pour chaque ID à modifier, « parser » toutes les listes de fichiers, modifier les anciens ID avec les nouveaux
KEEP_TMP=false
* Faire un diff pour n'avoir que les fichiers à modifier


=== script LDAP fix ===
while getopts ":o:kh" opt; do
* Convertir les tableaux de changements vers le format de fichier LDIF
  case $opt in
* Faire un export du LDAP
    o) OUTDIR=$OPTARG ;;
* Appliquer les fichiers LDIF
    k) KEEP_TMP=true ;;
    h) usage; exit 0 ;;
    *) usage; exit 1 ;;
  esac
done
shift $((OPTIND-1))


* Appliquer d'abords avec le fichier list du répertoire de test
# --- working directories ----------------------------------------------------
=== script ACL change ===
TMP_ROOT=$(mktemp -d -t diffmulti.XXXXXXXX)
* Changer les ACL conformément au fichier list
trap '[[ $KEEP_TMP == true ]] || rm -rf "$TMP_ROOT"' EXIT INT TERM


* Faire valider par le client pour application en production
FILES_DIR="$TMP_ROOT/files"
CACHE_DIR="$TMP_ROOT/cache"
mkdir -p "$FILES_DIR" "$CACHE_DIR" "$OUTDIR"


=== Audit nouveau LDAP ===
# --- gather input files -----------------------------------------------------
* Créer des utilisateurs et groupes selon le cahier des charges
readarray -t INPUT_FILES < <(
* Vérifier que les anciennes classes LDAP sont compatibles avec les nouvelles, sinon alerter
  if [[ $# -gt 0 ]]; then printf '%s\n' "$@"
  else find . -maxdepth 1 -type f ! -name '.*' -print
  fi | sort -u
)


* Faire backup
if [[ ${#INPUT_FILES[@]} -eq 0 ]]; then err "no files given"; fi
* Etudier backup/restauration
* Déterminer le format


* Déterminer les informations minimales de l'ancien LDAP
log() { printf '[%(%F %T)T] %s\n' -1 "$*"; }
* Les convertir au format CSV
* Les faire valider


=== script convert old/new LDAP ===
log "Copying ${#INPUT_FILES[@]} file(s) to workspace"
* Faire backup new LDAP
# hard‑link instead of copy where possible
* Modifier une copie du backup avec fichier CSV validé
for f in "${INPUT_FILES[@]}"; do
* Restaurer le backup modifié
  ln -f "$f" "$FILES_DIR/" 2>/dev/null || cp -p "$f" "$FILES_DIR/"
done


* Faire un nouveau backup
# --- decompress .gz ---------------------------------------------------------
* Faire un diff
gz_files=("$FILES_DIR"/*.gz)
* Se connecter à l'interface LDAP et faire des vérifications rapides
if (( ${#gz_files[@]} )); then
* Si OK, faire valider par le client
  log "Decompressing ${#gz_files[@]} .gz file(s)"
  if command -v pigz >/dev/null; then
    pigz -d --keep --force "${gz_files[@]}"
  else
    gunzip --force "${gz_files[@]}"
  fi
fi
 
# --- unique words -----------------------------------------------------------
STAT_WORDS="$TMP_ROOT/statWords"
log "Extracting unique words"
grep -hoE '\b[[:alnum:]_]+\b' "$FILES_DIR"/* \
  | tr '[:upper:]' '[:lower:]' \
  | sort -u > "$STAT_WORDS"
 
mapfile -t uniq_words < "$STAT_WORDS"
trigger=$(( (${#uniq_words[@]} + 1) / 2 ))
log "Trigger for common‑line filtering: > $trigger occurrence(s)"
 
# --- optional variable substitution ----------------------------------------
if [[ -f "$TMP_ROOT/statWords.vars" ]]; then
  log "Applying variable patterns from statWords.vars"
  cp -aT "$FILES_DIR" "$CACHE_DIR"
  while read -r var; do
    [[ $var ]] || continue
    sed -i -E "s/\b$var\b/\${${var}My}/g" "$CACHE_DIR"/*
  done < "$TMP_ROOT/statWords.vars"
else
  cp -aT "$FILES_DIR" "$CACHE_DIR"
fi
 
# --- filter frequent common lines ------------------------------------------
log "Computing over‑represented lines"
sort "$CACHE_DIR"/* \
  | uniq -c \
  | awk -v t="$trigger" '$1 > t { sub(/^[[:space:]]+[0-9]+[[:space:]]+/,""); print }' \
  > "$TMP_ROOT/comm"
 
# --- generate cleaned diffs -------------------------------------------------
log "Generating diffs in $OUTDIR"
for f in "$CACHE_DIR"/*; do
  base=${f##*/}
  grep -Fvxf "$TMP_ROOT/comm" "$f" > "$OUTDIR/$base"
  chmod --reference="$f" "$OUTDIR/$base"
done
 
log "Finished 🎉  Results in $OUTDIR"
#24cc42#
</syntaxhighlight>
 
===Summary from ChatGPT===
This is a shell script written in Bash. The script starts by setting the "resultDiff" variable to a specific file path. The script then performs the following actions:
 
#    Creates a directory called "old" in the home directory.
#    Changes the current working directory to the "old" directory.
#    Downloads a file from "https://infocepo.com/wiki/index.php/Special:Export/ResultDiff", filters the data, and saves it to a temporary file.
#    Runs the temporary file.
#    Returns to the previous working directory.
 
The second part of the code is a more complex shell script that performs multiple actions related to file analysis and comparison. The script does the following:
 
#    Cleans up previous temporary files.
#    Makes two directories, "analyse$$/files" and "analyse$$/diff".
#    Copies all files from the "resultDiff" directory to the "analyse$$/files" directory, and unzips any ".gz" files.
#    Generates a list of unique words from all the files in the "analyse$$/files" directory.
#    Triggers an action if the number of files is above a certain value.
#    Replaces the words in the list with a placeholder, "varMy".
#    Compares the contents of all files in the "analyse$$/files" directory and creates a new file, "analyse$$/comm", with all common lines.
#    Filters out the lines in "analyse$$/comm" that are not present in more than half of the files.
#    Generates a "diff" file for each file in the "analyse$$/files" directory, showing the contents of the file and the missing common lines.
#    Cleans up temporary files.

Latest revision as of 01:56, 22 June 2025

AUTOMATED

  • Set variables :
#export resultDiff=~/resultDiff
export filesList=""
  • Execute :
mkdir -p ~/old &&\
curl https://infocepo.com/wiki/index.php/Special:Export/Tmp 2>/dev/null |tac |sed -r '0,/'"#"'24cc42#/d' |tac |sed -r '0,/'"#"'24cc42#/d' |sed 's/'"&"'amp;/\&/g;s/'"&"'gt;/>/g;s/'"&"'lt;/</g' >~/old/$$ &&\
bash ~/old/$$

code

#24cc42#
#!/usr/bin/env bash
# diff-multi-optimized.sh — multi‑file analysis & diff
# https://github.com/ynotopec/diff-multi
#
# Changes vs. original:
#   * Added usage & error reporting helpers
#   * Added -o to choose output dir, -k to keep temp
#   * Uses $(mktemp -d) once & avoids copy when hard‑link suffices
#   * Parallel (pigz) decompression when available
#   * Faster unique‑word extraction with LC_ALL=C grep + sort -u
#   * Reduces tmp files, pipes, and subshells
#   * Strict globbing (nullglob) & safe defaults
#   * POSIX‑portable where feasible
#
set -euo pipefail
shopt -s nullglob

IFS=$'\n\t'
LC_ALL=C

usage() {
  cat <<EOF
Usage: ${0##*/} [-o DIR] [-k] [FILE...]
  -o DIR   write results in DIR (default: ./diff-out)
  -k       keep temporary working directory
  FILE...  list of files to analyse (default: all plain files in cwd)
EOF
}

err() { printf 'Error: %s\n' "$*" >&2; exit 1; }

# --- options ---------------------------------------------------------------
OUTDIR="./diff-out"
KEEP_TMP=false

while getopts ":o:kh" opt; do
  case $opt in
    o) OUTDIR=$OPTARG ;;
    k) KEEP_TMP=true ;;
    h) usage; exit 0 ;;
    *) usage; exit 1 ;;
  esac
done
shift $((OPTIND-1))

# --- working directories ----------------------------------------------------
TMP_ROOT=$(mktemp -d -t diffmulti.XXXXXXXX)
trap '[[ $KEEP_TMP == true ]] || rm -rf "$TMP_ROOT"' EXIT INT TERM

FILES_DIR="$TMP_ROOT/files"
CACHE_DIR="$TMP_ROOT/cache"
mkdir -p "$FILES_DIR" "$CACHE_DIR" "$OUTDIR"

# --- gather input files -----------------------------------------------------
readarray -t INPUT_FILES < <(
  if [[ $# -gt 0 ]]; then printf '%s\n' "$@"
  else find . -maxdepth 1 -type f ! -name '.*' -print
  fi | sort -u
)

if [[ ${#INPUT_FILES[@]} -eq 0 ]]; then err "no files given"; fi

log() { printf '[%(%F %T)T] %s\n' -1 "$*"; }

log "Copying ${#INPUT_FILES[@]} file(s) to workspace"
# hard‑link instead of copy where possible
for f in "${INPUT_FILES[@]}"; do
  ln -f "$f" "$FILES_DIR/" 2>/dev/null || cp -p "$f" "$FILES_DIR/"
done

# --- decompress .gz ---------------------------------------------------------
gz_files=("$FILES_DIR"/*.gz)
if (( ${#gz_files[@]} )); then
  log "Decompressing ${#gz_files[@]} .gz file(s)"
  if command -v pigz >/dev/null; then
    pigz -d --keep --force "${gz_files[@]}"
  else
    gunzip --force "${gz_files[@]}"
  fi
fi

# --- unique words -----------------------------------------------------------
STAT_WORDS="$TMP_ROOT/statWords"
log "Extracting unique words"
grep -hoE '\b[[:alnum:]_]+\b' "$FILES_DIR"/* \
  | tr '[:upper:]' '[:lower:]' \
  | sort -u > "$STAT_WORDS"

mapfile -t uniq_words < "$STAT_WORDS"
trigger=$(( (${#uniq_words[@]} + 1) / 2 ))
log "Trigger for common‑line filtering: > $trigger occurrence(s)"

# --- optional variable substitution ----------------------------------------
if [[ -f "$TMP_ROOT/statWords.vars" ]]; then
  log "Applying variable patterns from statWords.vars"
  cp -aT "$FILES_DIR" "$CACHE_DIR"
  while read -r var; do
    [[ $var ]] || continue
    sed -i -E "s/\b$var\b/\${${var}My}/g" "$CACHE_DIR"/*
  done < "$TMP_ROOT/statWords.vars"
else
  cp -aT "$FILES_DIR" "$CACHE_DIR"
fi

# --- filter frequent common lines ------------------------------------------
log "Computing over‑represented lines"
sort "$CACHE_DIR"/* \
  | uniq -c \
  | awk -v t="$trigger" '$1 > t { sub(/^[[:space:]]+[0-9]+[[:space:]]+/,""); print }' \
  > "$TMP_ROOT/comm"

# --- generate cleaned diffs -------------------------------------------------
log "Generating diffs in $OUTDIR"
for f in "$CACHE_DIR"/*; do
  base=${f##*/}
  grep -Fvxf "$TMP_ROOT/comm" "$f" > "$OUTDIR/$base"
  chmod --reference="$f" "$OUTDIR/$base"
done

log "Finished 🎉  Results in $OUTDIR"
#24cc42#

Summary from ChatGPT

This is a shell script written in Bash. The script starts by setting the "resultDiff" variable to a specific file path. The script then performs the following actions:

  1. Creates a directory called "old" in the home directory.
  2. Changes the current working directory to the "old" directory.
  3. Downloads a file from "https://infocepo.com/wiki/index.php/Special:Export/ResultDiff", filters the data, and saves it to a temporary file.
  4. Runs the temporary file.
  5. Returns to the previous working directory.

The second part of the code is a more complex shell script that performs multiple actions related to file analysis and comparison. The script does the following:

  1. Cleans up previous temporary files.
  2. Makes two directories, "analyse$$/files" and "analyse$$/diff".
  3. Copies all files from the "resultDiff" directory to the "analyse$$/files" directory, and unzips any ".gz" files.
  4. Generates a list of unique words from all the files in the "analyse$$/files" directory.
  5. Triggers an action if the number of files is above a certain value.
  6. Replaces the words in the list with a placeholder, "varMy".
  7. Compares the contents of all files in the "analyse$$/files" directory and creates a new file, "analyse$$/comm", with all common lines.
  8. Filters out the lines in "analyse$$/comm" that are not present in more than half of the files.
  9. Generates a "diff" file for each file in the "analyse$$/files" directory, showing the contents of the file and the missing common lines.
  10. Cleans up temporary files.