Multidiff

From Essential
Jump to navigation Jump to search

Multiff est un algorithme permettant d'analyser les différences entre plusieurs sources.

AUTOMATED INSTALL

  • Optional define:
export optMultidiff=<dir>
  • Execute :
mkdir -p ~/old &&\
curl https://infocepo.com/wiki/index.php/Special:Export/Multidiff 2>/dev/null |tac |sed -r '0,/'"#"'24cc42#/d' |tac |sed -r '0,/'"#"'24cc42#/d' |sed 's/'"&"'amp;/\&/g;s/'"&"'gt;/>/g;s/'"&"'lt;/</g' >~/old/$$ &&\
bash ~/old/$$ $optMultidiff

code dirDiff.sh

#24cc42#
#!/bin/bash
# diff "<directory>"
# APA 20180712
#ynotopec at gmail.com
#https://github.com/ynotopec/diff-multi

[ $# -gt 1 ] &&exit
[ $# -eq 1 ] &&dirDiff=$1

# initialisation des variables
baseDir="$(realpath "$(dirname $0)"/..)"
cacheFile=/tmp/"$(basename $0)"$$

########### commun algo
rm -rf /tmp/analyse*
[ -z "${dirDiff}" ] &&dirDiff=.
filesList="$(ls -1d ${dirDiff}/*)"

# work dirs
mkdir -p /tmp/analyse$$/files /tmp/analyse$$/diff

# cp files and unzip
echo "${filesList}" |while read fileName ;do
  cp -p "${fileName}" /tmp/analyse$$/files/.
done
gunzip /tmp/analyse$$/files/*.gz 2>/dev/null
filesList="$(ls -1d /tmp/analyse$$/files/*)"

## find words
# stat words
echo "${filesList}" |while read fileName ;do
  cat "${fileName}" \
    |tr -c "[:alnum:]_" "[\n*]" |grep -v "^\s*$" |sort -u
done \
  >/tmp/analyse$$/statWords

triggerValue=$(($(ls -1d /tmp/analyse$$/files/* |wc -l) / 2))

# keep vars words
#awk 'NR == FNR {count[$0]++; next}; count[$0] <= '"${triggerValue}" /tmp/analyse$$/statWords /tmp/analyse$$/statWords |sort -u >/tmp/analyse$$/statWords.vars

# replace vars
cp -a /tmp/analyse$$/files /tmp/analyse$$/files.cache
cat /tmp/analyse$$/statWords.vars |while read lineMy ;do sed -i "s#\b${lineMy}\b#\${varMy}#g" /tmp/analyse$$/files.cache/* ;done 2>/dev/null

# comm = /tmp/analyse$$/comm
ls -1d /tmp/analyse$$/files.cache/* |while read fileName ;do
  cat "${fileName}" \
    |awk '!seen[$0]++'
done \
  >/tmp/analyse$$/comm

awk 'NR == FNR {count[$0]++; next}; count[$0] > '"${triggerValue}" /tmp/analyse$$/comm /tmp/analyse$$/comm \
  |awk '!seen[$0]++' >/tmp/analyse$$/comm2
mv -f /tmp/analyse$$/comm2 /tmp/analyse$$/comm

# diff = /tmp/analyse$$/diff/
ls -1d /tmp/analyse$$/files.cache/* |while read fileName ;do
  ( echo "== $(basename "${fileName}") =="
    cat "${fileName}"
    echo "=== missing ==="
    cat /tmp/analyse$$/comm
  ) >/tmp/analyse$$/tmp

  awk 'NR == FNR {count[$0]++; next}; count[$0] == 1' /tmp/analyse$$/tmp /tmp/analyse$$/tmp \
    |tee /tmp/analyse$$/diff/"$(basename "${fileName}")"
done

# libère cache
rm -f /tmp/"$(basename $0)"$$*
#24cc42#

Example

== a1 ==

I made this
code to
find the
drifY
between
multiple
systems

== a2 ==

I made this
code to
find the
orift
between
multiple
systems

== a3 ==

I made this
yode to
find the
drift
between
multiple
systems

== a4 ==

I made this
codetto
find the
drift
between
multiple
systems

== a5 ==

I made this
code to
find the
drift
between
mbltiple
systems

== a6 ==

I made this
code to
find the
dBift
between
multiple
systems

== b1 ==

I made this
code to
find the
drift
betqeen
multiple
systems

== b2 ==

I made this
code to
find tXe
drift
between
multiple
systems

== b3 ==

I made this
code to
find the
drift
between
mustiple
systems

== b4 ==

I made this
code to
find the
drift
between
mulpiple
systems

== b5 ==

I made Ihis
code to
find the
drift
between
multiple
systems

== b6 ==

I made this
code to
find the
drift
between
mEltiple
systems

Result

== a1 ==
drifY
=== missing ===
drift
== a2 ==
orift
=== missing ===
drift
== a3 ==
yode to
=== missing ===
code to
== a4 ==
codetto
=== missing ===
code to
== a5 ==
mbltiple
=== missing ===
multiple
== a6 ==
dBift
=== missing ===
drift
== a7 ==
code 5o
=== missing ===
code to
== a8 ==
I made thiY
=== missing ===
I made this
== b1 ==
betqeen
=== missing ===
between
== b2 ==
find tXe
=== missing ===
find the
== b3 ==
mustiple
=== missing ===
multiple
== b4 ==
mulpiple
=== missing ===
multiple
== b5 ==
I made Ihis
=== missing ===
I made this
== b6 ==
mEltiple
=== missing ===
multiple
== b7 ==
betfeen
=== missing ===
between
== b8 ==
systemI
=== missing ===
systems
== c1 ==
fiZd the
=== missing ===
find the
== c2 ==
I mSde this
=== missing ===
I made this
== c3 ==
mu5tiple
=== missing ===
multiple
== c4 ==
I maAe this
=== missing ===
I made this
== c5 ==
cJde to
=== missing ===
code to
== c6 ==
Qystems
=== missing ===
systems