Multidiff: Difference between revisions

From Essential
Jump to navigation Jump to search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
Multiff est un algorithme permettant d'analyser les différences entre plusieurs sources.
Multiff est un algorithme permettant d'analyser les différences entre plusieurs sources.


== dirDiff.sh ==
==AUTOMATED INSTALL==
* Optional define:
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
#!/bin/sh
export optMultidiff=<dir>
</syntaxhighlight>
* Execute :
<syntaxhighlight lang="bash">
mkdir -p ~/old &&\
curl https://infocepo.com/wiki/index.php/Special:Export/Multidiff 2>/dev/null |tac |sed -r '0,/'"#"'24cc42#/d' |tac |sed -r '0,/'"#"'24cc42#/d' |sed 's/'"&"'amp;/\&/g;s/'"&"'gt;/>/g;s/'"&"'lt;/</g' >~/old/$$ &&\
bash ~/old/$$ $optMultidiff
</syntaxhighlight>
 
==code dirDiff.sh==
<syntaxhighlight lang="bash">
#24cc42#
#!/bin/bash
# diff "<directory>"
# diff "<directory>"
# APA 20180712
# APA 20180712
#pacheco at infocepo.com
#ynotopec at gmail.com
#https://github.com/ynotopec/diff-multi
#https://github.com/ynotopec/diff-multi


[ $# -lt 1 ] &&exit
[ $# -gt 1 ] &&exit
[ $# -eq 1 ] &&dirDiff=$1


# initialisation des variables
# initialisation des variables
Line 17: Line 31:
########### commun algo
########### commun algo
rm -rf /tmp/analyse*
rm -rf /tmp/analyse*
filesList="$(ls -1d $1/*)"
[ -z "${dirDiff}" ] &&dirDiff=.
filesList="$(ls -1d ${dirDiff}/*)"


# work dirs
# work dirs
Line 26: Line 41:
   cp -p "${fileName}" /tmp/analyse$$/files/.
   cp -p "${fileName}" /tmp/analyse$$/files/.
done
done
gunzip /tmp/analyse$$/files/*.gz
gunzip /tmp/analyse$$/files/*.gz 2>/dev/null
filesList="$(ls -1d /tmp/analyse$$/files/*)"
filesList="$(ls -1d /tmp/analyse$$/files/*)"


Line 40: Line 55:


# keep vars words
# keep vars words
awk 'NR == FNR {count[$0]++; next}; count[$0] <= '"${triggerValue}" /tmp/analyse$$/statWords /tmp/analyse$$/statWords |sort -u >/tmp/analyse$$/statWords.vars
#awk 'NR == FNR {count[$0]++; next}; count[$0] <= '"${triggerValue}" /tmp/analyse$$/statWords /tmp/analyse$$/statWords |sort -u >/tmp/analyse$$/statWords.vars


# replace vars
# replace vars
cp -a /tmp/analyse$$/files /tmp/analyse$$/files.cache
cp -a /tmp/analyse$$/files /tmp/analyse$$/files.cache
cat /tmp/analyse$$/statWords.vars |while read lineMy ;do sed -i "s#\b${lineMy}\b#\${varMy}#g" /tmp/analyse$$/files.cache/* ;done
cat /tmp/analyse$$/statWords.vars |while read lineMy ;do sed -i "s#\b${lineMy}\b#\${varMy}#g" /tmp/analyse$$/files.cache/* ;done 2>/dev/null


# comm = /tmp/analyse$$/comm
# comm = /tmp/analyse$$/comm
Line 62: Line 77:
     cat "${fileName}"
     cat "${fileName}"
     echo "=== missing ==="
     echo "=== missing ==="
     cat /tmp/analyse$$/comm
     cat /tmp/analyse$$/comm
   ) >/tmp/analyse$$/tmp
   ) >/tmp/analyse$$/tmp


Line 71: Line 86:
# libère cache
# libère cache
rm -f /tmp/"$(basename $0)"$$*
rm -f /tmp/"$(basename $0)"$$*
#24cc42#
</syntaxhighlight>
</syntaxhighlight>


== copyright ==
==Example==
Antonio DA SILVA PACHECO (pacheco at infocepo.com)
{| class="wikitable"
|== a1 ==<br>
I made this<br>
code to<br>
find the<br>
drifY<br>
between<br>
multiple<br>
systems<br>
|== a2 ==<br>
I made this<br>
code to<br>
find the<br>
orift<br>
between<br>
multiple<br>
systems<br>
|== a3 ==<br>
I made this<br>
yode to<br>
find the<br>
drift<br>
between<br>
multiple<br>
systems<br>
|== a4 ==<br>
I made this<br>
codetto<br>
find the<br>
drift<br>
between<br>
multiple<br>
systems<br>
|== a5 ==<br>
I made this<br>
code to<br>
find the<br>
drift<br>
between<br>
mbltiple<br>
systems<br>
|== a6 ==<br>
I made this<br>
code to<br>
find the<br>
dBift<br>
between<br>
multiple<br>
systems<br>
|-
|== b1 ==<br>
I made this<br>
code to<br>
find the<br>
drift<br>
betqeen<br>
multiple<br>
systems<br>
|== b2 ==<br>
I made this<br>
code to<br>
find tXe<br>
drift<br>
between<br>
multiple<br>
systems<br>
|== b3 ==<br>
I made this<br>
code to<br>
find the<br>
drift<br>
between<br>
mustiple<br>
systems<br>
|== b4 ==<br>
I made this<br>
code to<br>
find the<br>
drift<br>
between<br>
mulpiple<br>
systems<br>
|== b5 ==<br>
I made Ihis<br>
code to<br>
find the<br>
drift<br>
between<br>
multiple<br>
systems<br>
|== b6 ==<br>
I made this<br>
code to<br>
find the<br>
drift<br>
between<br>
mEltiple<br>
systems<br>
|-
|}
===Result===
<pre>
== a1 ==
drifY
=== missing ===
drift
== a2 ==
orift
=== missing ===
drift
== a3 ==
yode to
=== missing ===
code to
== a4 ==
codetto
=== missing ===
code to
== a5 ==
mbltiple
=== missing ===
multiple
== a6 ==
dBift
=== missing ===
drift
== a7 ==
code 5o
=== missing ===
code to
== a8 ==
I made thiY
=== missing ===
I made this
== b1 ==
betqeen
=== missing ===
between
== b2 ==
find tXe
=== missing ===
find the
== b3 ==
mustiple
=== missing ===
multiple
== b4 ==
mulpiple
=== missing ===
multiple
== b5 ==
I made Ihis
=== missing ===
I made this
== b6 ==
mEltiple
=== missing ===
multiple
== b7 ==
betfeen
=== missing ===
between
== b8 ==
systemI
=== missing ===
systems
== c1 ==
fiZd the
=== missing ===
find the
== c2 ==
I mSde this
=== missing ===
I made this
== c3 ==
mu5tiple
=== missing ===
multiple
== c4 ==
I maAe this
=== missing ===
I made this
== c5 ==
cJde to
=== missing ===
code to
== c6 ==
Qystems
=== missing ===
systems
</pre>

Latest revision as of 22:18, 30 March 2023

Multiff est un algorithme permettant d'analyser les différences entre plusieurs sources.

AUTOMATED INSTALL

  • Optional define:
export optMultidiff=<dir>
  • Execute :
mkdir -p ~/old &&\
curl https://infocepo.com/wiki/index.php/Special:Export/Multidiff 2>/dev/null |tac |sed -r '0,/'"#"'24cc42#/d' |tac |sed -r '0,/'"#"'24cc42#/d' |sed 's/'"&"'amp;/\&/g;s/'"&"'gt;/>/g;s/'"&"'lt;/</g' >~/old/$$ &&\
bash ~/old/$$ $optMultidiff

code dirDiff.sh

#24cc42#
#!/bin/bash
# diff "<directory>"
# APA 20180712
#ynotopec at gmail.com
#https://github.com/ynotopec/diff-multi

[ $# -gt 1 ] &&exit
[ $# -eq 1 ] &&dirDiff=$1

# initialisation des variables
baseDir="$(realpath "$(dirname $0)"/..)"
cacheFile=/tmp/"$(basename $0)"$$

########### commun algo
rm -rf /tmp/analyse*
[ -z "${dirDiff}" ] &&dirDiff=.
filesList="$(ls -1d ${dirDiff}/*)"

# work dirs
mkdir -p /tmp/analyse$$/files /tmp/analyse$$/diff

# cp files and unzip
echo "${filesList}" |while read fileName ;do
  cp -p "${fileName}" /tmp/analyse$$/files/.
done
gunzip /tmp/analyse$$/files/*.gz 2>/dev/null
filesList="$(ls -1d /tmp/analyse$$/files/*)"

## find words
# stat words
echo "${filesList}" |while read fileName ;do
  cat "${fileName}" \
    |tr -c "[:alnum:]_" "[\n*]" |grep -v "^\s*$" |sort -u
done \
  >/tmp/analyse$$/statWords

triggerValue=$(($(ls -1d /tmp/analyse$$/files/* |wc -l) / 2))

# keep vars words
#awk 'NR == FNR {count[$0]++; next}; count[$0] <= '"${triggerValue}" /tmp/analyse$$/statWords /tmp/analyse$$/statWords |sort -u >/tmp/analyse$$/statWords.vars

# replace vars
cp -a /tmp/analyse$$/files /tmp/analyse$$/files.cache
cat /tmp/analyse$$/statWords.vars |while read lineMy ;do sed -i "s#\b${lineMy}\b#\${varMy}#g" /tmp/analyse$$/files.cache/* ;done 2>/dev/null

# comm = /tmp/analyse$$/comm
ls -1d /tmp/analyse$$/files.cache/* |while read fileName ;do
  cat "${fileName}" \
    |awk '!seen[$0]++'
done \
  >/tmp/analyse$$/comm

awk 'NR == FNR {count[$0]++; next}; count[$0] > '"${triggerValue}" /tmp/analyse$$/comm /tmp/analyse$$/comm \
  |awk '!seen[$0]++' >/tmp/analyse$$/comm2
mv -f /tmp/analyse$$/comm2 /tmp/analyse$$/comm

# diff = /tmp/analyse$$/diff/
ls -1d /tmp/analyse$$/files.cache/* |while read fileName ;do
  ( echo "== $(basename "${fileName}") =="
    cat "${fileName}"
    echo "=== missing ==="
    cat /tmp/analyse$$/comm
  ) >/tmp/analyse$$/tmp

  awk 'NR == FNR {count[$0]++; next}; count[$0] == 1' /tmp/analyse$$/tmp /tmp/analyse$$/tmp \
    |tee /tmp/analyse$$/diff/"$(basename "${fileName}")"
done

# libère cache
rm -f /tmp/"$(basename $0)"$$*
#24cc42#

Example

== a1 ==

I made this
code to
find the
drifY
between
multiple
systems

== a2 ==

I made this
code to
find the
orift
between
multiple
systems

== a3 ==

I made this
yode to
find the
drift
between
multiple
systems

== a4 ==

I made this
codetto
find the
drift
between
multiple
systems

== a5 ==

I made this
code to
find the
drift
between
mbltiple
systems

== a6 ==

I made this
code to
find the
dBift
between
multiple
systems

== b1 ==

I made this
code to
find the
drift
betqeen
multiple
systems

== b2 ==

I made this
code to
find tXe
drift
between
multiple
systems

== b3 ==

I made this
code to
find the
drift
between
mustiple
systems

== b4 ==

I made this
code to
find the
drift
between
mulpiple
systems

== b5 ==

I made Ihis
code to
find the
drift
between
multiple
systems

== b6 ==

I made this
code to
find the
drift
between
mEltiple
systems

Result

== a1 ==
drifY
=== missing ===
drift
== a2 ==
orift
=== missing ===
drift
== a3 ==
yode to
=== missing ===
code to
== a4 ==
codetto
=== missing ===
code to
== a5 ==
mbltiple
=== missing ===
multiple
== a6 ==
dBift
=== missing ===
drift
== a7 ==
code 5o
=== missing ===
code to
== a8 ==
I made thiY
=== missing ===
I made this
== b1 ==
betqeen
=== missing ===
between
== b2 ==
find tXe
=== missing ===
find the
== b3 ==
mustiple
=== missing ===
multiple
== b4 ==
mulpiple
=== missing ===
multiple
== b5 ==
I made Ihis
=== missing ===
I made this
== b6 ==
mEltiple
=== missing ===
multiple
== b7 ==
betfeen
=== missing ===
between
== b8 ==
systemI
=== missing ===
systems
== c1 ==
fiZd the
=== missing ===
find the
== c2 ==
I mSde this
=== missing ===
I made this
== c3 ==
mu5tiple
=== missing ===
multiple
== c4 ==
I maAe this
=== missing ===
I made this
== c5 ==
cJde to
=== missing ===
code to
== c6 ==
Qystems
=== missing ===
systems