About
Docs
Files
Support
Index

Essentials
AutoDocs
Library functions
Cooperate
Process mastering
Debug
Own debugging kit
FGE
Smart modeid pick
Loader
Shared lib. loader
Macros
Helper macros
Structs
All the structures
Logs
Library changelogs
Tools
Progs baked inside

Tools
amiscreen0.3
ansifilter0.1
auto2guide0.1
ccfilter0.1
chipset0.1
cputype0.1
device640.1
dupfilter0.1
expandlogs0.1
findsm0.2
fmpp0.1
fsmount0.1
genanim0.3
gensymtab0.1
hashlab0.4
ktm0.6
linkpoint0.1
logo2ilbm0.1
lpradio0.1
lzwpack0.1
minism0.2
mkheader0.1
modeidctrl0.1
mss0.6
multisum0.3
newterm0.2
numconv0.1
palhack1.1
patmat0.1
pfr0.2
playamitape0.1
qdbflags0.1
qfill0.1
resmodinit0.2
sadctrl0.2
saveassign0.1
setanim0.2
systime0.1
tokendump0.1
uassign0.1
viewlogo0.1
waitback0.4
wrapmount0.1



H
A
S
H
L
A
B

hashlab-0.4-1
--------------

---

Welcome to 'hashlab'  yer one stop for testing hashing routines. The reason
this proggy exists is... When i was trying to determine the best all-in-one
hashing algorithm i faced the problem of belief that no matter what routine
i choose it will be good enough. I found this rather bullshit non-sense and
decided to investigate. On  the net i found some really poor test tools who
judge on the usefulness of the routine just  by  hashing few lines of text.
Could not believe my eyes... In  those tests every  routine is the best and
of course has no or very low collision rate... At  first i thought, oh well
maybe that is true. Can you imagine what was  drawing on my face when i did
actually write  quick and dirty hash evaluator? My jaw hit  the floor after
i saw the-very-best hash routines scoring 666+ collisions after few seconds
of runtime at randomly picked text files...

After testing  few, simple algorithms i  have quickly  realised  that prime
numbers  rule and  non-prime  numbers do not rule! No matter how nifty your
strategy is without primes it all sucks :-] .

It  turns out that FNV(32 & 64 bit)  and BKDR(32 bit only) routines are the
best out of all in this package. Is there anything  else that is better and
simplier? I dont think so, but i might be wrong. Anyway im happy with FNV.

Below is a table that gives quick overview on all routines in this package:

---------------------------------------------------------------------------
|  mod. name  |  dict. name  |   d. lines | h. time |  colls |  det. time |
---------------------------------------------------------------------------
| d_ugly32.hl | d1ct_all.txt |     102305 |    7.42 |    666 | 0:00:00:00 |
| quick32i.hl | d1ct_all.txt |     102305 |    7.72 |     48 | 0:00:01:27 |
| d_sfh32i.hl | d1ct_all.txt |     102305 |    7.78 |     13 | 0:00:01:32 |
| _fnv64ai.hl | d1ct_all.txt |     102305 |    8.88 |      0 | 0:00:01:18 |
| d_pjw64i.hl | d1ct_all.txt |     102305 |    7.78 |      0 | 0:00:01:14 |
| od_dummy.hl | d1ct_all.txt |     102305 |    7.02 |      0 | 0:00:01:26 |
| mod_rs32.hl | d1ct_all.txt |     102305 |    7.36 |      1 | 0:00:01:27 |
| mod_js32.hl | d1ct_all.txt |     102305 |    7.30 |    666 | 0:00:00:21 |
| od_elf32.hl | d1ct_all.txt |     102305 |    7.48 |    666 | 0:00:00:10 |
| d_bkdr32.hl | d1ct_all.txt |     102305 |    7.22 |      0 | 0:00:01:27 |
| d_sdbm32.hl | d1ct_all.txt |     102305 |    7.34 |      2 | 0:00:01:27 |
| od_djb32.hl | d1ct_all.txt |     102305 |    7.22 |    222 | 0:00:01:28 |
| od_dek32.hl | d1ct_all.txt |     102305 |    7.40 |    515 | 0:00:01:29 |
| mod_bp32.hl | d1ct_all.txt |     102305 |    7.16 |    666 | 0:00:00:00 |
| od_fnv32.hl | d1ct_all.txt |     102305 |    7.24 |      0 | 0:00:01:29 |
| mod_ap32.hl | d1ct_all.txt |     102305 |    7.72 |     12 | 0:00:01:27 |
| od_pjw32.hl | d1ct_all.txt |     102305 |    7.44 |    666 | 0:00:00:10 |
---------------------------------------------------------------------------


 -           Beware! This program is considered experimental!            -

---

NEWS:

   [19-Aug-2014]   0.4-1 * Created  FNV128-1a case insensitive hash module.



   [15-Jan-2013]   0.4   * Twin-hash mode has been  added to observe if two
                           weak routines can act strong when merged.



   [19-Jul-2011]   0.3   * Memory  consumption  was reduced  greatly in the
                           area of literal data.



   [10-Feb-2011]   0.2   * Network  support  added.  Now  we  are  talking!

---

NOTES:

[*]
Requires  68020(no FPU)+, OS2.04(theoretically)+, LOTS megs of free memory,
bsdsocket.library 3+(if remote labs needed),  lots of time depending on CPU
power

[*]
Text  files must contain lines  of reasonable length and the last line must
be new line terminated!

[*]
Really! Do not waste your time on testing routines with  'hashlab' alone if
you  got  horse powered machines around. Just compile  'hashlabd' for these
platforms  and utilise them through the network! On the other hand when you
know  that certain routine is weak then using  7  processes may shorten the
test even on a single CPU but that seems pointless anyway :-) .

[*]
If  you know  that your platform supports multi-cored CPUs and 'pthread' is
able to use this facility then pass single IP address of remote lab as many
times as there is CPU cores. Same applies to number of CPUs!

[*]
Read  the comment  block in the source code for more geeky details or study
the code if you dare... You  will also have to  read this and that in order
to create new modules.

[*]
In this package you will find few  dictionaries that can be merged into one
big dictionary with the 't_dictmerge' ARexx script.

[*]
Visit  ftp://ftp.openwall.com/pub/passwords/wordlists/all.gz  for  nice 50+
meg  dictionary. You  can also create  your own by  making listing  of your
harddisk or  CD's  and then passing it to duplicate  line remover and maybe
tokenizer.

[*]
The new thing is so called  "twinhash" mode that allows to merge results of
two  hash routines. This may be  useful in case  when  you need  to examine
certain  dataset to see if one routine can support the other when collision
happens. I  can tell you that this  works and is much  better  than dealing
with  collisions when they occurr. Combining two different philosophies who
are zero-base and non-zero-base should be really effective!

[*]
Short permutations say  3  chars long are often good hash  routine strength
indication so you should start with these when testing your own idea.

---

HELP:

   > hashlab ?

     IF=INPUTFILE/A,HM=HASHMODULE/A,BL=BUFFERLEN/N,TC=TASKCOUNT/N,
     SC=STARTCELL/N,IR=ITERARATE/N,NE=NETENTRY/K,MC=MAXCOLLS/N,
     SH=SHOWCOLLS/S,32=32BITHASH/S,64=64BITHASH/S,96=96BITHASH/S,
     LO=LOADONLY/S,NO=NOSTATUS/S



   IF=INPUTFILE/A   -  Text file of any kind. It is important for this file
                       to have new line separated lines. By default maximum
                       line length allowed is 128 bytes!

   HM=HASHMODULE/A  -  Hashing routine to be put to a test. These are plain
                       'LoadSeg()' objects with special header & internals.
                       You can  pass two such routines  here by  delimiting
                       them  with comma(','). This  will activate  twinhash
                       mode.

   BL=BUFFERLEN/N   -  Maximum  size  of  one  line.  In  early  time  this
                       mattered, now  the  line will be all dynamic anyway,
                       so you can set it to very big value without noticing
                       enormous memory consumption.

   TC=TASKCOUNT/N   -  Number of subprocesses/remote labs involved. This is
                       only  useful when using 'hashlabd', but can possibly
                       be  of  use  under emulation to speed up processing?
                       It for sure helps if your remote labs work on multi-
                       cored CPUs. Max of 7 subprocesses can be used.

   SC=STARTCELL/N   -  Start processing  at  this line/cell. This should be
                       only  used  with  one  process,  since sub-processes
                       get fed with dynamic ranges.

   IR=ITERARATE/N   -  When this  is specified(number of loops)  then speed
                       test of this very hashing routine will be performed.
                       Should pass at least 32000 loops.

   NE=NETENTRY/K    -  Network support!  Do not be afraid uncle megacz took
                       care  of it to be easy. You can  either pass the lab
                       IP  addresses  separated  by commas or use broadcast
                       address  so they  will be located automagically. You
                       can  even: .255(255.255.255.255) for short. In  case
                       of  wireless  networks  broadcast  address  must  be
                       narrowed to a  particular network(192.168.255.255 or
                       10.255.255.255, ...)!

   MC=MAXCOLLS/N    -  How many  collisions to catch to consider this algo.
                       useless? By default this is 666, evil muahahaha!

   SH=SHOWCOLLS/S   -  Show collisions  as they happen. You can toggle that
                       with 'CTRL + F' keys.

   32=32BITHASH/S   -  Force  32bit hashing. The rest of the 128bit virtual
                       datatype will be zero.

   64=64BITHASH/S   -  Force  64bit hashing. The rest of the 128bit virtual
                       datatype will be zero.

   96=96BITHASH/S   -  Force  96bit hashing. The rest of the 128bit virtual
                       datatype will be zero.

   LO=LOADONLY/S    -  Load  and hash  the data  only and quit. This can be
                       used to measure the time and such.

   NO=NOSTATUS/S    -  Do  not display status line during test. Useful when
                       redirecting the output.

---

USAGE:

   ; Check on DJB routine using upto 2 remote labs available
   ; in your LAN
   hashlab dict_step.txt mod_djb32.hl tc 2 ne .255

   ; Turn on twin-hash mode using two weakest 32bit routines
   ; (this proves the concept, so two strong routines can in
   ; theory be just unbeatable at only 64 bits)
   hashlab dict_perm.txt mod_elf32.hl,mod_js32.hl mc 1

---
megacz
    


No more fear cus pure HTML is here!
Copyright (C) 2013-2014 by Burnt Chip Dominators