|
| | | | H A S H L A B
| | | | |
hashlab-0.4-1
--------------
---
Welcome to 'hashlab' yer one stop for testing hashing routines. The reason
this proggy exists is... When i was trying to determine the best all-in-one
hashing algorithm i faced the problem of belief that no matter what routine
i choose it will be good enough. I found this rather bullshit non-sense and
decided to investigate. On the net i found some really poor test tools who
judge on the usefulness of the routine just by hashing few lines of text.
Could not believe my eyes... In those tests every routine is the best and
of course has no or very low collision rate... At first i thought, oh well
maybe that is true. Can you imagine what was drawing on my face when i did
actually write quick and dirty hash evaluator? My jaw hit the floor after
i saw the-very-best hash routines scoring 666+ collisions after few seconds
of runtime at randomly picked text files...
After testing few, simple algorithms i have quickly realised that prime
numbers rule and non-prime numbers do not rule! No matter how nifty your
strategy is without primes it all sucks :-] .
It turns out that FNV(32 & 64 bit) and BKDR(32 bit only) routines are the
best out of all in this package. Is there anything else that is better and
simplier? I dont think so, but i might be wrong. Anyway im happy with FNV.
Below is a table that gives quick overview on all routines in this package:
---------------------------------------------------------------------------
| mod. name | dict. name | d. lines | h. time | colls | det. time |
---------------------------------------------------------------------------
| d_ugly32.hl | d1ct_all.txt | 102305 | 7.42 | 666 | 0:00:00:00 |
| quick32i.hl | d1ct_all.txt | 102305 | 7.72 | 48 | 0:00:01:27 |
| d_sfh32i.hl | d1ct_all.txt | 102305 | 7.78 | 13 | 0:00:01:32 |
| _fnv64ai.hl | d1ct_all.txt | 102305 | 8.88 | 0 | 0:00:01:18 |
| d_pjw64i.hl | d1ct_all.txt | 102305 | 7.78 | 0 | 0:00:01:14 |
| od_dummy.hl | d1ct_all.txt | 102305 | 7.02 | 0 | 0:00:01:26 |
| mod_rs32.hl | d1ct_all.txt | 102305 | 7.36 | 1 | 0:00:01:27 |
| mod_js32.hl | d1ct_all.txt | 102305 | 7.30 | 666 | 0:00:00:21 |
| od_elf32.hl | d1ct_all.txt | 102305 | 7.48 | 666 | 0:00:00:10 |
| d_bkdr32.hl | d1ct_all.txt | 102305 | 7.22 | 0 | 0:00:01:27 |
| d_sdbm32.hl | d1ct_all.txt | 102305 | 7.34 | 2 | 0:00:01:27 |
| od_djb32.hl | d1ct_all.txt | 102305 | 7.22 | 222 | 0:00:01:28 |
| od_dek32.hl | d1ct_all.txt | 102305 | 7.40 | 515 | 0:00:01:29 |
| mod_bp32.hl | d1ct_all.txt | 102305 | 7.16 | 666 | 0:00:00:00 |
| od_fnv32.hl | d1ct_all.txt | 102305 | 7.24 | 0 | 0:00:01:29 |
| mod_ap32.hl | d1ct_all.txt | 102305 | 7.72 | 12 | 0:00:01:27 |
| od_pjw32.hl | d1ct_all.txt | 102305 | 7.44 | 666 | 0:00:00:10 |
---------------------------------------------------------------------------
- Beware! This program is considered experimental! -
---
NEWS:
[19-Aug-2014] 0.4-1 * Created FNV128-1a case insensitive hash module.
[15-Jan-2013] 0.4 * Twin-hash mode has been added to observe if two
weak routines can act strong when merged.
[19-Jul-2011] 0.3 * Memory consumption was reduced greatly in the
area of literal data.
[10-Feb-2011] 0.2 * Network support added. Now we are talking!
---
NOTES:
[*]
Requires 68020(no FPU)+, OS2.04(theoretically)+, LOTS megs of free memory,
bsdsocket.library 3+(if remote labs needed), lots of time depending on CPU
power
[*]
Text files must contain lines of reasonable length and the last line must
be new line terminated!
[*]
Really! Do not waste your time on testing routines with 'hashlab' alone if
you got horse powered machines around. Just compile 'hashlabd' for these
platforms and utilise them through the network! On the other hand when you
know that certain routine is weak then using 7 processes may shorten the
test even on a single CPU but that seems pointless anyway :-) .
[*]
If you know that your platform supports multi-cored CPUs and 'pthread' is
able to use this facility then pass single IP address of remote lab as many
times as there is CPU cores. Same applies to number of CPUs!
[*]
Read the comment block in the source code for more geeky details or study
the code if you dare... You will also have to read this and that in order
to create new modules.
[*]
In this package you will find few dictionaries that can be merged into one
big dictionary with the 't_dictmerge' ARexx script.
[*]
Visit ftp://ftp.openwall.com/pub/passwords/wordlists/all.gz for nice 50+
meg dictionary. You can also create your own by making listing of your
harddisk or CD's and then passing it to duplicate line remover and maybe
tokenizer.
[*]
The new thing is so called "twinhash" mode that allows to merge results of
two hash routines. This may be useful in case when you need to examine
certain dataset to see if one routine can support the other when collision
happens. I can tell you that this works and is much better than dealing
with collisions when they occurr. Combining two different philosophies who
are zero-base and non-zero-base should be really effective!
[*]
Short permutations say 3 chars long are often good hash routine strength
indication so you should start with these when testing your own idea.
---
HELP:
> hashlab ?
IF=INPUTFILE/A,HM=HASHMODULE/A,BL=BUFFERLEN/N,TC=TASKCOUNT/N,
SC=STARTCELL/N,IR=ITERARATE/N,NE=NETENTRY/K,MC=MAXCOLLS/N,
SH=SHOWCOLLS/S,32=32BITHASH/S,64=64BITHASH/S,96=96BITHASH/S,
LO=LOADONLY/S,NO=NOSTATUS/S
IF=INPUTFILE/A - Text file of any kind. It is important for this file
to have new line separated lines. By default maximum
line length allowed is 128 bytes!
HM=HASHMODULE/A - Hashing routine to be put to a test. These are plain
'LoadSeg()' objects with special header & internals.
You can pass two such routines here by delimiting
them with comma(','). This will activate twinhash
mode.
BL=BUFFERLEN/N - Maximum size of one line. In early time this
mattered, now the line will be all dynamic anyway,
so you can set it to very big value without noticing
enormous memory consumption.
TC=TASKCOUNT/N - Number of subprocesses/remote labs involved. This is
only useful when using 'hashlabd', but can possibly
be of use under emulation to speed up processing?
It for sure helps if your remote labs work on multi-
cored CPUs. Max of 7 subprocesses can be used.
SC=STARTCELL/N - Start processing at this line/cell. This should be
only used with one process, since sub-processes
get fed with dynamic ranges.
IR=ITERARATE/N - When this is specified(number of loops) then speed
test of this very hashing routine will be performed.
Should pass at least 32000 loops.
NE=NETENTRY/K - Network support! Do not be afraid uncle megacz took
care of it to be easy. You can either pass the lab
IP addresses separated by commas or use broadcast
address so they will be located automagically. You
can even: .255(255.255.255.255) for short. In case
of wireless networks broadcast address must be
narrowed to a particular network(192.168.255.255 or
10.255.255.255, ...)!
MC=MAXCOLLS/N - How many collisions to catch to consider this algo.
useless? By default this is 666, evil muahahaha!
SH=SHOWCOLLS/S - Show collisions as they happen. You can toggle that
with 'CTRL + F' keys.
32=32BITHASH/S - Force 32bit hashing. The rest of the 128bit virtual
datatype will be zero.
64=64BITHASH/S - Force 64bit hashing. The rest of the 128bit virtual
datatype will be zero.
96=96BITHASH/S - Force 96bit hashing. The rest of the 128bit virtual
datatype will be zero.
LO=LOADONLY/S - Load and hash the data only and quit. This can be
used to measure the time and such.
NO=NOSTATUS/S - Do not display status line during test. Useful when
redirecting the output.
---
USAGE:
; Check on DJB routine using upto 2 remote labs available
; in your LAN
hashlab dict_step.txt mod_djb32.hl tc 2 ne .255
; Turn on twin-hash mode using two weakest 32bit routines
; (this proves the concept, so two strong routines can in
; theory be just unbeatable at only 64 bits)
hashlab dict_perm.txt mod_elf32.hl,mod_js32.hl mc 1
---
megacz
| |
| | | | |
|