Commit 402b1332 by heffnercj

Initial check-in to git.

parent 4001b759
DESCRIPTION
The binwalk python module can be used by any python script to programatically perform binwalk scans and
obtain the results of those scans.
The classes, methods and objects in the binwalk modules are documented via pydoc, including examples,
so those interested in using the binwalk module are encouraged to look there. However, several common usage
examples are provided here to help jump-start development efforts.
BASIC SCAN
The following is an example of the simplest scan, and is equivalent to running binwalk on the command line
with no additional arguments:
import pprint
from binwalk import Binwalk
with Binwalk() as bw:
pprint.PrettyPrinter().pprint(bw.scan('firmware.bin'))
The scan() method will return a dictionary of results, and may also be passed a list of files:
from binwalk import Binwalk
with Binwalk() as bw:
for (filename, file_results) in bw.scan(['firmware1.bin', 'firmware2.bin']).iteritems():
print "Results for %s:" % filename
for (offset, results) in file_results:
for result in results:
print offset, result['description']
Alternatively, a callback function may be specified. The callback function is called as soon as a match is found.
It is passed two arguments: the offset at which the match was found, and a list of results dictionaries (one dictionary
per result found at that offset):
from binwalk import Binwalk
def my_callback(offset, results):
print "Found %d results at offset %d:" % (len(results), offset)
for result in results:
print " %s" % result['description']
with Binwalk() as bw:
bw.scan('firmware.bin', callback=my_callback)
ADDING FILTERS
Include and exclude filters may be specified which operate identically to the --include, and --exclude binwalk
command line options:
from binwalk import Binwalk
binwalk = Binwalk()
# Exclusively filters out all signatures except those containing the string 'filesystem' (same as --include)
binwalk.filter.include('filesystem')
# Excludes all results that contain the string 'jffs2' (same as --exclude)
binwalk.filter.exclude('jffs2')
binwalk.scan('firmware')
binwalk.cleanup()
EXTRACTING FILES
Extract rules may be specified which operate identically to the --dd and --extract binwalk command line options.
Extraction is automatically enabled when one or more extraction rules are specified.
To add a custom extract rule, or a list of extract rules (such as with the --dd option):
from binwalk import Binwalk
binwalk = Binwalk()
# Extract results containing the string 'gzip' with a file extension of 'gz' and run the gunzip command
binwalk.extractor.add_rule('gzip:gz:gunzip %e')
# Extract 'gzip' and 'filesystem' results
binwalk.extractor.add_rule(['gzip:gz', 'filesystem:fs'])
binwalk.scan('firmware')
binwalk.cleanup()
To load the default extraction rules from the extract.conf file (such as with the --extract option):
from binwalk import Binwalk
binwalk = Binwalk()
binwalk.extractor.load_defaults()
binwalk.scan('firmware.bin')
binwalk.cleanup()
To enabled delayed file extraction (such as with the --delay option):
from binwalk import Binwalk
binwalk = Binwalk()
binwalk.extractor.enable_delayed_extract(True)
binwalk.scan('firmware.bin')
binwalk.cleanup()
To enable file cleanup after extraction (such as with the --rm option):
from binwalk import Binwalk
binwalk = Binwalk()
binwalk.extractor.cleanup_extracted_files(True)
binwalk.scan('firmware.bin')
binwalk.cleanup()
export CC=@CC@
export CFLAGS=@CFLAGS@
export SONAME=@SONAME@
export LIBDIR=@libdir@
all: clean
make -C miniz
make -C compress
install:
make -C miniz install
make -C compress install
.PHONY: clean distclean
clean:
make -C miniz clean
make -C compress clean
distclean:
make -C miniz distclean
make -C compress distclean
rm -rf *.cache config.* Makefile
LIBNAME=libcompress42.so
all: clean $(LIBNAME)
$(LIBNAME): compress42.o
$(CC) $(CFLAGS) -shared -Wl,$(SONAME),$(LIBNAME) compress42.o -o $(LIBNAME) $(LDFLAGS)
compress42.o:
$(CC) $(CFLAGS) compress42.c -c
install:
install -D -m644 $(LIBNAME) $(DESTDIR)$(LIBDIR)/$(LIBNAME)
.PHONY: clean distclean
clean:
rm -f *.o
distclean: clean
rm -f $(LIBNAME)
Unix compress implementation of LZW (from debian source repository).
Used by the compressd plugin to validate potential compress'd candidates.
/* (N)compress42.c - File compression ala IEEE Computer, Mar 1992.
*
* Authors:
* Spencer W. Thomas (decvax!harpo!utah-cs!utah-gr!thomas)
* Jim McKie (decvax!mcvax!jim)
* Steve Davies (decvax!vax135!petsd!peora!srd)
* Ken Turkowski (decvax!decwrl!turtlevax!ken)
* James A. Woods (decvax!ihnp4!ames!jaw)
* Joe Orost (decvax!vax135!petsd!joe)
* Dave Mack (csu@alembic.acs.com)
* Peter Jannesen, Network Communication Systems
* (peter@ncs.nl)
*
* Revision 4.2.3 92/03/14 peter@ncs.nl
* Optimise compress and decompress function and a lot of cleanups.
* New fast hash algoritme added (if more than 800Kb available).
*
* Revision 4.1 91/05/26 csu@alembic.acs.com
* Modified to recursively compress directories ('r' flag). As a side
* effect, compress will no longer attempt to compress things that
* aren't "regular" files. See Changes.
*
* Revision 4.0 85/07/30 12:50:00 joe
* Removed ferror() calls in output routine on every output except first.
* Prepared for release to the world.
*
* Revision 3.6 85/07/04 01:22:21 joe
* Remove much wasted storage by overlaying hash table with the tables
* used by decompress: tab_suffix[1<<BITS], stack[8000]. Updated USERMEM
* computations. Fixed dump_tab() DEBUG routine.
*
* Revision 3.5 85/06/30 20:47:21 jaw
* Change hash function to use exclusive-or. Rip out hash cache. These
* speedups render the megamemory version defunct, for now. Make decoder
* stack global. Parts of the RCS trunks 2.7, 2.6, and 2.1 no longer apply.
*
* Revision 3.4 85/06/27 12:00:00 ken
* Get rid of all floating-point calculations by doing all compression ratio
* calculations in fixed point.
*
* Revision 3.3 85/06/24 21:53:24 joe
* Incorporate portability suggestion for M_XENIX. Got rid of text on #else
* and #endif lines. Cleaned up #ifdefs for vax and interdata.
*
* Revision 3.2 85/06/06 21:53:24 jaw
* Incorporate portability suggestions for Z8000, IBM PC/XT from mailing list.
* Default to "quiet" output (no compression statistics).
*
* Revision 3.1 85/05/12 18:56:13 jaw
* Integrate decompress() stack speedups (from early pointer mods by McKie).
* Repair multi-file USERMEM gaffe. Unify 'force' flags to mimic semantics
* of SVR2 'pack'. Streamline block-compress table clear logic. Increase
* output byte count by magic number size.
*
* Revision 3.0 84/11/27 11:50:00 petsd!joe
* Set HSIZE depending on BITS. Set BITS depending on USERMEM. Unrolled
* loops in clear routines. Added "-C" flag for 2.0 compatibility. Used
* unsigned compares on Perkin-Elmer. Fixed foreground check.
*
* Revision 2.7 84/11/16 19:35:39 ames!jaw
* Cache common hash codes based on input statistics; this improves
* performance for low-density raster images. Pass on #ifdef bundle
* from Turkowski.
*
* Revision 2.6 84/11/05 19:18:21 ames!jaw
* Vary size of hash tables to reduce time for small files.
* Tune PDP-11 hash function.
*
* Revision 2.5 84/10/30 20:15:14 ames!jaw
* Junk chaining; replace with the simpler (and, on the VAX, faster)
* double hashing, discussed within. Make block compression standard.
*
* Revision 2.4 84/10/16 11:11:11 ames!jaw
* Introduce adaptive reset for block compression, to boost the rate
* another several percent. (See mailing list notes.)
*
* Revision 2.3 84/09/22 22:00:00 petsd!joe
* Implemented "-B" block compress. Implemented REVERSE sorting of tab_next.
* Bug fix for last bits. Changed fwrite to putchar loop everywhere.
*
* Revision 2.2 84/09/18 14:12:21 ames!jaw
* Fold in news changes, small machine typedef from thomas,
* #ifdef interdata from joe.
*
* Revision 2.1 84/09/10 12:34:56 ames!jaw
* Configured fast table lookup for 32-bit machines.
* This cuts user time in half for b <= FBITS, and is useful for news batching
* from VAX to PDP sites. Also sped up decompress() [fwrite->putc] and
* added signal catcher [plus beef in write_error()] to delete effluvia.
*
* Revision 2.0 84/08/28 22:00:00 petsd!joe
* Add check for foreground before prompting user. Insert maxbits into
* compressed file. Force file being uncompressed to end with ".Z".
* Added "-c" flag and "zcat". Prepared for release.
*
* Revision 1.10 84/08/24 18:28:00 turtlevax!ken
* Will only compress regular files (no directories), added a magic number
* header (plus an undocumented -n flag to handle old files without headers),
* added -f flag to force overwriting of possibly existing destination file,
* otherwise the user is prompted for a response. Will tack on a .Z to a
* filename if it doesn't have one when decompressing. Will only replace
* file if it was compressed.
*
* Revision 1.9 84/08/16 17:28:00 turtlevax!ken
* Removed scanargs(), getopt(), added .Z extension and unlimited number of
* filenames to compress. Flags may be clustered (-Ddvb12) or separated
* (-D -d -v -b 12), or combination thereof. Modes and other status is
* copied with copystat(). -O bug for 4.2 seems to have disappeared with
* 1.8.
*
* Revision 1.8 84/08/09 23:15:00 joe
* Made it compatible with vax version, installed jim's fixes/enhancements
*
* Revision 1.6 84/08/01 22:08:00 joe
* Sped up algorithm significantly by sorting the compress chain.
*
* Revision 1.5 84/07/13 13:11:00 srd
* Added C version of vax asm routines. Changed structure to arrays to
* save much memory. Do unsigned compares where possible (faster on
* Perkin-Elmer)
*
* Revision 1.4 84/07/05 03:11:11 thomas
* Clean up the code a little and lint it. (Lint complains about all
* the regs used in the asm, but I'm not going to "fix" this.)
*
* Revision 1.3 84/07/05 02:06:54 thomas
* Minor fixes.
*
* Revision 1.2 84/07/05 00:27:27 thomas
* Add variable bit length output.
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <ctype.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#ifdef DIRENT
# include <dirent.h>
# define RECURSIVE 1
# undef SYSDIR
#endif
#ifdef SYSDIR
# include <sys/dir.h>
# define RECURSIVE 1
#endif
#ifdef UTIME_H
# include <utime.h>
#else
struct utimbuf {
time_t actime;
time_t modtime;
};
#endif
#ifdef __STDC__
# define ARGS(a) a
#else
# define ARGS(a) ()
#endif
#define LARGS(a) () /* Relay on include files for libary func defs. */
#ifndef SIG_TYPE
# define SIG_TYPE void (*)()
#endif
/* CJH */
#define NOFUNCDEF
#ifndef NOFUNCDEF
extern void *malloc LARGS((int));
extern void free LARGS((void *));
#ifndef _IBMR2
extern int open LARGS((char const *,int,...));
#endif
extern int close LARGS((int));
extern int read LARGS((int,void *,int));
extern int write LARGS((int,void const *,int));
extern int chmod LARGS((char const *,int));
extern int unlink LARGS((char const *));
extern int chown LARGS((char const *,int,int));
extern int utime LARGS((char const *,struct utimbuf const *));
extern char *strcpy LARGS((char *,char const *));
extern char *strcat LARGS((char *,char const *));
extern int strcmp LARGS((char const *,char const *));
extern unsigned strlen LARGS((char const *));
extern void *memset LARGS((void *,char,unsigned int));
extern void *memcpy LARGS((void *,void const *,unsigned int));
extern int atoi LARGS((char const *));
extern void exit LARGS((int));
extern int isatty LARGS((int));
#endif
#define MARK(a) { asm(" .globl M.a"); asm("M.a:"); }
#ifdef DEF_ERRNO
extern int errno;
#endif
/* CJH */
//#include "patchlevel.h"
#undef min
#define min(a,b) ((a>b) ? b : a)
#ifndef IBUFSIZ
# define IBUFSIZ BUFSIZ /* Defailt input buffer size */
#endif
#ifndef OBUFSIZ
# define OBUFSIZ BUFSIZ /* Default output buffer size */
#endif
#define MAXPATHLEN 1024 /* MAXPATHLEN - maximum length of a pathname we allow */
#define SIZE_INNER_LOOP 256 /* Size of the inter (fast) compress loop */
/* Defines for third byte of header */
#define MAGIC_1 (char_type)'\037'/* First byte of compressed file */
#define MAGIC_2 (char_type)'\235'/* Second byte of compressed file */
#define BIT_MASK 0x1f /* Mask for 'number of compresssion bits' */
/* Masks 0x20 and 0x40 are free. */
/* I think 0x20 should mean that there is */
/* a fourth header byte (for expansion). */
#define BLOCK_MODE 0x80 /* Block compresssion if table is full and */
/* compression rate is dropping flush tables */
/* the next two codes should not be changed lightly, as they must not */
/* lie within the contiguous general code space. */
#define FIRST 257 /* first free entry */
#define CLEAR 256 /* table clear output code */
#define INIT_BITS 9 /* initial number of bits/code */
#ifndef SACREDMEM
/*
* SACREDMEM is the amount of physical memory saved for others; compress
* will hog the rest.
*/
# define SACREDMEM 0
#endif
#ifndef USERMEM
/*
* Set USERMEM to the maximum amount of physical user memory available
* in bytes. USERMEM is used to determine the maximum BITS that can be used
* for compression.
*/
# define USERMEM 450000 /* default user memory */
#endif
#ifndef BYTEORDER
# define BYTEORDER 0000
#endif
#ifndef NOALLIGN
# define NOALLIGN 0
#endif
/*
* machine variants which require cc -Dmachine: pdp11, z8000, DOS
*/
#ifdef interdata /* Perkin-Elmer */
# define SIGNED_COMPARE_SLOW /* signed compare is slower than unsigned */
#endif
#ifdef pdp11 /* PDP11: don't forget to compile with -i */
# define BITS 12 /* max bits/code for 16-bit machine */
# define NO_UCHAR /* also if "unsigned char" functions as signed char */
#endif /* pdp11 */
#ifdef z8000 /* Z8000: */
# define BITS 12 /* 16-bits processor max 12 bits */
# undef vax /* weird preprocessor */
#endif /* z8000 */
#ifdef DOS /* PC/XT/AT (8088) processor */
# define BITS 16 /* 16-bits processor max 12 bits */
# if BITS == 16
# define MAXSEG_64K
# endif
# undef BYTEORDER
# define BYTEORDER 4321
# undef NOALLIGN
# define NOALLIGN 1
# define COMPILE_DATE __DATE__
#endif /* DOS */
#ifndef O_BINARY
# define O_BINARY 0 /* System has no binary mode */
#endif
#ifdef M_XENIX /* Stupid compiler can't handle arrays with */
# if BITS == 16 /* more than 65535 bytes - so we fake it */
# define MAXSEG_64K
# else
# if BITS > 13 /* Code only handles BITS = 12, 13, or 16 */
# define BITS 13
# endif
# endif
#endif
#ifndef BITS /* General processor calculate BITS */
# if USERMEM >= (800000+SACREDMEM)
# define FAST
# else
# if USERMEM >= (433484+SACREDMEM)
# define BITS 16
# else
# if USERMEM >= (229600+SACREDMEM)
# define BITS 15
# else
# if USERMEM >= (127536+SACREDMEM)
# define BITS 14
# else
# if USERMEM >= (73464+SACREDMEM)
# define BITS 13
# else
# define BITS 12
# endif
# endif
# endif
# endif
# endif
#endif /* BITS */
#ifdef FAST
# define HBITS 17 /* 50% occupancy */
# define HSIZE (1<<HBITS)
# define HMASK (HSIZE-1)
# define HPRIME 9941
# define BITS 16
# undef MAXSEG_64K
#else
# if BITS == 16
# define HSIZE 69001 /* 95% occupancy */
# endif
# if BITS == 15
# define HSIZE 35023 /* 94% occupancy */
# endif
# if BITS == 14
# define HSIZE 18013 /* 91% occupancy */
# endif
# if BITS == 13
# define HSIZE 9001 /* 91% occupancy */
# endif
# if BITS <= 12
# define HSIZE 5003 /* 80% occupancy */
# endif
#endif
#define CHECK_GAP 10000
typedef long int code_int;
#ifdef SIGNED_COMPARE_SLOW
typedef unsigned long int count_int;
typedef unsigned short int count_short;
typedef unsigned long int cmp_code_int; /* Cast to make compare faster */
#else
typedef long int count_int;
typedef long int cmp_code_int;
#endif
typedef unsigned char char_type;
#define ARGVAL() (*++(*argv) || (--argc && *++argv))
#define MAXCODE(n) (1L << (n))
#ifndef REGISTERS
# define REGISTERS 2
#endif
#define REG1
#define REG2
#define REG3
#define REG4
#define REG5
#define REG6
#define REG7
#define REG8
#define REG9
#define REG10
#define REG11
#define REG12
#define REG13
#define REG14
#define REG15
#define REG16
#if REGISTERS >= 1
# undef REG1
# define REG1 register
#endif
#if REGISTERS >= 2
# undef REG2
# define REG2 register
#endif
#if REGISTERS >= 3
# undef REG3
# define REG3 register
#endif
#if REGISTERS >= 4
# undef REG4
# define REG4 register
#endif
#if REGISTERS >= 5
# undef REG5
# define REG5 register
#endif
#if REGISTERS >= 6
# undef REG6
# define REG6 register
#endif
#if REGISTERS >= 7
# undef REG7
# define REG7 register
#endif
#if REGISTERS >= 8
# undef REG8
# define REG8 register
#endif
#if REGISTERS >= 9
# undef REG9
# define REG9 register
#endif
#if REGISTERS >= 10
# undef REG10
# define REG10 register
#endif
#if REGISTERS >= 11
# undef REG11
# define REG11 register
#endif
#if REGISTERS >= 12
# undef REG12
# define REG12 register
#endif
#if REGISTERS >= 13
# undef REG13
# define REG13 register
#endif
#if REGISTERS >= 14
# undef REG14
# define REG14 register
#endif
#if REGISTERS >= 15
# undef REG15
# define REG15 register
#endif
#if REGISTERS >= 16
# undef REG16
# define REG16 register
#endif
union bytes
{
long word;
struct
{
#if BYTEORDER == 4321
char_type b1;
char_type b2;
char_type b3;
char_type b4;
#else
#if BYTEORDER == 1234
char_type b4;
char_type b3;
char_type b2;
char_type b1;
#else
# undef BYTEORDER
int dummy;
#endif
#endif
} bytes;
} ;
#if BYTEORDER == 4321 && NOALLIGN == 1
#define output(b,o,c,n) { \
*(long *)&((b)[(o)>>3]) |= ((long)(c))<<((o)&0x7);\
(o) += (n); \
}
#else
#ifdef BYTEORDER
#define output(b,o,c,n) { REG1 char_type *p = &(b)[(o)>>3]; \
union bytes i; \
i.word = ((long)(c))<<((o)&0x7); \
p[0] |= i.bytes.b1; \
p[1] |= i.bytes.b2; \
p[2] |= i.bytes.b3; \
(o) += (n); \
}
#else
#define output(b,o,c,n) { REG1 char_type *p = &(b)[(o)>>3]; \
REG2 long i = ((long)(c))<<((o)&0x7); \
p[0] |= (char_type)(i); \
p[1] |= (char_type)(i>>8); \
p[2] |= (char_type)(i>>16); \
(o) += (n); \
}
#endif
#endif
#if BYTEORDER == 4321 && NOALLIGN == 1
#define input(b,o,c,n,m){ \
(c) = (*(long *)(&(b)[(o)>>3])>>((o)&0x7))&(m); \
(o) += (n); \
}
#else
#define input(b,o,c,n,m){ REG1 char_type *p = &(b)[(o)>>3]; \
(c) = ((((long)(p[0]))|((long)(p[1])<<8)| \
((long)(p[2])<<16))>>((o)&0x7))&(m); \
(o) += (n); \
}
#endif
char *progname; /* Program name */
int silent = 0; /* don't tell me about errors */
int quiet = 1; /* don't tell me about compression */
int do_decomp = 0; /* Decompress mode */
int force = 0; /* Force overwrite of files and links */
int nomagic = 0; /* Use a 3-byte magic number header, */
/* unless old file */
int block_mode = BLOCK_MODE;/* Block compress mode -C compatible with 2.0*/
int maxbits = BITS; /* user settable max # bits/code */
int zcat_flg = 0; /* Write output on stdout, suppress messages */
int recursive = 0; /* compress directories */
int exit_code = -1; /* Exitcode of compress (-1 no file compressed) */
char_type inbuf[IBUFSIZ+64]; /* Input buffer */
char_type outbuf[OBUFSIZ+2048];/* Output buffer */
struct stat infstat; /* Input file status */
char *ifname; /* Input filename */
int remove_ofname = 0; /* Remove output file on a error */
char ofname[MAXPATHLEN]; /* Output filename */
int fgnd_flag = 0; /* Running in background (SIGINT=SIGIGN) */
long bytes_in; /* Total number of byte from input */
long bytes_out; /* Total number of byte to output */
/*
* 8086 & 80286 Has a problem with array bigger than 64K so fake the array
* For processors with a limited address space and segments.
*/
/*
* To save much memory, we overlay the table used by compress() with those
* used by decompress(). The tab_prefix table is the same size and type
* as the codetab. The tab_suffix table needs 2**BITS characters. We
* get this from the beginning of htab. The output stack uses the rest
* of htab, and contains characters. There is plenty of room for any
* possible stack (stack used to be 8000 characters).
*/
#ifdef MAXSEG_64K
count_int htab0[8192];
count_int htab1[8192];
count_int htab2[8192];
count_int htab3[8192];
count_int htab4[8192];
count_int htab5[8192];
count_int htab6[8192];
count_int htab7[8192];
count_int htab8[HSIZE-65536];
count_int * htab[9] = {htab0,htab1,htab2,htab3,htab4,htab5,htab6,htab7,htab8};
unsigned short code0tab[16384];
unsigned short code1tab[16384];
unsigned short code2tab[16384];
unsigned short code3tab[16384];
unsigned short code4tab[16384];
unsigned short * codetab[5] = {code0tab,code1tab,code2tab,code3tab,code4tab};
# define htabof(i) (htab[(i) >> 13][(i) & 0x1fff])
# define codetabof(i) (codetab[(i) >> 14][(i) & 0x3fff])
# define tab_prefixof(i) codetabof(i)
# define tab_suffixof(i) ((char_type *)htab[(i)>>15])[(i) & 0x7fff]
# define de_stack ((char_type *)(&htab2[8191]))
void clear_htab()
{
memset(htab0, -1, sizeof(htab0));
memset(htab1, -1, sizeof(htab1));
memset(htab2, -1, sizeof(htab2));
memset(htab3, -1, sizeof(htab3));
memset(htab4, -1, sizeof(htab4));
memset(htab5, -1, sizeof(htab5));
memset(htab6, -1, sizeof(htab6));
memset(htab7, -1, sizeof(htab7));
memset(htab8, -1, sizeof(htab8));
}
# define clear_tab_prefixof() memset(code0tab, 0, 256);
#else /* Normal machine */
count_int htab[HSIZE];
unsigned short codetab[HSIZE];
# define htabof(i) htab[i]
# define codetabof(i) codetab[i]
# define tab_prefixof(i) codetabof(i)
# define tab_suffixof(i) ((char_type *)(htab))[i]
# define de_stack ((char_type *)&(htab[HSIZE-1]))
# define clear_htab() memset(htab, -1, sizeof(htab))
# define clear_tab_prefixof() memset(codetab, 0, 256);
#endif /* MAXSEG_64K */
#ifdef FAST
int primetab[256] = /* Special secudary hash table. */
{
1013, -1061, 1109, -1181, 1231, -1291, 1361, -1429,
1481, -1531, 1583, -1627, 1699, -1759, 1831, -1889,
1973, -2017, 2083, -2137, 2213, -2273, 2339, -2383,
2441, -2531, 2593, -2663, 2707, -2753, 2819, -2887,
2957, -3023, 3089, -3181, 3251, -3313, 3361, -3449,
3511, -3557, 3617, -3677, 3739, -3821, 3881, -3931,
4013, -4079, 4139, -4219, 4271, -4349, 4423, -4493,
4561, -4639, 4691, -4783, 4831, -4931, 4973, -5023,
5101, -5179, 5261, -5333, 5413, -5471, 5521, -5591,
5659, -5737, 5807, -5857, 5923, -6029, 6089, -6151,
6221, -6287, 6343, -6397, 6491, -6571, 6659, -6709,
6791, -6857, 6917, -6983, 7043, -7129, 7213, -7297,
7369, -7477, 7529, -7577, 7643, -7703, 7789, -7873,
7933, -8017, 8093, -8171, 8237, -8297, 8387, -8461,
8543, -8627, 8689, -8741, 8819, -8867, 8963, -9029,
9109, -9181, 9241, -9323, 9397, -9439, 9511, -9613,
9677, -9743, 9811, -9871, 9941,-10061,10111,-10177,
10259,-10321,10399,-10477,10567,-10639,10711,-10789,
10867,-10949,11047,-11113,11173,-11261,11329,-11423,
11491,-11587,11681,-11777,11827,-11903,11959,-12041,
12109,-12197,12263,-12343,12413,-12487,12541,-12611,
12671,-12757,12829,-12917,12979,-13043,13127,-13187,
13291,-13367,13451,-13523,13619,-13691,13751,-13829,
13901,-13967,14057,-14153,14249,-14341,14419,-14489,
14557,-14633,14717,-14767,14831,-14897,14983,-15083,
15149,-15233,15289,-15359,15427,-15497,15583,-15649,
15733,-15791,15881,-15937,16057,-16097,16189,-16267,
16363,-16447,16529,-16619,16691,-16763,16879,-16937,
17021,-17093,17183,-17257,17341,-17401,17477,-17551,
17623,-17713,17791,-17891,17957,-18041,18097,-18169,
18233,-18307,18379,-18451,18523,-18637,18731,-18803,
18919,-19031,19121,-19211,19273,-19381,19429,-19477
} ;
#endif
int main ARGS((int,char **));
void Usage ARGS((void));
void comprexx ARGS((char **));
void compdir ARGS((char *));
void compress ARGS((int,int));
/* CJH */
//void decompress ARGS((int,int));
int is_compressed ARGS((char *,int));
void read_error ARGS((void));
void write_error ARGS((void));
void abort_compress ARGS((void));
void prratio ARGS((FILE *,long,long));
void about ARGS((void));
/*****************************************************************
* TAG( main )
*
* Algorithm from "A Technique for High Performance Data Compression",
* Terry A. Welch, IEEE Computer Vol 17, No 6 (June 1984), pp 8-19.
*
* Usage: compress [-dfvc] [-b bits] [file ...]
* Inputs:
* -d: If given, decompression is done instead.
*
* -c: Write output on stdout, don't remove original.
*
* -b: Parameter limits the max number of bits/code.
*
* -f: Forces output file to be generated, even if one already
* exists, and even if no space is saved by compressing.
* If -f is not used, the user will be prompted if stdin is
* a tty, otherwise, the output file will not be overwritten.
*
* -v: Write compression statistics
*
* -r: Recursive. If a filename is a directory, descend
* into it and compress everything in it.
*
* file ...:
* Files to be compressed. If none specified, stdin is used.
* Outputs:
* file.Z: Compressed form of file with same mode, owner, and utimes
* or stdout (if stdin used as input)
*
* Assumptions:
* When filenames are given, replaces with the compressed version
* (.Z suffix) only if the file decreases in size.
*
* Algorithm:
* Modified Lempel-Ziv method (LZW). Basically finds common
* substrings and replaces them with a variable size code. This is
* deterministic, and can be done on the fly. Thus, the decompression
* procedure needs no input table, but tracks the way the table was built.
*/
/* CJH */
#ifdef DO_ARGV
int
main(argc, argv)
REG1 int argc;
REG2 char *argv[];
{
REG3 char **filelist;
REG4 char **fileptr;
if (fgnd_flag = (signal(SIGINT, SIG_IGN) != SIG_IGN))
signal(SIGINT, (SIG_TYPE)abort_compress);
signal(SIGTERM, (SIG_TYPE)abort_compress);
#ifndef DOS
signal(SIGHUP, (SIG_TYPE)abort_compress);
#endif
#ifdef COMPATIBLE
nomagic = 1; /* Original didn't have a magic number */
#endif
filelist = (char **)malloc(argc*sizeof(char *));
if (filelist == NULL)
{
fprintf(stderr, "Cannot allocate memory for file list.\n");
exit (1);
}
fileptr = filelist;
*filelist = NULL;
if((progname = strrchr(argv[0], '/')) != 0)
progname++;
else
progname = argv[0];
if (strcmp(progname, "uncompress") == 0
|| strcmp(progname, "uncompress.real") == 0)
do_decomp = 1;
else
if (strcmp(progname, "zcat") == 0)
do_decomp = zcat_flg = 1;
/* Argument Processing
* All flags are optional.
* -V => print Version; debug verbose
* -d => do_decomp
* -v => unquiet
* -f => force overwrite of output file
* -n => no header: useful to uncompress old files
* -b maxbits => maxbits. If -b is specified, then maxbits MUST be given also.
* -c => cat all output to stdout
* -C => generate output compatible with compress 2.0.
* -r => recursively compress directories
* if a string is left, must be an input filename.
*/
for (argc--, argv++; argc > 0; argc--, argv++)
{
if (**argv == '-')
{/* A flag argument */
while (*++(*argv))
{/* Process all flags in this arg */
switch (**argv)
{
case 'V':
about();
break;
case 's':
silent = 1;
quiet = 1;
break;
case 'v':
silent = 0;
quiet = 0;
break;
case 'd':
do_decomp = 1;
break;
case 'f':
case 'F':
force = 1;
break;
case 'n':
nomagic = 1;
break;
case 'C':
block_mode = 0;
break;
case 'b':
if (!ARGVAL())
{
fprintf(stderr, "Missing maxbits\n");
Usage();
}
maxbits = atoi(*argv);
goto nextarg;
case 'c':
zcat_flg = 1;
break;
case 'q':
quiet = 1;
break;
case 'r':
case 'R':
#ifdef RECURSIVE
recursive = 1;
#else
fprintf(stderr, "%s -r not available (due to missing directory functions)\n", *argv);
#endif
break;
default:
fprintf(stderr, "Unknown flag: '%c'; ", **argv);
Usage();
}
}
}
else
{
*fileptr++ = *argv; /* Build input file list */
*fileptr = NULL;
}
nextarg: continue;
}
if (maxbits < INIT_BITS) maxbits = INIT_BITS;
if (maxbits > BITS) maxbits = BITS;
if (*filelist != NULL)
{
for (fileptr = filelist; *fileptr; fileptr++)
comprexx(fileptr);
}
else
{/* Standard input */
ifname = "";
exit_code = 0;
remove_ofname = 0;
if (do_decomp == 0)
{
compress(0, 1);
if (zcat_flg == 0 && !quiet)
{
fprintf(stderr, "Compression: ");
prratio(stderr, bytes_in-bytes_out, bytes_in);
fprintf(stderr, "\n");
}
if (bytes_out >= bytes_in && !(force))
exit_code = 2;
}
//else
//decompress(0, 1);
}
//int rsize = 0;
//char input_buffer[64] = { 0 };
//rsize = read(0, input_buffer, sizeof(input_buffer));
//decompress(input_buffer, rsize);
//exit((exit_code== -1) ? 1:exit_code);
}
void
Usage()
{
fprintf(stderr, "\
Usage: %s [-dfvcVr] [-b maxbits] [file ...]\n\
-d If given, decompression is done instead.\n\
-c Write output on stdout, don't remove original.\n\
-b Parameter limits the max number of bits/code.\n", progname);
fprintf(stderr, "\
-f Forces output file to be generated, even if one already.\n\
exists, and even if no space is saved by compressing.\n\
If -f is not used, the user will be prompted if stdin is.\n\
a tty, otherwise, the output file will not be overwritten.\n\
-v Write compression statistics.\n\
-V Output vesion and compile options.\n\
-r Recursive. If a filename is a directory, descend\n\
into it and compress everything in it.\n");
exit(1);
}
void
comprexx(fileptr)
char **fileptr;
{
int fdin;
int fdout;
char tempname[MAXPATHLEN];
if (strlen(*fileptr) > sizeof(tempname) - 3) {
fprintf(stderr, "Pathname too long: %s\n", *fileptr);
exit_code = 1;
return;
}
strcpy(tempname,*fileptr);
errno = 0;
#ifdef LSTAT
if (lstat(tempname,&infstat) == -1)
#else
if (stat(tempname,&infstat) == -1)
#endif
{
if (do_decomp)
{
switch (errno)
{
case ENOENT: /* file doesn't exist */
/*
** if the given name doesn't end with .Z, try appending one
** This is obviously the wrong thing to do if it's a
** directory, but it shouldn't do any harm.
*/
if (strcmp(tempname + strlen(tempname) - 2, ".Z") != 0)
{
strcat(tempname,".Z");
errno = 0;
#ifdef LSTAT
if (lstat(tempname,&infstat) == -1)
#else
if (stat(tempname,&infstat) == -1)
#endif
{
perror(tempname);
exit_code = 1;
return;
}
if ((infstat.st_mode & S_IFMT) != S_IFREG)
{
fprintf(stderr, "%s: Not a regular file.\n", tempname);
exit_code = 1;
return ;
}
}
else
{
perror(tempname);
exit_code = 1;
return;
}
break;
default:
perror(tempname);
exit_code = 1;
return;
}
}
else
{
perror(tempname);
exit_code = 1;
return;
}
}
switch (infstat.st_mode & S_IFMT)
{
case S_IFDIR: /* directory */
#ifdef RECURSIVE
if (recursive)
compdir(tempname);
else
#endif
if (!quiet)
fprintf(stderr,"%s is a directory -- ignored\n", tempname);
break;
case S_IFREG: /* regular file */
if (do_decomp != 0)
{/* DECOMPRESSION */
if (!zcat_flg)
{
if (strcmp(tempname + strlen(tempname) - 2, ".Z") != 0)
{
if (!quiet)
fprintf(stderr,"%s - no .Z suffix\n",tempname);
return;
}
}
strcpy(ofname, tempname);
/* Strip of .Z suffix */
if (strcmp(tempname + strlen(tempname) - 2, ".Z") == 0)
ofname[strlen(tempname) - 2] = '\0';
}
else
{/* COMPRESSION */
if (!zcat_flg)
{
if (strcmp(tempname + strlen(tempname) - 2, ".Z") == 0)
{
fprintf(stderr, "%s: already has .Z suffix -- no change\n", tempname);
return;
}
if (infstat.st_nlink > 1 && (!force))
{
fprintf(stderr, "%s has %d other links: unchanged\n",
tempname, infstat.st_nlink - 1);
exit_code = 1;
return;
}
}
strcpy(ofname, tempname);
strcat(ofname, ".Z");
}
if ((fdin = open(ifname = tempname, O_RDONLY|O_BINARY)) == -1)
{
perror(tempname);
exit_code = 1;
return;
}
if (zcat_flg == 0)
{
int c;
int s;
struct stat statbuf;
struct stat statbuf2;
if (stat(ofname, &statbuf) == 0)
{
if ((s = strlen(ofname)) > 8)
{
c = ofname[s-1];
ofname[s-1] = '\0';
statbuf2 = statbuf;
if (!stat(ofname, &statbuf2) &&
statbuf.st_mode == statbuf2.st_mode &&
statbuf.st_ino == statbuf2.st_ino &&
statbuf.st_dev == statbuf2.st_dev &&
statbuf.st_uid == statbuf2.st_uid &&
statbuf.st_gid == statbuf2.st_gid &&
statbuf.st_size == statbuf2.st_size &&
statbuf.st_atime == statbuf2.st_atime &&
statbuf.st_mtime == statbuf2.st_mtime &&
statbuf.st_ctime == statbuf2.st_ctime)
{
fprintf(stderr, "%s: filename too long to tack on .Z\n", tempname);
exit_code = 1;
return;
}
ofname[s-1] = (char)c;
}
if (!force)
{
inbuf[0] = 'n';
fprintf(stderr, "%s already exists.\n", ofname);
if (fgnd_flag && isatty(0))
{
fprintf(stderr, "Do you wish to overwrite %s (y or n)? ", ofname);
fflush(stderr);
if (read(0, inbuf, 1) > 0)
{
if (inbuf[0] != '\n')
{
do
{
if (read(0, inbuf+1, 1) <= 0)
{
perror("stdin");
break;
}
}
while (inbuf[1] != '\n');
}
}
else
perror("stdin");
}
if (inbuf[0] != 'y')
{
fprintf(stderr, "%s not overwritten\n", ofname);
exit_code = 1;
return;
}
}
if (unlink(ofname))
{
fprintf(stderr, "Can't remove old output file\n");
perror(ofname);
exit_code = 1;
return ;
}
}
if ((fdout = open(ofname, O_WRONLY|O_CREAT|O_EXCL|O_BINARY,0600)) == -1)
{
perror(tempname);
return;
}
if ((s = strlen(ofname)) > 8)
{
if (fstat(fdout, &statbuf))
{
fprintf(stderr, "Can't get status op output file\n");
perror(ofname);
exit_code = 1;
return ;
}
c = ofname[s-1];
ofname[s-1] = '\0';
statbuf2 = statbuf;
if (!stat(ofname, &statbuf2) &&
statbuf.st_mode == statbuf2.st_mode &&
statbuf.st_ino == statbuf2.st_ino &&
statbuf.st_dev == statbuf2.st_dev &&
statbuf.st_uid == statbuf2.st_uid &&
statbuf.st_gid == statbuf2.st_gid &&
statbuf.st_size == statbuf2.st_size &&
statbuf.st_atime == statbuf2.st_atime &&
statbuf.st_mtime == statbuf2.st_mtime &&
statbuf.st_ctime == statbuf2.st_ctime)
{
fprintf(stderr, "%s: filename too long to tack on .Z\n", tempname);
if (unlink(ofname))
{
fprintf(stderr, "can't remove bad output file\n");
perror(ofname);
}
exit_code = 1;
return;
}
ofname[s-1] = (char)c;
}
if(!quiet)
fprintf(stderr, "%s: ", tempname);
remove_ofname = 1;
}
else
{
fdout = 1;
ofname[0] = '\0';
remove_ofname = 0;
}
if (do_decomp == 0)
compress(fdin, fdout);
else
decompress(fdin, fdout);
close(fdin);
if (fdout != 1 && close(fdout))
write_error();
if ( (bytes_in == 0) && (force == 0 ) )
{
if (remove_ofname)
{
if(!quiet)
fprintf(stderr, "No compression -- %s unchanged\n", ifname);
if (unlink(ofname)) /* Remove input file */
{
fprintf(stderr, "\nunlink error (ignored) ");
perror(ofname);
exit_code = 1;
}
remove_ofname = 0;
exit_code = 2;
}
}
else
if (zcat_flg == 0)
{
struct utimbuf timep;
if (!do_decomp && bytes_out >= bytes_in && (!force))
{/* No compression: remove file.Z */
if(!quiet)
fprintf(stderr, "No compression -- %s unchanged\n", ifname);
if (unlink(ofname))
{
fprintf(stderr, "unlink error (ignored) ");
perror(ofname);
}
remove_ofname = 0;
exit_code = 2;
}
else
{/* ***** Successful Compression ***** */
if(!quiet)
{
fprintf(stderr, " -- replaced with %s",ofname);
if (!do_decomp)
{
fprintf(stderr, " Compression: ");
prratio(stderr, bytes_in-bytes_out, bytes_in);
}
fprintf(stderr, "\n");
}
timep.actime = infstat.st_atime;
timep.modtime = infstat.st_mtime;
if (utime(ofname, &timep))
{
fprintf(stderr, "\nutime error (ignored) ");
perror(ofname);
exit_code = 1;
}
#ifndef AMIGA
if (chmod(ofname, infstat.st_mode & 07777)) /* Copy modes */
{
fprintf(stderr, "\nchmod error (ignored) ");
perror(ofname);
exit_code = 1;
}
#ifndef DOS
if (chown(ofname, infstat.st_uid, infstat.st_gid)) /* Copy ownership */
{
fprintf(stderr, "\nchown error (ignored) ");
perror(ofname);
exit_code = 1;
}
#endif
#endif
remove_ofname = 0;
if (unlink(ifname)) /* Remove input file */
{
fprintf(stderr, "\nunlink error (ignored) ");
perror(ifname);
exit_code = 1;
}
}
}
if (exit_code == -1)
exit_code = 0;
break;
default:
fprintf(stderr,"%s is not a directory or a regular file - ignored\n",
tempname);
break;
}
}
#endif // DO_ARGV
#ifdef RECURSIVE
void
compdir(dir)
REG3 char *dir;
{
#ifndef DIRENT
REG1 struct direct *dp;
#else
REG1 struct dirent *dp;
#endif
REG2 DIR *dirp;
char nbuf[MAXPATHLEN];
char *nptr = nbuf;
dirp = opendir(dir);
if (dirp == NULL)
{
printf("%s unreadable\n", dir); /* not stderr! */
return ;
}
/*
** WARNING: the following algorithm will occasionally cause
** compress to produce error warnings of the form "<filename>.Z
** already has .Z suffix - ignored". This occurs when the
** .Z output file is inserted into the directory below
** readdir's current pointer.
** These warnings are harmless but annoying. The alternative
** to allowing this would be to store the entire directory
** list in memory, then compress the entries in the stored
** list. Given the depth-first recursive algorithm used here,
** this could use up a tremendous amount of memory. I don't
** think it's worth it. -- Dave Mack
*/
while (dp = readdir(dirp))
{
if (dp->d_ino == 0)
continue;
if (strcmp(dp->d_name,".") == 0 || strcmp(dp->d_name,"..") == 0)
continue;
if ((strlen(dir)+strlen(dp->d_name)+1) < (MAXPATHLEN - 1))
{
strcpy(nbuf,dir);
strcat(nbuf,"/");
strcat(nbuf,dp->d_name);
comprexx(&nptr);
}
else
fprintf(stderr,"Pathname too long: %s/%s\n", dir, dp->d_name);
}
closedir(dirp);
return;
}
#endif
/*
* compress fdin to fdout
*
* Algorithm: use open addressing double hashing (no chaining) on the
* prefix code / next character combination. We do a variant of Knuth's
* algorithm D (vol. 3, sec. 6.4) along with G. Knott's relatively-prime
* secondary probe. Here, the modular division first probe is gives way
* to a faster exclusive-or manipulation. Also do block compression with
* an adaptive reset, whereby the code table is cleared when the compression
* ratio decreases, but after the table fills. The variable-length output
* codes are re-sized at this point, and a special CLEAR code is generated
* for the decompressor. Late addition: construct the table according to
* file size for noticeable speed improvement on small files. Please direct
* questions about this implementation to ames!jaw.
*/
void
compress(fdin, fdout)
int fdin;
int fdout;
{
REG2 long hp;
REG3 int rpos;
#if REGISTERS >= 5
REG5 long fc;
#endif
REG6 int outbits;
REG7 int rlop;
REG8 int rsize;
REG9 int stcode;
REG10 code_int free_ent;
REG11 int boff;
REG12 int n_bits;
REG13 int ratio;
REG14 long checkpoint;
REG15 code_int extcode;
union
{
long code;
struct
{
char_type c;
unsigned short ent;
} e;
} fcode;
ratio = 0;
checkpoint = CHECK_GAP;
extcode = MAXCODE(n_bits = INIT_BITS)+1;
stcode = 1;
free_ent = FIRST;
memset(outbuf, 0, sizeof(outbuf));
bytes_out = 0; bytes_in = 0;
outbuf[0] = MAGIC_1;
outbuf[1] = MAGIC_2;
outbuf[2] = (char)(maxbits | block_mode);
boff = outbits = (3<<3);
fcode.code = 0;
clear_htab();
while ((rsize = read(fdin, inbuf, IBUFSIZ)) > 0)
{
if (bytes_in == 0)
{
fcode.e.ent = inbuf[0];
rpos = 1;
}
else
rpos = 0;
rlop = 0;
do
{
if (free_ent >= extcode && fcode.e.ent < FIRST)
{
if (n_bits < maxbits)
{
boff = outbits = (outbits-1)+((n_bits<<3)-
((outbits-boff-1+(n_bits<<3))%(n_bits<<3)));
if (++n_bits < maxbits)
extcode = MAXCODE(n_bits)+1;
else
extcode = MAXCODE(n_bits);
}
else
{
extcode = MAXCODE(16)+OBUFSIZ;
stcode = 0;
}
}
if (!stcode && bytes_in >= checkpoint && fcode.e.ent < FIRST)
{
REG1 long int rat;
checkpoint = bytes_in + CHECK_GAP;
if (bytes_in > 0x007fffff)
{ /* shift will overflow */
rat = (bytes_out+(outbits>>3)) >> 8;
if (rat == 0) /* Don't divide by zero */
rat = 0x7fffffff;
else
rat = bytes_in / rat;
}
else
rat = (bytes_in << 8) / (bytes_out+(outbits>>3)); /* 8 fractional bits */
if (rat >= ratio)
ratio = (int)rat;
else
{
ratio = 0;
clear_htab();
output(outbuf,outbits,CLEAR,n_bits);
boff = outbits = (outbits-1)+((n_bits<<3)-
((outbits-boff-1+(n_bits<<3))%(n_bits<<3)));
extcode = MAXCODE(n_bits = INIT_BITS)+1;
free_ent = FIRST;
stcode = 1;
}
}
if (outbits >= (OBUFSIZ<<3))
{
if (write(fdout, outbuf, OBUFSIZ) != OBUFSIZ)
write_error();
outbits -= (OBUFSIZ<<3);
boff = -(((OBUFSIZ<<3)-boff)%(n_bits<<3));
bytes_out += OBUFSIZ;
memcpy(outbuf, outbuf+OBUFSIZ, (outbits>>3)+1);
memset(outbuf+(outbits>>3)+1, '\0', OBUFSIZ);
}
{
REG1 int i;
i = rsize-rlop;
if ((code_int)i > extcode-free_ent) i = (int)(extcode-free_ent);
if (i > ((sizeof(outbuf) - 32)*8 - outbits)/n_bits)
i = ((sizeof(outbuf) - 32)*8 - outbits)/n_bits;
if (!stcode && (long)i > checkpoint-bytes_in)
i = (int)(checkpoint-bytes_in);
rlop += i;
bytes_in += i;
}
goto next;
hfound: fcode.e.ent = codetabof(hp);
next: if (rpos >= rlop)
goto endlop;
next2: fcode.e.c = inbuf[rpos++];
#ifndef FAST
{
REG1 code_int i;
#if REGISTERS >= 5
fc = fcode.code;
#else
# define fc fcode.code
#endif
hp = (((long)(fcode.e.c)) << (BITS-8)) ^ (long)(fcode.e.ent);
if ((i = htabof(hp)) == fc)
goto hfound;
if (i != -1)
{
REG4 long disp;
disp = (HSIZE - hp)-1; /* secondary hash (after G. Knott) */
do
{
if ((hp -= disp) < 0) hp += HSIZE;
if ((i = htabof(hp)) == fc)
goto hfound;
}
while (i != -1);
}
}
#else
{
REG1 long i;
REG4 long p;
#if REGISTERS >= 5
fc = fcode.code;
#else
# define fc fcode.code
#endif
hp = ((((long)(fcode.e.c)) << (HBITS-8)) ^ (long)(fcode.e.ent));
if ((i = htabof(hp)) == fc) goto hfound;
if (i == -1) goto out;
p = primetab[fcode.e.c];
lookup: hp = (hp+p)&HMASK;
if ((i = htabof(hp)) == fc) goto hfound;
if (i == -1) goto out;
hp = (hp+p)&HMASK;
if ((i = htabof(hp)) == fc) goto hfound;
if (i == -1) goto out;
hp = (hp+p)&HMASK;
if ((i = htabof(hp)) == fc) goto hfound;
if (i == -1) goto out;
goto lookup;
}
out: ;
#endif
output(outbuf,outbits,fcode.e.ent,n_bits);
{
#if REGISTERS < 5
# undef fc
REG1 long fc;
fc = fcode.code;
#endif
fcode.e.ent = fcode.e.c;
if (stcode)
{
codetabof(hp) = (unsigned short)free_ent++;
htabof(hp) = fc;
}
}
goto next;
endlop: if (fcode.e.ent >= FIRST && rpos < rsize)
goto next2;
if (rpos > rlop)
{
bytes_in += rpos-rlop;
rlop = rpos;
}
}
while (rlop < rsize);
}
if (rsize < 0)
read_error();
if (bytes_in > 0)
output(outbuf,outbits,fcode.e.ent,n_bits);
if (write(fdout, outbuf, (outbits+7)>>3) != (outbits+7)>>3)
write_error();
bytes_out += (outbits+7)>>3;
return;
}
/*
* Decompress stdin to stdout. This routine adapts to the codes in the
* file building the "string" table on-the-fly; requiring no table to
* be stored in the compressed file. The tables used herein are shared
* with those of the compress() routine. See the definitions above.
*/
/* CJH
void
decompress(fdin, fdout)
int fdin;
int fdout;
{
*/
int is_compressed(char *inbuffer, int insize)
{
REG2 char_type *stackp;
REG3 code_int code;
REG4 int finchar;
REG5 code_int oldcode;
REG6 code_int incode;
REG7 int inbits;
REG8 int posbits;
REG9 int outpos;
// CJH REG10 int insize;
REG11 int bitmask;
REG12 code_int free_ent;
REG13 code_int maxcode;
REG14 code_int maxmaxcode;
REG15 int n_bits;
REG16 int rsize;
bytes_in = 0;
bytes_out = 0;
// CJH insize = 0;
/* CJH
while (insize < 3 && (rsize = read(fdin, inbuf+insize, IBUFSIZ)) > 0)
insize += rsize;
*/
if (insize < sizeof(inbuf))
{
memcpy(inbuf, inbuffer, insize);
rsize = insize;
}
else
{
memcpy(inbuf, inbuffer, sizeof(inbuf));
rsize = insize = sizeof(inbuf);
}
if (insize < 3 || inbuf[0] != MAGIC_1 || inbuf[1] != MAGIC_2)
{
if (rsize < 0)
read_error();
if (insize > 0)
{
//fprintf(stderr, "%s: not in compressed format\n",
// (ifname[0] != '\0'? ifname : "stdin"));
exit_code = 1;
}
return 0;
}
maxbits = inbuf[2] & BIT_MASK;
block_mode = inbuf[2] & BLOCK_MODE;
maxmaxcode = MAXCODE(maxbits);
if (maxbits > BITS)
{
//fprintf(stderr,
// "%s: compressed with %d bits, can only handle %d bits\n",
// (*ifname != '\0' ? ifname : "stdin"), maxbits, BITS);
exit_code = 4;
return 0;
}
bytes_in = insize;
maxcode = MAXCODE(n_bits = INIT_BITS)-1;
bitmask = (1<<n_bits)-1;
oldcode = -1;
finchar = 0;
outpos = 0;
posbits = 3<<3;
free_ent = ((block_mode) ? FIRST : 256);
clear_tab_prefixof(); /* As above, initialize the first
256 entries in the table. */
for (code = 255 ; code >= 0 ; --code)
tab_suffixof(code) = (char_type)code;
do
{
resetbuf: ;
{
REG1 int i;
int e;
int o;
o = posbits >> 3;
e = o <= insize ? insize - o : 0;
for (i = 0 ; i < e ; ++i)
inbuf[i] = inbuf[i+o];
insize = e;
posbits = 0;
}
if (insize < sizeof(inbuf)-IBUFSIZ)
{
rsize = 0;
/*
if ((rsize = read(fdin, inbuf+insize, IBUFSIZ)) < 0)
{
printf("Read error!!\n");
read_error();
}
*/
insize += rsize;
}
inbits = ((rsize > 0) ? (insize - insize%n_bits)<<3 :
(insize<<3)-(n_bits-1));
while (inbits > posbits)
{
if (free_ent > maxcode)
{
posbits = ((posbits-1) + ((n_bits<<3) -
(posbits-1+(n_bits<<3))%(n_bits<<3)));
++n_bits;
if (n_bits == maxbits)
maxcode = maxmaxcode;
else
maxcode = MAXCODE(n_bits)-1;
bitmask = (1<<n_bits)-1;
goto resetbuf;
}
input(inbuf,posbits,code,n_bits,bitmask);
if (oldcode == -1)
{
if (code >= 256) {
//fprintf(stderr, "oldcode:-1 code:%i\n", (int)(code));
//fprintf(stderr, "uncompress: corrupt input\n");
//abort_compress();
return 0;
}
outbuf[outpos++] = (char_type)(finchar = (int)(oldcode = code));
continue;
}
if (code == CLEAR && block_mode)
{
clear_tab_prefixof();
free_ent = FIRST - 1;
posbits = ((posbits-1) + ((n_bits<<3) -
(posbits-1+(n_bits<<3))%(n_bits<<3)));
maxcode = MAXCODE(n_bits = INIT_BITS)-1;
bitmask = (1<<n_bits)-1;
goto resetbuf;
}
incode = code;
stackp = de_stack;
if (code >= free_ent) /* Special case for KwKwK string. */
{
if (code > free_ent)
{
//REG1 char_type *p;
posbits -= n_bits;
//p = &inbuf[posbits>>3];
//fprintf(stderr, "insize:%d posbits:%d inbuf:%02X %02X %02X %02X %02X (%d)\n", insize, posbits,
// p[-1],p[0],p[1],p[2],p[3], (posbits&07));
//fprintf(stderr, "uncompress: corrupt input\n");
//abort_compress();
return 0;
}
*--stackp = (char_type)finchar;
code = oldcode;
}
while ((cmp_code_int)code >= (cmp_code_int)256)
{ /* Generate output characters in reverse order */
*--stackp = tab_suffixof(code);
code = tab_prefixof(code);
}
*--stackp = (char_type)(finchar = tab_suffixof(code));
/* And put them out in forward order */
{
REG1 int i;
if (outpos+(i = (de_stack-stackp)) >= OBUFSIZ)
{
do
{
if (i > OBUFSIZ-outpos) i = OBUFSIZ-outpos;
if (i > 0)
{
memcpy(outbuf+outpos, stackp, i);
outpos += i;
}
if (outpos >= OBUFSIZ)
{
/*
if (write(fdout, outbuf, outpos) != outpos)
write_error();
*/
outpos = 0;
}
stackp+= i;
}
while ((i = (de_stack-stackp)) > 0);
}
else
{
memcpy(outbuf+outpos, stackp, i);
outpos += i;
}
}
if ((code = free_ent) < maxmaxcode) /* Generate the new entry. */
{
tab_prefixof(code) = (unsigned short)oldcode;
tab_suffixof(code) = (char_type)finchar;
free_ent = code+1;
}
oldcode = incode; /* Remember previous code. */
}
bytes_in += rsize;
}
while (rsize > 0);
/*
if (outpos > 0 && write(fdout, outbuf, outpos) != outpos)
write_error();
*/
return 1;
}
void
read_error()
{
fprintf(stderr, "\nread error on");
perror((ifname[0] != '\0') ? ifname : "stdin");
abort_compress();
}
void
write_error()
{
fprintf(stderr, "\nwrite error on");
perror((ofname[0] != '\0') ? ofname : "stdout");
abort_compress();
}
void
abort_compress()
{
if (remove_ofname)
unlink(ofname);
exit(1);
}
void
prratio(stream, num, den)
FILE *stream;
long int num;
long int den;
{
REG1 int q; /* Doesn't need to be long */
if (den > 0)
{
if (num > 214748L)
q = (int)(num/(den/10000L)); /* 2147483647/10000 */
else
q = (int)(10000L*num/den); /* Long calculations, though */
}
else
q = 10000;
if (q < 0)
{
putc('-', stream);
q = -q;
}
fprintf(stream, "%d.%02d%%", q / 100, q % 100);
}
void
about()
{
/* CJH */
//fprintf(stderr, "Compress version: %s, compiled: %s\n", version_id, COMPILE_DATE);
fprintf(stderr, "Compile options:\n ");
#if BYTEORDER == 4321 && NOALLIGN == 1
fprintf(stderr, "USE_BYTEORDER, ");
#endif
#ifdef FAST
fprintf(stderr, "FAST, ");
#endif
#ifdef vax
fprintf(stderr, "vax, ");
#endif
#ifdef DIRENT
fprintf(stderr,"DIRENT, ");
#endif
#ifdef SYSDIR
fprintf(stderr,"SYSDIR, ");
#endif
#ifdef NO_UCHAR
fprintf(stderr, "NO_UCHAR, ");
#endif
#ifdef SIGNED_COMPARE_SLOW
fprintf(stderr, "SIGNED_COMPARE_SLOW, ");
#endif
#ifdef MAXSEG_64K
fprintf(stderr, "MAXSEG_64K, ");
#endif
#ifdef DOS
fprintf(stderr, "DOS, ");
#endif
#ifdef DEBUG
fprintf(stderr, "DEBUG, ");
#endif
#ifdef LSTAT
fprintf(stderr, "LSTAT, ");
#endif
fprintf(stderr, "\n REGISTERS=%d IBUFSIZ=%d, OBUFSIZ=%d, BITS=%d\n",
REGISTERS, IBUFSIZ, OBUFSIZ, BITS);
fprintf(stderr, "\n\
Author version 4.2 (Speed improvement & source cleanup):\n\
Peter Jannesen (peter@ncs.nl)\n\
\n\
Author version 4.1 (Added recursive directory compress):\n\
Dave Mack (csu@alembic.acs.com)\n\
\n\
Authors version 4.0 (World release in 1985):\n\
Spencer W. Thomas, Jim McKie, Steve Davies,\n\
Ken Turkowski, James A. Woods, Joe Orost\n");
exit(0);
}
#!/usr/bin/env python
import sys
import ctypes
import ctypes.util
SIZE = 64
try:
data = open(sys.argv[1], "rb").read(SIZE)
except:
print "Usage: %s <input file>" % sys.argv[0]
sys.exit(1)
comp = ctypes.cdll.LoadLibrary(ctypes.util.find_library("compress42"))
if comp.is_compressed(data, len(data)):
print "%s is compress'd." % (sys.argv[1])
else:
print "%s is not compress'd." % sys.argv[1]
This source diff could not be displayed because it is too large. You can view the blob instead.
AC_PREREQ([2.65])
AC_INIT()
AC_PROG_CC
AC_LANG(C)
AC_TYPE_SIZE_T
AC_FUNC_MALLOC
CFLAGS="-Wall -fPIC $CFLAGS"
if test "$(uname)" == "Darwin"
then
SONAME="-install_name"
else
SONAME="-soname"
fi
AC_SUBST(SONAME, $SONAME)
AC_CONFIG_FILES([Makefile])
AC_OUTPUT
LIBNAME=libtinfl.so
all: clean $(LIBNAME)
$(LIBNAME): tinfl.o
$(CC) $(CFLAGS) -shared -Wl,$(SONAME),$(LIBNAME) tinfl.o -o $(LIBNAME) $(LDFLAGS)
tinfl.o:
$(CC) $(CFLAGS) -c tinfl.c
install:
install -D -m644 $(LIBNAME) $(DESTDIR)$(LIBDIR)/$(LIBNAME)
.PHONY: clean distclean
clean:
rm -f *.o
distclean: clean
rm -f $(LIBNAME)
deflate/inflate implementation library from http://code.google.com/p/miniz.
Used by the zlib plugin to validate potential zlib candidates.
#!/usr/bin/env python
import sys
import ctypes
import ctypes.util
from binwalk.common import BlockFile
class Foo:
SIZE = 33*1024
def __init__(self):
self.tinfl = ctypes.cdll.LoadLibrary(ctypes.util.find_library("tinfl"))
def _extractor(self, file_name):
processed = 0
inflated_data = ''
fd = BlockFile(file_name, 'rb')
fd.READ_BLOCK_SIZE = self.SIZE
while processed < fd.length:
(data, dlen) = fd.read_block()
inflated_block = self.tinfl.inflate_block(data, dlen)
if inflated_block:
inflated_data += ctypes.c_char_p(inflated_block).value[0:4]
else:
break
processed += dlen
fd.close()
print inflated_data
print "%s inflated to %d bytes" % (file_name, len(inflated_data))
Foo()._extractor(sys.argv[1])
#!/usr/bin/env python
import sys
import ctypes
import ctypes.util
SIZE = 33*1024
try:
data = open(sys.argv[1], "rb").read(SIZE)
except:
print "Usage: %s <input file>" % sys.argv[0]
sys.exit(1)
tinfl = ctypes.cdll.LoadLibrary(ctypes.util.find_library("tinfl"))
if tinfl.is_deflated(data, len(data), 1):
print "%s is zlib compressed." % (sys.argv[1])
else:
print "%s is not zlib compressed." % sys.argv[1]
/* tinfl.c v1.11 - public domain inflate with zlib header parsing/adler32 checking (inflate-only subset of miniz.c)
See "unlicense" statement at the end of this file.
Rich Geldreich <richgel99@gmail.com>, last updated May 20, 2011
Implements RFC 1950: http://www.ietf.org/rfc/rfc1950.txt and RFC 1951: http://www.ietf.org/rfc/rfc1951.txt
The entire decompressor coroutine is implemented in tinfl_decompress(). The other functions are optional high-level helpers.
*/
#ifndef TINFL_HEADER_INCLUDED
#define TINFL_HEADER_INCLUDED
#include <stdio.h>
#include <stdlib.h>
typedef unsigned char mz_uint8;
typedef signed short mz_int16;
typedef unsigned short mz_uint16;
typedef unsigned int mz_uint32;
typedef unsigned int mz_uint;
typedef unsigned long long mz_uint64;
#if defined(_M_IX86) || defined(_M_X64)
// Set MINIZ_USE_UNALIGNED_LOADS_AND_STORES to 1 if integer loads and stores to unaligned addresses are acceptable on the target platform (slightly faster).
#define MINIZ_USE_UNALIGNED_LOADS_AND_STORES 1
// Set MINIZ_LITTLE_ENDIAN to 1 if the processor is little endian.
#define MINIZ_LITTLE_ENDIAN 1
#endif
#if defined(_WIN64) || defined(__MINGW64__) || defined(_LP64) || defined(__LP64__)
// Set MINIZ_HAS_64BIT_REGISTERS to 1 if the processor has 64-bit general purpose registers (enables 64-bit bitbuffer in inflator)
#define MINIZ_HAS_64BIT_REGISTERS 1
#endif
// Works around MSVC's spammy "warning C4127: conditional expression is constant" message.
#ifdef _MSC_VER
#define MZ_MACRO_END while (0, 0)
#else
#define MZ_MACRO_END while (0)
#endif
// Decompression flags used by tinfl_decompress().
// TINFL_FLAG_PARSE_ZLIB_HEADER: If set, the input has a valid zlib header and ends with an adler32 checksum (it's a valid zlib stream). Otherwise, the input is a raw deflate stream.
// TINFL_FLAG_HAS_MORE_INPUT: If set, there are more input bytes available beyond the end of the supplied input buffer. If clear, the input buffer contains all remaining input.
// TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF: If set, the output buffer is large enough to hold the entire decompressed stream. If clear, the output buffer is at least the size of the dictionary (typically 32KB).
// TINFL_FLAG_COMPUTE_ADLER32: Force adler-32 checksum computation of the decompressed bytes.
enum
{
TINFL_FLAG_PARSE_ZLIB_HEADER = 1,
TINFL_FLAG_HAS_MORE_INPUT = 2,
TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF = 4,
TINFL_FLAG_COMPUTE_ADLER32 = 8
};
// High level decompression functions:
// tinfl_decompress_mem_to_heap() decompresses a block in memory to a heap block allocated via malloc().
// On entry:
// pSrc_buf, src_buf_len: Pointer and size of the Deflate or zlib source data to decompress.
// On return:
// Function returns a pointer to the decompressed data, or NULL on failure.
// *pOut_len will be set to the decompressed data's size, which could be larger than src_buf_len on uncompressible data.
// The caller must free() the returned block when it's no longer needed.
void *tinfl_decompress_mem_to_heap(const void *pSrc_buf, size_t src_buf_len, size_t *pOut_len, int flags);
// tinfl_decompress_mem_to_mem() decompresses a block in memory to another block in memory.
// Returns TINFL_DECOMPRESS_MEM_TO_MEM_FAILED on failure, or the number of bytes written on success.
#define TINFL_DECOMPRESS_MEM_TO_MEM_FAILED ((size_t)(-1))
size_t tinfl_decompress_mem_to_mem(void *pOut_buf, size_t out_buf_len, const void *pSrc_buf, size_t src_buf_len, int flags);
// tinfl_decompress_mem_to_callback() decompresses a block in memory to an internal 32KB buffer, and a user provided callback function will be called to flush the buffer.
// Returns 1 on success or 0 on failure.
typedef int (*tinfl_put_buf_func_ptr)(const void* pBuf, int len, void *pUser);
int tinfl_decompress_mem_to_callback(const void *pIn_buf, size_t *pIn_buf_size, tinfl_put_buf_func_ptr pPut_buf_func, void *pPut_buf_user, int flags);
// Checks to see if the first block of data in in_buf is valid zlib compressed data.
// Returns 1 if valid, 0 if invalid.
int is_valid_zlib_data(char *in_buf, size_t in_buf_size);
struct tinfl_decompressor_tag; typedef struct tinfl_decompressor_tag tinfl_decompressor;
// Max size of LZ dictionary.
#define TINFL_LZ_DICT_SIZE 32768
// Return status.
typedef enum
{
TINFL_STATUS_BAD_PARAM = -3,
TINFL_STATUS_ADLER32_MISMATCH = -2,
TINFL_STATUS_FAILED = -1,
TINFL_STATUS_DONE = 0,
TINFL_STATUS_NEEDS_MORE_INPUT = 1,
TINFL_STATUS_HAS_MORE_OUTPUT = 2
} tinfl_status;
// Initializes the decompressor to its initial state.
#define tinfl_init(r) do { (r)->m_state = 0; } MZ_MACRO_END
#define tinfl_get_adler32(r) (r)->m_check_adler32
// Main low-level decompressor coroutine function. This is the only function actually needed for decompression. All the other functions are just high-level helpers for improved usability.
// This is a universal API, i.e. it can be used as a building block to build any desired higher level decompression API. In the limit case, it can be called once per every byte input or output.
tinfl_status tinfl_decompress(tinfl_decompressor *r, const mz_uint8 *pIn_buf_next, size_t *pIn_buf_size, mz_uint8 *pOut_buf_start, mz_uint8 *pOut_buf_next, size_t *pOut_buf_size, const mz_uint32 decomp_flags);
// Internal/private bits follow.
enum
{
TINFL_MAX_HUFF_TABLES = 3, TINFL_MAX_HUFF_SYMBOLS_0 = 288, TINFL_MAX_HUFF_SYMBOLS_1 = 32, TINFL_MAX_HUFF_SYMBOLS_2 = 19,
TINFL_FAST_LOOKUP_BITS = 10, TINFL_FAST_LOOKUP_SIZE = 1 << TINFL_FAST_LOOKUP_BITS
};
typedef struct
{
mz_uint8 m_code_size[TINFL_MAX_HUFF_SYMBOLS_0];
mz_int16 m_look_up[TINFL_FAST_LOOKUP_SIZE], m_tree[TINFL_MAX_HUFF_SYMBOLS_0 * 2];
} tinfl_huff_table;
#if MINIZ_HAS_64BIT_REGISTERS
#define TINFL_USE_64BIT_BITBUF 1
#endif
#if TINFL_USE_64BIT_BITBUF
typedef mz_uint64 tinfl_bit_buf_t;
#define TINFL_BITBUF_SIZE (64)
#else
typedef mz_uint32 tinfl_bit_buf_t;
#define TINFL_BITBUF_SIZE (32)
#endif
struct tinfl_decompressor_tag
{
mz_uint32 m_state, m_num_bits, m_zhdr0, m_zhdr1, m_z_adler32, m_final, m_type, m_check_adler32, m_dist, m_counter, m_num_extra, m_table_sizes[TINFL_MAX_HUFF_TABLES];
tinfl_bit_buf_t m_bit_buf;
size_t m_dist_from_out_buf_start;
tinfl_huff_table m_tables[TINFL_MAX_HUFF_TABLES];
mz_uint8 m_raw_header[4], m_len_codes[TINFL_MAX_HUFF_SYMBOLS_0 + TINFL_MAX_HUFF_SYMBOLS_1 + 137];
};
#endif // #ifdef TINFL_HEADER_INCLUDED
// ------------------- End of Header: Implementation follows. (If you only want the header, define MINIZ_HEADER_FILE_ONLY.)
#ifndef TINFL_HEADER_FILE_ONLY
#include <string.h>
// MZ_MALLOC, etc. are only used by the optional high-level helper functions.
#ifdef MINIZ_NO_MALLOC
#define MZ_MALLOC(x) NULL
#define MZ_FREE(x) x, ((void)0)
#define MZ_REALLOC(p, x) NULL
#else
#define MZ_MALLOC(x) malloc(x)
#define MZ_FREE(x) free(x)
#define MZ_REALLOC(p, x) realloc(p, x)
#endif
#define MZ_MAX(a,b) (((a)>(b))?(a):(b))
#define MZ_MIN(a,b) (((a)<(b))?(a):(b))
#define MZ_CLEAR_OBJ(obj) memset(&(obj), 0, sizeof(obj))
#if MINIZ_USE_UNALIGNED_LOADS_AND_STORES && MINIZ_LITTLE_ENDIAN
#define MZ_READ_LE16(p) *((const mz_uint16 *)(p))
#define MZ_READ_LE32(p) *((const mz_uint32 *)(p))
#else
#define MZ_READ_LE16(p) ((mz_uint32)(((const mz_uint8 *)(p))[0]) | ((mz_uint32)(((const mz_uint8 *)(p))[1]) << 8U))
#define MZ_READ_LE32(p) ((mz_uint32)(((const mz_uint8 *)(p))[0]) | ((mz_uint32)(((const mz_uint8 *)(p))[1]) << 8U) | ((mz_uint32)(((const mz_uint8 *)(p))[2]) << 16U) | ((mz_uint32)(((const mz_uint8 *)(p))[3]) << 24U))
#endif
#define TINFL_MEMCPY(d, s, l) memcpy(d, s, l)
#define TINFL_MEMSET(p, c, l) memset(p, c, l)
#define TINFL_CR_BEGIN switch(r->m_state) { case 0:
#define TINFL_CR_RETURN(state_index, result) do { status = result; r->m_state = state_index; goto common_exit; case state_index:; } MZ_MACRO_END
#define TINFL_CR_RETURN_FOREVER(state_index, result) do { for ( ; ; ) { TINFL_CR_RETURN(state_index, result); } } MZ_MACRO_END
#define TINFL_CR_FINISH }
// TODO: If the caller has indicated that there's no more input, and we attempt to read beyond the input buf, then something is wrong with the input because the inflator never
// reads ahead more than it needs to. Currently TINFL_GET_BYTE() pads the end of the stream with 0's in this scenario.
#define TINFL_GET_BYTE(state_index, c) do { \
if (pIn_buf_cur >= pIn_buf_end) { \
for ( ; ; ) { \
if (decomp_flags & TINFL_FLAG_HAS_MORE_INPUT) { \
TINFL_CR_RETURN(state_index, TINFL_STATUS_NEEDS_MORE_INPUT); \
if (pIn_buf_cur < pIn_buf_end) { \
c = *pIn_buf_cur++; \
break; \
} \
} else { \
c = 0; \
break; \
} \
} \
} else c = *pIn_buf_cur++; } MZ_MACRO_END
#define TINFL_NEED_BITS(state_index, n) do { mz_uint c; TINFL_GET_BYTE(state_index, c); bit_buf |= (((tinfl_bit_buf_t)c) << num_bits); num_bits += 8; } while (num_bits < (mz_uint)(n))
#define TINFL_SKIP_BITS(state_index, n) do { if (num_bits < (mz_uint)(n)) { TINFL_NEED_BITS(state_index, n); } bit_buf >>= (n); num_bits -= (n); } MZ_MACRO_END
#define TINFL_GET_BITS(state_index, b, n) do { if (num_bits < (mz_uint)(n)) { TINFL_NEED_BITS(state_index, n); } b = bit_buf & ((1 << (n)) - 1); bit_buf >>= (n); num_bits -= (n); } MZ_MACRO_END
// TINFL_HUFF_BITBUF_FILL() is only used rarely, when the number of bytes remaining in the input buffer falls below 2.
// It reads just enough bytes from the input stream that are needed to decode the next Huffman code (and absolutely no more). It works by trying to fully decode a
// Huffman code by using whatever bits are currently present in the bit buffer. If this fails, it reads another byte, and tries again until it succeeds or until the
// bit buffer contains >=15 bits (deflate's max. Huffman code size).
#define TINFL_HUFF_BITBUF_FILL(state_index, pHuff) \
do { \
temp = (pHuff)->m_look_up[bit_buf & (TINFL_FAST_LOOKUP_SIZE - 1)]; \
if (temp >= 0) { \
code_len = temp >> 9; \
if ((code_len) && (num_bits >= code_len)) \
break; \
} else if (num_bits > TINFL_FAST_LOOKUP_BITS) { \
code_len = TINFL_FAST_LOOKUP_BITS; \
do { \
temp = (pHuff)->m_tree[~temp + ((bit_buf >> code_len++) & 1)]; \
} while ((temp < 0) && (num_bits >= (code_len + 1))); if (temp >= 0) break; \
} TINFL_GET_BYTE(state_index, c); bit_buf |= (((tinfl_bit_buf_t)c) << num_bits); num_bits += 8; \
} while (num_bits < 15);
// TINFL_HUFF_DECODE() decodes the next Huffman coded symbol. It's more complex than you would initially expect because the zlib API expects the decompressor to never read
// beyond the final byte of the deflate stream. (In other words, when this macro wants to read another byte from the input, it REALLY needs another byte in order to fully
// decode the next Huffman code.) Handling this properly is particularly important on raw deflate (non-zlib) streams, which aren't followed by a byte aligned adler-32.
// The slow path is only executed at the very end of the input buffer.
#define TINFL_HUFF_DECODE(state_index, sym, pHuff) do { \
int temp; mz_uint code_len, c; \
if (num_bits < 15) { \
if ((pIn_buf_end - pIn_buf_cur) < 2) { \
TINFL_HUFF_BITBUF_FILL(state_index, pHuff); \
} else { \
bit_buf |= (((tinfl_bit_buf_t)pIn_buf_cur[0]) << num_bits) | (((tinfl_bit_buf_t)pIn_buf_cur[1]) << (num_bits + 8)); pIn_buf_cur += 2; num_bits += 16; \
} \
} \
if ((temp = (pHuff)->m_look_up[bit_buf & (TINFL_FAST_LOOKUP_SIZE - 1)]) >= 0) \
code_len = temp >> 9, temp &= 511; \
else { \
code_len = TINFL_FAST_LOOKUP_BITS; do { temp = (pHuff)->m_tree[~temp + ((bit_buf >> code_len++) & 1)]; } while (temp < 0); \
} sym = temp; bit_buf >>= code_len; num_bits -= code_len; } MZ_MACRO_END
tinfl_status tinfl_decompress(tinfl_decompressor *r, const mz_uint8 *pIn_buf_next, size_t *pIn_buf_size, mz_uint8 *pOut_buf_start, mz_uint8 *pOut_buf_next, size_t *pOut_buf_size, const mz_uint32 decomp_flags)
{
static const int s_length_base[31] = { 3,4,5,6,7,8,9,10,11,13, 15,17,19,23,27,31,35,43,51,59, 67,83,99,115,131,163,195,227,258,0,0 };
static const int s_length_extra[31]= { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
static const int s_dist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193, 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
static const int s_dist_extra[32] = { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
static const mz_uint8 s_length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
static const int s_min_table_sizes[3] = { 257, 1, 4 };
tinfl_status status = TINFL_STATUS_FAILED; mz_uint32 num_bits, dist, counter, num_extra; tinfl_bit_buf_t bit_buf;
const mz_uint8 *pIn_buf_cur = pIn_buf_next, *const pIn_buf_end = pIn_buf_next + *pIn_buf_size;
mz_uint8 *pOut_buf_cur = pOut_buf_next, *const pOut_buf_end = pOut_buf_next + *pOut_buf_size;
size_t out_buf_size_mask = (decomp_flags & TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF) ? (size_t)-1 : ((pOut_buf_next - pOut_buf_start) + *pOut_buf_size) - 1, dist_from_out_buf_start;
// Ensure the output buffer's size is a power of 2, unless the output buffer is large enough to hold the entire output file (in which case it doesn't matter).
if (((out_buf_size_mask + 1) & out_buf_size_mask) || (pOut_buf_next < pOut_buf_start)) { *pIn_buf_size = *pOut_buf_size = 0; return TINFL_STATUS_BAD_PARAM; }
num_bits = r->m_num_bits; bit_buf = r->m_bit_buf; dist = r->m_dist; counter = r->m_counter; num_extra = r->m_num_extra; dist_from_out_buf_start = r->m_dist_from_out_buf_start;
TINFL_CR_BEGIN
bit_buf = num_bits = dist = counter = num_extra = r->m_zhdr0 = r->m_zhdr1 = 0; r->m_z_adler32 = r->m_check_adler32 = 1;
if (decomp_flags & TINFL_FLAG_PARSE_ZLIB_HEADER)
{
TINFL_GET_BYTE(1, r->m_zhdr0); TINFL_GET_BYTE(2, r->m_zhdr1);
counter = (((r->m_zhdr0 * 256 + r->m_zhdr1) % 31 != 0) || (r->m_zhdr1 & 32) || ((r->m_zhdr0 & 15) != 8));
if (!(decomp_flags & TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF)) counter |= (((1U << (8U + (r->m_zhdr0 >> 4))) > 32768U) || ((out_buf_size_mask + 1) < (size_t)(1U << (8U + (r->m_zhdr0 >> 4)))));
if (counter) { TINFL_CR_RETURN_FOREVER(36, TINFL_STATUS_FAILED); }
}
do
{
TINFL_GET_BITS(3, r->m_final, 3); r->m_type = r->m_final >> 1;
if (r->m_type == 0)
{
TINFL_SKIP_BITS(5, num_bits & 7);
for (counter = 0; counter < 4; ++counter) { if (num_bits) TINFL_GET_BITS(6, r->m_raw_header[counter], 8); else TINFL_GET_BYTE(7, r->m_raw_header[counter]); }
if ((counter = (r->m_raw_header[0] | (r->m_raw_header[1] << 8))) != (mz_uint)(0xFFFF ^ (r->m_raw_header[2] | (r->m_raw_header[3] << 8)))) { TINFL_CR_RETURN_FOREVER(39, TINFL_STATUS_FAILED); }
while ((counter) && (num_bits))
{
TINFL_GET_BITS(51, dist, 8);
while (pOut_buf_cur >= pOut_buf_end) { TINFL_CR_RETURN(52, TINFL_STATUS_HAS_MORE_OUTPUT); }
*pOut_buf_cur++ = (mz_uint8)dist;
counter--;
}
while (counter)
{
size_t n; while (pOut_buf_cur >= pOut_buf_end) { TINFL_CR_RETURN(9, TINFL_STATUS_HAS_MORE_OUTPUT); }
while (pIn_buf_cur >= pIn_buf_end)
{
if (decomp_flags & TINFL_FLAG_HAS_MORE_INPUT)
{
TINFL_CR_RETURN(38, TINFL_STATUS_NEEDS_MORE_INPUT);
}
else
{
TINFL_CR_RETURN_FOREVER(40, TINFL_STATUS_FAILED);
}
}
n = MZ_MIN(MZ_MIN((size_t)(pOut_buf_end - pOut_buf_cur), (size_t)(pIn_buf_end - pIn_buf_cur)), counter);
TINFL_MEMCPY(pOut_buf_cur, pIn_buf_cur, n); pIn_buf_cur += n; pOut_buf_cur += n; counter -= (mz_uint)n;
}
}
else if (r->m_type == 3)
{
TINFL_CR_RETURN_FOREVER(10, TINFL_STATUS_FAILED);
}
else
{
if (r->m_type == 1)
{
mz_uint8 *p = r->m_tables[0].m_code_size; mz_uint i;
r->m_table_sizes[0] = 288; r->m_table_sizes[1] = 32; TINFL_MEMSET(r->m_tables[1].m_code_size, 5, 32);
for ( i = 0; i <= 143; ++i) *p++ = 8; for ( ; i <= 255; ++i) *p++ = 9; for ( ; i <= 279; ++i) *p++ = 7; for ( ; i <= 287; ++i) *p++ = 8;
}
else
{
for (counter = 0; counter < 3; counter++) { TINFL_GET_BITS(11, r->m_table_sizes[counter], "\05\05\04"[counter]); r->m_table_sizes[counter] += s_min_table_sizes[counter]; }
MZ_CLEAR_OBJ(r->m_tables[2].m_code_size); for (counter = 0; counter < r->m_table_sizes[2]; counter++) { mz_uint s; TINFL_GET_BITS(14, s, 3); r->m_tables[2].m_code_size[s_length_dezigzag[counter]] = (mz_uint8)s; }
r->m_table_sizes[2] = 19;
}
for ( ; (int)r->m_type >= 0; r->m_type--)
{
int tree_next, tree_cur; tinfl_huff_table *pTable;
mz_uint i, j, used_syms, total, sym_index, next_code[17], total_syms[16]; pTable = &r->m_tables[r->m_type]; MZ_CLEAR_OBJ(total_syms); MZ_CLEAR_OBJ(pTable->m_look_up); MZ_CLEAR_OBJ(pTable->m_tree);
for (i = 0; i < r->m_table_sizes[r->m_type]; ++i) total_syms[pTable->m_code_size[i]]++;
used_syms = 0, total = 0; next_code[0] = next_code[1] = 0;
for (i = 1; i <= 15; ++i) { used_syms += total_syms[i]; next_code[i + 1] = (total = ((total + total_syms[i]) << 1)); }
if ((65536 != total) && (used_syms > 1))
{
TINFL_CR_RETURN_FOREVER(35, TINFL_STATUS_FAILED);
}
for (tree_next = -1, sym_index = 0; sym_index < r->m_table_sizes[r->m_type]; ++sym_index)
{
mz_uint rev_code = 0, l, cur_code, code_size = pTable->m_code_size[sym_index]; if (!code_size) continue;
cur_code = next_code[code_size]++; for (l = code_size; l > 0; l--, cur_code >>= 1) rev_code = (rev_code << 1) | (cur_code & 1);
if (code_size <= TINFL_FAST_LOOKUP_BITS) { mz_int16 k = (mz_int16)((code_size << 9) | sym_index); while (rev_code < TINFL_FAST_LOOKUP_SIZE) { pTable->m_look_up[rev_code] = k; rev_code += (1 << code_size); } continue; }
if (0 == (tree_cur = pTable->m_look_up[rev_code & (TINFL_FAST_LOOKUP_SIZE - 1)])) { pTable->m_look_up[rev_code & (TINFL_FAST_LOOKUP_SIZE - 1)] = (mz_int16)tree_next; tree_cur = tree_next; tree_next -= 2; }
rev_code >>= (TINFL_FAST_LOOKUP_BITS - 1);
for (j = code_size; j > (TINFL_FAST_LOOKUP_BITS + 1); j--)
{
tree_cur -= ((rev_code >>= 1) & 1);
if (!pTable->m_tree[-tree_cur - 1]) { pTable->m_tree[-tree_cur - 1] = (mz_int16)tree_next; tree_cur = tree_next; tree_next -= 2; } else tree_cur = pTable->m_tree[-tree_cur - 1];
}
tree_cur -= ((rev_code >>= 1) & 1); pTable->m_tree[-tree_cur - 1] = (mz_int16)sym_index;
}
if (r->m_type == 2)
{
for (counter = 0; counter < (r->m_table_sizes[0] + r->m_table_sizes[1]); )
{
mz_uint s; TINFL_HUFF_DECODE(16, dist, &r->m_tables[2]); if (dist < 16) { r->m_len_codes[counter++] = (mz_uint8)dist; continue; }
if ((dist == 16) && (!counter))
{
TINFL_CR_RETURN_FOREVER(17, TINFL_STATUS_FAILED);
}
num_extra = "\02\03\07"[dist - 16]; TINFL_GET_BITS(18, s, num_extra); s += "\03\03\013"[dist - 16];
TINFL_MEMSET(r->m_len_codes + counter, (dist == 16) ? r->m_len_codes[counter - 1] : 0, s); counter += s;
}
if ((r->m_table_sizes[0] + r->m_table_sizes[1]) != counter)
{
TINFL_CR_RETURN_FOREVER(21, TINFL_STATUS_FAILED);
}
TINFL_MEMCPY(r->m_tables[0].m_code_size, r->m_len_codes, r->m_table_sizes[0]); TINFL_MEMCPY(r->m_tables[1].m_code_size, r->m_len_codes + r->m_table_sizes[0], r->m_table_sizes[1]);
}
}
for ( ; ; )
{
mz_uint8 *pSrc;
for ( ; ; )
{
if (((pIn_buf_end - pIn_buf_cur) < 4) || ((pOut_buf_end - pOut_buf_cur) < 2))
{
TINFL_HUFF_DECODE(23, counter, &r->m_tables[0]);
if (counter >= 256)
break;
while (pOut_buf_cur >= pOut_buf_end) { TINFL_CR_RETURN(24, TINFL_STATUS_HAS_MORE_OUTPUT); }
*pOut_buf_cur++ = (mz_uint8)counter;
}
else
{
int sym2; mz_uint code_len;
#if TINFL_USE_64BIT_BITBUF
if (num_bits < 30) { bit_buf |= (((tinfl_bit_buf_t)MZ_READ_LE32(pIn_buf_cur)) << num_bits); pIn_buf_cur += 4; num_bits += 32; }
#else
if (num_bits < 15) { bit_buf |= (((tinfl_bit_buf_t)MZ_READ_LE16(pIn_buf_cur)) << num_bits); pIn_buf_cur += 2; num_bits += 16; }
#endif
if ((sym2 = r->m_tables[0].m_look_up[bit_buf & (TINFL_FAST_LOOKUP_SIZE - 1)]) >= 0)
code_len = sym2 >> 9;
else
{
code_len = TINFL_FAST_LOOKUP_BITS; do { sym2 = r->m_tables[0].m_tree[~sym2 + ((bit_buf >> code_len++) & 1)]; } while (sym2 < 0);
}
counter = sym2; bit_buf >>= code_len; num_bits -= code_len;
if (counter & 256)
break;
#if !TINFL_USE_64BIT_BITBUF
if (num_bits < 15) { bit_buf |= (((tinfl_bit_buf_t)MZ_READ_LE16(pIn_buf_cur)) << num_bits); pIn_buf_cur += 2; num_bits += 16; }
#endif
if ((sym2 = r->m_tables[0].m_look_up[bit_buf & (TINFL_FAST_LOOKUP_SIZE - 1)]) >= 0)
code_len = sym2 >> 9;
else
{
code_len = TINFL_FAST_LOOKUP_BITS; do { sym2 = r->m_tables[0].m_tree[~sym2 + ((bit_buf >> code_len++) & 1)]; } while (sym2 < 0);
}
bit_buf >>= code_len; num_bits -= code_len;
pOut_buf_cur[0] = (mz_uint8)counter;
if (sym2 & 256)
{
pOut_buf_cur++;
counter = sym2;
break;
}
pOut_buf_cur[1] = (mz_uint8)sym2;
pOut_buf_cur += 2;
}
}
if ((counter &= 511) == 256) break;
num_extra = s_length_extra[counter - 257]; counter = s_length_base[counter - 257];
if (num_extra) { mz_uint extra_bits; TINFL_GET_BITS(25, extra_bits, num_extra); counter += extra_bits; }
TINFL_HUFF_DECODE(26, dist, &r->m_tables[1]);
num_extra = s_dist_extra[dist]; dist = s_dist_base[dist];
if (num_extra) { mz_uint extra_bits; TINFL_GET_BITS(27, extra_bits, num_extra); dist += extra_bits; }
dist_from_out_buf_start = pOut_buf_cur - pOut_buf_start;
if ((dist > dist_from_out_buf_start) && (decomp_flags & TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF))
{
TINFL_CR_RETURN_FOREVER(37, TINFL_STATUS_FAILED);
}
pSrc = pOut_buf_start + ((dist_from_out_buf_start - dist) & out_buf_size_mask);
if ((MZ_MAX(pOut_buf_cur, pSrc) + counter) > pOut_buf_end)
{
while (counter--)
{
while (pOut_buf_cur >= pOut_buf_end) { TINFL_CR_RETURN(53, TINFL_STATUS_HAS_MORE_OUTPUT); }
*pOut_buf_cur++ = pOut_buf_start[(dist_from_out_buf_start++ - dist) & out_buf_size_mask];
}
continue;
}
#if MINIZ_USE_UNALIGNED_LOADS_AND_STORES
else if ((counter >= 9) && (counter <= dist))
{
const mz_uint8 *pSrc_end = pSrc + (counter & ~7);
do
{
((mz_uint32 *)pOut_buf_cur)[0] = ((const mz_uint32 *)pSrc)[0];
((mz_uint32 *)pOut_buf_cur)[1] = ((const mz_uint32 *)pSrc)[1];
pOut_buf_cur += 8;
} while ((pSrc += 8) < pSrc_end);
if ((counter &= 7) < 3)
{
if (counter)
{
pOut_buf_cur[0] = pSrc[0];
if (counter > 1)
pOut_buf_cur[1] = pSrc[1];
pOut_buf_cur += counter;
}
continue;
}
}
#endif
do
{
pOut_buf_cur[0] = pSrc[0];
pOut_buf_cur[1] = pSrc[1];
pOut_buf_cur[2] = pSrc[2];
pOut_buf_cur += 3; pSrc += 3;
} while ((int)(counter -= 3) > 2);
if ((int)counter > 0)
{
pOut_buf_cur[0] = pSrc[0];
if ((int)counter > 1)
pOut_buf_cur[1] = pSrc[1];
pOut_buf_cur += counter;
}
}
}
} while (!(r->m_final & 1));
if (decomp_flags & TINFL_FLAG_PARSE_ZLIB_HEADER)
{
TINFL_SKIP_BITS(32, num_bits & 7); for (counter = 0; counter < 4; ++counter) { mz_uint s; if (num_bits) TINFL_GET_BITS(41, s, 8); else TINFL_GET_BYTE(42, s); r->m_z_adler32 = (r->m_z_adler32 << 8) | s; }
}
TINFL_CR_RETURN_FOREVER(34, TINFL_STATUS_DONE);
TINFL_CR_FINISH
common_exit:
r->m_num_bits = num_bits; r->m_bit_buf = bit_buf; r->m_dist = dist; r->m_counter = counter; r->m_num_extra = num_extra; r->m_dist_from_out_buf_start = dist_from_out_buf_start;
*pIn_buf_size = pIn_buf_cur - pIn_buf_next; *pOut_buf_size = pOut_buf_cur - pOut_buf_next;
if ((decomp_flags & (TINFL_FLAG_PARSE_ZLIB_HEADER | TINFL_FLAG_COMPUTE_ADLER32)) && (status >= 0))
{
const mz_uint8 *ptr = pOut_buf_next; size_t buf_len = *pOut_buf_size;
mz_uint32 i, s1 = r->m_check_adler32 & 0xffff, s2 = r->m_check_adler32 >> 16; size_t block_len = buf_len % 5552;
while (buf_len)
{
for (i = 0; i + 7 < block_len; i += 8, ptr += 8)
{
s1 += ptr[0], s2 += s1; s1 += ptr[1], s2 += s1; s1 += ptr[2], s2 += s1; s1 += ptr[3], s2 += s1;
s1 += ptr[4], s2 += s1; s1 += ptr[5], s2 += s1; s1 += ptr[6], s2 += s1; s1 += ptr[7], s2 += s1;
}
for ( ; i < block_len; ++i) s1 += *ptr++, s2 += s1;
s1 %= 65521U, s2 %= 65521U; buf_len -= block_len; block_len = 5552;
}
r->m_check_adler32 = (s2 << 16) + s1; if ((status == TINFL_STATUS_DONE) && (decomp_flags & TINFL_FLAG_PARSE_ZLIB_HEADER) && (r->m_check_adler32 != r->m_z_adler32)) status = TINFL_STATUS_ADLER32_MISMATCH;
}
return status;
}
// Higher level helper functions.
void *tinfl_decompress_mem_to_heap(const void *pSrc_buf, size_t src_buf_len, size_t *pOut_len, int flags)
{
tinfl_decompressor decomp; void *pBuf = NULL, *pNew_buf; size_t src_buf_ofs = 0, out_buf_capacity = 0;
*pOut_len = 0;
tinfl_init(&decomp);
for ( ; ; )
{
size_t src_buf_size = src_buf_len - src_buf_ofs, dst_buf_size = out_buf_capacity - *pOut_len, new_out_buf_capacity;
tinfl_status status = tinfl_decompress(&decomp, (const mz_uint8*)pSrc_buf + src_buf_ofs, &src_buf_size, (mz_uint8*)pBuf, pBuf ? (mz_uint8*)pBuf + *pOut_len : NULL, &dst_buf_size,
(flags & ~TINFL_FLAG_HAS_MORE_INPUT) | TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF);
if ((status < 0) || (status == TINFL_STATUS_NEEDS_MORE_INPUT))
{
MZ_FREE(pBuf); *pOut_len = 0; return NULL;
}
src_buf_ofs += src_buf_size;
*pOut_len += dst_buf_size;
if (status == TINFL_STATUS_DONE) break;
new_out_buf_capacity = out_buf_capacity * 2; if (new_out_buf_capacity < 128) new_out_buf_capacity = 128;
pNew_buf = MZ_REALLOC(pBuf, new_out_buf_capacity);
if (!pNew_buf)
{
MZ_FREE(pBuf); *pOut_len = 0; return NULL;
}
pBuf = pNew_buf; out_buf_capacity = new_out_buf_capacity;
}
return pBuf;
}
size_t tinfl_decompress_mem_to_mem(void *pOut_buf, size_t out_buf_len, const void *pSrc_buf, size_t src_buf_len, int flags)
{
tinfl_decompressor decomp; tinfl_status status; tinfl_init(&decomp);
status = tinfl_decompress(&decomp, (const mz_uint8*)pSrc_buf, &src_buf_len, (mz_uint8*)pOut_buf, (mz_uint8*)pOut_buf, &out_buf_len, (flags & ~TINFL_FLAG_HAS_MORE_INPUT) | TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF);
return (status != TINFL_STATUS_DONE) ? TINFL_DECOMPRESS_MEM_TO_MEM_FAILED : out_buf_len;
}
int tinfl_decompress_mem_to_callback(const void *pIn_buf, size_t *pIn_buf_size, tinfl_put_buf_func_ptr pPut_buf_func, void *pPut_buf_user, int flags)
{
int result = 0;
tinfl_decompressor decomp;
mz_uint8 *pDict = (mz_uint8*)MZ_MALLOC(TINFL_LZ_DICT_SIZE); size_t in_buf_ofs = 0, dict_ofs = 0;
if (!pDict)
return TINFL_STATUS_FAILED;
tinfl_init(&decomp);
for ( ; ; )
{
size_t in_buf_size = *pIn_buf_size - in_buf_ofs, dst_buf_size = TINFL_LZ_DICT_SIZE - dict_ofs;
tinfl_status status = tinfl_decompress(&decomp, (const mz_uint8*)pIn_buf + in_buf_ofs, &in_buf_size, pDict, pDict + dict_ofs, &dst_buf_size,
(flags & ~(TINFL_FLAG_HAS_MORE_INPUT | TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF)));
in_buf_ofs += in_buf_size;
if ((dst_buf_size) && (!(*pPut_buf_func)(pDict + dict_ofs, (int)dst_buf_size, pPut_buf_user)))
break;
if (status != TINFL_STATUS_HAS_MORE_OUTPUT)
{
result = (status == TINFL_STATUS_DONE);
break;
}
dict_ofs = (dict_ofs + dst_buf_size) & (TINFL_LZ_DICT_SIZE - 1);
}
MZ_FREE(pDict);
*pIn_buf_size = in_buf_ofs;
return result;
}
#define BLOCK_SIZE (32*1024)
char *inflate_block(char *buf, size_t buf_size)
{
size_t out_size = BLOCK_SIZE;
return (char *) tinfl_decompress_mem_to_heap((const void *) buf, buf_size, (size_t *) &out_size, 0);
}
#define DATA_SIZE (32*1024)
int is_deflated(char *buf, size_t buf_size, int includes_zlib_header)
{
tinfl_decompressor decomp_struct = { 0 };
char out_buf[DATA_SIZE] = { 0 };
size_t out_buf_size = DATA_SIZE;
size_t in_buf_size = buf_size;
int flags = TINFL_FLAG_HAS_MORE_INPUT;
if(includes_zlib_header)
{
flags |= TINFL_FLAG_PARSE_ZLIB_HEADER;
}
if(tinfl_decompress(&decomp_struct,
(const mz_uint8 *) buf,
&in_buf_size,
(mz_uint8 *) &out_buf,
(mz_uint8 *) &out_buf,
&out_buf_size,
flags) >= 0 && out_buf_size > 0)
{
//printf("%d => %d DATA: '%s'\n", in_buf_size, out_buf_size, out_buf);
return 1;
}
return 0;
}
#endif // #ifndef TINFL_HEADER_FILE_ONLY
/*
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org/>
*/
#!/usr/bin/env python
import sys
import os.path
import binwalk
from threading import Thread
from getopt import GetoptError, gnu_getopt as GetOpt
def display_status():
global bwalk
while bwalk is not None:
# Display the current scan progress when the enter key is pressed.
try:
raw_input()
print "Progress: %.2f%% (%d / %d)\n" % (((float(bwalk.total_scanned) / float(bwalk.scan_length)) * 100), bwalk.total_scanned, bwalk.scan_length)
except Exception, e:
pass
def examples():
name = os.path.basename(sys.argv[0])
print """
Scanning firmware for file signatures:
\t$ %s firmware.bin
Extracting files from firmware:
\t$ %s -Me firmware.bin
Hueristic compression/encryption analysis:
\t$ %s -H firmware.bin
Scanning firmware for executable code:
\t$ %s -A firmware.bin
Performing a firmware strings analysis:
\t$ %s -S firmware.bin
Performing a firmware entropy analysis:
\t$ %s -E firmware.bin
Display identified file signatures on entropy graph:
\t$ %s -EB firmware.bin
Diffing multiple files:
\t$ %s -W firmware1.bin firmware2.bin firmware3.bin
See http://code.google.com/p/binwalk/wiki/TableOfContents for more.
""" % (name, name, name, name, name, name, name, name)
sys.exit(0)
def usage(fd):
fd.write("\n")
fd.write("Binwalk v%s\n" % binwalk.Config.VERSION)
fd.write("Craig Heffner, http://www.devttys0.com\n")
fd.write("\n")
fd.write("Usage: %s [OPTIONS] [FILE1] [FILE2] [FILE3] ...\n" % os.path.basename(sys.argv[0]))
fd.write("\n")
fd.write("Signature Analysis:\n")
fd.write("\t-B, --binwalk Perform a file signature scan (default)\n")
fd.write("\t-R, --raw-bytes=<string> Search for a custom signature\n")
fd.write("\t-A, --opcodes Scan for executable code signatures\n")
fd.write("\t-C, --cast Cast file contents as various data types\n")
fd.write("\t-m, --magic=<file> Specify an alternate magic file to use\n")
fd.write("\t-x, --exclude=<filter> Exclude matches that have <filter> in their description\n")
fd.write("\t-y, --include=<filter> Only search for matches that have <filter> in their description\n")
fd.write("\t-I, --show-invalid Show results marked as invalid\n")
fd.write("\t-T, --ignore-time-skew Do not show results that have timestamps more than 1 year in the future\n")
fd.write("\t-k, --keep-going Show all matching results at a given offset, not just the first one\n")
fd.write("\t-b, --dumb Disable smart signature keywords\n")
fd.write("\n")
fd.write("Strings Analysis:\n")
fd.write("\t-S, --strings Scan for ASCII strings (may be combined with -B, -R, -A, or -E)\n")
fd.write("\t-s, --strlen=<n> Set the minimum string length to search for (default: 3)\n")
fd.write("\n")
fd.write("Entropy Analysis:\n")
fd.write("\t-E, --entropy Plot file entropy (may be combined with -B, -R, -A, or -S)\n")
fd.write("\t-H, --heuristic Identify unknown compression/encryption based on entropy heuristics (implies -E)\n")
fd.write("\t-K, --block=<int> Set the block size for entropy analysis (default: %d)\n" % binwalk.entropy.FileEntropy.DEFAULT_BLOCK_SIZE)
fd.write("\t-a, --gzip Use gzip compression ratios to measure entropy\n")
fd.write("\t-N, --no-plot Do not generate an entropy plot graph\n")
fd.write("\t-F, --marker=<offset:name> Add a marker to the entropy plot graph\n")
fd.write("\t-Q, --no-legend Omit the legend from the entropy plot graph\n")
fd.write("\t-J, --save-plot Save plot as an SVG (implied if multiple files are specified)\n")
fd.write("\n")
fd.write("Binary Diffing:\n")
fd.write("\t-W, --diff Hexdump / diff the specified files\n")
fd.write("\t-K, --block=<int> Number of bytes to display per line (default: %d)\n" % binwalk.hexdiff.HexDiff.DEFAULT_BLOCK_SIZE)
fd.write("\t-G, --green Only show hex dump lines that contain bytes which were the same in all files\n")
fd.write("\t-i, --red Only show hex dump lines that contain bytes which were different in all files\n")
fd.write("\t-U, --blue Only show hex dump lines that contain bytes which were different in some files\n")
fd.write("\t-w, --terse Diff all files, but only display a hex dump of the first file\n")
fd.write("\n")
fd.write("Extraction Options:\n")
fd.write("\t-D, --dd=<type:ext[:cmd]> Extract <type> signatures, give the files an extension of <ext>, and execute <cmd>\n")
fd.write("\t-e, --extract=[file] Automatically extract known file types; load rules from file, if specified\n")
fd.write("\t-M, --matryoshka=[n] Recursively scan extracted files, up to n levels deep (8 levels of recursion is the default)\n")
fd.write("\t-r, --rm Cleanup extracted files and zero-size files\n")
fd.write("\t-d, --delay Delay file extraction for files with known footers\n")
fd.write("\n")
fd.write("Plugin Options:\n")
fd.write("\t-X, --disable-plugin=<name> Disable a plugin by name\n")
fd.write("\t-Y, --enable-plugin=<name> Enable a plugin by name\n")
fd.write("\t-p, --disable-plugins Do not load any binwalk plugins\n")
fd.write("\t-L, --list-plugins List all user and system plugins by name\n")
fd.write("\n")
fd.write("General Options:\n")
fd.write("\t-o, --offset=<int> Start scan at this file offset\n")
fd.write("\t-l, --length=<int> Number of bytes to scan\n")
fd.write("\t-g, --grep=<text> Grep results for the specified text\n")
fd.write("\t-f, --file=<file> Log results to file\n")
fd.write("\t-c, --csv Log results to file in csv format\n")
fd.write("\t-O, --skip-unopened Ignore file open errors and process only the files that can be opened\n")
fd.write("\t-t, --term Format output to fit the terminal window\n")
fd.write("\t-q, --quiet Supress output to stdout\n")
fd.write("\t-v, --verbose Be verbose (specify twice for very verbose)\n")
fd.write("\t-u, --update Update magic signature files\n")
fd.write("\t-?, --examples Show example usage\n")
fd.write("\t-h, --help Show help output\n")
fd.write("\n")
if fd == sys.stderr:
sys.exit(1)
else:
sys.exit(0)
def main():
# The Binwalk class instance must be global so that the display_status thread can access it.
global bwalk
MIN_ARGC = 2
requested_scans = []
offset = 0
length = 0
strlen = 0
verbose = 0
matryoshka = 1
block_size = 0
failed_open_count = 0
quiet = False
do_comp = False
do_files = False
log_file = None
do_csv = False
save_plot = False
show_plot = True
show_legend = True
entropy_scan = False
enable_plugins = True
show_invalid = False
entropy_algorithm = None
format_to_terminal = False
custom_signature = None
delay_extraction = False
ignore_time_skew = True
extract_rules_file = None
ignore_failed_open = False
extract_from_config = False
show_single_hex_dump = False
cleanup_after_extract = False
explicit_signature_scan = False
ignore_signature_keywords = False
magic_flags = binwalk.magic.MAGIC_NONE
markers = []
magic_files = []
file_opt_list = []
target_files = []
greps = []
excludes = []
searches = []
extracts = []
options = []
arguments = []
plugin_whitelist = []
plugin_blacklist = []
config = binwalk.Config()
short_options = "AaBbCcdEeGHhIiJkLMNnOPpQqrSTtUuvWw?D:F:f:g:K:o:l:m:R:s:X:x:Y:y:"
long_options = [
"rm",
"help",
"green",
"red",
"blue",
"examples",
"quiet",
"csv",
"verbose",
"opcodes",
"cast",
"update",
"binwalk",
"keep-going",
"show-invalid",
"ignore-time-skew",
"profile",
"delay",
"skip-unopened",
"term",
"tim",
"terse",
"diff",
"dumb",
"entropy",
"heuristic",
"math",
"gzip",
"save-plot",
"no-plot",
"no-legend",
"matryoshka=",
"strings",
"list-plugins",
"disable-plugins",
"disable-plugin=",
"enable-plugin=",
"marker=",
"strlen=",
"file=",
"block=",
"offset=",
"length=",
"exclude=",
"include=",
"search=",
"extract=",
"dd=",
"grep=",
"magic=",
"raw-bytes=",
]
# Require at least one argument (the target file)
if len(sys.argv) < MIN_ARGC:
usage(sys.stderr)
try:
opts, args = GetOpt(sys.argv[1:], short_options, long_options)
except GetoptError, e:
sys.stderr.write("%s\n" % str(e))
usage(sys.stderr)
for opt, arg in opts:
if opt in ("-h", "--help"):
usage(sys.stdout)
elif opt in ("-?", "--examples"):
examples()
elif opt in ("-d", "--delay"):
delay_extraction = True
elif opt in ("-f", "--file"):
log_file = arg
elif opt in ("-c", "--csv"):
do_csv = True
elif opt in ("-q", "--quiet"):
quiet = True
elif opt in ("-s", "--strlen"):
strlen = binwalk.common.str2int(arg)
elif opt in ("-Q", "--no-legend"):
show_legend = False
elif opt in ("-J", "--save-plot"):
save_plot = True
elif opt in ("-N", "--no-plot"):
show_plot = False
elif opt in ("-E", "--entropy"):
requested_scans.append(binwalk.Binwalk.ENTROPY)
elif opt in ("-W", "--diff"):
requested_scans.append(binwalk.Binwalk.HEXDIFF)
elif opt in ("-w", "--terse"):
show_single_hex_dump = True
elif opt in ("-a", "--gzip"):
entropy_algorithm = 'gzip'
elif opt in("-t", "--term", "--tim"):
format_to_terminal = True
elif opt in("-p", "--disable-plugins"):
enable_plugins = False
elif opt in ("-b", "--dumb"):
ignore_signature_keywords = True
elif opt in ("-v", "--verbose"):
verbose += 1
elif opt in ("-S", "--strings"):
requested_scans.append(binwalk.Binwalk.STRINGS)
elif opt in ("-O", "--skip-unopened"):
ignore_failed_open = True
elif opt in ("-o", "--offset"):
offset = binwalk.common.str2int(arg)
elif opt in ("-l", "--length"):
length = binwalk.common.str2int(arg)
elif opt in ("-y", "--search", "--include"):
searches.append(arg)
elif opt in ("-x", "--exclude"):
excludes.append(arg)
elif opt in ("-D", "--dd"):
extracts.append(arg)
elif opt in ("-g", "--grep"):
greps.append(arg)
elif opt in ("-G", "--green"):
greps.append("32;")
elif opt in ("-i", "--red"):
greps.append("31;")
elif opt in ("-U", "--blue"):
greps.append("34;")
elif opt in ("-r", "--rm"):
cleanup_after_extract = True
elif opt in ("-m", "--magic"):
magic_files.append(arg)
elif opt in ("-k", "--keep-going"):
magic_flags |= binwalk.magic.MAGIC_CONTINUE
elif opt in ("-I", "--show-invalid"):
show_invalid = True
elif opt in ("-B", "--binwalk"):
requested_scans.append(binwalk.Binwalk.BINWALK)
elif opt in ("-K", "--block"):
block_size = binwalk.common.str2int(arg)
elif opt in ("-X", "--disable-plugin"):
plugin_blacklist.append(arg)
elif opt in ("-Y", "--enable-plugin"):
plugin_whitelist.append(arg)
elif opt in ("-T", "--ignore-time-skew"):
ignore_time_skew = False
elif opt in ("-H", "--heuristic", "--math"):
do_comp = True
if binwalk.Binwalk.ENTROPY not in requested_scans:
requested_scans.append(binwalk.Binwalk.ENTROPY)
elif opt in ("-F", "--marker"):
if ':' in arg:
(location, description) = arg.split(':', 1)
location = int(location)
markers.append((location, [{'description' : description, 'offset' : location}]))
elif opt in("-L", "--list-plugins"):
# List all user and system plugins, then exit
print ''
print 'NAME TYPE ENABLED DESCRIPTION'
print '-' * 115
with binwalk.Binwalk() as bw:
for (key, info) in binwalk.plugins.Plugins(bw).list_plugins().iteritems():
for module_name in info['modules']:
print '%-16s %-10s %-10s %s' % (module_name, key, info['enabled'][module_name], info['descriptions'][module_name])
print ''
sys.exit(1)
elif opt in ("-M", "--matryoshka"):
if arg:
matryoshka = binwalk.common.str2int(arg)
else:
# Original Zvyozdochkin matrhoska set had 8 dolls. This is a good number.
matryoshka = 8
elif opt in ("-e", "--extract"):
# If a file path was specified, use that as the extraction rules file
if arg:
extract_from_config = False
extract_rules_file = arg
# Else, use the default rules file
else:
extract_from_config = True
elif opt in ("-A", "--opcodes"):
requested_scans.append(binwalk.Binwalk.BINARCH)
# Load user file first so its signatures take precedence
magic_files.append(config.paths['user'][config.BINARCH_MAGIC_FILE])
magic_files.append(config.paths['system'][config.BINARCH_MAGIC_FILE])
elif opt in ("-C", "--cast"):
requested_scans.append(binwalk.Binwalk.BINCAST)
# Don't stop at the first match (everything matches everything in this scan)
magic_flags |= binwalk.magic.MAGIC_CONTINUE
# Load user file first so its signatures take precedence
magic_files.append(config.paths['user'][config.BINCAST_MAGIC_FILE])
magic_files.append(config.paths['system'][config.BINCAST_MAGIC_FILE])
elif opt in ("-R", "--raw-bytes"):
custom_signature = arg
requested_scans.append(binwalk.Binwalk.CUSTOM)
explicit_signature_scan = True
elif opt in ("-u", "--update"):
try:
sys.stdout.write("Updating signatures...")
sys.stdout.flush()
binwalk.Update().update()
sys.stdout.write("done.\n")
sys.exit(0)
except Exception, e:
if 'Permission denied' in str(e):
sys.stderr.write("failed (permission denied). Check your user permissions, or run the update as root.\n")
else:
sys.stderr.write('\n' + str(e) + '\n')
sys.exit(1)
# The --profile option is handled prior to calling main()
elif opt not in ('-P', '--profile'):
usage(sys.stderr)
# Keep track of the options and arguments.
# This is used later to determine which argv entries are file names.
options.append(opt)
options.append("%s%s" % (opt, arg))
options.append("%s=%s" % (opt, arg))
arguments.append(arg)
# Treat any command line options not processed by getopt as target file paths.
for opt in sys.argv[1:]:
if opt not in arguments and opt not in options and not opt.startswith('-'):
file_opt_list.append(opt)
# Validate the target files listed in target_files
for tfile in file_opt_list:
# Ignore directories.
if not os.path.isdir(tfile):
# Make sure we can open the target files
try:
fd = open(tfile, "rb")
fd.close()
target_files.append(tfile)
except Exception, e:
sys.stdout.write("Cannot open file : %s\n" % str(e))
failed_open_count += 1
# Unless -O was specified, don't run the scan unless we are able to scan all specified files
if failed_open_count > 0 and not ignore_failed_open:
if failed_open_count > 1:
plural = 's'
else:
plural = ''
sys.stdout.write("Failed to open %d file%s for scanning, quitting...\n" % (failed_open_count, plural))
sys.exit(1)
# If more than one target file was specified, enable verbose mode; else, there is
# nothing in the output to indicate which scan corresponds to which file.
if (matryoshka > 1 or len(target_files) > 1):
save_plot = True
if not verbose:
verbose = 1
elif len(target_files) == 0:
usage(sys.stderr)
# Instantiate the Binwalk class
bwalk = binwalk.Binwalk(magic_files=magic_files, flags=magic_flags, verbose=verbose, log=log_file, quiet=quiet, ignore_smart_keywords=ignore_signature_keywords, load_plugins=enable_plugins, ignore_time_skews=ignore_time_skew)
# If a custom signature was specified, create a temporary magic file containing the custom signature
# and ensure that it is the only magic file that will be loaded when Binwalk.scan() is called.
if custom_signature is not None:
bwalk.magic_files = [bwalk.parser.file_from_string(custom_signature)]
# Set any specified filters
bwalk.filter.exclude(excludes)
bwalk.filter.include(searches)
bwalk.filter.grep(filters=greps)
# Add any specified extract rules
bwalk.extractor.add_rule(extracts)
# If -e was specified, load the default extract rules
if extract_from_config:
bwalk.extractor.load_defaults()
# If --extract was specified, load the specified extraction rules file
if extract_rules_file is not None:
bwalk.extractor.load_from_file(extract_rules_file)
# Set the extractor cleanup value (True to clean up files, False to leave them on disk)
bwalk.extractor.cleanup_extracted_files(cleanup_after_extract)
# Enable delayed extraction, which will prevent supported file types from having trailing data when extracted
bwalk.extractor.enable_delayed_extract(delay_extraction)
# Load the magic file(s)
#bwalk.load_signatures(magic_files=magic_files)
# If --term was specified, enable output formatting to terminal
if format_to_terminal:
bwalk.display.enable_formatting(True)
# Enable log file CSV formatting, if specified
if do_csv:
bwalk.display.enable_csv()
# If no scan was explicitly rquested, do a binwalk scan
if not requested_scans:
requested_scans.append(binwalk.Binwalk.BINWALK)
# Sort the scan types to ensure the entropy scan is performed last
requested_scans.sort()
# Everything is set up, let's do a scan
try:
results = {}
# Start the display_status function as a daemon thread.
t = Thread(target=display_status)
t.setDaemon(True)
t.start()
for scan_type in requested_scans:
if scan_type in [binwalk.Binwalk.BINWALK, binwalk.Binwalk.BINARCH, binwalk.Binwalk.BINCAST, binwalk.Binwalk.CUSTOM]:
# There's no generic way for the binwalk class to know what
# scan type is being run, since these are all signature scans,
# just with different magic files. Manually set the scan sub-type
# here to ensure that plugins can differentiate between the
# scans being performed.
bwalk.scan_type = scan_type
r = bwalk.scan(target_files,
offset=offset,
length=length,
show_invalid_results=show_invalid,
callback=bwalk.display.results,
start_callback=bwalk.display.header,
end_callback=bwalk.display.footer,
matryoshka=matryoshka,
plugins_whitelist=plugin_whitelist,
plugins_blacklist=plugin_blacklist)
bwalk.concatenate_results(results, r)
elif scan_type == binwalk.Binwalk.STRINGS:
r = bwalk.analyze_strings(target_files,
length=length,
offset=offset,
n=strlen,
block=block_size,
load_plugins=enable_plugins,
whitelist=plugin_whitelist,
blacklist=plugin_blacklist)
bwalk.concatenate_results(results, r)
elif scan_type == binwalk.Binwalk.COMPRESSION:
r = bwalk.analyze_compression(target_files, offset=offset, length=length)
bwalk.concatenate_results(results, r)
elif scan_type == binwalk.Binwalk.ENTROPY:
if not results:
for target_file in target_files:
results[target_file] = []
else:
bwalk.display.quiet = True
bwalk.display.cleanup()
for target_file in results.keys():
bwalk.concatenate_results(results, {target_file : markers})
bwalk.analyze_entropy(results,
offset,
length,
block_size,
show_plot,
show_legend,
save_plot,
algorithm=entropy_algorithm,
load_plugins=enable_plugins,
whitelist=plugin_whitelist,
blacklist=plugin_blacklist,
compcheck=do_comp)
elif scan_type == binwalk.Binwalk.HEXDIFF:
bwalk.hexdiff(target_files, offset=offset, length=length, block=block_size, first=show_single_hex_dump)
except KeyboardInterrupt:
pass
except IOError:
pass
# except Exception, e:
# print "Unexpected error:", str(e)
bwalk.cleanup()
try:
# Special options for profiling the code. For debug use only.
if '--profile' in sys.argv or '-P' in sys.argv:
import cProfile
cProfile.run('main()')
else:
main()
except KeyboardInterrupt:
pass
__all__ = ["Binwalk"]
import os
import re
import time
import magic
from config import *
from update import *
from filter import *
from parser import *
from plugins import *
from hexdiff import *
from entropy import *
from extractor import *
from prettyprint import *
from smartstrings import *
from smartsignature import *
from common import file_size, unique_file_name, BlockFile
class Binwalk(object):
'''
Primary Binwalk class.
Useful class objects:
self.filter - An instance of the MagicFilter class.
self.extractor - An instance of the Extractor class.
self.parser - An instance of the MagicParser class.
self.display - An instance of the PrettyPrint class.
self.magic_files - A list of magic file path strings to use whenever the scan() method is invoked.
self.scan_length - The total number of bytes to be scanned.
self.total_scanned - The number of bytes that have already been scanned.
self.scan_type - The type of scan being performed, one of: BINWALK, BINCAST, BINARCH, STRINGS, ENTROPY.
Performing a simple binwalk scan:
from binwalk import Binwalk
scan = Binwalk().scan(['firmware1.bin', 'firmware2.bin'])
for (filename, file_results) in scan.iteritems():
print "Results for %s:" % filename
for (offset, results) in file_results:
for result in results:
print offset, result['description']
'''
# Default libmagic flags. Basically disable anything we don't need in the name of speed.
DEFAULT_FLAGS = magic.MAGIC_NO_CHECK_TEXT | magic.MAGIC_NO_CHECK_ENCODING | magic.MAGIC_NO_CHECK_APPTYPE | magic.MAGIC_NO_CHECK_TOKENS
# Maximum magic bytes length
MAX_SIGNATURE_SIZE = 128
# Minimum verbosity level at which to enable extractor verbosity.
VERY_VERBOSE = 2
# Scan every byte by default.
DEFAULT_BYTE_ALIGNMENT = 1
# Valid scan_type values.
# ENTROPY must be the largest value to ensure it is performed last if multiple scans are performed.
BINWALK = 0x01
BINARCH = 0x02
BINCAST = 0x04
STRINGS = 0x08
COMPRESSION = 0x10
HEXDIFF = 0x20
CUSTOM = 0x40
ENTROPY = 0x80
def __init__(self, magic_files=[], flags=magic.MAGIC_NONE, log=None, quiet=False, verbose=0, ignore_smart_keywords=False, ignore_time_skews=False, load_extractor=False, load_plugins=True):
'''
Class constructor.
@magic_files - A list of magic files to use.
@flags - Flags to pass to magic_open. [TODO: Might this be more appropriate as an argument to load_signaures?]
@log - Output PrettyPrint data to log file as well as to stdout.
@quiet - If set to True, supress PrettyPrint output to stdout.
@verbose - Verbosity level.
@ignore_smart_keywords - Set to True to ignore smart signature keywords.
@ignore_time_skews - Set to True to ignore file results with timestamps in the future.
@load_extractor - Set to True to load the default extraction rules automatically.
@load_plugins - Set to False to disable plugin support.
Returns None.
'''
self.flags = self.DEFAULT_FLAGS | flags
self.last_extra_data_section = ''
self.load_plugins = load_plugins
self.magic_files = magic_files
self.verbose = verbose
self.total_scanned = 0
self.scan_length = 0
self.total_read = 0
self.matryoshka = 1
self.epoch = 0
self.year = 0
self.plugins = None
self.magic = None
self.mfile = None
self.entropy = None
self.strings = None
self.scan_type = self.BINWALK
if not ignore_time_skews:
# Consider timestamps up to 1 year in the future valid,
# to account for any minor time skew on the local system.
self.year = time.localtime().tm_year + 1
self.epoch = int(time.time()) + (60 * 60 * 24 * 365)
# Instantiate the config class so we can access file/directory paths
self.config = Config()
# Use the system default magic file if no other was specified
if not self.magic_files or self.magic_files is None:
# Append the user's magic file first so that those signatures take precedence
self.magic_files = [
self.config.paths['user'][self.config.BINWALK_MAGIC_FILE],
self.config.paths['system'][self.config.BINWALK_MAGIC_FILE],
]
# Only set the extractor verbosity if told to be very verbose
if self.verbose >= self.VERY_VERBOSE:
extractor_verbose = True
else:
extractor_verbose = False
# Create an instance of the PrettyPrint class, which can be used to print results to screen/file.
self.display = PrettyPrint(self, log=log, quiet=quiet, verbose=verbose)
# Create MagicFilter and Extractor class instances. These can be used to:
#
# o Create include/exclude filters
# o Specify file extraction rules to be applied during a scan
#
self.filter = MagicFilter()
self.extractor = Extractor(verbose=extractor_verbose)
if load_extractor:
self.extractor.load_defaults()
# Create SmartSignature and MagicParser class instances. These are mostly for internal use.
self.smart = SmartSignature(self.filter, ignore_smart_signatures=ignore_smart_keywords)
self.parser = MagicParser(self.filter, self.smart)
def __del__(self):
self.cleanup()
def __enter__(self):
return self
def __exit__(self, t, v, traceback):
self.cleanup()
def cleanup(self):
'''
Close magic and cleanup any temporary files generated by the internal instance of MagicParser.
Returns None.
'''
try:
self.magic.close()
except:
pass
try:
self.parser.cleanup()
except:
pass
def load_signatures(self, magic_files=[]):
'''
Load signatures from magic file(s).
Called automatically by Binwalk.scan() with all defaults, if not already called manually.
@magic_files - A list of magic files to use (default: self.magic_files).
Returns None.
'''
# The magic files specified here override any already set
if magic_files and magic_files is not None:
self.magic_files = magic_files
# Parse the magic file(s) and initialize libmagic
self.mfile = self.parser.parse(self.magic_files)
self.magic = magic.open(self.flags)
self.magic.load(self.mfile)
# Once the temporary magic file is loaded into libmagic, we don't need it anymore; delete the temp file
self.parser.rm_magic_file()
def hexdiff(self, file_names, length=0x100, offset=0, block=16, first=False):
if not length and len(file_names) > 0:
length = file_size(file_names[0])
if not block:
block = 16
HexDiff(self).display(file_names, offset=offset, size=length, block=block, show_first_only=first)
def analyze_strings(self, file_names, length=0, offset=0, n=0, block=0, load_plugins=True, whitelist=[], blacklist=[]):
'''
Performs a strings analysis on the specified file(s).
@file_names - A list of files to analyze.
@length - The number of bytes in the file to analyze.
@offset - The starting offset into the file to begin analysis.
@n - The minimum valid string length.
@block - The block size to use when performing entropy analysis.
@load_plugins - Set to False to disable plugin callbacks.
@whitelist - A list of whitelisted plugins.
@blacklist - A list of blacklisted plugins.
Returns a dictionary compatible with other classes and methods (Entropy, Binwalk, analyze_entropy, etc):
{
'file_name' : (offset, [{
'description' : 'Strings',
'string' : 'found_string'
}]
)
}
'''
data = {}
self.strings = Strings(file_names,
self,
length=length,
offset=offset,
n=n,
block=block,
algorithm='gzip', # Use gzip here as it is faster and we don't need the detail provided by shannon
load_plugins=load_plugins,
whitelist=whitelist,
blacklist=blacklist)
data = self.strings.strings()
del self.strings
self.strings = None
return data
def analyze_entropy(self, files, offset=0, length=0, block=0, plot=True, legend=True, save=False, algorithm=None, load_plugins=True, whitelist=[], blacklist=[], compcheck=False):
'''
Performs an entropy analysis on the specified file(s).
@files - A dictionary containing file names and results data, as returned by Binwalk.scan.
@offset - The offset into the data to begin analysis.
@length - The number of bytes to analyze.
@block - The size of the data blocks to analyze.
@plot - Set to False to disable plotting.
@legend - Set to False to exclude the legend and custom offset markers from the plot.
@save - Set to True to save plots to disk instead of displaying them.
@algorithm - Set to 'gzip' to use the gzip entropy "algorithm".
@load_plugins - Set to False to disable plugin callbacks.
@whitelist - A list of whitelisted plugins.
@blacklist - A list of blacklisted plugins.
@compcheck - Set to True to perform heuristic compression detection.
Returns a dictionary of:
{
'file_name' : ([list, of, offsets], [list, of, entropy], average_entropy)
}
'''
data = {}
self.entropy = Entropy(files,
self,
offset,
length,
block,
plot,
legend,
save,
algorithm=algorithm,
load_plugins=plugins,
whitelist=whitelist,
blacklist=blacklist,
compcheck=compcheck)
data = self.entropy.analyze()
del self.entropy
self.entropy = None
return data
def scan(self, target_files, offset=0, length=0, show_invalid_results=False, callback=None, start_callback=None, end_callback=None, base_dir=None, matryoshka=1, plugins_whitelist=[], plugins_blacklist=[]):
'''
Performs a binwalk scan on a file or list of files.
@target_files - File or list of files to scan.
@offset - Starting offset at which to start the scan.
@length - Number of bytes to scan. Specify -1 for streams.
@show_invalid_results - Set to True to display invalid results.
@callback - Callback function to be invoked when matches are found.
@start_callback - Callback function to be invoked prior to scanning each file.
@end_callback - Callback function to be invoked after scanning each file.
@base_dir - Base directory for output files.
@matryoshka - Number of levels to traverse into the rabbit hole.
@plugins_whitelist - A list of plugin names to load. If not empty, only these plugins will be loaded.
@plugins_blacklist - A list of plugin names to not load.
Returns a dictionary of :
{
'target file name' : [
(0, [{description : "LZMA compressed data..."}]),
(112, [{description : "gzip compressed data..."}])
]
}
'''
# Prefix all directory names with an underscore. This prevents accidental deletion of the original file(s)
# when the user is typing too fast and is trying to deleted the extraction directory.
prefix = '_'
dir_extension = 'extracted'
i = 0
total_results = {}
self.matryoshka = matryoshka
# For backwards compatibility
if not isinstance(target_files, type([])):
target_files = [target_files]
if base_dir is None:
base_dir = ''
# Instantiate the Plugins class and load all plugins, if not disabled
self.plugins = Plugins(self, whitelist=plugins_whitelist, blacklist=plugins_blacklist)
if self.load_plugins:
self.plugins._load_plugins()
# Load the default signatures if self.load_signatures has not already been invoked
if self.magic is None:
self.load_signatures()
while i < self.matryoshka:
new_target_files = []
# Scan each target file
for target_file in target_files:
ignore_files = []
# On the first scan, add the base_dir value to dir_prefix. Subsequent target_file values will have this value prepended already.
if i == 0:
dir_prefix = os.path.join(base_dir, prefix + os.path.basename(target_file))
else:
dir_prefix = os.path.join(os.path.dirname(target_file), prefix + os.path.basename(target_file))
output_dir = unique_file_name(dir_prefix, dir_extension)
# Set the output directory for extracted files to go to
self.extractor.output_directory(output_dir)
if start_callback is not None:
start_callback(target_file)
results = self.single_scan(target_file,
offset=offset,
length=length,
show_invalid_results=show_invalid_results,
callback=callback)
if end_callback is not None:
end_callback(target_file)
# Get a list of extracted file names; don't scan them again.
for (index, results_list) in results:
for result in results_list:
if result['extract']:
ignore_files.append(result['extract'])
# Find all newly created files and add them to new_target_files / new_target_directories
for (dir_path, sub_dirs, files) in os.walk(output_dir):
for fname in files:
fname = os.path.join(dir_path, fname)
if fname not in ignore_files:
new_target_files.append(fname)
# Don't worry about sub-directories
break
total_results[target_file] = results
target_files = new_target_files
i += 1
# Be sure to delete the Plugins instance so that there isn't a lingering reference to
# this Binwalk class instance (lingering handles to this Binwalk instance cause the
# __del__ deconstructor to not be called).
if self.plugins is not None:
del self.plugins
self.plugins = None
return total_results
def single_scan(self, target_file='', fd=None, offset=0, length=0, show_invalid_results=False, callback=None, plugins_whitelist=[], plugins_blacklist=[]):
'''
Performs a binwalk scan on one target file or file descriptor.
@target_file - File to scan.
@fd - A common.BlockFile object.
@offset - Starting offset at which to start the scan.
@length - Number of bytes to scan. Specify -1 for streams.
@show_invalid_results - Set to True to display invalid results.
@callback - Callback function to be invoked when matches are found.
@plugins_whitelist - A list of plugin names to load. If not empty, only these plugins will be loaded.
@plugins_blacklist - A list of plugin names to not load.
The callback function is passed two arguments: a list of result dictionaries containing the scan results
(one result per dict), and the offset at which those results were identified. Example callback function:
def my_callback(offset, results):
print "Found %d results at offset %d:" % (len(results), offset)
for result in results:
print "\t%s" % result['description']
binwalk.Binwalk(callback=my_callback).scan("firmware.bin")
Upon completion, the scan method returns a sorted list of tuples containing a list of results dictionaries
and the offsets at which those results were identified:
scan_results = [
(0, [{description : "LZMA compressed data..."}]),
(112, [{description : "gzip compressed data..."}])
]
See SmartSignature.parse for a more detailed description of the results dictionary structure.
'''
scan_results = {}
fsize = 0
jump_offset = 0
i_opened_fd = False
i_loaded_plugins = False
plugret = PLUGIN_CONTINUE
plugret_start = PLUGIN_CONTINUE
self.total_read = 0
self.total_scanned = 0
self.scan_length = length
self.filter.show_invalid_results = show_invalid_results
self.start_offset = offset
# Check to make sure either a target file or a file descriptor was supplied
if not target_file and fd is None:
raise Exception("Must supply Binwalk.single_scan with a valid file path or BlockFile object")
# Need the total size of the target file, even if we aren't scanning the whole thing
if target_file:
fsize = file_size(target_file)
# If no length was specified, make the length the size of the target file minus the starting offset
if self.scan_length == 0:
self.scan_length = fsize - offset
# Open the target file and seek to the specified start offset
if fd is None:
fd = BlockFile(target_file, length=self.scan_length, offset=offset)
i_opened_fd = True
# Seek to the starting offset.
#fd.seek(offset)
# If the Plugins class has not already been instantitated, do that now.
if self.plugins is None:
self.plugins = Plugins(self, blacklist=plugins_blacklist, whitelist=plugins_whitelist)
i_loaded_plugins = True
if self.load_plugins:
self.plugins._load_plugins()
# Invoke any pre-scan plugins
plugret_start = self.plugins._pre_scan_callbacks(fd)
# Load the default signatures if self.load_signatures has not already been invoked
if self.magic is None:
self.load_signatures()
# Main loop, scan through all the data
while not ((plugret | plugret_start) & PLUGIN_TERMINATE):
i = 0
# Read in the next block of data from the target file and make sure it's valid
(data, dlen) = fd.read_block()
if not data or dlen == 0:
break
# The total number of bytes scanned could be bigger than the total number
# of bytes read from the file if the previous signature result specified a
# jump offset that was beyond the end of the then current data block.
#
# If this is the case, we need to index into this data block appropriately in order to
# resume the scan from the appropriate offset.
#
# Don't update dlen though, as it is the literal offset into the data block that we
# are to scan up to in this loop iteration. It is also appended to self.total_scanned,
# which is what we want (even if we have been told to skip part of the block, the skipped
# part is still considered part of the total bytes scanned).
if jump_offset > 0:
total_check = self.total_scanned + dlen
# Is the jump offset beyond the total amount of data that we've currently read in (i.e., in a future data block)?
if jump_offset >= total_check:
i = -1
# Try to seek to the jump offset; this won't work if fd == sys.stdin
try:
fd.seek(jump_offset)
self.total_read = jump_offset
self.total_scanned = jump_offset - dlen
except:
pass
# Is the jump offset inside this block of data?
elif jump_offset > self.total_scanned and jump_offset < total_check:
# Index into this block appropriately; jump_offset is the file offset that
# we need to jump to, and self.total_scanned is the file offset that starts
# the beginning of the current block
i = jump_offset - self.total_scanned
# We're done with jump_offset, zero it out for the next round
jump_offset = 0
# Scan through each block of data looking for signatures
if i >= 0 and i < dlen:
# Scan this data block for a list of offsets which are candidates for possible valid signatures.
# Signatures could be split across the block boundary; since data conatins 1KB more than dlen,
# pass up to dlen+MAX_SIGNATURE_SIZE to find_signature_candidates, but don't accept signatures that
# start after the end of dlen.
for candidate in self.parser.find_signature_candidates(data[i:dlen+self.MAX_SIGNATURE_SIZE], (dlen-i)):
# If a signature specified a jump offset beyond this candidate signature offset, ignore it
if (i + candidate + self.total_scanned) < jump_offset:
continue
# Reset these values on each loop
smart = {}
results = []
results_offset = -1
# Pass the data to libmagic, and split out multiple results into a list
for magic_result in self.parser.split(self.magic.buffer(data[i+candidate:i+candidate+fd.MAX_TRAILING_SIZE])):
i_set_results_offset = False
# Some file names are not NULL byte terminated, but rather their length is
# specified in a size field. To ensure these are not marked as invalid due to
# non-printable characters existing in the file name, parse the filename(s) and
# trim them to the specified filename length, if one was specified.
magic_result = self.smart._parse_raw_strings(magic_result)
# Invoke any pre-parser callback plugin functions
if not (plugret_start & PLUGIN_STOP_PLUGINS):
raw_result = {'description' : magic_result}
plugret = self.plugins._scan_pre_parser_callbacks(raw_result)
magic_result = raw_result['description']
if (plugret & PLUGIN_TERMINATE):
break
# Make sure this is a valid result before further processing
if not self.filter.invalid(magic_result):
# The smart filter parser returns a dictionary of keyword values and the signature description.
smart = self.smart.parse(magic_result)
# Validate the jump value and check if the response description should be displayed
if smart['jump'] > -1 and self._should_display(smart):
# If multiple results are returned and one of them has smart['jump'] set to a non-zero value,
# the calculated results offset will be wrong since i will have been incremented. Only set the
# results_offset value when the first match is encountered.
if results_offset < 0:
results_offset = offset + i + candidate + smart['adjust'] + self.total_scanned
i_set_results_offset = True
# Double check to make sure the smart['adjust'] value is sane.
# If it makes results_offset negative, then it is not sane.
if results_offset >= 0:
smart['offset'] = results_offset
# Invoke any scan plugins
if not (plugret_start & PLUGIN_STOP_PLUGINS):
plugret = self.plugins._scan_callbacks(smart)
results_offset = smart['offset']
if (plugret & PLUGIN_TERMINATE):
break
# Extract the result, if it matches one of the extract rules and is not a delayed extract.
if self.extractor.enabled and not (self.extractor.delayed and smart['delay']) and not ((plugret | plugret_start) & PLUGIN_NO_EXTRACT):
# If the signature did not specify a size, extract to the end of the file.
if not smart['size']:
smart['size'] = fsize-results_offset
smart['extract'] = self.extractor.extract( results_offset,
smart['description'],
target_file,
smart['size'],
name=smart['name'])
if not ((plugret | plugret_start) & PLUGIN_NO_DISPLAY):
# This appears to be a valid result, so append it to the results list.
results.append(smart)
elif i_set_results_offset:
results_offset = -1
# Did we find any valid results?
if results_offset >= 0:
scan_results[results_offset] = results
if callback is not None:
callback(results_offset, results)
# If a relative jump offset was specified, update the absolute jump_offset variable
if smart.has_key('jump') and smart['jump'] > 0:
jump_offset = results_offset + smart['jump']
# Track the total number of bytes scanned
self.total_scanned += dlen
# The starting offset only affects the reported offset for results
# in the first block of data. Zero it out after the first block has
# been processed.
offset = 0
# Sort the results before returning them
scan_items = scan_results.items()
scan_items.sort()
# Do delayed extraction, if specified.
if self.extractor.enabled and self.extractor.delayed:
scan_items = self.extractor.delayed_extract(scan_items, target_file, fsize)
# Invoke any post-scan plugins
#if not (plugret_start & PLUGIN_STOP_PLUGINS):
self.plugins._post_scan_callbacks(fd)
# Be sure to delete the Plugins instance so that there isn't a lingering reference to
# this Binwalk class instance (lingering handles to this Binwalk instance cause the
# __del__ deconstructor to not be called).
if i_loaded_plugins:
del self.plugins
self.plugins = None
if i_opened_fd:
fd.close()
return scan_items
def concatenate_results(self, results, new):
'''
Concatenate multiple Binwalk.scan results into one dictionary.
@results - Binwalk results to append new results to.
@new - New data to append to results.
Returns None.
'''
for (new_file_name, new_data) in new.iteritems():
if not results.has_key(new_file_name):
results[new_file_name] = new_data
else:
for i in range(0, len(new_data)):
found_offset = False
(new_offset, new_results_list) = new_data[i]
for j in range(0, len(results[new_file_name])):
(offset, results_list) = results[new_file_name][j]
if offset == new_offset:
results_list += new_results_list
results[new_file_name][j] = (offset, results_list)
found_offset = True
break
if not found_offset:
results[new_file_name] += new_data
def _should_display(self, result):
'''
Determines if a result string should be displayed to the user or not.
@result - Result dictionary, as returned by self.smart.parse.
Returns True if the string should be displayed.
Returns False if the string should not be displayed.
'''
if result['invalid'] == True or (self.year and result['year'] > self.year) or (self.epoch and result['epoch'] > self.epoch):
return False
desc = result['description']
return (desc and desc is not None and not self.filter.invalid(desc) and self.filter.filter(desc) != self.filter.FILTER_EXCLUDE)
# Common functions.
import os
import re
def file_size(filename):
'''
Obtains the size of a given file.
@filename - Path to the file.
Returns the size of the file.
'''
# Using open/lseek works on both regular files and block devices
fd = os.open(filename, os.O_RDONLY)
try:
return os.lseek(fd, 0, os.SEEK_END)
except Exception, e:
raise Exception("file_size failed to obtain the size of '%s': %s" % (filename, str(e)))
finally:
os.close(fd)
def str2int(string):
'''
Attempts to convert string to a base 10 integer; if that fails, then base 16.
@string - String to convert to an integer.
Returns the integer value on success.
Throws an exception if the string cannot be converted into either a base 10 or base 16 integer value.
'''
try:
return int(string)
except:
return int(string, 16)
def strip_quoted_strings(string):
'''
Strips out data in between double quotes.
@string - String to strip.
Returns a sanitized string.
'''
# This regex removes all quoted data from string.
# Note that this removes everything in between the first and last double quote.
# This is intentional, as printed (and quoted) strings from a target file may contain
# double quotes, and this function should ignore those. However, it also means that any
# data between two quoted strings (ex: '"quote 1" you won't see me "quote 2"') will also be stripped.
return re.sub(r'\"(.*)\"', "", string)
def get_quoted_strings(string):
'''
Returns a string comprised of all data in between double quotes.
@string - String to get quoted data from.
Returns a string of quoted data on success.
Returns a blank string if no quoted data is present.
'''
try:
# This regex grabs all quoted data from string.
# Note that this gets everything in between the first and last double quote.
# This is intentional, as printed (and quoted) strings from a target file may contain
# double quotes, and this function should ignore those. However, it also means that any
# data between two quoted strings (ex: '"quote 1" non-quoted data "quote 2"') will also be included.
return re.findall(r'\"(.*)\"', string)[0]
except:
return ''
def unique_file_name(base_name, extension=''):
'''
Creates a unique file name based on the specified base name.
@base_name - The base name to use for the unique file name.
@extension - The file extension to use for the unique file name.
Returns a unique file string.
'''
idcount = 0
if extension and not extension.startswith('.'):
extension = '.%s' % extension
fname = base_name + extension
while os.path.exists(fname):
fname = "%s-%d%s" % (base_name, idcount, extension)
idcount += 1
return fname
class BlockFile(file):
'''
Abstraction class to handle reading data from files in blocks.
Necessary for large files.
'''
# The MAX_TRAILING_SIZE limits the amount of data available to a signature.
# While most headers/signatures are far less than this value, some may reference
# pointers in the header structure which may point well beyond the header itself.
# Passing the entire remaining buffer to libmagic is resource intensive and will
# significantly slow the scan; this value represents a reasonable buffer size to
# pass to libmagic which will not drastically affect scan time.
MAX_TRAILING_SIZE = 8 * 1024
# Max number of bytes to process at one time. This needs to be large enough to
# limit disk I/O, but small enough to limit the size of processed data blocks.
READ_BLOCK_SIZE = 1 * 1024 * 1024
def __init__(self, fname, mode='rb', length=0, offset=0):
'''
Class constructor.
@fname - Path to the file to be opened.
@mode - Mode to open the file in.
@length - Maximum number of bytes to read from the file via self.block_read().
Returns None.
'''
self.total_read = 0
self.offset = offset
if length:
self.length = length
else:
try:
self.length = file_size(fname)
except:
self.length = 0
file.__init__(self, fname, mode)
self.seek(self.offset)
def read_block(self):
'''
Reads in a block of data from the target file.
Returns a tuple of (file block data, block data length).
'''
dlen = 0
data = None
if self.total_read < self.length:
# Read in READ_BLOCK_SIZE plus MAX_TRAILING_SIZE bytes, but return a max dlen value
# of READ_BLOCK_SIZE. This ensures that there is a MAX_TRAILING_SIZE buffer at the
# end of the returned data in case a signature is found at or near data[dlen].
data = self.read(self.READ_BLOCK_SIZE + self.MAX_TRAILING_SIZE)
if data and data is not None:
# Get the actual length of the read in data
dlen = len(data)
seek_offset = dlen - self.READ_BLOCK_SIZE
# If we've read in more data than the scan length, truncate the dlen value
if (self.total_read + self.READ_BLOCK_SIZE) > self.length:
dlen = self.length - self.total_read
# If dlen is the expected rlen size, it should be set to READ_BLOCK_SIZE
elif dlen == (self.READ_BLOCK_SIZE + self.MAX_TRAILING_SIZE):
dlen = self.READ_BLOCK_SIZE
# Increment self.total_read to reflect the amount of data that has been read
# for processing (actual read size is larger of course, due to the MAX_TRAILING_SIZE
# buffer of data at the end of each block).
self.total_read += dlen
# Seek to the self.total_read offset so the next read can pick up where this one left off.
if seek_offset > 0:
self.seek(self.tell() - seek_offset)
return (data, dlen)
#!/usr/bin/env python
# Routines to perform Monte Carlo Pi approximation and Chi Squared tests.
# Used for fingerprinting unknown areas of high entropy (e.g., is this block of high entropy data compressed or encrypted?).
# Inspired by people who actually know what they're doing: http://www.fourmilab.ch/random/
import math
import common
class MonteCarloPi(object):
'''
Performs a Monte Carlo Pi approximation.
Currently unused.
'''
def __init__(self):
'''
Class constructor.
Returns None.
'''
self.reset()
def reset(self):
'''
Reset state to the beginning.
'''
self.pi = 0
self.error = 0
self.m = 0
self.n = 0
def update(self, data):
'''
Update the pi approximation with new data.
@data - A string of bytes to update (length must be >= 6).
Returns None.
'''
c = 0
dlen = len(data)
while (c+6) < dlen:
# Treat 3 bytes as an x coordinate, the next 3 bytes as a y coordinate.
# Our box is 1x1, so divide by 2^24 to put the x y values inside the box.
x = ((ord(data[c]) << 16) + (ord(data[c+1]) << 8) + ord(data[c+2])) / 16777216.0
c += 3
y = ((ord(data[c]) << 16) + (ord(data[c+1]) << 8) + ord(data[c+2])) / 16777216.0
c += 3
# Does the x,y point lie inside the circle inscribed within our box, with diameter == 1?
if ((x**2) + (y**2)) <= 1:
self.m += 1
self.n += 1
def montecarlo(self):
'''
Approximates the value of Pi based on the provided data.
Returns a tuple of (approximated value of pi, percent deviation).
'''
if self.n:
self.pi = (float(self.m) / float(self.n) * 4.0)
if self.pi:
self.error = math.fabs(1.0 - (math.pi / self.pi)) * 100.0
return (self.pi, self.error)
else:
return (0.0, 0.0)
class ChiSquare(object):
'''
Performs a Chi Squared test against the provided data.
'''
IDEAL = 256.0
def __init__(self):
'''
Class constructor.
Returns None.
'''
self.bytes = {}
self.freedom = self.IDEAL - 1
# Initialize the self.bytes dictionary with keys for all possible byte values (0 - 255)
for i in range(0, int(self.IDEAL)):
self.bytes[chr(i)] = 0
self.reset()
def reset(self):
self.xc2 = 0.0
self.byte_count = 0
for key in self.bytes.keys():
self.bytes[key] = 0
def update(self, data):
'''
Updates the current byte counts with new data.
@data - String of bytes to update.
Returns None.
'''
# Count the number of occurances of each byte value
for i in data:
self.bytes[i] += 1
self.byte_count += len(data)
def chisq(self):
'''
Calculate the Chi Square critical value.
Returns the critical value.
'''
expected = self.byte_count / self.IDEAL
if expected:
for byte in self.bytes.values():
self.xc2 += ((byte - expected) ** 2 ) / expected
return self.xc2
class CompressionEntropyAnalyzer(object):
'''
Class wrapper around ChiSquare.
Performs analysis and attempts to interpret the results.
'''
BLOCK_SIZE = 32
CHI_CUTOFF = 512
DESCRIPTION = "Statistical Compression Analysis"
def __init__(self, fname, start, length, binwalk=None):
'''
Class constructor.
@fname - The file to scan.
@start - The start offset to begin analysis at.
@length - The number of bytes to analyze.
@binwalk - Binwalk class object.
Returns None.
'''
self.fp = common.BlockFile(fname, 'rb', offset=start, length=length)
# Read block size must be at least as large as our analysis block size
if self.fp.READ_BLOCK_SIZE < self.BLOCK_SIZE:
self.fp.READ_BLOCK_SIZE = self.BLOCK_SIZE
self.start = start
self.length = length
self.binwalk = binwalk
def __del__(self):
try:
self.fp.close()
except:
pass
def analyze(self):
'''
Perform analysis and interpretation.
Returns a descriptive string containing the results and attempted interpretation.
'''
i = 0
num_error = 0
analyzer_results = []
if self.binwalk:
self.binwalk.display.header(file_name=self.fp.name, description=self.DESCRIPTION)
chi = ChiSquare()
while i < self.length:
j = 0
(d, dlen) = self.fp.read_block()
while j < dlen:
chi.reset()
data = d[j:j+self.BLOCK_SIZE]
if len(data) < self.BLOCK_SIZE:
break
chi.update(data)
if chi.chisq() >= self.CHI_CUTOFF:
num_error += 1
j += self.BLOCK_SIZE
i += dlen
if num_error > 0:
verdict = 'Moderate entropy data, best guess: compressed'
else:
verdict = 'High entropy data, best guess: encrypted'
result = [{'offset' : self.start, 'description' : '%s, size: %d, %d low entropy blocks' % (verdict, self.length, num_error)}]
if self.binwalk:
self.binwalk.display.results(self.start, result)
self.binwalk.display.footer()
return result
import os
import common
class Config:
'''
Binwalk configuration class, used for accessing user and system file paths.
After instatiating the class, file paths can be accessed via the self.paths dictionary.
System file paths are listed under the 'system' key, user file paths under the 'user' key.
For example, to get the path to both the user and system binwalk magic files:
from binwalk import Config
conf = Config()
user_binwalk_file = conf.paths['user'][conf.BINWALK_MAGIC_FILE]
system_binwalk_file = conf.paths['system'][conf.BINWALK_MAGIC_FILE]
There is also an instance of this class available via the Binwalk.config object:
import binwalk
bw = binwalk.Binwalk()
user_binwalk_file = bw.config.paths['user'][conf.BINWALK_MAGIC_FILE]
system_binwalk_file = bw.config.paths['system'][conf.BINWALK_MAGIC_FILE]
Valid file names under both the 'user' and 'system' keys are as follows:
o BINWALK_MAGIC_FILE - Path to the default binwalk magic file.
o BINCAST_MAGIC_FILE - Path to the bincast magic file (used when -C is specified with the command line binwalk script).
o BINARCH_MAGIC_FILE - Path to the binarch magic file (used when -A is specified with the command line binwalk script).
o EXTRACT_FILE - Path to the extract configuration file (used when -e is specified with the command line binwalk script).
o PLUGINS - Path to the plugins directory.
'''
# Release version
VERSION = "1.2.3"
# Sub directories
BINWALK_USER_DIR = ".binwalk"
BINWALK_MAGIC_DIR = "magic"
BINWALK_CONFIG_DIR = "config"
BINWALK_PLUGINS_DIR = "plugins"
# File names
PLUGINS = "plugins"
EXTRACT_FILE = "extract.conf"
BINWALK_MAGIC_FILE = "binwalk"
BINCAST_MAGIC_FILE = "bincast"
BINARCH_MAGIC_FILE = "binarch"
def __init__(self):
'''
Class constructor. Enumerates file paths and populates self.paths.
'''
# Path to the user binwalk directory
self.user_dir = self._get_user_dir()
# Path to the system wide binwalk directory
self.system_dir = self._get_system_dir()
# Dictionary of all absolute user/system file paths
self.paths = {
'user' : {},
'system' : {},
}
# Build the paths to all user-specific files
self.paths['user'][self.BINWALK_MAGIC_FILE] = self._user_path(self.BINWALK_MAGIC_DIR, self.BINWALK_MAGIC_FILE)
self.paths['user'][self.BINCAST_MAGIC_FILE] = self._user_path(self.BINWALK_MAGIC_DIR, self.BINCAST_MAGIC_FILE)
self.paths['user'][self.BINARCH_MAGIC_FILE] = self._user_path(self.BINWALK_MAGIC_DIR, self.BINARCH_MAGIC_FILE)
self.paths['user'][self.EXTRACT_FILE] = self._user_path(self.BINWALK_CONFIG_DIR, self.EXTRACT_FILE)
self.paths['user'][self.PLUGINS] = self._user_path(self.BINWALK_PLUGINS_DIR)
# Build the paths to all system-wide files
self.paths['system'][self.BINWALK_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.BINWALK_MAGIC_FILE)
self.paths['system'][self.BINCAST_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.BINCAST_MAGIC_FILE)
self.paths['system'][self.BINARCH_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.BINARCH_MAGIC_FILE)
self.paths['system'][self.EXTRACT_FILE] = self._system_path(self.BINWALK_CONFIG_DIR, self.EXTRACT_FILE)
self.paths['system'][self.PLUGINS] = self._system_path(self.BINWALK_PLUGINS_DIR)
def find_magic_file(self, fname, system_only=False, user_only=False):
'''
Finds the specified magic file name in the system / user magic file directories.
@fname - The name of the magic file.
@system_only - If True, only the system magic file directory will be searched.
@user_only - If True, only the user magic file directory will be searched.
If system_only and user_only are not set, the user directory is always searched first.
Returns the path to the file on success; returns None on failure.
'''
loc = None
if not system_only:
fpath = self._user_path(self.BINWALK_MAGIC_DIR, fname)
if os.path.exists(fpath) and common.file_size(fpath) > 0:
loc = fpath
if loc is None and not user_only:
fpath = self._system_path(self.BINWALK_MAGIC_DIR, fname)
if os.path.exists(fpath) and common.file_size(fpath) > 0:
loc = fpath
return fpath
def _get_system_dir(self):
'''
Find the directory where the binwalk module is installed on the system.
'''
try:
root = __file__
if os.path.islink(root):
root = os.path.realpath(root)
return os.path.dirname(os.path.abspath(root))
except:
return ''
def _get_user_dir(self):
'''
Get the user's home directory.
'''
try:
# This should work in both Windows and Unix environments
return os.getenv('USERPROFILE') or os.getenv('HOME')
except:
return ''
def _file_path(self, dirname, filename):
'''
Builds an absolute path and creates the directory and file if they don't already exist.
@dirname - Directory path.
@filename - File name.
Returns a full path of 'dirname/filename'.
'''
if not os.path.exists(dirname):
try:
os.makedirs(dirname)
except:
pass
fpath = os.path.join(dirname, filename)
if not os.path.exists(fpath):
try:
open(fpath, "w").close()
except:
pass
return fpath
def _user_path(self, subdir, basename=''):
'''
Gets the full path to the 'subdir/basename' file in the user binwalk directory.
@subdir - Subdirectory inside the user binwalk directory.
@basename - File name inside the subdirectory.
Returns the full path to the 'subdir/basename' file.
'''
return self._file_path(os.path.join(self.user_dir, self.BINWALK_USER_DIR, subdir), basename)
def _system_path(self, subdir, basename=''):
'''
Gets the full path to the 'subdir/basename' file in the system binwalk directory.
@subdir - Subdirectory inside the system binwalk directory.
@basename - File name inside the subdirectory.
Returns the full path to the 'subdir/basename' file.
'''
return self._file_path(os.path.join(self.system_dir, subdir), basename)
#################################################################################################################
# Default extract rules loaded when --extract is specified.
#
# <case-insensitive unique string from binwalk output text>:<desired file extension>:<command to execute>
#
# Note that %e is a place holder for the extracted file name.
#################################################################################################################
# Assumes these utilities are installed in $PATH.
^gzip compressed data:gz:gzip -d -f '%e'
^lzma compressed data:7z:7zr e -y '%e'
^bzip2 compressed data:bz2:bzip2 -d -f '%e'
^compress'd data:Z:compress -d '%e'
^zip archive data:zip:jar xf '%e' # jar does a better job of unzipping than unzip does...
^posix tar archive:tar:tar xvf '%e'
^rar archive data:rar:unrar e '%e'
^arj archive data.*comment header:arj:arj e '%e'
^iso 9660:iso:7z x '%e' -oiso-root
# These assume the firmware-mod-kit is installed to /opt/firmware-mod-kit.
# If not, change the file paths appropriately.
^squashfs filesystem:squashfs:/opt/firmware-mod-kit/unsquashfs_all.sh '%e'
^jffs2 filesystem:jffs2:/opt/firmware-mod-kit/src/jffs2/unjffs2 '%e'
^ascii cpio archive:cpio:/opt/firmware-mod-kit/uncpio.sh '%e'
^cramfs filesystem:cramfs:/opt/firmware-mod-kit/uncramfs_all.sh '%e'
^bff volume entry:bff:/opt/firmware-mod-kit/src/bff/bffxtractor.py '%e'
^wdk file system:wdk:/opt/firmware-mod-kit/src/firmware-tools/unwdk.py '%e'
^zlib header:zlib:/opt/firmware-mod-kit/src/firmware-tools/unzlib.py '%e'
^ext2 filesystem:ext2:/opt/firmware-mod-kit/src/mountcp/mountcp '%e' ext2-root
^romfs filesystem:romfs:/opt/firmware-mod-kit/src/mountcp/mountcp '%e' romfs-root
# These paths are for the depreciated firmware-mod-kit file paths, which included the 'trunk' directory.
# These will only be run if the above file paths don't exist.
^squashfs filesystem:squashfs:/opt/firmware-mod-kit/trunk/unsquashfs_all.sh '%e'
^jffs2 filesystem:jffs2:/opt/firmware-mod-kit/trunk/src/jffs2/unjffs2 '%e' # requires root
^ascii cpio archive:cpio:/opt/firmware-mod-kit/trunk/uncpio.sh '%e'
^cramfs filesystem:cramfs:/opt/firmware-mod-kit/trunk/uncramfs_all.sh '%e'
^bff volume entry:bff:/opt/firmware-mod-kit/trunk/src/bff/bffxtractor.py '%e'
# If FMK isn't installed, try the system's unsquashfs for SquashFS files
^squashfs filesystem:squashfs:unsquashfs '%e'
# Extract, but don't run anything
private key:key
certificate:crt
html document header:html
xml document:xml
import zlib
import math
import os.path
import plugins
import common
import compression
class PlotEntropy(object):
'''
Class to plot entropy data on a graph.
'''
YLIM_MIN = 0
YLIM_MAX = 1.5
XLABEL = 'Offset'
YLABEL = 'Entropy'
LINE_WIDTH = 1.5
COLORS = ['darkgreen', 'blueviolet', 'saddlebrown', 'deeppink', 'goldenrod', 'olive', 'black']
FILE_FORMAT = 'svg'
def __init__(self, x, y, title='Entropy', average=0, file_results={}, show_legend=True, save=False):
'''
Plots entropy data.
@x - List of graph x-coordinates (i.e., data offsets).
@y - List of graph y-coordinates (i.e., entropy for each offset).
@title - Graph title.
@average - The average entropy.
@file_results - Binwalk results, if any.
@show_legend - Set to False to not generate a color-coded legend and plotted x coordinates for the graph.
@save - If set to True, graph will be saved to disk rather than displayed.
Returns None.
'''
import matplotlib.pyplot as plt
import numpy as np
i = 0
trigger = 0
new_ticks = []
color_mappings = {}
plt.clf()
if file_results:
for (offset, results) in file_results:
label = None
description = results[0]['description'].split(',')[0]
if not color_mappings.has_key(description):
if show_legend:
label = description
color_mappings[description] = self.COLORS[i]
i += 1
if i >= len(self.COLORS):
i = 0
plt.axvline(x=offset, label=label, color=color_mappings[description], linewidth=self.LINE_WIDTH)
new_ticks.append(offset)
if show_legend:
plt.legend()
if new_ticks:
new_ticks.sort()
plt.xticks(np.array(new_ticks), new_ticks)
plt.plot(x, y, linewidth=self.LINE_WIDTH)
if average:
plt.plot(x, [average] * len(x), linestyle='--', color='r')
plt.xlabel(self.XLABEL)
plt.ylabel(self.YLABEL)
plt.title(title)
plt.ylim(self.YLIM_MIN, self.YLIM_MAX)
if save:
plt.savefig(common.unique_file_name(title, self.FILE_FORMAT))
else:
plt.show()
class FileEntropy(object):
'''
Class for analyzing and plotting data entropy for a file.
Preferred to use the Entropy class instead of calling FileEntropy directly.
'''
DEFAULT_BLOCK_SIZE = 1024
ENTROPY_TRIGGER = 0.9
ENTROPY_MAX = 0.95
def __init__(self, file_name=None, binwalk=None, offset=0, length=None, block=DEFAULT_BLOCK_SIZE, plugins=None, file_results=[], compcheck=False):
'''
Class constructor.
@file_name - The path to the file to analyze.
@binwalk - An instance of the Binwalk class.
@offset - The offset into the data to begin analysis.
@length - The number of bytes to analyze.
@block - The size of the data blocks to analyze.
@plugins - Instance of the Plugins class.
@file_results - Scan results to overlay on the entropy plot graph.
@compcheck - Set to True to enable entropy compression detection.
Returns None.
'''
self.start = offset
self.length = length
self.block = block
self.binwalk = binwalk
self.plugins = plugins
self.total_read = 0
self.current_data_block = ''
self.current_data_block_len = 0
self.current_data_block_offset = 0
self.file_results = file_results
self.do_chisq = compcheck
if file_name is None:
raise Exception("Entropy.__init__ requires at least the file_name option")
if not self.length:
self.length = 0
if not self.start:
self.start = 0
if not self.block:
self.block = self.DEFAULT_BLOCK_SIZE
self.fd = common.BlockFile(file_name, 'rb', offset=self.start, length=self.length)
self.fd.MAX_TRAILING_SIZE = 0
if self.fd.READ_BLOCK_SIZE < self.block:
self.fd.READ_BLOCK_SIZE = self.block
if self.binwalk:
# Set the total_scanned and scan_length values for plugins and status display messages
self.binwalk.total_scanned = 0
self.binwalk.scan_length = self.fd.length
def __enter__(self):
return self
def __del__(self):
self.cleanup()
def __exit__(self, t, v, traceback):
self.cleanup()
def cleanup(self):
'''
Clean up any open file objects.
Called internally by __del__ and __exit__.
Returns None.
'''
try:
self.fd.close()
except:
pass
def _read_block(self):
offset = self.total_read
if self.current_data_block_offset >= self.current_data_block_len:
self.current_data_block_offset = 0
(self.current_data_block, self.current_data_block_len) = self.fd.read_block()
if self.current_data_block and (self.current_data_block_len-self.current_data_block_offset) >= self.block:
data = self.current_data_block[self.current_data_block_offset:self.current_data_block_offset+self.block]
dlen = self.block
else:
data = ''
dlen = 0
self.current_data_block_offset += dlen
self.total_read += dlen
if self.binwalk:
self.binwalk.total_scanned = self.total_read
return (dlen, data, offset+self.start)
def gzip(self, offset, data, truncate=True):
'''
Performs an entropy analysis based on zlib compression ratio.
This is faster than the shannon entropy analysis, but not as accurate.
'''
# Entropy is a simple ratio of: <zlib compressed size> / <original size>
e = float(float(len(zlib.compress(data, 9))) / float(len(data)))
if truncate and e > 1.0:
e = 1.0
return e
def shannon(self, offset, data):
'''
Performs a Shannon entropy analysis on a given block of data.
'''
entropy = 0
dlen = len(data)
if not data:
return 0
for x in range(256):
p_x = float(data.count(chr(x))) / dlen
if p_x > 0:
entropy += - p_x*math.log(p_x, 2)
return (entropy / 8)
def _do_analysis(self, algorithm):
'''
Performs an entropy analysis using the provided algorithm.
@algorithm - A function/method to call which returns an entropy value.
Returns a tuple of ([x-coordinates], [y-coordinates], average_entropy), where:
o x-coordinates = A list of offsets analyzed inside the data.
o y-coordinates = A corresponding list of entropy for each offset.
'''
offsets = []
entropy = []
average = 0
total = 0
self.total_read = 0
plug_ret = plugins.PLUGIN_CONTINUE
plug_pre_ret = plugins.PLUGIN_CONTINUE
if self.plugins:
plug_pre_ret = self.plugins._pre_scan_callbacks(self.fd)
while not ((plug_pre_ret | plug_ret) & plugins.PLUGIN_TERMINATE):
(dlen, data, offset) = self._read_block()
if not dlen or not data:
break
e = algorithm(offset, data)
results = {'description' : '%f' % e, 'offset' : offset}
if self.plugins:
plug_ret = self.plugins._scan_callbacks(results)
offset = results['offset']
e = float(results['description'])
if not ((plug_pre_ret | plug_ret) & (plugins.PLUGIN_TERMINATE | plugins.PLUGIN_NO_DISPLAY)):
if self.binwalk and not self.do_chisq:
self.binwalk.display.results(offset, [results])
entropy.append(e)
offsets.append(offset)
total += e
try:
# This results in a divide by zero if one/all plugins returns PLUGIN_TERMINATE or PLUGIN_NO_DISPLAY,
# or if the file being scanned is a zero-size file.
average = float(float(total) / float(len(offsets)))
except:
pass
if self.plugins:
self.plugins._post_scan_callbacks(self.fd)
return (offsets, entropy, average)
def _look_for_compression(self, x, y):
'''
Analyzes areas of high entropy for signs of compression or encryption and displays the results.
'''
trigger = self.ENTROPY_TRIGGER
pairs = []
scan_pairs = []
index = -1
total = 0
if not self.file_results:
for j in range(0, len(x)):
if y[j] >= trigger and (j == 0 or y[j-1] < trigger):
pairs.append([x[j]])
index = len(pairs) - 1
elif y[j] <= trigger and y[j-1] > trigger and index > -1 and len(pairs[index]) == 1:
pairs[index].append(x[j])
# Generate a list of tuples containing the starting offset to begin analysis plus a length
for pair in pairs:
start = pair[0]
if len(pair) == 2:
stop = pair[1]
else:
self.fd.seek(0, 2)
stop = self.fd.tell()
length = stop - start
total += length
scan_pairs.append((start, length))
# Update the binwalk scan length and total scanned values so that the percent complete
# isn't stuck at 100% after the initial entropy analysis (which has already finished).
if self.binwalk and total > 0:
self.binwalk.scan_length = total
self.binwalk.total_scanned = 0
# Analyze each scan pair and display the results
for (start, length) in scan_pairs:
# Ignore anything less than 4KB in size
if length > (self.DEFAULT_BLOCK_SIZE * 4):
# Ignore the first and last 1KB of data to prevent header/footer or extra data from skewing results
result = compression.CompressionEntropyAnalyzer(self.fd.name, start+self.DEFAULT_BLOCK_SIZE, length-self.DEFAULT_BLOCK_SIZE).analyze()
results = [{'description' : result[0]['description'], 'offset' : start}]
self.file_results.append((start, results))
if self.binwalk:
self.binwalk.display.results(start, results)
# Keep the total scanned length updated
if self.binwalk:
self.binwalk.total_scanned += length
def analyze(self, algorithm=None):
'''
Performs an entropy analysis of the data using the specified algorithm.
@algorithm - A method inside of the Entropy class to invoke for entropy analysis.
Default method: self.shannon.
Other available methods: self.gzip.
May also be a string: 'gzip'.
Returns the return value of algorithm.
'''
algo = self.shannon
if algorithm:
if callable(algorithm):
algo = algorithm
try:
if algorithm.lower() == 'gzip':
algo = self.gzip
except:
pass
return self._do_analysis(algo)
def plot(self, x, y, average=0, show_legend=True, save=False):
'''
Plots entropy data.
@x - List of graph x-coordinates (i.e., data offsets).
@y - List of graph y-coordinates (i.e., entropy for each offset).
@average - The average entropy.
@show_legend - Set to False to not generate a color-coded legend and plotted x coordinates for the graph.
@save - If set to True, graph will be saved to disk rather than displayed.
Returns None.
'''
if self.do_chisq:
self._look_for_compression(x, y)
PlotEntropy(x, y, self.fd.name, average, self.file_results, show_legend, save)
class Entropy(object):
'''
Class for analyzing and plotting data entropy for multiple files.
A simple example of performing a binwalk scan and overlaying the binwalk scan results on the
resulting entropy analysis graph:
import sys
import binwalk
bwalk = binwalk.Binwalk()
scan_results = bwalk.scan(sys.argv[1])
with binwalk.entropy.Entropy(scan_results, bwalk) as e:
e.analyze()
bwalk.cleanup()
'''
DESCRIPTION = "ENTROPY ANALYSIS"
ALT_DESCRIPTION = "HEURISTIC ANALYSIS"
ENTROPY_SCAN = 'entropy'
def __init__(self, files, binwalk=None, offset=0, length=0, block=0, plot=True, legend=True, save=False, algorithm=None, load_plugins=True, whitelist=[], blacklist=[], compcheck=False):
'''
Class constructor.
@files - A dictionary containing file names and results data, as returned by Binwalk.scan.
@binwalk - An instance of the Binwalk class.
@offset - The offset into the data to begin analysis.
@length - The number of bytes to analyze.
@block - The size of the data blocks to analyze.
@plot - Set to False to disable plotting.
@legend - Set to False to exclude the legend and custom offset markers from the plot.
@save - Set to True to save plots to disk instead of displaying them.
@algorithm - Set to 'gzip' to use the gzip entropy "algorithm".
@load_plugins - Set to False to disable plugin callbacks.
@whitelist - A list of whitelisted plugins.
@blacklist - A list of blacklisted plugins.
@compcheck - Set to True to enable entropy compression detection.
Returns None.
'''
self.files = files
self.binwalk = binwalk
self.offset = offset
self.length = length
self.block = block
self.plot = plot
self.legend = legend
self.save = save
self.algorithm = algorithm
self.plugins = None
self.load_plugins = load_plugins
self.whitelist = whitelist
self.blacklist = blacklist
self.compcheck = compcheck
if len(self.files) > 1:
self.save = True
if self.binwalk:
self.binwalk.scan_type = self.binwalk.ENTROPY
def __enter__(self):
return self
def __exit__(self, t, v, traceback):
return None
def __del__(self):
return None
def set_entropy_algorithm(self, algorithm):
'''
Specify a function/method to call for determining data entropy.
@algorithm - The function/method to call. This will be passed two arguments:
the file offset of the data block, and a data block (type 'str').
It must return a single floating point entropy value from 0.0 and 1.0, inclusive.
Returns None.
'''
self.algorithm = algorithm
def analyze(self):
'''
Perform an entropy analysis on the target files.
Returns a dictionary of:
{
'file_name' : ([list, of, offsets], [list, of, entropy], average_entropy)
}
'''
results = {}
if self.binwalk and self.load_plugins:
self.plugins = plugins.Plugins(self.binwalk, whitelist=self.whitelist, blacklist=self.blacklist)
for (file_name, overlay) in self.files.iteritems():
if self.plugins:
self.plugins._load_plugins()
if self.binwalk:
if self.compcheck:
desc = self.ALT_DESCRIPTION
else:
desc = self.DESCRIPTION
self.binwalk.display.header(file_name=file_name, description=desc)
with FileEntropy(file_name=file_name, binwalk=self.binwalk, offset=self.offset, length=self.length, block=self.block, plugins=self.plugins, file_results=overlay, compcheck=self.compcheck) as e:
(x, y, average) = e.analyze(self.algorithm)
if self.plot or self.save:
e.plot(x, y, average, self.legend, self.save)
results[file_name] = (x, y, average)
if self.binwalk:
self.binwalk.display.footer()
if self.plugins:
del self.plugins
self.plugins = None
return results
import os
import re
import sys
import shlex
import tempfile
import subprocess
from config import *
from common import file_size, unique_file_name, BlockFile
class Extractor:
'''
Extractor class, responsible for extracting files from the target file and executing external applications, if requested.
An instance of this class is accessible via the Binwalk.extractor object.
Example usage:
import binwalk
bw = binwalk.Binwalk()
# Create extraction rules for scan results containing the string 'gzip compressed data' and 'filesystem'.
# The former will be saved to disk with a file extension of 'gz' and the command 'gunzip <file name on disk>' will be executed (note the %e placeholder).
# The latter will be saved to disk with a file extension of 'fs' and no command will be executed.
# These rules will be ignored if there were previous rules with the same match string.
bw.extractor.add_rule(['gzip compressed data:gz:gunzip %e', 'filesystem:fs'])
# Load the extraction rules from the default extract.conf file(s).
bw.extractor.load_defaults()
# Run the binwalk scan.
bw.scan('firmware.bin')
'''
# Extract rules are delimited with a colon.
# <case insensitive matching string>:<file extension>[:<command to run>]
RULE_DELIM = ':'
# Comments in the extract.conf files start with a pound
COMMENT_DELIM ='#'
# Place holder for the extracted file name in the command
FILE_NAME_PLACEHOLDER = '%e'
# Max size of data to read/write at one time when extracting data
MAX_READ_SIZE = 10 * 1024 * 1024
def __init__(self, verbose=False):
'''
Class constructor.
@verbose - Set to True to display the output from any executed external applications.
Returns None.
'''
self.config = Config()
self.enabled = False
self.delayed = False
self.verbose = verbose
self.extract_rules = []
self.remove_after_execute = False
self.extract_path = os.getcwd()
def append_rule(self, r):
self.enabled = True
self.extract_rules.append(r.copy())
def add_rule(self, txtrule=None, regex=None, extension=None, cmd=None):
'''
Adds a set of rules to the extraction rule list.
@txtrule - Rule string, or list of rule strings, in the format <regular expression>:<file extension>[:<command to run>]
@regex - If rule string is not specified, this is the regular expression string to use.
@extension - If rule string is not specified, this is the file extension to use.
@cmd - If rule string is not specified, this is the command to run.
Alternatively a callable object may be specified, which will be passed one argument: the path to the file to extract.
Returns None.
'''
rules = []
match = False
r = {
'extension' : '',
'cmd' : '',
'regex' : None
}
# Process single explicitly specified rule
if not txtrule and regex and extension:
r['extension'] = extension
r['regex'] = re.compile(regex)
if cmd:
r['cmd'] = cmd
self.append_rule(r)
return
# Process rule string, or list of rule strings
if not isinstance(txtrule, type([])):
rules = [txtrule]
else:
rules = txtrule
for rule in rules:
r['cmd'] = ''
r['extension'] = ''
try:
values = self._parse_rule(rule)
match = values[0]
r['regex'] = re.compile(values[0])
r['extension'] = values[1]
r['cmd'] = values[2]
except:
pass
# Verify that the match string and file extension were retrieved.
if match and r['extension']:
self.append_rule(r)
def remove_rule(self, text):
'''
Remove all rules that match a specified text.
@text - The text to match against.
Returns the number of rules removed.
'''
rm = []
for i in range(0, len(self.extract_rules)):
if self.extract_rules[i]['regex'].match(text):
rm.append(i)
for i in rm:
self.extract_rules.pop(i)
return len(rm)
def clear_rules(self):
'''
Deletes all extraction rules.
Returns None.
'''
self.extract_rules = []
self.enabled = False
def get_rules(self):
'''
Returns a list of all extraction rules.
'''
return self.extract_rules
def enable_delayed_extract(self, tf=None):
'''
Enables / disables the delayed extraction feature.
This feature ensures that certian supported file types will not contain extra data at the end of the
file when they are extracted, but also means that these files will not be extracted until the end of the scan.
@tf - Set to True to enable, False to disable.
Returns the current delayed extraction setting.
'''
if tf is not None:
self.delayed = tf
return self.delayed
def load_from_file(self, fname):
'''
Loads extraction rules from the specified file.
@fname - Path to the extraction rule file.
Returns None.
'''
try:
# Process each line from the extract file, ignoring comments
for rule in open(fname).readlines():
self.add_rule(rule.split(self.COMMENT_DELIM, 1)[0])
except Exception, e:
raise Exception("Extractor.load_from_file failed to load file '%s': %s" % (fname, str(e)))
def load_defaults(self):
'''
Loads default extraction rules from the user and system extract.conf files.
Returns None.
'''
# Load the user extract file first to ensure its rules take precedence.
extract_files = [
self.config.paths['user'][self.config.EXTRACT_FILE],
self.config.paths['system'][self.config.EXTRACT_FILE],
]
for extract_file in extract_files:
try:
self.load_from_file(extract_file)
except Exception, e:
if self.verbose:
raise Exception("Extractor.load_defaults failed to load file '%s': %s" % (extract_file, str(e)))
def output_directory(self, path):
'''
Set the output directory for extracted files.
@path - The extraction path.
Returns None.
'''
self.extract_path = path
def cleanup_extracted_files(self, tf=None):
'''
Set the action to take after a file is extracted.
@tf - If set to True, extracted files will be cleaned up after running a command against them.
If set to False, extracted files will not be cleaned up after running a command against them.
If set to None or not specified, the current setting will not be changed.
Returns the current cleanup status (True/False).
'''
if tf is not None:
self.remove_after_execute = tf
return self.remove_after_execute
def extract(self, offset, description, file_name, size, name=None):
'''
Extract an embedded file from the target file, if it matches an extract rule.
Called automatically by Binwalk.scan().
@offset - Offset inside the target file to begin the extraction.
@description - Description of the embedded file to extract, as returned by libmagic.
@file_name - Path to the target file.
@size - Number of bytes to extract.
@name - Name to save the file as.
Returns the name of the extracted file (blank string if nothing was extracted).
'''
fname = ''
cleanup_extracted_fname = True
original_dir = os.getcwd()
rules = self._match(description)
# No extraction rules for this file
if not rules:
return
if not os.path.exists(self.extract_path):
os.mkdir(self.extract_path)
file_path = os.path.realpath(file_name)
if os.path.isfile(file_path):
os.chdir(self.extract_path)
# Loop through each extraction rule until one succeeds
for i in range(0, len(rules)):
rule = rules[i]
# Copy out the data to disk, if we haven't already
fname = self._dd(file_path, offset, size, rule['extension'], output_file_name=name)
# If there was a command specified for this rule, try to execute it.
# If execution fails, the next rule will be attempted.
if rule['cmd']:
# Many extraction utilities will extract the file to a new file, just without
# the file extension (i.e., myfile.7z -> myfile). If the presumed resulting
# file name already exists before executing the extract command, do not attempt
# to clean it up even if its resulting file size is 0.
if self.remove_after_execute:
extracted_fname = os.path.splitext(fname)[0]
if os.path.exists(extracted_fname):
cleanup_extracted_fname = False
# Execute the specified command against the extracted file
extract_ok = self.execute(rule['cmd'], fname)
# Only clean up files if remove_after_execute was specified
if extract_ok and self.remove_after_execute:
# Remove the original file that we extracted
try:
os.unlink(fname)
except:
pass
# If the command worked, assume it removed the file extension from the extracted file
# If the extracted file name file exists and is empty, remove it
if cleanup_extracted_fname and os.path.exists(extracted_fname) and file_size(extracted_fname) == 0:
try:
os.unlink(extracted_fname)
except:
pass
# If the command executed OK, don't try any more rules
if extract_ok:
break
# Else, remove the extracted file if this isn't the last rule in the list.
# If it is the last rule, leave the file on disk for the user to examine.
elif i != (len(rules)-1):
try:
os.unlink(fname)
except:
pass
# If there was no command to execute, just use the first rule
else:
break
os.chdir(original_dir)
# If a file was extracted, return the full path to that file
if fname:
fname = os.path.join(self.extract_path, fname)
return fname
def delayed_extract(self, results, file_name, size):
'''
Performs a delayed extraction (see self.enable_delayed_extract).
Called internally by Binwalk.Scan().
@results - A list of dictionaries of all the scan results.
@file_name - The path to the scanned file.
@size - The size of the scanned file.
Returns an updated results list containing the names of the newly extracted files.
'''
index = 0
info_count = 0
nresults = results
for (offset, infos) in results:
info_count = 0
for info in infos:
ninfos = infos
if info['delay']:
end_offset = self._entry_offset(index, results, info['delay'])
if end_offset == -1:
extract_size = size
else:
extract_size = (end_offset - offset)
ninfos[info_count]['extract'] = self.extract(offset, info['description'], file_name, extract_size, info['name'])
nresults[index] = (offset, ninfos)
info_count += 1
index += 1
return nresults
def _entry_offset(self, index, entries, description):
'''
Gets the offset of the first entry that matches the description.
@index - Index into the entries list to begin searching.
@entries - Dictionary of result entries.
@description - Case insensitive description.
Returns the offset, if a matching description is found.
Returns -1 if a matching description is not found.
'''
description = description.lower()
for (offset, infos) in entries[index:]:
for info in infos:
if info['description'].lower().startswith(description):
return offset
return -1
def _match(self, description):
'''
Check to see if the provided description string matches an extract rule.
Called internally by self.extract().
@description - Description string to check.
Returns the associated rule dictionary if a match is found.
Returns None if no match is found.
'''
rules = []
description = description.lower()
for rule in self.extract_rules:
if rule['regex'].search(description):
rules.append(rule)
return rules
def _parse_rule(self, rule):
'''
Parses an extraction rule.
@rule - Rule string.
Returns an array of ['<case insensitive matching string>', '<file extension>', '<command to run>'].
'''
return rule.strip().split(self.RULE_DELIM, 2)
def _dd(self, file_name, offset, size, extension, output_file_name=None):
'''
Extracts a file embedded inside the target file.
@file_name - Path to the target file.
@offset - Offset inside the target file where the embedded file begins.
@size - Number of bytes to extract.
@extension - The file exension to assign to the extracted file on disk.
@output_file_name - The requested name of the output file.
Returns the extracted file name.
'''
total_size = 0
# Default extracted file name is <hex offset>.<extension>
default_bname = "%X" % offset
if not output_file_name or output_file_name is None:
bname = default_bname
else:
# Strip the output file name of invalid/dangerous characters (like file paths)
bname = os.path.basename(output_file_name)
fname = unique_file_name(bname, extension)
try:
# Open the target file and seek to the offset
fdin = BlockFile(file_name, "rb", length=size)
fdin.seek(offset)
# Open the output file
try:
fdout = BlockFile(fname, "wb")
except Exception, e:
# Fall back to the default name if the requested name fails
fname = unique_file_name(default_bname, extension)
fdout = BlockFile(fname, "wb")
while total_size < size:
(data, dlen) = fdin.read_block()
fdout.write(data[:dlen])
total_size += dlen
# Cleanup
fdout.close()
fdin.close()
except Exception, e:
raise Exception("Extractor.dd failed to extract data from '%s' to '%s': %s" % (file_name, fname, str(e)))
return fname
def execute(self, cmd, fname):
'''
Execute a command against the specified file.
@cmd - Command to execute.
@fname - File to run command against.
Returns True on success, False on failure.
'''
tmp = None
retval = True
try:
if callable(cmd):
try:
cmd(fname)
except Exception, e:
sys.stderr.write("WARNING: Extractor.execute failed to run '%s': %s\n" % (str(cmd), str(e)))
else:
# If not in verbose mode, create a temporary file to redirect stdout and stderr to
if not self.verbose:
tmp = tempfile.TemporaryFile()
# Replace all instances of FILE_NAME_PLACEHOLDER in the command with fname
cmd = cmd.replace(self.FILE_NAME_PLACEHOLDER, fname)
# Execute.
if subprocess.call(shlex.split(cmd), stdout=tmp, stderr=tmp) != 0:
retval = False
except Exception, e:
# Silently ignore no such file or directory errors. Why? Because these will inevitably be raised when
# making the switch to the new firmware mod kit directory structure. We handle this elsewhere, but it's
# annoying to see this spammed out to the console every time.
if e.errno != 2:
sys.stderr.write("WARNING: Extractor.execute failed to run '%s': %s\n" % (str(cmd), str(e)))
retval = False
if tmp is not None:
tmp.close()
return retval
import re
import common
from smartsignature import SmartSignature
class MagicFilter:
'''
Class to filter libmagic results based on include/exclude rules and false positive detection.
An instance of this class is available via the Binwalk.filter object.
Note that all filter strings should be in lower case.
Example code which creates include, exclude, and grep filters before running a binwalk scan:
import binwalk
bw = binwalk.Binwalk()
# Include all signatures whose descriptions contain the string 'filesystem' in the first line of the signature, even if those signatures are normally excluded.
# Note that if exclusive=False was specified, this would merely add these signatures to the default signatures.
# Since exclusive=True (the default) has been specified, ONLY those matching signatures will be loaded; all others will be ignored.
bw.filter.include('filesystem')
# Exclude all signatures whose descriptions contain the string 'jffs2', even if those signatures are normally included.
# In this case, we are now searching for all filesystem signatures, except JFFS2.
bw.filter.exclude('jffs2')
# Add a grep filter. Unlike the include and exclude filters, it does not affect which results are returned by Binwalk.scan(), but it does affect which results
# are printed by Binwalk.display.results(). This is particularly useful for cases like the bincast scan, where multiple lines of results are returned per offset,
# but you only want certian ones displayed. In this case, only file systems whose description contain the string '2012' will be displayed.
bw.filter.grep(filters=['2012'])
bw.scan('firmware.bin')
'''
# If the result returned by libmagic is "data" or contains the text
# 'invalid' or a backslash are known to be invalid/false positives.
DATA_RESULT = "data"
INVALID_RESULTS = ["invalid", "\\"]
INVALID_RESULT = "invalid"
NON_PRINTABLE_RESULT = "\\"
FILTER_INCLUDE = 0
FILTER_EXCLUDE = 1
def __init__(self, show_invalid_results=False):
'''
Class constructor.
@show_invalid_results - Set to True to display results marked as invalid.
Returns None.
'''
self.filters = []
self.grep_filters = []
self.show_invalid_results = show_invalid_results
self.exclusive_filter = False
self.smart = SmartSignature(self)
def include(self, match, exclusive=True):
'''
Adds a new filter which explicitly includes results that contain
the specified matching text.
@match - Regex, or list of regexs, to match.
@exclusive - If True, then results that do not explicitly contain
a FILTER_INCLUDE match will be excluded. If False,
signatures that contain the FILTER_INCLUDE match will
be included in the scan, but will not cause non-matching
results to be excluded.
Returns None.
'''
if not isinstance(match, type([])):
matches = [match]
else:
matches = match
for m in matches:
include_filter = {}
if m:
if exclusive and not self.exclusive_filter:
self.exclusive_filter = True
include_filter['type'] = self.FILTER_INCLUDE
include_filter['filter'] = m
include_filter['regex'] = re.compile(m)
self.filters.append(include_filter)
def exclude(self, match):
'''
Adds a new filter which explicitly excludes results that contain
the specified matching text.
@match - Regex, or list of regexs, to match.
Returns None.
'''
if not isinstance(match, type([])):
matches = [match]
else:
matches = match
for m in matches:
exclude_filter = {}
if m:
exclude_filter['type'] = self.FILTER_EXCLUDE
exclude_filter['filter'] = m
exclude_filter['regex'] = re.compile(m)
self.filters.append(exclude_filter)
def filter(self, data):
'''
Checks to see if a given string should be excluded from or included in the results.
Called internally by Binwalk.scan().
@data - String to check.
Returns FILTER_INCLUDE if the string should be included.
Returns FILTER_EXCLUDE if the string should be excluded.
'''
data = data.lower()
# Loop through the filters to see if any of them are a match.
# If so, return the registered type for the matching filter (FILTER_INCLUDE | FILTER_EXCLUDE).
for f in self.filters:
if f['regex'].search(data):
return f['type']
# If there was not explicit match and exclusive filtering is enabled, return FILTER_EXCLUDE.
if self.exclusive_filter:
return self.FILTER_EXCLUDE
return self.FILTER_INCLUDE
def invalid(self, data):
'''
Checks if the given string contains invalid data.
Called internally by Binwalk.scan().
@data - String to validate.
Returns True if data is invalid, False if valid.
'''
# A result of 'data' is never ever valid.
if data == self.DATA_RESULT:
return True
# If showing invalid results, just return False.
if self.show_invalid_results:
return False
# Don't include quoted strings or keyword arguments in this search, as
# strings from the target file may legitimately contain the INVALID_RESULT text.
if self.INVALID_RESULT in common.strip_quoted_strings(self.smart._strip_tags(data)):
return True
# There should be no non-printable characters in any of the data
if self.NON_PRINTABLE_RESULT in data:
return True
return False
def grep(self, data=None, filters=[]):
'''
Add or check case-insensitive grep filters against the supplied data string.
@data - Data string to check grep filters against. Not required if filters is specified.
@filters - Regex, or list of regexs, to add to the grep filters list. Not required if data is specified.
Returns None if data is not specified.
If data is specified, returns True if the data contains a grep filter, or if no grep filters exist.
If data is specified, returns False if the data does not contain any grep filters.
'''
# Add any specified filters to self.grep_filters
if filters:
if not isinstance(filters, type([])):
gfilters = [filters]
else:
gfilters = filters
for gfilter in gfilters:
# Filters are case insensitive
self.grep_filters.append(re.compile(gfilter))
# Check the data against all grep filters until one is found
if data is not None:
# If no grep filters have been created, always return True
if not self.grep_filters:
return True
# Filters are case insensitive
data = data.lower()
# If a filter exists in data, return True
for gfilter in self.grep_filters:
if gfilter.search(data):
return True
# Else, return False
return False
return None
def clear(self):
'''
Clears all include, exclude and grep filters.
Retruns None.
'''
self.filters = []
self.grep_filters = []
#!/usr/bin/env python
import os
import sys
import string
import curses
import platform
import common
class HexDiff(object):
ALL_SAME = 0
ALL_DIFF = 1
SOME_DIFF = 2
DEFAULT_DIFF_SIZE = 0x100
DEFAULT_BLOCK_SIZE = 16
COLORS = {
'red' : '31',
'green' : '32',
'blue' : '34',
}
def __init__(self, binwalk=None):
self.block_hex = ""
self.printed_alt_text = False
if binwalk:
self._pprint = binwalk.display._pprint
self._show_header = binwalk.display.header
self._footer = binwalk.display.footer
self._display_result = binwalk.display.results
self._grep = binwalk.filter.grep
else:
self._pprint = sys.stdout.write
self._show_header = self._print
self._footer = self._simple_footer
self._display_result = self._print
self._grep = None
if hasattr(sys.stderr, 'isatty') and sys.stderr.isatty() and platform.system() != 'Windows':
curses.setupterm()
self.colorize = self._colorize
else:
self.colorize = self._no_colorize
def _no_colorize(self, c, color="red", bold=True):
return c
def _colorize(self, c, color="red", bold=True):
attr = []
attr.append(self.COLORS[color])
if bold:
attr.append('1')
return "\x1b[%sm%s\x1b[0m" % (';'.join(attr), c)
def _print_block_hex(self, alt_text="*"):
printed = False
if self._grep is None or self._grep(self.block_hex):
self._pprint(self.block_hex)
self.printed_alt_text = False
printed = True
elif not self.printed_alt_text:
self._pprint("%s\n" % alt_text)
self.printed_alt_text = True
printed = True
self.block_hex = ""
return printed
def _build_block(self, c, highlight=None):
if highlight == self.ALL_DIFF:
self.block_hex += self.colorize(c, color="red")
elif highlight == self.ALL_SAME:
self.block_hex += self.colorize(c, color="green")
elif highlight == self.SOME_DIFF:
self.block_hex += self.colorize(c, color="blue")
else:
self.block_hex += c
def _simple_footer(self):
print ""
def _header(self, files, block):
header = "OFFSET "
for i in range(0, len(files)):
f = files[i]
header += "%s" % os.path.basename(f)
if i != len(files)-1:
header += " " * ((block*4) + 10 - len(os.path.basename(f)))
self._show_header(header=header)
def display(self, files, offset=0, size=DEFAULT_DIFF_SIZE, block=DEFAULT_BLOCK_SIZE, show_first_only=False):
i = 0
total = 0
fps = []
data = {}
delim = '/'
if show_first_only:
self._header([files[0]], block)
else:
self._header(files, block)
if common.BlockFile.READ_BLOCK_SIZE < block:
read_block_size = block
else:
read_block_size = common.BlockFile.READ_BLOCK_SIZE
for f in files:
fp = common.BlockFile(f, 'rb', length=size, offset=offset)
fp.READ_BLOCK_SIZE = read_block_size
fp.MAX_TRAILING_SIZE = 0
fps.append(fp)
while total < size:
i = 0
for fp in fps:
(ddata, dlen) = fp.read_block()
data[fp.name] = ddata
while i < read_block_size and (total+i) < size:
diff_same = {}
alt_text = "*" + " " * 6
self._build_block("%.08X " % (total + i + offset))
# For each byte in this block, is the byte the same in all files, the same in some files, or different in all files?
for j in range(0, block):
byte_list = []
try:
c = data[files[0]][j+i]
except:
c = None
for f in files:
try:
c = data[f][j+i]
except Exception, e:
c = None
if c not in byte_list:
byte_list.append(c)
if len(byte_list) == 1:
diff_same[j] = self.ALL_SAME
elif len(byte_list) == len(files):
diff_same[j] = self.ALL_DIFF
else:
diff_same[j] = self.SOME_DIFF
for index in range(0, len(files)):
if show_first_only and index > 0:
break
f = files[index]
alt_text += " " * (3 + (3 * block) + 3 + block + 3)
alt_text += delim
for j in range(0, block):
try:
#print "%s[%d]" % (f, j+i)
self._build_block("%.2X " % ord(data[f][j+i]), highlight=diff_same[j])
except Exception, e:
#print str(e)
self._build_block(" ")
if (j+1) == block:
self._build_block(" |")
for k in range(0, block):
try:
if data[f][k+i] in string.printable and data[f][k+i] not in string.whitespace:
self._build_block(data[f][k+i], highlight=diff_same[k])
else:
self._build_block('.', highlight=diff_same[k])
except:
self._build_block(' ')
if index == len(files)-1 or (show_first_only and index == 0):
self._build_block("|\n")
else:
self._build_block('| %s ' % delim)
if self._print_block_hex(alt_text=alt_text[:-1].strip()):
if delim == '\\':
delim = '/'
else:
delim = '\\'
i += block
total += read_block_size
for fp in fps:
fp.close()
self._footer()
if __name__ == "__main__":
HexDiff().display(sys.argv[1:])
# MIPS prologue
# addiu $sp, -XX
# 27 BD FF XX
0 string \377\275\47 MIPSEL instructions, function prologue{offset-adjust:-1}
0 string \47\275\377 MIPS instructions, function prologue
# MIPS epilogue
# jr $ra
0 belong 0x03e00008 MIPS instructions, function epilogue
0 lelong 0x03e00008 MIPSEL instructions, function epilogue
# PowerPC prologue
# mflr r0
0 belong 0x7C0802A6 PowerPC big endian instructions, function prologue
0 lelong 0x7C0802A6 PowerPC little endian instructions, funciton prologue
# PowerPC epilogue
# blr
0 belong 0x4E800020 PowerPC big endian instructions, function epilogue
0 lelong 0x4E800020 PowerPC little endian instructions, function epilogue
# ARM prologue
# STMFD SP!, {XX}
0 beshort 0xE92D ARMEB instructions, function prologue
0 leshort 0xE92D ARM instructions, function prologue{offset-adjust:-2}
# ARM epilogue
# LDMFD SP!, {XX}
0 beshort 0xE8BD ARMEB instructions, function epilogue
0 leshort 0xE8BD ARM instructions, function epilogue{offset-adjust:-2}
# Ubicom32 prologue
# move.4 -4($sp)++, $ra
0 belong 0x02FF6125 Ubicom32 instructions, function prologue
# Ubicom32 epilogues
# calli $ra, 0($ra)
# ret ($sp)4++
0 belong 0xF0A000A0 Ubicom32 instructions, function epilogue
0 belong 0x000022E1 Ubicom32 instructions, function epilogue
# AVR8 prologue
# push r28
# push r29
0 belong 0x93CF93DF AVR8 instructions, function prologue
0 belong 0x93DF93CF AVR8 instructions, function prologue
# AVR32 prologue
# pushm r7,lr
# mov r7,sp
0 string \xEB\xCD\x40\x80\x1A\x97 AVR32 instructions, function prologue
# SPARC eiplogue
# ret
# restore XX
0 string \x81\xC7\xE0\x08\x81\xE8 SPARC instructions, function epilogue
# x86 epilogue
# push ebp
# move ebp, esp
0 string \x55\x89\xE5 Intel x86 instructions, function epilogue
0 belong x Hex: 0x%.8X
#0 string x String: %s
0 lequad x Little Endian Quad: %lld
0 bequad x Big Endian Quad: %lld
0 lelong x Little Endian Long: %d
0 belong x Big Endian Long: %d
0 leshort x Little Endian Short: %d
0 beshort x Big Endian Short: %d
0 ledate x Little Endian Date: %s
0 bedate x Big Endian Date: %s
This source diff could not be displayed because it is too large. You can view the blob instead.
0 string \x1f\x9d\x90 compress'd data, 16 bits
#0 beshort 0x7801 Zlib header, no compression
0 beshort 0x789c Zlib header, default compression
0 beshort 0x78da Zlib header, best compression
0 beshort 0x785e Zlib header, compressed
#!/usr/bin/env python
# Routines to perform Monte Carlo Pi approximation and Chi Squared tests.
# Used for fingerprinting unknown areas of high entropy (e.g., is this block of high entropy data compressed or encrypted?).
# Inspired by people who actually know what they're doing: http://www.fourmilab.ch/random/
import math
class MonteCarloPi(object):
'''
Performs a Monte Carlo Pi approximation.
'''
def __init__(self):
'''
Class constructor.
Returns None.
'''
self.reset()
def reset(self):
'''
Reset state to the beginning.
'''
self.pi = 0
self.error = 0
self.m = 0
self.n = 0
def update(self, data):
'''
Update the pi approximation with new data.
@data - A string of bytes to update (length must be >= 6).
Returns None.
'''
c = 0
dlen = len(data)
while (c+6) < dlen:
# Treat 3 bytes as an x coordinate, the next 3 bytes as a y coordinate.
# Our box is 1x1, so divide by 2^24 to put the x y values inside the box.
x = ((ord(data[c]) << 16) + (ord(data[c+1]) << 8) + ord(data[c+2])) / 16777216.0
c += 3
y = ((ord(data[c]) << 16) + (ord(data[c+1]) << 8) + ord(data[c+2])) / 16777216.0
c += 3
# Does the x,y point lie inside the circle inscribed within our box, with diameter == 1?
if ((x**2) + (y**2)) <= 1:
self.m += 1
self.n += 1
def montecarlo(self):
'''
Approximates the value of Pi based on the provided data.
Returns a tuple of (approximated value of pi, percent deviation).
'''
if self.n:
self.pi = (float(self.m) / float(self.n) * 4.0)
if self.pi:
self.error = math.fabs(1.0 - (math.pi / self.pi)) * 100.0
return (self.pi, self.error)
else:
return (0.0, 0.0)
class ChiSquare(object):
'''
Performs a Chi Squared test against the provided data.
'''
IDEAL = 256.0
def __init__(self):
'''
Class constructor.
Returns None.
'''
self.bytes = {}
self.freedom = self.IDEAL - 1
# Initialize the self.bytes dictionary with keys for all possible byte values (0 - 255)
for i in range(0, int(self.IDEAL)):
self.bytes[chr(i)] = 0
self.reset()
def reset(self):
self.xc2 = 0.0
self.byte_count = 0
for key in self.bytes.keys():
self.bytes[key] = 0
def update(self, data):
'''
Updates the current byte counts with new data.
@data - String of bytes to update.
Returns None.
'''
# Count the number of occurances of each byte value
for i in data:
self.bytes[i] += 1
self.byte_count += len(data)
def chisq(self):
'''
Calculate the Chi Square critical value.
Returns the critical value.
'''
expected = self.byte_count / self.IDEAL
if expected:
for byte in self.bytes.values():
self.xc2 += ((byte - expected) ** 2 ) / expected
return self.xc2
class MathAnalyzer(object):
'''
Class wrapper aroung ChiSquare and MonteCarloPi.
Performs analysis and attempts to interpret the results.
'''
# Data blocks must be in multiples of 6 for the monte carlo pi approximation
BLOCK_SIZE = 32
CHI_CUTOFF = 512
def __init__(self, fp, start, length):
'''
Class constructor.
@fp - A seekable, readable, file object that will be the data source.
@start - The start offset to begin analysis at.
@length - The number of bytes to analyze.
Returns None.
'''
self.fp = fp
self.start = start
self.length = length
def analyze(self):
'''
Perform analysis and interpretation.
Returns a descriptive string containing the results and attempted interpretation.
'''
i = 0
num_error = 0
analyzer_results = []
chi = ChiSquare()
self.fp.seek(self.start)
while i < self.length:
rsize = self.length - i
if rsize > self.BLOCK_SIZE:
rsize = self.BLOCK_SIZE
chi.reset()
chi.update(self.fp.read(rsize))
if chi.chisq() >= self.CHI_CUTOFF:
num_error += 1
i += rsize
if num_error > 0:
verdict = 'Low/medium entropy data block'
else:
verdict = 'High entropy data block'
result = '%s, %d low entropy blocks' % (verdict, num_error)
return result
if __name__ == "__main__":
import sys
rsize = 0
largest = (0, 0)
num_error = 0
data = open(sys.argv[1], 'rb').read()
try:
block_size = int(sys.argv[2], 0)
except:
block_size = 32
chi = ChiSquare()
while rsize < len(data):
chi.reset()
d = data[rsize:rsize+block_size]
if d < block_size:
break
chi.update(d)
if chi.chisq() >= 512:
sys.stderr.write("0x%X -> %d\n" % (rsize, chi.xc2))
num_error += 1
if chi.xc2 >= largest[1]:
largest = (rsize, chi.xc2)
rsize += block_size
sys.stderr.write("Number of deviations: %d\n" % num_error)
sys.stderr.write("Largest deviation: %d at offset 0x%X\n" % (largest[1], largest[0]))
print "Data:",
if num_error != 0:
print "Compressed"
else:
print "Encrypted"
print "Confidence:",
if num_error >= 5 or num_error == 0:
print "High"
elif num_error in [3,4]:
print "Medium"
else:
print "Low"
import re
import os.path
import tempfile
from common import str2int
class MagicParser:
'''
Class for loading, parsing and creating libmagic-compatible magic files.
This class is primarily used internally by the Binwalk class, and a class instance of it is available via the Binwalk.parser object.
One useful method however, is file_from_string(), which will generate a temporary magic file from a given signature string:
import binwalk
bw = binwalk.Binwalk()
# Create a temporary magic file that contains a single entry with a signature of '\\x00FOOBAR\\xFF', and append the resulting
# temporary file name to the list of magic files in the Binwalk class instance.
bw.magic_files.append(bw.parser.file_from_string('\\x00FOOBAR\\xFF', display_name='My custom signature'))
bw.scan('firmware.bin')
All magic files generated by this class will be deleted when the class deconstructor is called.
'''
BIG_ENDIAN = 'big'
LITTLE_ENDIAN = 'little'
MAGIC_STRING_FORMAT = "%d\tstring\t%s\t%s\n"
DEFAULT_DISPLAY_NAME = "Raw string signature"
WILDCARD = 'x'
# If libmagic returns multiple results, they are delimited with this string.
RESULT_SEPERATOR = "\\012- "
def __init__(self, filter=None, smart=None):
'''
Class constructor.
@filter - Instance of the MagicFilter class. May be None if the parse/parse_file methods are not used.
@smart - Instance of the SmartSignature class. May be None if the parse/parse_file methods are not used.
Returns None.
'''
self.matches = set([])
self.signatures = {}
self.filter = filter
self.smart = smart
self.raw_fd = None
self.signature_count = 0
self.fd = tempfile.NamedTemporaryFile()
def __del__(self):
try:
self.cleanup()
except:
pass
def rm_magic_file(self):
'''
Cleans up the temporary magic file generated by self.parse.
Returns None.
'''
try:
self.fd.close()
except:
pass
def cleanup(self):
'''
Cleans up any tempfiles created by the class instance.
Returns None.
'''
self.rm_magic_file()
try:
self.raw_fd.close()
except:
pass
def file_from_string(self, signature_string, offset=0, display_name=DEFAULT_DISPLAY_NAME):
'''
Generates a magic file from a signature string.
This method is intended to be used once per instance.
If invoked multiple times, any previously created magic files will be closed and deleted.
@signature_string - The string signature to search for.
@offset - The offset at which the signature should occur.
@display_name - The text to display when the signature is found.
Returns the name of the generated temporary magic file.
'''
self.raw_fd = tempfile.NamedTemporaryFile()
self.raw_fd.write(self.MAGIC_STRING_FORMAT % (offset, signature_string, display_name))
self.raw_fd.seek(0)
return self.raw_fd.name
def parse(self, file_name):
'''
Parses magic file(s) and contatenates them into a single temporary magic file
while simultaneously removing filtered signatures.
@file_name - Magic file, or list of magic files, to parse.
Returns the name of the generated temporary magic file, which will be automatically
deleted when the class deconstructor is called.
'''
if isinstance(file_name, type([])):
files = file_name
else:
files = [file_name]
for fname in files:
if os.path.exists(fname):
self.parse_file(fname)
else:
sys.stdout.write("WARNING: Magic file '%s' does not exist!\n" % fname)
self.fd.seek(0)
return self.fd.name
def parse_file(self, file_name):
'''
Parses a magic file and appends valid signatures to the temporary magic file, as allowed
by the existing filter rules.
@file_name - Magic file to parse.
Returns None.
'''
# Default to not including signature entries until we've
# found what looks like a valid entry.
include = False
line_count = 0
try:
for line in open(file_name).readlines():
line_count += 1
# Check if this is the first line of a signature entry
entry = self._parse_line(line)
if entry is not None:
# If this signature is marked for inclusion, include it.
if self.filter.filter(entry['description']) == self.filter.FILTER_INCLUDE:
include = True
self.signature_count += 1
if not self.signatures.has_key(entry['offset']):
self.signatures[entry['offset']] = []
if entry['condition'] not in self.signatures[entry['offset']]:
self.signatures[entry['offset']].append(entry['condition'])
else:
include = False
# Keep writing lines of the signature to the temporary magic file until
# we detect a signature that should not be included.
if include:
self.fd.write(line)
self.build_signature_set()
except Exception, e:
raise Exception("Error parsing magic file '%s' on line %d: %s" % (file_name, line_count, str(e)))
def _parse_line(self, line):
'''
Parses a signature line into its four parts (offset, type, condition and description),
looking for the first line of a given signature.
@line - The signature line to parse.
Returns a dictionary with the respective line parts populated if the line is the first of a signature.
Returns a dictionary with all parts set to None if the line is not the first of a signature.
'''
entry = {
'offset' : '',
'type' : '',
'condition' : '',
'description' : '',
'length' : 0
}
# Quick and dirty pre-filter. We are only concerned with the first line of a
# signature, which will always start with a number. Make sure the first byte of
# the line is a number; if not, don't process.
if line[:1] < '0' or line[:1] > '9':
return None
try:
# Split the line into white-space separated parts.
# For this to work properly, replace escaped spaces ('\ ') with '\x20'.
# This means the same thing, but doesn't confuse split().
line_parts = line.replace('\\ ', '\\x20').split()
entry['offset'] = line_parts[0]
entry['type'] = line_parts[1]
# The condition line may contain escaped sequences, so be sure to decode it properly.
entry['condition'] = line_parts[2].decode('string_escape')
entry['description'] = ' '.join(line_parts[3:])
except Exception, e:
raise Exception("%s :: %s", (str(e), line))
# We've already verified that the first character in this line is a number, so this *shouldn't*
# throw an exception, but let's catch it just in case...
try:
entry['offset'] = str2int(entry['offset'])
except Exception, e:
raise Exception("%s :: %s", (str(e), line))
# If this is a string, get the length of the string
if 'string' in entry['type'] or entry['condition'] == self.WILDCARD:
entry['length'] = len(entry['condition'])
# Else, we need to jump through a few more hoops...
else:
# Default to little endian, unless the type field starts with 'be'.
# This assumes that we're running on a little endian system...
if entry['type'].startswith('be'):
endianess = self.BIG_ENDIAN
else:
endianess = self.LITTLE_ENDIAN
# Try to convert the condition to an integer. This does not allow
# for more advanced conditions for the first line of a signature,
# but needing that is rare.
try:
intval = str2int(entry['condition'].strip('L'))
except Exception, e:
raise Exception("Failed to evaluate condition for '%s' type: '%s', condition: '%s', error: %s" % (entry['description'], entry['type'], entry['condition'], str(e)))
# How long is the field type?
if entry['type'] == 'byte':
entry['length'] = 1
elif 'short' in entry['type']:
entry['length'] = 2
elif 'long' in entry['type']:
entry['length'] = 4
elif 'quad' in entry['type']:
entry['length'] = 8
# Convert the integer value to a string of the appropriate endianess
entry['condition'] = self._to_string(intval, entry['length'], endianess)
return entry
def build_signature_set(self):
'''
Builds a list of signature tuples.
Returns a list of tuples in the format: [(<signature offset>, [signature regex])].
'''
signature_set = []
for (offset, sigs) in self.signatures.iteritems():
for sig in sigs:
if sig == self.WILDCARD:
sig = re.compile('.')
else:
sig = re.compile(re.escape(sig))
signature_set.append(sig)
self.signature_set = set(signature_set)
return self.signature_set
def find_signature_candidates(self, data, end):
'''
Finds candidate signatures inside of the data buffer.
Called internally by Binwalk.single_scan.
@data - Data to scan for candidate signatures.
@end - Don't look for signatures beyond this offset.
Returns an ordered list of offsets inside of data at which candidate offsets were found.
'''
candidate_offsets = []
for regex in self.signature_set:
candidate_offsets += [match.start() for match in regex.finditer(data) if match.start() < end]
candidate_offsets = list(set(candidate_offsets))
candidate_offsets.sort()
return candidate_offsets
def _to_string(self, value, size, endianess):
'''
Converts an integer value into a raw string.
@value - The integer value to convert.
@size - Size, in bytes, of the integer value.
@endianess - One of self.LITTLE_ENDIAN | self.BIG_ENDIAN.
Returns a raw string containing value.
'''
data = ""
for i in range(0, size):
data += chr((value >> (8*i)) & 0xFF)
if endianess != self.LITTLE_ENDIAN:
data = data[::-1]
return data
def split(self, data):
'''
Splits multiple libmagic results in the data string into a list of separate results.
@data - Data string returned from libmagic.
Returns a list of result strings.
'''
try:
return data.split(self.RESULT_SEPERATOR)
except:
return []
import os
import sys
import imp
# Valid return values for plugins
PLUGIN_CONTINUE = 0x00
PLUGIN_NO_EXTRACT = 0x01
PLUGIN_NO_DISPLAY = 0x02
PLUGIN_STOP_PLUGINS = 0x04
PLUGIN_TERMINATE = 0x08
class Plugins:
'''
Class to load and call plugin callback functions, handled automatically by Binwalk.scan / Binwalk.single_scan.
An instance of this class is available during a scan via the Binwalk.plugins object.
Each plugin must be placed in the user or system plugins directories, and must define a class named 'Plugin'.
The Plugin class constructor (__init__) is passed one argument, which is the current instance of the Binwalk class.
The Plugin class constructor is called once prior to scanning a file or set of files.
The Plugin class destructor (__del__) is called once after scanning all files.
The Plugin class can define one or all of the following callback methods:
o pre_scan(self, fd)
This method is called prior to running a scan against a file. It is passed the file object of
the file about to be scanned.
o pre_parser(self, result)
This method is called every time any result - valid or invalid - is found in the file being scanned.
It is passed a dictionary with one key ('description'), which contains the raw string returned by libmagic.
The contents of this dictionary key may be modified as necessary by the plugin.
o callback(self, results)
This method is called every time a valid result is found in the file being scanned. It is passed a
dictionary of results. This dictionary is identical to that passed to Binwalk.single_scan's callback
function, and its contents may be modified as necessary by the plugin.
o post_scan(self, fd)
This method is called after running a scan against a file, but before the file has been closed.
It is passed the file object of the scanned file.
Valid return values for all plugin callbacks are (PLUGIN_* values may be OR'd together):
PLUGIN_CONTINUE - Do nothing, continue the scan normally.
PLUGIN_NO_EXTRACT - Do not preform data extraction.
PLUGIN_NO_DISPLAY - Ignore the result(s); they will not be displayed or further processed.
PLUGIN_STOP_PLUGINS - Do not call any other plugins.
PLUGIN_TERMINATE - Terminate the scan.
None - The same as PLUGIN_CONTINUE.
Values returned by pre_scan affect all results during the scan of that particular file.
Values returned by callback affect only that specific scan result.
Values returned by post_scan are ignored since the scan of that file has already been completed.
By default, all plugins are loaded during binwalk signature scans. Plugins that wish to be disabled by
default may create a class variable named 'ENABLED' and set it to False. If ENABLED is set to False, the
plugin will only be loaded if it is explicitly named in the plugins whitelist.
Simple example plugin:
from binwalk.plugins import *
class Plugin:
# Set to False to have this plugin disabled by default.
ENABLED = True
def __init__(self, binwalk):
self.binwalk = binwalk
print 'Scanning initialized!'
def __del__(self):
print 'Scanning complete!'
def pre_scan(self, fd):
print 'About to scan', fd.name
return PLUGIN_CONTINUE
def callback(self, results):
print 'Got a result:', results['description']
return PLUGIN_CONTINUE
def post_scan(self, fd):
print 'Done scanning', fd.name
return PLUGIN_CONTINUE
'''
CALLBACK = 'callback'
PRESCAN = 'pre_scan'
POSTSCAN = 'post_scan'
PREPARSER = 'pre_parser'
PLUGIN = 'Plugin'
MODULE_EXTENSION = '.py'
def __init__(self, binwalk, whitelist=[], blacklist=[]):
self.binwalk = binwalk
self.callback = []
self.pre_scan = []
self.pre_parser = []
self.post_scan = []
self.whitelist = whitelist
self.blacklist = blacklist
def __del__(self):
self._cleanup()
def __exit__(self, t, v, traceback):
self._cleanup()
def _cleanup(self):
try:
del self.binwalk
except:
pass
def _call_plugins(self, callback_list, arg):
retval = PLUGIN_CONTINUE
for callback in callback_list:
if (retval & PLUGIN_STOP_PLUGINS):
break
try:
val = callback(arg)
if val is not None:
retval |= val
except Exception, e:
sys.stderr.write("WARNING: %s.%s failed: %s\n" % (str(callback.im_class), callback.__name__, str(e)))
return retval
def list_plugins(self):
'''
Obtain a list of all user and system plugin modules.
Returns a dictionary of:
{
'user' : {
'modules' : [list, of, module, names],
'descriptions' : {'module_name' : 'module pydoc string'},
'enabled' : {'module_name' : True},
'path' : "path/to/module/plugin/directory"
},
'system' : {
'modules' : [list, of, module, names],
'descriptions' : {'module_name' : 'module pydoc string'},
'enabled' : {'module_name' : True},
'path' : "path/to/module/plugin/directory"
}
}
'''
plugins = {
'user' : {
'modules' : [],
'descriptions' : {},
'enabled' : {},
'path' : None,
},
'system' : {
'modules' : [],
'descriptions' : {},
'enabled' : {},
'path' : None,
}
}
for key in plugins.keys():
plugins[key]['path'] = self.binwalk.config.paths[key][self.binwalk.config.PLUGINS]
for file_name in os.listdir(plugins[key]['path']):
if file_name.endswith(self.MODULE_EXTENSION):
module = file_name[:-len(self.MODULE_EXTENSION)]
if module in self.blacklist:
continue
else:
plugin = imp.load_source(module, os.path.join(plugins[key]['path'], file_name))
plugin_class = getattr(plugin, self.PLUGIN)
try:
enabled = plugin_class.ENABLED
except:
enabled = True
plugins[key]['enabled'][module] = enabled
plugins[key]['modules'].append(module)
try:
plugins[key]['descriptions'][module] = plugin_class.__doc__.strip().split('\n')[0]
except:
plugins[key]['descriptions'][module] = 'No description'
return plugins
def _load_plugins(self):
plugins = self.list_plugins()
self._load_plugin_modules(plugins['user'])
self._load_plugin_modules(plugins['system'])
def _load_plugin_modules(self, plugins):
for module in plugins['modules']:
file_path = os.path.join(plugins['path'], module + self.MODULE_EXTENSION)
try:
plugin = imp.load_source(module, file_path)
plugin_class = getattr(plugin, self.PLUGIN)
try:
# If this plugin is disabled by default and has not been explicitly white listed, ignore it
if plugin_class.ENABLED == False and module not in self.whitelist:
continue
except:
pass
class_instance = plugin_class(self.binwalk)
try:
self.callback.append(getattr(class_instance, self.CALLBACK))
except:
pass
try:
self.pre_scan.append(getattr(class_instance, self.PRESCAN))
except:
pass
try:
self.pre_parser.append(getattr(class_instance, self.PREPARSER))
except:
pass
try:
self.post_scan.append(getattr(class_instance, self.POSTSCAN))
except:
pass
except Exception, e:
sys.stderr.write("WARNING: Failed to load plugin module '%s': %s\n" % (module, str(e)))
def _pre_scan_callbacks(self, fd):
return self._call_plugins(self.pre_scan, fd)
def _post_scan_callbacks(self, fd):
return self._call_plugins(self.post_scan, fd)
def _scan_callbacks(self, results):
return self._call_plugins(self.callback, results)
def _scan_pre_parser_callbacks(self, results):
return self._call_plugins(self.pre_parser, results)
from binwalk.plugins import *
class Plugin:
'''
Validates ARM instructions during opcode scans.
'''
BITMASK = 0x83FF
BITMASK_SIZE = 2
def __init__(self, binwalk):
self.fd = None
if binwalk.scan_type == binwalk.BINARCH:
self.enabled = True
else:
self.enabled = False
def pre_scan(self, fd):
if self.enabled:
self.fd = open(fd.name, 'rb')
def callback(self, results):
if self.fd:
data = ''
try:
if results['description'].startswith('ARM instruction'):
self.fd.seek(results['offset'])
data = self.fd.read(self.BITMASK_SIZE)
data = data[1] + data[0]
elif results['description'].startswith('ARMEB instruction'):
self.fd.seek(results['offset']+self.BITMASK_SIZE)
data = self.fd.read(self.BITMASK_SIZE)
if data:
registers = int(data.encode('hex'), 16)
if (registers & self.BITMASK) != registers:
return PLUGIN_NO_DISPLAY
except:
pass
def post_scan(self, fd):
try:
self.fd.close()
except:
pass
import ctypes
import ctypes.util
from binwalk.plugins import *
class Plugin:
'''
Searches for and validates compress'd data.
'''
ENABLED = True
READ_SIZE = 64
def __init__(self, binwalk):
self.fd = None
self.comp = None
self.binwalk = binwalk
if binwalk.scan_type == binwalk.BINWALK:
self.comp = ctypes.cdll.LoadLibrary(ctypes.util.find_library("compress42"))
binwalk.magic_files.append(binwalk.config.find_magic_file('compressd'))
def __del__(self):
try:
self.fd.close()
except:
pass
def pre_scan(self, fd):
try:
if self.comp:
self.fd = open(fd.name, 'rb')
except:
pass
def callback(self, results):
if self.fd and results['description'].lower().startswith("compress'd data"):
self.fd.seek(results['offset'])
compressed_data = self.fd.read(self.READ_SIZE)
if not self.comp.is_compressed(compressed_data, len(compressed_data)):
return (PLUGIN_NO_DISPLAY | PLUGIN_NO_EXTRACT)
from binwalk.plugins import *
class Plugin:
'''
Ensures that ASCII CPIO archive entries only get extracted once.
'''
def __init__(self, binwalk):
self.binwalk = binwalk
self.found_archive = False
def pre_scan(self, fd):
# Be sure to re-set this at the beginning of every scan
self.found_archive = False
def callback(self, results):
if self.binwalk.extractor.enabled and self.binwalk.scan_type == self.binwalk.BINWALK:
# ASCII CPIO archives consist of multiple entries, ending with an entry named 'TRAILER!!!'.
# Displaying each entry is useful, as it shows what files are contained in the archive,
# but we only want to extract the archive when the first entry is found.
if results['description'].startswith('ASCII cpio archive'):
if not self.found_archive:
# This is the first entry. Set found_archive and allow the scan to continue normally.
self.found_archive = True
return PLUGIN_CONTINUE
elif 'TRAILER!!!' in results['description']:
# This is the last entry, un-set found_archive.
self.found_archive = False
# The first entry has already been found and this is the last entry, or the last entry
# has not yet been found. Don't extract.
return PLUGIN_NO_EXTRACT
# Allow all other results to continue normally.
return PLUGIN_CONTINUE
#!/usr/bin/env python
import ctypes
import ctypes.util
from binwalk.plugins import *
from binwalk.common import BlockFile
class Plugin:
'''
Searches for raw deflate compression streams.
'''
ENABLED = False
SIZE = 64*1024
DESCRIPTION = "Deflate compressed data stream"
def __init__(self, binwalk):
self.binwalk = binwalk
# The tinfl library is built and installed with binwalk
self.tinfl = ctypes.cdll.LoadLibrary(ctypes.util.find_library("tinfl"))
if self.binwalk.extractor.enabled:
# TODO: Add python extractor rule
pass
def pre_scan(self, fp):
self._deflate_scan(fp)
return PLUGIN_TERMINATE
def _extractor(self, file_name):
processed = 0
inflated_data = ''
fd = BlockFile(file_name, 'rb')
fd.READ_BLOCK_SIZE = self.SIZE
while processed < fd.length:
(data, dlen) = fd.read_block()
inflated_block = self.tinfl.inflate_block(data, dlen)
if inflated_block:
inflated_data += inflated_block
else:
break
processed += dlen
fd.close()
print "%s inflated to %d bytes" % (file_name, len(inflated_data))
def _deflate_scan(self, fp):
fp.MAX_TRAILING_SIZE = self.SIZE
# Set these so that the progress report reflects the current scan status
self.binwalk.scan_length = fp.length
self.binwalk.total_scanned = 0
while self.binwalk.total_scanned < self.binwalk.scan_length:
current_total = self.binwalk.total_scanned
(data, dlen) = fp.read_block()
if not data or dlen == 0:
break
for i in range(0, dlen):
if self.tinfl.is_deflated(data[i:], dlen-i, 0):
loc = fp.offset + current_total + i
# Update total_scanned here for immediate progress feedback
self.binwalk.total_scanned = current_total + i
self.binwalk.display.easy_results(loc, self.DESCRIPTION)
if (current_total + i) > self.binwalk.scan_length:
break
# Set total_scanned here in case no data streams were identified
self.binwalk.total_scanned = current_total + dlen
import os
import shutil
from binwalk.common import BlockFile
class Plugin:
'''
Finds and extracts modified LZMA files commonly found in cable modems.
Based on Bernardo Rodrigues' work: http://w00tsec.blogspot.com/2013/11/unpacking-firmware-images-from-cable.html
'''
ENABLED = True
FAKE_LZMA_SIZE = "\x00\x00\x00\x10\x00\x00\x00\x00"
SIGNATURE = "lzma compressed data"
def __init__(self, binwalk):
self.binwalk = binwalk
self.original_cmd = ''
if self.binwalk.extractor.enabled:
# Replace the existing LZMA extraction command with our own
rules = self.binwalk.extractor.get_rules()
for i in range(0, len(rules)):
if rules[i]['regex'].match(self.SIGNATURE):
self.original_cmd = rules[i]['cmd']
rules[i]['cmd'] = self.lzma_cable_extractor
break
def lzma_cable_extractor(self, fname):
# Try extracting the LZMA file without modification first
if not self.binwalk.extractor.execute(self.original_cmd, fname):
out_name = os.path.splitext(fname)[0] + '-patched' + os.path.splitext(fname)[1]
fp_out = open(out_name, 'wb')
fp_in = BlockFile(fname)
fp_in.MAX_TRAILING_SIZE = 0
i = 0
while i < fp_in.length:
(data, dlen) = fp_in.read_block()
if i == 0:
fp_out.write(data[0:5] + self.FAKE_LZMA_SIZE + data[5:])
else:
fp_out.write(data)
i += dlen
fp_in.close()
fp_out.close()
# Overwrite the original file so that it can be cleaned up if -r was specified
shutil.move(out_name, fname)
self.binwalk.extractor.execute(self.original_cmd, fname)
def pre_parser(self, result):
# The modified cable modem LZMA headers all have valid dictionary sizes and a properties byte of 0x5D.
if result['description'].lower().startswith(self.SIGNATURE) and "invalid uncompressed size" in result['description']:
if "properties: 0x5D" in result['description'] and "invalid dictionary size" not in result['description']:
result['invalid'] = False
result['description'] = result['description'].split("invalid uncompressed size")[0] + "missing uncompressed size"
class Plugin:
'''
Modifies string analysis output to mimic that of the Unix strings utility.
'''
ENABLED = False
def __init__(self, binwalk):
self.modify_output = False
if binwalk.scan_type == binwalk.STRINGS:
binwalk.display.quiet = True
self.modify_output = True
def callback(self, results):
if self.modify_output:
try:
print results['description']
except Exception, e:
pass
import ctypes
import ctypes.util
from binwalk.plugins import *
class Plugin:
'''
Searches for and validates zlib compressed data.
'''
MAX_DATA_SIZE = 33 * 1024
def __init__(self, binwalk):
self.fd = None
self.tinfl = None
if binwalk.scan_type == binwalk.BINWALK:
# Add the zlib file to the list of magic files
binwalk.magic_files.append(binwalk.config.find_magic_file('zlib'))
# Load libtinfl.so
self.tinfl = ctypes.cdll.LoadLibrary(ctypes.util.find_library('tinfl'))
def pre_scan(self, fd):
if self.tinfl:
self.fd = open(fd.name, 'rb')
def callback(self, result):
# If this result is a zlib signature match, try to decompress the data
if self.fd and result['description'].lower().startswith('zlib'):
# Seek to and read the suspected zlib data
self.fd.seek(result['offset'])
data = self.fd.read(self.MAX_DATA_SIZE)
# Check if this is valid zlib data
if not self.tinfl.is_deflated(data, len(data), 1):
return (PLUGIN_NO_DISPLAY | PLUGIN_NO_EXTRACT)
return PLUGIN_CONTINUE
def post_scan(self, fd):
if self.fd:
self.fd.close()
import sys
import hashlib
import csv as pycsv
from datetime import datetime
class PrettyPrint:
'''
Class for printing binwalk results to screen/log files.
An instance of PrettyPrint is available via the Binwalk.display object.
The PrettyPrint.results() method is of particular interest, as it is suitable for use as a Binwalk.scan() callback function,
and can be used to print Binwalk.scan() results to stdout, a log file, or both.
Useful class objects:
self.fp - The log file's file object.
self.quiet - If set to True, all output to stdout is supressed.
self.verbose - If set to True, verbose output is enabled.
self.csv - If set to True, data will be saved to the log file in CSV format.
self.format_to_screen - If set to True, output data will be formatted to fit into the current screen width.
Example usage:
import binwalk
bw = binwalk.Binwalk()
bw.display.header()
bw.single_scan('firmware.bin', callback=bw.display.results)
bw.display.footer()
'''
HEADER_WIDTH = 115
BUFFER_WIDTH = 32
MAX_LINE_LEN = 0
DEFAULT_DESCRIPTION_HEADER = "DESCRIPTION"
def __init__(self, binwalk, log=None, csv=False, quiet=False, verbose=0, format_to_screen=False):
'''
Class constructor.
@binwalk - An instance of the Binwalk class.
@log - Output log file.
@csv - If True, save data to log file in CSV format.
@quiet - If True, results will not be displayed to screen.
@verbose - If set to True, target file information will be displayed when file_info() is called.
@format_to_screen - If set to True, format the output data to fit into the current screen width.
Returns None.
'''
self.binwalk = binwalk
self.fp = None
self.log = log
self.csv = None
self.log_csv = csv
self.quiet = quiet
self.verbose = verbose
self.format_to_screen = format_to_screen
if self.format_to_screen:
self.enable_formatting(True)
if self.log is not None:
self.fp = open(log, "w")
if self.log_csv:
self.enable_csv()
def __del__(self):
'''
Class deconstructor.
'''
self.cleanup()
def __exit__(self, t, v, traceback):
self.cleanup()
def cleanup(self):
'''
Clean up any open file descriptors.
'''
try:
self.fp.close()
except:
pass
self.fp = None
def _log(self, data):
'''
Log data to the log file.
'''
if self.fp is not None:
if self.log_csv and self.csv:
data = data.replace('\n', ' ')
while ' ' in data:
data = data.replace(' ', ' ')
data_parts = data.split(None, 2)
if len(data_parts) == 3:
for i in range(0, len(data_parts)):
data_parts[i] = data_parts[i].strip()
self.csv.writerow(data_parts)
else:
self.fp.write(data)
def _pprint(self, data):
'''
Print data to stdout and the log file.
'''
if not self.quiet:
sys.stdout.write(data)
self._log(data)
def _file_md5(self, file_name):
'''
Generate an MD5 hash of the specified file.
'''
md5 = hashlib.md5()
with open(file_name, 'rb') as f:
for chunk in iter(lambda: f.read(128*md5.block_size), b''):
md5.update(chunk)
return md5.hexdigest()
def _append_to_data_parts(self, data, start, end):
'''
Intelligently appends data to self.string_parts.
For use by self._format.
'''
try:
while data[start] == ' ':
start += 1
if start == end:
end = len(data[start:])
self.string_parts.append(data[start:end])
except:
try:
self.string_parts.append(data[start:])
except:
pass
return start
def _format(self, data):
'''
Formats a line of text to fit in the terminal window.
For Tim.
'''
offset = 0
space_offset = 0
self.string_parts = []
delim = '\n' + ' ' * self.BUFFER_WIDTH
if self.format_to_screen:
while len(data[offset:]) > self.MAX_LINE_LEN:
space_offset = data[offset:offset+self.MAX_LINE_LEN].rfind(' ')
if space_offset == -1 or space_offset == 0:
space_offset = self.MAX_LINE_LEN
self._append_to_data_parts(data, offset, offset+space_offset)
offset += space_offset
self._append_to_data_parts(data, offset, offset+len(data[offset:]))
return delim.join(self.string_parts)
def enable_csv(self):
'''
Enables CSV formatting to log file.
'''
self.log_csv = True
self.csv = pycsv.writer(self.fp)
def enable_formatting(self, tf):
'''
Enables output formatting, which fits output to the current terminal width.
@tf - If True, enable formatting. If False, disable formatting.
Returns None.
'''
self.format_to_screen = tf
if self.format_to_screen:
try:
import fcntl
import struct
import termios
# Get the terminal window width
hw = struct.unpack('hh', fcntl.ioctl(1, termios.TIOCGWINSZ, '1234'))
self.HEADER_WIDTH = hw[1]
except Exception, e:
pass
self.MAX_LINE_LEN = self.HEADER_WIDTH - self.BUFFER_WIDTH
def file_info(self, file_name):
'''
Prints detailed info about the specified file, including file name, scan time and the file's MD5 sum.
Called internally by self.header if self.verbose is not 0.
@file_name - The path to the target file.
@binwalk - Binwalk class instance.
Returns None.
'''
self._pprint("\n")
self._pprint("Scan Time: %s\n" % datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
self._pprint("Signatures: %d\n" % self.binwalk.parser.signature_count)
self._pprint("Target File: %s\n" % file_name)
self._pprint("MD5 Checksum: %s\n" % self._file_md5(file_name))
def header(self, file_name=None, header=None, description=DEFAULT_DESCRIPTION_HEADER):
'''
Prints the binwalk header, typically used just before starting a scan.
@file_name - If specified, and if self.verbose > 0, then detailed file info will be included in the header.
@header - If specified, this is a custom header to display at the top of the output.
@description - The description header text to display (default: "DESCRIPTION")
Returns None.
'''
if self.verbose and file_name is not None:
self.file_info(file_name)
self._pprint("\n")
if not header:
self._pprint("DECIMAL \tHEX \t%s\n" % description)
else:
self._pprint(header + "\n")
self._pprint("-" * self.HEADER_WIDTH + "\n")
def footer(self, bwalk=None, file_name=None):
'''
Prints the binwalk footer, typically used just after completing a scan.
Returns None.
'''
self._pprint("\n")
def results(self, offset, results, formatted=False):
'''
Prints the results of a scan. Suitable for use as a callback function for Binwalk.scan().
@offset - The offset at which the results were found.
@results - A list of libmagic result strings.
@formatted - Set to True if the result description has already been formatted properly.
Returns None.
'''
offset_printed = False
for info in results:
# Check for any grep filters before printing
if self.binwalk.filter.grep(info['description']):
if not formatted:
# Only display the offset once per list of results
if not offset_printed:
self._pprint("%-10d\t0x%-8X\t%s\n" % (offset, offset, self._format(info['description'])))
offset_printed = True
else:
self._pprint("%s\t %s\t%s\n" % (' '*10, ' '*8, self._format(info['description'])))
else:
self._pprint(info['description'])
def easy_results(self, offset, description):
'''
Simpler wrapper around prettyprint.results.
@offset - The offset at which the result was found.
@description - Description string to display.
Returns None.
'''
results = {
'offset' : offset,
'description' : description,
}
return self.results(offset, [results])
import re
from common import str2int, get_quoted_strings
class SmartSignature:
'''
Class for parsing smart signature tags in libmagic result strings.
This class is intended for internal use only, but a list of supported 'smart keywords' that may be used
in magic files is available via the SmartSignature.KEYWORDS dictionary:
from binwalk import SmartSignature
for (i, keyword) in SmartSignature().KEYWORDS.iteritems():
print keyword
'''
KEYWORD_DELIM_START = "{"
KEYWORD_DELIM_END = "}"
KEYWORDS = {
'jump' : '%sjump-to-offset:' % KEYWORD_DELIM_START,
'filename' : '%sfile-name:' % KEYWORD_DELIM_START,
'filesize' : '%sfile-size:' % KEYWORD_DELIM_START,
'raw-string' : '%sraw-string:' % KEYWORD_DELIM_START, # This one is special and must come last in a signature block
'raw-size' : '%sraw-string-length:' % KEYWORD_DELIM_START,
'adjust' : '%soffset-adjust:' % KEYWORD_DELIM_START,
'delay' : '%sextract-delay:' % KEYWORD_DELIM_START,
'year' : '%syear:' % KEYWORD_DELIM_START,
'epoch' : '%sepoch:' % KEYWORD_DELIM_START,
'raw-replace' : '%sraw-replace%s' % (KEYWORD_DELIM_START, KEYWORD_DELIM_END),
'one-of-many' : '%sone-of-many%s' % (KEYWORD_DELIM_START, KEYWORD_DELIM_END),
}
def __init__(self, filter, ignore_smart_signatures=False):
'''
Class constructor.
@filter - Instance of the MagicFilter class.
@ignore_smart_signatures - Set to True to ignore smart signature keywords.
Returns None.
'''
self.filter = filter
self.invalid = False
self.last_one_of_many = None
self.ignore_smart_signatures = ignore_smart_signatures
def parse(self, data):
'''
Parse a given data string for smart signature keywords. If any are found, interpret them and strip them.
@data - String to parse, as returned by libmagic.
Returns a dictionary of parsed values.
'''
results = {
'offset' : '', # Offset where the match was found, filled in by Binwalk.single_scan.
'description' : '', # The libmagic data string, stripped of all keywords
'name' : '', # The original name of the file, if known
'delay' : '', # Extract delay description
'extract' : '', # Name of the extracted file, filled in by Binwalk.single_scan.
'jump' : 0, # The relative offset to resume the scan from
'size' : 0, # The size of the file, if known
'adjust' : 0, # The relative offset to add to the reported offset
'year' : 0, # The file's creation/modification year, if reported in the signature
'epoch' : 0, # The file's creation/modification epoch time, if reported in the signature
'invalid' : False, # Set to True if parsed numerical values appear invalid
}
self.invalid = False
# If smart signatures are disabled, or the result data is not valid (i.e., potentially malicious),
# don't parse anything, just return the raw data as the description.
if self.ignore_smart_signatures or not self._is_valid(data):
results['description'] = data
else:
# Parse the offset-adjust value. This is used to adjust the reported offset at which
# a signature was located due to the fact that MagicParser.match expects all signatures
# to be located at offset 0, which some wil not be.
results['adjust'] = self._get_math_arg(data, 'adjust')
# Parse the file-size value. This is used to determine how many bytes should be extracted
# when extraction is enabled. If not specified, everything to the end of the file will be
# extracted (see Binwalk.scan).
try:
results['size'] = str2int(self._get_keyword_arg(data, 'filesize'))
except:
pass
try:
results['year'] = str2int(self._get_keyword_arg(data, 'year'))
except:
pass
try:
results['epoch'] = str2int(self._get_keyword_arg(data, 'epoch'))
except:
pass
results['delay'] = self._get_keyword_arg(data, 'delay')
# Parse the string for the jump-to-offset keyword.
# This keyword is honored, even if this string result is one of many.
results['jump'] = self._get_math_arg(data, 'jump')
# If this is one of many, don't do anything and leave description as a blank string.
# Else, strip all keyword tags from the string and process additional keywords as necessary.
if not self._one_of_many(data):
results['name'] = self._get_keyword_arg(data, 'filename').strip('"')
results['description'] = self._strip_tags(data)
results['invalid'] = self.invalid
return results
def _is_valid(self, data):
'''
Validates that result data does not contain smart keywords in file-supplied strings.
@data - Data string to validate.
Returns True if data is OK.
Returns False if data is not OK.
'''
# All strings printed from the target file should be placed in strings, else there is
# no way to distinguish between intended keywords and unintended keywords. Get all the
# quoted strings.
quoted_data = get_quoted_strings(data)
# Check to see if there was any quoted data, and if so, if it contained the keyword starting delimiter
if quoted_data and self.KEYWORD_DELIM_START in quoted_data:
# If so, check to see if the quoted data contains any of our keywords.
# If any keywords are found inside of quoted data, consider the keywords invalid.
for (name, keyword) in self.KEYWORDS.iteritems():
if keyword in quoted_data:
return False
return True
def _one_of_many(self, data):
'''
Determines if a given data string is one result of many.
@data - String result data.
Returns True if the string result is one of many.
Returns False if the string result is not one of many.
'''
if not self.filter.invalid(data):
if self.last_one_of_many is not None and data.startswith(self.last_one_of_many):
return True
if self.KEYWORDS['one-of-many'] in data:
# Only match on the data before the first comma, as that is typically unique and static
self.last_one_of_many = data.split(',')[0]
else:
self.last_one_of_many = None
return False
def _get_keyword_arg(self, data, keyword):
'''
Retrieves the argument for keywords that specify arguments.
@data - String result data, as returned by libmagic.
@keyword - Keyword index in KEYWORDS.
Returns the argument string value on success.
Returns a blank string on failure.
'''
arg = ''
if self.KEYWORDS.has_key(keyword) and self.KEYWORDS[keyword] in data:
arg = data.split(self.KEYWORDS[keyword])[1].split(self.KEYWORD_DELIM_END)[0]
return arg
def _get_math_arg(self, data, keyword):
'''
Retrieves the argument for keywords that specifiy mathematical expressions as arguments.
@data - String result data, as returned by libmagic.
@keyword - Keyword index in KEYWORDS.
Returns the resulting calculated value.
'''
value = 0
arg = self._get_keyword_arg(data, keyword)
if arg:
for string_int in arg.split('+'):
try:
value += str2int(string_int)
except:
self.invalid = True
return value
def _jump(self, data):
'''
Obtains the jump-to-offset value of a signature, if any.
@data - String result data.
Returns the offset to jump to.
'''
offset = 0
offset_str = self._get_keyword_arg(data, 'jump')
if offset_str:
try:
offset = str2int(offset_str)
except:
pass
return offset
def _parse_raw_strings(self, data):
'''
Process strings that aren't NULL byte terminated, but for which we know the string length.
This should be called prior to any other smart parsing functions.
@data - String to parse.
Returns a parsed string.
'''
if not self.ignore_smart_signatures and self._is_valid(data):
# Get the raw string keyword arg
raw_string = self._get_keyword_arg(data, 'raw-string')
# Was a raw string keyword specified?
if raw_string:
# Get the raw string length arg
raw_size = self._get_keyword_arg(data, 'raw-size')
# Is the raw string length arg is a numeric value?
if re.match('^-?[0-9]+$', raw_size):
# Replace all instances of raw-replace in data with raw_string[:raw_size]
# Also strip out everything after the raw-string keyword, including the keyword itself.
# Failure to do so may (will) result in non-printable characters and this string will be
# marked as invalid when it shouldn't be.
data = data[:data.find(self.KEYWORDS['raw-string'])].replace(self.KEYWORDS['raw-replace'], '"' + raw_string[:str2int(raw_size)] + '"')
return data
def _strip_tags(self, data):
'''
Strips the smart tags from a result string.
@data - String result data.
Returns a sanitized string.
'''
if not self.ignore_smart_signatures:
for (name, keyword) in self.KEYWORDS.iteritems():
start = data.find(keyword)
if start != -1:
end = data[start:].find(self.KEYWORD_DELIM_END)
if end != -1:
data = data.replace(data[start:start+end+1], "")
return data
import string
import entropy
import plugins
import common
class FileStrings(object):
'''
Class for performing a "smart" strings analysis on a single file.
It is preferred to use the Strings class instead of this class directly.
'''
SUSPECT_STRING_LENGTH = 4
SUSPECT_SPECIAL_CHARS_RATIO = .25
MIN_STRING_LENGTH = 3
MAX_STRING_LENGTH = 20
MAX_SPECIAL_CHARS_RATIO = .4
MAX_ENTROPY = 0.9
LETTERS = [x for x in string.letters]
NUMBERS = [x for x in string.digits]
PRINTABLE = [x for x in string.printable]
WHITESPACE = [x for x in string.whitespace]
PUNCTUATION = [x for x in string.punctuation]
NEWLINES = ['\r', '\n', '\x0b', '\x0c']
VOWELS = ['A', 'E', 'I', 'O', 'U', 'a', 'e', 'i', 'o', 'u']
NON_ALPHA_EXCEPTIONS = ['%', '.', '/', '-', '_']
BRACKETED = {
'[' : ']',
'<' : '>',
'{' : '}',
'(' : ')',
}
def __init__(self, file_name, binwalk, length=0, offset=0, n=MIN_STRING_LENGTH, block=0, algorithm=None, plugins=None):
'''
Class constructor. Preferred to be invoked from the Strings class instead of directly.
@file_name - The file name to perform a strings analysis on.
@binwalk - An instance of the Binwalk class.
@length - The number of bytes in the file to analyze.
@offset - The starting offset into the file to begin analysis.
@n - The minimum valid string length.
@block - The block size to use when performing entropy analysis.
@algorithm - The entropy algorithm to use when performing entropy analysis.
@plugins - An instance of the Plugins class.
Returns None.
'''
self.n = n
self.binwalk = binwalk
self.length = length
self.start = offset
self.data = ''
self.dlen = 0
self.i = 0
self.total_read = 0
self.entropy = {}
self.valid_strings = []
self.external_validators = []
self.plugins = plugins
if not self.n:
self.n = self.MIN_STRING_LENGTH
# Perform an entropy analysis over the entire file (anything less may generate poor entropy data).
# Give fake file results list to prevent FileEntropy from doing too much analysis.
with entropy.FileEntropy(file_name, block=block, file_results=['foo']) as e:
(self.x, self.y, self.average_entropy) = e.analyze()
for i in range(0, len(self.x)):
self.entropy[self.x[i]] = self.y[i]
# Make sure our block size matches the entropy analysis's block size
self.block = e.block
# Make sure the starting offset is a multiple of the block size; else, when later checking
# the entropy analysis, block offsets won't line up.
self.start -= (self.start % self.block)
self.fd = common.BlockFile(file_name, 'rb', length=length, offset=self.start)
# TODO: This is not optimal. We should read in larger chunks and process it into self.block chunks.
self.fd.READ_BLOCK_SIZE = self.block
self.fd.MAX_TRAILING_SIZE = 0
# Set the total_scanned and scan_length values for plugins and status display messages
self.binwalk.total_scanned = 0
self.binwalk.scan_length = self.fd.length
def __enter__(self):
return self
def __del__(self):
self.cleanup()
def __exit__(self, t, v, traceback):
self.cleanup()
def cleanup(self):
try:
self.fd.close()
except:
pass
def _read_block(self):
'''
Read one block of data from the target file.
Returns a tuple of (offset, data_length, data).
'''
offset = self.total_read + self.start
# Ignore blocks which have a higher than average or higher than MAX_ENTROPY entropy
while self.entropy.has_key(offset):
# Don't ignore blocks that border on an entropy rising/falling edge
try:
if self.entropy[offset-self.block] <= self.MAX_ENTROPY:
break
if self.entropy[offset+self.block] <= self.MAX_ENTROPY:
break
except KeyError:
break
if self.entropy[offset] > self.average_entropy or self.entropy[offset] > self.MAX_ENTROPY:
self.total_read += self.block
offset = self.total_read + self.start
self.fd.seek(offset)
else:
break
(data, dlen) = self.fd.read_block()
self.binwalk.total_scanned = self.total_read
self.total_read += dlen
return (self.start+self.total_read-dlen, dlen, data)
def _next_byte(self):
'''
Grab the next byte from the file.
Returns a tuple of (offset, byte).
'''
byte = ''
# If we've reached the end of the data buffer that we previously read in, read in the next block of data
if self.i == self.dlen:
(self.current_offset, self.dlen, self.data) = self._read_block()
self.i = 0
if self.i < self.dlen:
byte = self.data[self.i]
self.i += 1
return (self.current_offset+self.i-1, byte)
def _has_vowels(self, data):
'''
Returns True if data has a vowel in it, otherwise returns False.
'''
for i in self.VOWELS:
if i in data:
return True
return False
def _alpha_count(self, data):
'''
Returns the number of english letters in data.
'''
c = 0
for i in range(0, len(data)):
if data[i] in self.LETTERS:
c += 1
return c
def _is_bracketed(self, data):
'''
Checks if a string is bracketed by special characters.
@data - The data string to check.
Returns True if bracketed, False if not.
'''
return self.BRACKETED.has_key(data[0]) and data.endswith(self.BRACKETED[data[0]])
def _non_alpha_count(self, data):
'''
Returns the number of non-english letters in data.
'''
c = 0
dlen = len(data)
# No exceptions for very short strings
if dlen <= self.SUSPECT_STRING_LENGTH:
exceptions = []
else:
exceptions = self.NON_ALPHA_EXCEPTIONS
for i in range(0, len(data)):
if data[i] not in self.LETTERS and data[i] not in self.NUMBERS and data[i] not in exceptions:
c += 1
return c
def _too_many_special_chars(self, data):
'''
Returns True if the ratio of special characters in data is too high, otherwise returns False.
'''
# If an open bracket exists, we expect a close bracket as well
for (key, value) in self.BRACKETED.iteritems():
if key in data and not value in data:
return True
# For better filtering of false positives, require a lower ratio of special characters for very short strings
if len(data) <= self.SUSPECT_STRING_LENGTH:
return (float(self._non_alpha_count(data)) / len(data)) >= self.SUSPECT_SPECIAL_CHARS_RATIO
return (float(self._non_alpha_count(data)) / len(data)) >= self.MAX_SPECIAL_CHARS_RATIO
def _fails_grammar_rules(self, data):
'''
Returns True if data fails one of several general grammatical/logical rules.
'''
# Nothing here is going to be perfect and will likely result in both false positives and false negatives.
# The goal however is not to be perfect, but to filter out as many garbage strings while generating as
# few false negatives as possible.
# Generally, the first byte of a string is not a punctuation mark
if data[0] in self.PUNCTUATION:
return True
# Some punctuation may be generally considered OK if found at the end of a string; others are very unlikely
if data[-1] in self.PUNCTUATION and data[-1] not in ['.', '?', ',', '!', '>', '<', '|', '&']:
return True
for i in range(0, len(data)):
try:
# Q's must be followed by U's
if data[i] in ['q', 'Q'] and data[i+1] not in ['u', 'U']:
return True
except:
pass
try:
# Three characters in a row are the same? Unlikely.
if data[i] == data[i+1] == data[i+2]:
return True
except:
pass
try:
# Three punctuation marks in a row? Unlikely.
if data[i] in self.PUNCTUATION and data[i+1] in self.PUNCTUATION and data[i+2] in self.PUNCTUATION:
return True
except:
pass
return False
def _is_valid(self, offset, string):
'''
Determines of a particular string is "valid" or not.
@string - The string in question.
Returns True if the string is valid, False if invalid.
'''
strlen = len(string)
for callback in self.external_validators:
r = callback(offset, string)
if r is not None:
return r
# Large strings are automatically considered valid/interesting
if strlen >= self.MAX_STRING_LENGTH:
return True
elif strlen >= self.n:
# The chances of a random string being bracketed is pretty low.
# If the string is bracketed, consider it valid.
if self._is_bracketed(string):
return True
# Else, do some basic sanity checks on the string
elif self._has_vowels(string):
if not self._too_many_special_chars(string):
if not self._fails_grammar_rules(string):
return True
return False
def _add_string(self, offset, string, plug_pre):
'''
Adds a string to the list of valid strings if it passes several rules.
Also responsible for calling plugin and display callback functions.
@offset - The offset at which the string was found.
@string - The string that was found.
@plug_pre - Return value from plugin pre-scan callback functions.
Returns the value from the plugin callback functions.
'''
plug_ret = plugins.PLUGIN_CONTINUE
string = string.strip()
if self._is_valid(offset, string):
results = {'description' : string, 'offset' : offset}
if self.plugins:
plug_ret = self.plugins._scan_callbacks(results)
offset = results['offset']
string = results['description']
if not ((plug_ret | plug_pre ) & plugins.PLUGIN_NO_DISPLAY):
self.binwalk.display.results(offset, [results])
self.valid_strings.append((offset, string))
return plug_ret
def strings(self):
'''
Perform a strings analysis on the target file.
Returns a list of tuples consiting of [(offset, string), (offset, string), ...].
'''
string = ''
string_start = 0
plugin_pre = plugins.PLUGIN_CONTINUE
plugin_ret = plugins.PLUGIN_CONTINUE
if self.plugins:
plugin_pre = self.plugins._pre_scan_callbacks(self.fd)
while not ((plugin_pre | plugin_ret) & plugins.PLUGIN_TERMINATE):
(byte_offset, byte) = self._next_byte()
# If the returned byte is NULL, try to add whatever string we have now and quit
if not byte:
self._add_string(string_start, string, plugin_pre)
break
# End of string is signified by a non-printable character or a new line
if byte in self.PRINTABLE and byte not in self.NEWLINES:
if not string:
string_start = byte_offset
string += byte
else:
plugin_ret = self._add_string(string_start, string, plugin_pre)
string = ''
if self.plugins:
self.plugins._post_scan_callbacks(self.fd)
return self.valid_strings
class Strings(object):
'''
Class for performing a strings analysis against a list of files.
'''
def __init__(self, file_names, binwalk, length=0, offset=0, n=0, block=0, algorithm=None, load_plugins=True, whitelist=[], blacklist=[]):
'''
Class constructor.
@file_names - A list of files to analyze.
@binwalk - An instance of the Binwalk class.
@length - The number of bytes in the file to analyze.
@offset - The starting offset into the file to begin analysis.
@n - The minimum valid string length.
@block - The block size to use when performing entropy analysis.
@algorithm - The entropy algorithm to use when performing entropy analysis.
@load_plugins - Set to False to disable plugin callbacks.
@whitelist - A list of whitelisted plugins.
@blacklist - A list of blacklisted plugins.
Returns None.
'''
self.file_names = file_names
self.binwalk = binwalk
self.length = length
self.offset = offset
self.n = n
self.block = block
self.algorithm = algorithm
self.binwalk.scan_type = self.binwalk.STRINGS
self.file_strings = None
if load_plugins:
self.plugins = plugins.Plugins(self.binwalk, whitelist=whitelist, blacklist=blacklist)
else:
self.plugins = None
def __enter__(self):
return self
def __exit__(self, t, v, traceback):
return None
def add_validator(self, callback):
'''
Add a validation function to be invoked when determining if a string is valid or not.
Validators are passed two arguments: the string offset and the string in question.
Validators may return:
o True - The string is valid, stop further analysis.
o False - The string is not valid, stop futher analysis.
o None - Unknown, continue analysis.
@callback - The validation function.
Returns None.
'''
if self.file_strings:
self.file_strings.external_validators.append(callback)
def strings(self):
'''
Perform a "smart" strings analysis against the target files.
Returns a dictionary compatible with other classes (Entropy, Binwalk, etc):
{
'file_name' : (offset, [{
'description' : 'Strings',
'string' : 'found_string'
}]
)
}
'''
results = {}
if self.plugins:
self.plugins._load_plugins()
for file_name in self.file_names:
self.binwalk.display.header(file_name=file_name, description='Strings')
results[file_name] = []
self.file_strings = FileStrings(file_name, self.binwalk, self.length, self.offset, self.n, block=self.block, algorithm=self.algorithm, plugins=self.plugins)
for (offset, string) in self.file_strings.strings():
results[file_name].append((offset, [{'description' : 'Strings', 'string' : string}]))
del self.file_strings
self.file_strings = None
self.binwalk.display.footer()
if self.plugins:
del self.plugins
return results
import os
import urllib2
from config import *
class Update:
'''
Class for updating binwalk configuration and signatures files from the subversion trunk.
Example usage:
from binwalk import Update
Update().update()
'''
BASE_URL = "http://binwalk.googlecode.com/svn/trunk/src/binwalk/"
MAGIC_PREFIX = "magic/"
CONFIG_PREFIX = "config/"
def __init__(self):
'''
Class constructor.
'''
self.config = Config()
def update(self):
'''
Updates all system wide signatures and config files.
Returns None.
'''
self.update_binwalk()
self.update_bincast()
self.update_binarch()
self.update_extract()
self.update_zlib()
def _do_update_from_svn(self, prefix, fname):
'''
Updates the specified file to the latest version of that file in SVN.
@prefix - The URL subdirectory where the file is located.
@fname - The name of the file to update.
Returns None.
'''
# Get the local http proxy, if any
# csoban.kesmarki
proxy_url = os.getenv('HTTP_PROXY')
if proxy_url:
proxy_support = urllib2.ProxyHandler({'http' : proxy_url})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
url = self.BASE_URL + prefix + fname
try:
data = urllib2.urlopen(url).read()
open(self.config.paths['system'][fname], "wb").write(data)
except Exception, e:
raise Exception("Update._do_update_from_svn failed to update file '%s': %s" % (url, str(e)))
def update_binwalk(self):
'''
Updates the binwalk signature file.
Returns None.
'''
self._do_update_from_svn(self.MAGIC_PREFIX, self.config.BINWALK_MAGIC_FILE)
def update_bincast(self):
'''
Updates the bincast signature file.
Returns None.
'''
self._do_update_from_svn(self.MAGIC_PREFIX, self.config.BINCAST_MAGIC_FILE)
def update_binarch(self):
'''
Updates the binarch signature file.
Returns None.
'''
self._do_update_from_svn(self.MAGIC_PREFIX, self.config.BINARCH_MAGIC_FILE)
def update_zlib(self):
'''
Updates the zlib signature file.
Returns None.
'''
self._do_update_from_svn(self.MAGIC_PREFIX, self.config.ZLIB_MAGIC_FILE)
def update_extract(self):
'''
Updates the extract.conf file.
Returns None.
'''
self._do_update_from_svn(self.CONFIG_PREFIX, self.config.EXTRACT_FILE)
#!/bin/bash
# Easy installer script for Debian/RedHat/OSX systems.
function debian
{
# The appropriate unrar package goes under different names in Debian vs Ubuntu
sudo apt-get -y install unrar-nonfree
if [ "$?" != "0" ]
then
echo "WARNING: Failed to install 'unrar-nonfree' package, trying 'unrar' instead..."
sudo apt-get -y install unrar
fi
# Install binwalk/fmk pre-requisites and extraction tools
sudo apt-get -y install git build-essential mtd-utils zlib1g-dev liblzma-dev ncompress gzip bzip2 tar arj p7zip p7zip-full openjdk-6-jdk python-magic python-matplotlib
}
function redhat
{
sudo yum groupinstall -y "Development Tools"
sudo yum install -y git mtd-utils unrar zlib1g-dev liblzma-dev xz-devel compress gzip bzip2 tar arj p7zip p7zip-full openjdk-6-jdk python-magic python-matplotlib
}
function darwin
{
sudo port install git-core arj p7zip py-magic py-matplotlib
}
if [ "$1" == "" ] || [ "$1" == "--sumount" ]
then
PLATFORM=$(python -c 'import platform; print platform.system().lower()')
DISTRO=$(python -c 'import platform; print platform.linux_distribution()[0].lower()')
else
DISTRO="$1"
fi
if [ "$DISTRO" == "" ]
then
DISTRO="$PLATFORM"
fi
echo "Detected $DISTRO $PLATFORM"
case $DISTRO in
debian)
;&
ubuntu)
;&
linuxmint)
;&
knoppix)
;&
aptosid)
debian
;;
redhat)
;&
rhel)
;&
fedora)
;&
centos)
redhat
;;
darwin)
darwin
;;
*)
echo ""
echo "This system is not supported by easy install! You may need to install dependent packages manually."
echo ""
echo "If your system is a derivative of Debian, RedHat or OSX, you can try manually specifying your system type on the command line:"
echo ""
echo -e "\t$0 [debian | redhat | darwin] [--sumount]"
echo ""
exit 1
esac
if [ "$DISTRO" != "darwin" ]
then
# Get and build the firmware mod kit
sudo rm -rf /opt/firmware-mod-kit/
sudo mkdir -p /opt/firmware-mod-kit
sudo chmod a+rwx /opt/firmware-mod-kit
git clone https://code.google.com/p/firmware-mod-kit /opt/firmware-mod-kit/
cd /opt/firmware-mod-kit/src
./configure && sudo make
if [ "$1" == "--sumount" ] || [ "$2" == "--sumount" ]
then
# The following will allow you - and others - to mount/unmount file systems without root permissions.
# This may be problematic, especially on a multi-user system, so think about it first.
sudo chown root ./mountcp/mountsu
sudo chmod u+s ./mountcp/mountsu
sudo chmod o-w ./mountcp/mountsu
sudo chown root ./mountcp/umountsu
sudo chmod u+s ./mountcp/umountsu
sudo chmod o-w ./mountcp/umountsu
sudo chown root ./jffs2/sunjffs2
sudo chmod u+s ./jffs2/sunjffs2
sudo chmod o-w ./jffs2/sunjffs2
fi
cd -
fi
# Install binwalk
sudo python setup.py install
# ----------------------------Archive Formats--------------------------------------
# POSIX tar archives
0 string ustar\000 POSIX tar archive{offset-adjust:-257}
0 string ustar\040\040\000 POSIX tar archive (GNU){offset-adjust:-257}
# JAR archiver (.j), this is the successor to ARJ, not Java's JAR (which is essentially ZIP)
0 string \x1aJar\x1b JAR (ARJ Software, Inc.) archive data{offset-adjust:-14}
0 string JARCS JAR (ARJ Software, Inc.) archive data
# PKZIP multi-volume archive
0 string PK\x07\x08PK\x03\x04 Zip multi-volume archive data, at least PKZIP v2.50 to extract
# ZIP compression (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu)
0 string PK\003\004 Zip
>6 leshort &0x01 encrypted
>0 byte x archive data,
>4 byte 0x00 v0.0
>4 byte 0x09 at least v0.9 to extract,
>4 byte 0x0a at least v1.0 to extract,
>4 byte 0x0b at least v1.1 to extract,
>0x161 string WINZIP WinZIP self-extracting,
>4 byte 0x14
>>30 ubelong !0x6d696d65 at least v2.0 to extract,
>18 lelong !0
>>18 lelong <0 invalid
>>18 lelong x compressed size: %d,
>>18 lelong x {jump-to-offset:%d}
>22 lelong !0
>>22 lelong <0 invalid
>>22 lelong x uncompressed size: %d,{extract-delay:End of Zip archive}
>30 string x {file-name:{raw-replace}}name: {raw-replace}
>26 leshort x {raw-string-length:%d}
>30 string x {raw-string:%s
>61 string x \b%s
>92 string x \b%s
>123 string x \b%s
>154 string x \b%s}
# ZIP footer
0 string PK\x05\x06 End of Zip archive
#>10 leshort x number of records: %d,
#>12 leshort x size of central directory: %d
#>20 leshort x {offset-adjust:22+%d}
>20 leshort >0
>>20 leshort x \b, comment: {raw-replace}
>>20 leshort x {raw-string-length:%d}
>>22 string x {raw-string:%s}
# ARJ archiver (jason@jarthur.Claremont.EDU)
0 leshort 0xea60 ARJ archive data,
>2 leshort x header size: %d,
>5 byte <1 invalid
>5 byte >16 invalid
>5 byte x version %d,
>6 byte <1 invalid
>6 byte >16 invalid
>6 byte x minimum version to extract: %d,
>8 byte <0 invalid flags,
>8 byte &0x04 multi-volume,
>8 byte &0x10 slash-switched,
>8 byte &0x20 backup,
>9 byte <0 invalid compression method,
>9 byte >4 invalid compression method,
>9 byte 0 compression method: stored,
>9 byte 1 compression method: compressed most,
>9 byte 2 compression method: compressed,
>9 byte 3 compression method: compressed faster,
>9 byte 4 compression method: compressed fastest,
>10 byte <0 invalid file type
>10 byte >4 invalid file type
>10 byte 0 file type: binary,
>10 byte 1 file type: 7-bit text,
>10 byte 2 file type: comment header,
>10 byte 3 file type: directory,
>10 byte 4 file type: volume label,
>34 byte !0
>>34 string x {file-name:%s}
>>34 string x original name: "%s",
>0xC ledate x original file date: %s,
>0x10 lelong <0 invalid
>0x10 lelong x compressed file size: %d,
>0x14 lelong <0 invalid
>0x14 lelong x uncompressed file size: %d,
>7 byte 0 os: MS-DOS
>7 byte 1 os: PRIMOS
>7 byte 2 os: Unix
>7 byte 3 os: Amiga
>7 byte 4 os: Macintosh
>7 byte 5 os: OS/2
>7 byte 6 os: Apple ][ GS
>7 byte 7 os: Atari ST
>7 byte 8 os: NeXT
>7 byte 9 os: VAX/VMS
>7 byte >9 invalid os
>7 byte <0 invalid os
# RAR archiver (Greg Roelofs, newt@uchicago.edu)
0 string Rar! RAR archive data
# HPACK archiver (Peter Gutmann, pgut1@cs.aukuni.ac.nz)
0 string HPAK HPACK archive data
# JAM Archive volume format, by Dmitry.Kohmanyuk@UA.net
0 string \351,\001JAM JAM archive
# LHARC/LHA archiver (Greg Roelofs, newt@uchicago.edu)
0 string -lzs- LHa 2.x? archive data [lzs] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh\40- LHa 2.x? archive data [lh ] [NSRL|LHA2]{offset-adjust:-2}
0 string -lhd- LHa 2.x? archive data [lhd] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh2- LHa 2.x? archive data [lh2] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh3- LHa 2.x? archive data [lh3] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh4- LHa (2.x) archive data [lh4] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh5- LHa (2.x) archive data [lh5] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh6- LHa (2.x) archive data [lh6] [NSRL|LHA2]{offset-adjust:-2}
0 string -lh7- LHa (2.x) archive data [lh7] [NSRL|LHA2]{offset-adjust:-2}
# cpio archives
#
# The SVR4 "cpio(4)" hints that there are additional formats, but they
# are defined as "short"s; I think all the new formats are
# character-header formats and thus are strings, not numbers.
#0 string 070707 ASCII cpio archive (pre-SVR4 or odc)
0 string 070701 ASCII cpio archive (SVR4 with no CRC),
>110 byte 0 invalid
#>110 byte !0x2F
#>>110 string !TRAILER!!! invalid
>110 string x file name: "%s"
>54 string x file size: "0x%.8s"
>54 string x {jump-to-offset:0x%.8s+112}
0 string 070702 ASCII cpio archive (SVR4 with CRC)
>110 byte 0 invalid
#>110 byte !0x2F
#>>110 string !TRAILER!!! invalid
>110 string x file name: "%s"
>54 string x file size: "0x%.8s"
>54 string x {jump-to-offset:0x%.8s+112}
# HP Printer Job Language
# The header found on Win95 HP plot files is the "Silliest Thing possible"
# (TM)
# Every driver puts the language at some random position, with random case
# (LANGUAGE and Language)
# For example the LaserJet 5L driver puts the "PJL ENTER LANGUAGE" in line 10
# From: Uwe Bonnes <bon@elektron.ikp.physik.th-darmstadt.de>
#
0 string \033%-12345X@PJL HP Printer Job Language data
>&0 string >\0 "%s"
>>&0 string >\0 "%s"
>>>&0 string >\0 "%s"
>>>>&0 string >\0 "%s"
#------------------------------------------------------------------------------
#
# RPM: file(1) magic for Red Hat Packages Erik Troan (ewt@redhat.com)
#
0 belong 0xedabeedb RPM
>4 byte x v%d
>6 beshort 0 bin
>6 beshort 1 src
>8 beshort 1 i386
>8 beshort 2 Alpha
>8 beshort 3 Sparc
>8 beshort 4 MIPS
>8 beshort 5 PowerPC
>8 beshort 6 68000
>8 beshort 7 SGI
>8 beshort 8 RS6000
>8 beshort 9 IA64
>8 beshort 10 Sparc64
>8 beshort 11 MIPSel
>8 beshort 12 ARM
>10 string x "%s"
# IBM AIX Backup File Format header and entry signatures
0 lelong 0xea6b0009 BFF volume header,
>4 leshort x checksum: 0x%.4X,
>6 leshort <0 invalid
>6 leshort 0 invalid
>6 leshort x volume number: %d,
>8 ledate x current date: %s,
>12 ledate x starting date: %s,
>20 string x disk name: "%s",
>36 string x file system name: "%s",
>52 string x user name: "%s"
0 leshort 0xea6b BFF volume entry,{offset-adjust:-2}
>22 lelong <0 invalid
>22 lelong 0 directory,
>22 lelong >0
>>22 lelong x file size: %d,
>>54 lelong <0 invalid
>>54 lelong 0 invalid
>>54 lelong x compressed size: %d,
>58 lelong !0 invalid
>62 byte 0 invalid
>62 byte !0x2e
>>62 byte !0x2f invalid
>62 string x file name: "%s
>92 string x \b%s"
0 leshort 0xea6c BFF volume entry, compressed,{offset-adjust:-2}
>22 lelong <0 invalid
>22 lelong 0 directory,
>22 lelong >0
>>22 lelong x file size: %d,
>>54 lelong <0 invalid
>>54 lelong 0 invalid
>>54 lelong x compressed size: %d,
>58 lelong !0 invalid
>62 byte 0 invalid
>62 byte !0x2e
>>62 byte !0x2f invalid
>62 string x file name: "%s
>92 string x \b%s"
0 leshort 0xea6d BFF volume entry, AIXv3,{offset-adjust:-2}
>22 lelong <0 invalid
>22 lelong 0 directory,
>22 lelong >0
>>22 lelong x file size: %d,
>>54 lelong <0 invalid
>>54 lelong 0 invalid
>>54 lelong x compressed size: %d,
>58 lelong !0 invalid
>62 byte 0 invalid
>62 byte !0x2e
>>62 byte !0x2f invalid
>62 string x file name: "%s
>92 string x \b%s"
#------------------------------------------------------------------------------
# From Stuart Caie <kyzer@4u.net> (developer of cabextract)
# Microsoft Cabinet files
0 string MSCF\0\0\0\0 Microsoft Cabinet archive data
# According to libmagic comments, CAB version number is always 1.3
>25 byte !1 \b,invalid major version
>24 byte !3 \b,invalid minor version
>8 lelong x \b, %u bytes
>28 leshort 0 \b, 0 files (invalid)
>28 leshort 1 \b, 1 file
>28 leshort >1 \b, %u files
# InstallShield Cabinet files
0 string ISc( InstallShield Cabinet archive data
# TODO: Version number checks should be made more specific for false positive filtering
>5 byte&0xf0 =0x60 version 6,
>5 byte&0xf0 !0x60 version 4/5,
>(12.l+40) lelong x %u files
# Windows CE package files
0 string MSCE\0\0\0\0 Microsoft WinCE install header
>20 lelong 0 \b, architecture-independent
>20 lelong 103 \b, Hitachi SH3
>20 lelong 104 \b, Hitachi SH4
>20 lelong 0xA11 \b, StrongARM
>20 lelong 4000 \b, MIPS R4000
>20 lelong 10003 \b, Hitachi SH3
>20 lelong 10004 \b, Hitachi SH3E
>20 lelong 10005 \b, Hitachi SH4
>20 lelong 70001 \b, ARM 7TDMI
>52 leshort 1 \b, 1 file
>52 leshort >1 \b, %u files
>56 leshort 1 \b, 1 registry entry
>56 leshort >1 \b, %u registry entries
0 string \0\ \ \ \ \ \ \ \ \ \ \ \0\0 LBR archive data
# Parity archive reconstruction file, the 'par' file format now used on Usenet.
0 string PAR\0 PARity archive data
>48 leshort =0 - Index file
>48 leshort >0 - file number %d
# Felix von Leitner <felix-file@fefe.de>
0 string d8:announce BitTorrent file
#---------------------------Bootloaders--------------------------------
# CFE bootloader
0 string CFE1CFE1 CFE boot loader
>40 string CFE1CFE1 invalid
# U-Boot boot loader
0 string U-Boot U-Boot boot loader reference{one-of-many}
0 string U-BOOT U-Boot boot loader reference{one-of-many}
0 string u-boot U-Boot boot loader reference{one-of-many}
#------------------Compression Formats-----------------------------
# AFX compressed files (Wolfram Kleff)
0 string -afx- AFX compressed file data{offset-adjust:-2}
# bzip2
0 string BZh91AY&SY bzip2 compressed data, block size = 900k
0 string BZh81AY&SY bzip2 compressed data, block size = 800k
0 string BZh71AY&SY bzip2 compressed data, block size = 700k
0 string BZh61AY&SY bzip2 compressed data, block size = 600k
0 string BZh51AY&SY bzip2 compressed data, block size = 500k
0 string BZh41AY&SY bzip2 compressed data, block size = 400k
0 string BZh31AY&SY bzip2 compressed data, block size = 300k
0 string BZh21AY&SY bzip2 compressed data, block size = 200k
0 string BZh11AY&SY bzip2 compressed data, block size = 100k
# lzop from <markus.oberhumer@jk.uni-linz.ac.at>
0 string \x89\x4c\x5a\x4f\x00\x0d\x0a\x1a\x0a lzop compressed data
>9 beshort <0x0940
>>9 byte&0xf0 =0x00 - version 0.
>>9 beshort&0x0fff x \b%03x,
>>13 byte 1 LZO1X-1,
>>13 byte 2 LZO1X-1(15),
>>13 byte 3 LZO1X-999,
## >>22 bedate >0 last modified: %s,
>>14 byte =0x00 os: MS-DOS
>>14 byte =0x01 os: Amiga
>>14 byte =0x02 os: VMS
>>14 byte =0x03 os: Unix
>>14 byte =0x05 os: Atari
>>14 byte =0x06 os: OS/2
>>14 byte =0x07 os: MacOS
>>14 byte =0x0A os: Tops/20
>>14 byte =0x0B os: WinNT
>>14 byte =0x0E os: Win32
>9 beshort >0x0939
>>9 byte&0xf0 =0x00 - version 0.
>>9 byte&0xf0 =0x10 - version 1.
>>9 byte&0xf0 =0x20 - version 2.
>>9 beshort&0x0fff x \b%03x,
>>15 byte 1 LZO1X-1,
>>15 byte 2 LZO1X-1(15),
>>15 byte 3 LZO1X-999,
## >>25 bedate >0 last modified: %s,
>>17 byte =0x00 os: MS-DOS
>>17 byte =0x01 os: Amiga
>>17 byte =0x02 os: VMS
>>17 byte =0x03 os: Unix
>>17 byte =0x05 os: Atari
>>17 byte =0x06 os: OS/2
>>17 byte =0x07 os: MacOS
>>17 byte =0x0A os: Tops/20
>>17 byte =0x0B os: WinNT
>>17 byte =0x0E os: Win32
# lzip
0 string LZIP lzip compressed data
>4 byte x \b, version: %d
# LZO
0 string \211LZO\000\015\012\032\012 LZO compressed data
# 7-zip archiver, from Thomas Klausner (wiz@danbala.tuwien.ac.at)
# http://www.7-zip.org or DOC/7zFormat.txt
#
0 string 7z\274\257\047\034 7-zip archive data,
>6 byte <0 invalid
>6 byte 0 invalid
>6 byte >20 invalid
>6 byte x version %d
>7 byte x \b.%d
# standard unix compress
# Implemented in the compress binwalk plugin.
#0 string \x1f\x9d\x90 compress'd data, 16 bits
# http://tukaani.org/xz/xz-file-format.txt
0 string \xFD\x37\x7a\x58\x5a\x00 xz compressed data
# gzip (GNU zip, not to be confused with Info-ZIP or PKWARE zip archiver)
# Edited by Chris Chittleborough <cchittleborough@yahoo.com.au>, March 2002
# * Original filename is only at offset 10 if "extra field" absent
# * Produce shorter output - notably, only report compression methods
# other than 8 ("deflate", the only method defined in RFC 1952).
#0 string \037\213\x08 gzip compressed data
0 string \x1f\x8b\x08 gzip compressed data
>3 byte &0x01 \b, ASCII
>3 byte&0xE0 !0x00 \b, invalid reserved flag bits
>8 byte 2 \b, maximum compression
>8 byte 4 \b, fastest compression
>8 byte 1 \b, invalid extra flags
>8 byte 3 \b, invalid extra flags
>8 byte >4 \b, invalid extra flags
>3 byte &0x02 \b, has header CRC
>3 byte&0x04 0x04
>>10 leshort x \b, has %d bytes of extra data
>3 byte&0xC =0x08 \b, has original file name
>>10 string x \b{file-name:%s}
>>10 string x \b: "%s"
>3 byte &0x10 \b, has comment
>>3 byte&0xC 0
>>>10 string x \b: "%s"
>9 byte =0x00 \b, from FAT filesystem (MS-DOS, OS/2, NT)
>9 byte =0x01 \b, from Amiga
>9 byte =0x02 \b, from VMS
>9 byte =0x03 \b, from Unix
>9 byte =0x04 \b, from VM/CMS
>9 byte =0x05 \b, from Atari
>9 byte =0x06 \b, from HPFS filesystem (OS/2, NT)
>9 byte =0x07 \b, from MacOS
>9 byte =0x08 \b, from Z-System
>9 byte =0x09 \b, from CP/M
>9 byte =0x0A \b, from TOPS/20
>9 byte =0x0B \b, from NTFS filesystem (NT)
>9 byte =0x0C \b, from QDOS
>9 byte =0x0D \b, from Acorn RISCOS
#>9 byte =0xFF \b, from ZyNOS
#>9 byte >0x0D \b, invalid
#>>9 byte x source: 0x%.2X
#>9 byte <0 \b, invalid
#>>9 byte x source: 0x%.2X
>3 byte &0x20 \b, encrypted (invalid)
# Dates before 1992 are invalid, unless of course you're DD-WRT in which
# case you don't know how to set a date in your gzip files. Brilliant.
>4 lelong =0 \b, NULL date:
>4 lelong <0 \b, invalid date:
>4 lelong >0
>>4 lelong <694224000 \b, invalid date:
>>4 lelong =694224000 \b, invalid date:
>>4 lelong >694224000 \b, last modified:
>4 ledate x %s
>4 lelong x \b{epoch:%d}
# Zlib signatures
# Too short to be useful on their own; see:
#
# o src/binwalk/magic/zlib
# o src/binwalk/plugins/zlib.py
#
#0 beshort 0x789C zlib compressed data
#0 beshort 0x78DA zlib compressed data
#0 beshort 0x7801 zlib compressed data
# Supplementary magic data for the file(1) command to support
# rzip(1). The format is described in magic(5).
#
# Copyright (C) 2003 by Andrew Tridgell. You may do whatever you want with
# this file.
#
0 string RZIP rzip compressed data
>4 byte x - version %d
>5 byte x \b.%d
>6 belong x (%d bytes)
# JAR
0 belong 0xcafed00d JAR compressed with pack200,
>5 byte x version %d.
>4 byte x \b%d
# New LZMA format signature
0 string \xFFLZMA\x00 LZMA compressed data (new),
>6 byte&0x10 0 single-block stream
>6 byte&0x10 0x10 multi-block stream
# See lzma file for LZMA signatures
# Type: OpenSSL certificates/key files
# From: Nicolas Collignon <tsointsoin@gmail.com>
0 string -----BEGIN\x20CERTIFICATE----- PEM certificate
0 string -----BEGIN\x20CERTIFICATE\x20REQ PEM certificate request
0 string -----BEGIN\x20RSA\x20PRIVATE PEM RSA private key
0 string -----BEGIN\x20DSA\x20PRIVATE PEM DSA private key
# Type: OpenSSH key files
# From: Nicolas Collignon <tsointsoin@gmail.com>
0 string SSH\x20PRIVATE\x20KEY OpenSSH RSA1 private key,
>28 string >\0 version "%s"
0 string ssh-dss\x20 OpenSSH DSA public key
0 string ssh-rsa\x20 OpenSSH RSA public key
# Type: Certificates/key files in DER format
# From: Gert Hulselmans <hulselmansgert@gmail.com>
0 string \x30\x82 Private key in DER format (PKCS#8),
>4 string !\x02\x01\x00 invalid,
>>2 beshort x header length: 4, sequence length: %d
0 string \x30\x82 Certificate in DER format (x509 v3),
>4 string !\x30\x82 invalid,
>>2 beshort x header length: 4, sequence length: %d
# GnuPG
# The format is very similar to pgp
0 string \001gpg GPG key trust database
>4 byte x version %d
# Not a very useful signature
#0 beshort 0x9901 GPG key public ring
# This magic is not particularly good, as the keyrings don't have true
# magic. Nevertheless, it covers many keyrings.
#------------------------------------------------------------------------------
# Mavroyanopoulos Nikos <nmav@hellug.gr>
# mcrypt: file(1) magic for mcrypt 2.2.x;
0 string \0m\3 mcrypt 2.5 encrypted data,
>4 byte 0 invalid
>4 string >\0 algorithm: "%s",
>>&1 leshort <1 invalid
>>&1 leshort >0 keysize: %d bytes,
>>>&0 byte 0 invalid
>>>&0 string >\0 mode: "%s",
0 string \0m\2 mcrypt 2.2 encrypted data,
>3 byte 0 algorithm: blowfish-448,
>3 byte 1 algorithm: DES,
>3 byte 2 algorithm: 3DES,
>3 byte 3 algorithm: 3-WAY,
>3 byte 4 algorithm: GOST,
>3 byte 6 algorithm: SAFER-SK64,
>3 byte 7 algorithm: SAFER-SK128,
>3 byte 8 algorithm: CAST-128,
>3 byte 9 algorithm: xTEA,
>3 byte 10 algorithm: TWOFISH-128,
>3 byte 11 algorithm: RC2,
>3 byte 12 algorithm: TWOFISH-192,
>3 byte 13 algorithm: TWOFISH-256,
>3 byte 14 algorithm: blowfish-128,
>3 byte 15 algorithm: blowfish-192,
>3 byte 16 algorithm: blowfish-256,
>3 byte 100 algorithm: RC6,
>3 byte 101 algorithm: IDEA,
>3 byte <0 invalid algorithm
>3 byte >101 invalid algorithm,
>3 byte >16
>>3 byte <100 invalid algorithm,
>4 byte 0 mode: CBC,
>4 byte 1 mode: ECB,
>4 byte 2 mode: CFB,
>4 byte 3 mode: OFB,
>4 byte 4 mode: nOFB,
>4 byte <0 invalid mode,
>4 byte >4 invalid mode,
>5 byte 0 keymode: 8bit
>5 byte 1 keymode: 4bit
>5 byte 2 keymode: SHA-1 hash
>5 byte 3 keymode: MD5 hash
>5 byte <0 invalid keymode
>5 byte >3 invalid keymode
#------------------------------------------------------------------------------
# pgp: file(1) magic for Pretty Good Privacy
#
#0 beshort 0x9900 PGP key public ring
#0 beshort 0x9501 PGP key security ring
#0 beshort 0x9500 PGP key security ring
#0 beshort 0xa600 PGP encrypted data
0 string -----BEGIN\040PGP PGP armored data,
>15 string PUBLIC\040KEY\040BLOCK- public key block
>15 string MESSAGE- message
>15 string SIGNED\040MESSAGE- signed message
>15 string PGP\040SIGNATURE- signature
0 string Salted__ OpenSSL encryption, salted,
>8 belong x salt: 0x%X
>12 belong x \b%X
#------------------Standard file formats------------------------------------
#------------------------------------------------------------------------------
# elf: file(1) magic for ELF executables
#
# We have to check the byte order flag to see what byte order all the
# other stuff in the header is in.
#
# What're the correct byte orders for the nCUBE and the Fujitsu VPP500?
#
# updated by Daniel Quinlan (quinlan@yggdrasil.com)
0 string \177ELF ELF
>4 byte 0 invalid class
>4 byte 1 32-bit
# only for MIPS - in the future, the ABI field of e_flags should be used.
>>18 leshort 8
>>>36 lelong &0x20 N32
>>18 leshort 10
>>>36 lelong &0x20 N32
>>18 beshort 8
>>>36 belong &0x20 N32
>>18 beshort 10
>>>36 belong &0x20 N32
>4 byte 2 64-bit
>5 byte 0 invalid byte order
>5 byte 1 LSB
# The official e_machine number for MIPS is now #8, regardless of endianness.
# The second number (#10) will be deprecated later. For now, we still
# say something if #10 is encountered, but only gory details for #8.
>>18 leshort 8
# only for 32-bit
>>>4 byte 1
>>>>36 lelong&0xf0000000 0x00000000 MIPS-I
>>>>36 lelong&0xf0000000 0x10000000 MIPS-II
>>>>36 lelong&0xf0000000 0x20000000 MIPS-III
>>>>36 lelong&0xf0000000 0x30000000 MIPS-IV
>>>>36 lelong&0xf0000000 0x40000000 MIPS-V
>>>>36 lelong&0xf0000000 0x60000000 MIPS32
>>>>36 lelong&0xf0000000 0x70000000 MIPS64
>>>>36 lelong&0xf0000000 0x80000000 MIPS32 rel2
>>>>36 lelong&0xf0000000 0x90000000 MIPS64 rel2
# only for 64-bit
>>>4 byte 2
>>>>48 lelong&0xf0000000 0x00000000 MIPS-I
>>>>48 lelong&0xf0000000 0x10000000 MIPS-II
>>>>48 lelong&0xf0000000 0x20000000 MIPS-III
>>>>48 lelong&0xf0000000 0x30000000 MIPS-IV
>>>>48 lelong&0xf0000000 0x40000000 MIPS-V
>>>>48 lelong&0xf0000000 0x60000000 MIPS32
>>>>48 lelong&0xf0000000 0x70000000 MIPS64
>>>>48 lelong&0xf0000000 0x80000000 MIPS32 rel2
>>>>48 lelong&0xf0000000 0x90000000 MIPS64 rel2
>>16 leshort 0 no file type,
>>16 leshort 1 relocatable,
>>16 leshort 2 executable,
>>16 leshort 3 shared object,
# Core handling from Peter Tobias <tobias@server.et-inf.fho-emden.de>
# corrections by Christian 'Dr. Disk' Hechelmann <drdisk@ds9.au.s.shuttle.de>
>>16 leshort 4 core file
# Core file detection is not reliable.
#>>>(0x38+0xcc) string >\0 of '%s'
#>>>(0x38+0x10) lelong >0 (signal %d),
>>16 leshort &0xff00 processor-specific,
>>18 leshort 0 no machine,
>>18 leshort 1 AT&T WE32100 - invalid byte order,
>>18 leshort 2 SPARC - invalid byte order,
>>18 leshort 3 Intel 80386,
>>18 leshort 4 Motorola
>>>36 lelong &0x01000000 68000 - invalid byte order,
>>>36 lelong &0x00810000 CPU32 - invalid byte order,
>>>36 lelong 0 68020 - invalid byte order,
>>18 leshort 5 Motorola 88000 - invalid byte order,
>>18 leshort 6 Intel 80486,
>>18 leshort 7 Intel 80860,
>>18 leshort 8 MIPS,
>>18 leshort 9 Amdahl - invalid byte order,
>>18 leshort 10 MIPS (deprecated),
>>18 leshort 11 RS6000 - invalid byte order,
>>18 leshort 15 PA-RISC - invalid byte order,
>>>50 leshort 0x0214 2.0
>>>48 leshort &0x0008 (LP64),
>>18 leshort 16 nCUBE,
>>18 leshort 17 Fujitsu VPP500,
>>18 leshort 18 SPARC32PLUS,
>>18 leshort 20 PowerPC,
>>18 leshort 22 IBM S/390,
>>18 leshort 36 NEC V800,
>>18 leshort 37 Fujitsu FR20,
>>18 leshort 38 TRW RH-32,
>>18 leshort 39 Motorola RCE,
>>18 leshort 40 ARM,
>>18 leshort 41 Alpha,
>>18 leshort 0xa390 IBM S/390 (obsolete),
>>18 leshort 42 Hitachi SH,
>>18 leshort 43 SPARC V9 - invalid byte order,
>>18 leshort 44 Siemens Tricore Embedded Processor,
>>18 leshort 45 Argonaut RISC Core, Argonaut Technologies Inc.,
>>18 leshort 46 Hitachi H8/300,
>>18 leshort 47 Hitachi H8/300H,
>>18 leshort 48 Hitachi H8S,
>>18 leshort 49 Hitachi H8/500,
>>18 leshort 50 IA-64 (Intel 64 bit architecture)
>>18 leshort 51 Stanford MIPS-X,
>>18 leshort 52 Motorola Coldfire,
>>18 leshort 53 Motorola M68HC12,
>>18 leshort 62 AMD x86-64,
>>18 leshort 75 Digital VAX,
>>18 leshort 97 NatSemi 32k,
>>18 leshort 0x9026 Alpha (unofficial),
>>20 lelong 0 invalid version
>>20 lelong 1 version 1
>>36 lelong 1 MathCoPro/FPU/MAU Required
>5 byte 2 MSB
# only for MIPS - see comment in little-endian section above.
>>18 beshort 8
# only for 32-bit
>>>4 byte 1
>>>>36 belong&0xf0000000 0x00000000 MIPS-I
>>>>36 belong&0xf0000000 0x10000000 MIPS-II
>>>>36 belong&0xf0000000 0x20000000 MIPS-III
>>>>36 belong&0xf0000000 0x30000000 MIPS-IV
>>>>36 belong&0xf0000000 0x40000000 MIPS-V
>>>>36 belong&0xf0000000 0x60000000 MIPS32
>>>>36 belong&0xf0000000 0x70000000 MIPS64
>>>>36 belong&0xf0000000 0x80000000 MIPS32 rel2
>>>>36 belong&0xf0000000 0x90000000 MIPS64 rel2
# only for 64-bit
>>>4 byte 2
>>>>48 belong&0xf0000000 0x00000000 MIPS-I
>>>>48 belong&0xf0000000 0x10000000 MIPS-II
>>>>48 belong&0xf0000000 0x20000000 MIPS-III
>>>>48 belong&0xf0000000 0x30000000 MIPS-IV
>>>>48 belong&0xf0000000 0x40000000 MIPS-V
>>>>48 belong&0xf0000000 0x60000000 MIPS32
>>>>48 belong&0xf0000000 0x70000000 MIPS64
>>>>48 belong&0xf0000000 0x80000000 MIPS32 rel2
>>>>48 belong&0xf0000000 0x90000000 MIPS64 rel2
>>16 beshort 0 no file type,
>>16 beshort 1 relocatable,
>>16 beshort 2 executable,
>>16 beshort 3 shared object,
>>16 beshort 4 core file,
#>>>(0x38+0xcc) string >\0 of '%s'
#>>>(0x38+0x10) belong >0 (signal %d),
>>16 beshort &0xff00 processor-specific,
>>18 beshort 0 no machine,
>>18 beshort 1 AT&T WE32100,
>>18 beshort 2 SPARC,
>>18 beshort 3 Intel 80386 - invalid byte order,
>>18 beshort 4 Motorola
>>>36 belong &0x01000000 68000,
>>>36 belong &0x00810000 CPU32,
>>>36 belong 0 68020,
>>18 beshort 5 Motorola 88000,
>>18 beshort 6 Intel 80486 - invalid byte order,
>>18 beshort 7 Intel 80860,
>>18 beshort 8 MIPS,
>>18 beshort 9 Amdahl,
>>18 beshort 10 MIPS (deprecated),
>>18 beshort 11 RS6000,
>>18 beshort 15 PA-RISC
>>>50 beshort 0x0214 2.0
>>>48 beshort &0x0008 (LP64)
>>18 beshort 16 nCUBE,
>>18 beshort 17 Fujitsu VPP500,
>>18 beshort 18 SPARC32PLUS,
>>>36 belong&0xffff00 &0x000100 V8+ Required,
>>>36 belong&0xffff00 &0x000200 Sun UltraSPARC1 Extensions Required,
>>>36 belong&0xffff00 &0x000400 HaL R1 Extensions Required,
>>>36 belong&0xffff00 &0x000800 Sun UltraSPARC3 Extensions Required,
>>18 beshort 20 PowerPC or cisco 4500,
>>18 beshort 21 cisco 7500,
>>18 beshort 22 IBM S/390,
>>18 beshort 24 cisco SVIP,
>>18 beshort 25 cisco 7200,
>>18 beshort 36 NEC V800 or cisco 12000,
>>18 beshort 37 Fujitsu FR20,
>>18 beshort 38 TRW RH-32,
>>18 beshort 39 Motorola RCE,
>>18 beshort 40 ARM,
>>18 beshort 41 Alpha,
>>18 beshort 42 Hitachi SH,
>>18 beshort 43 SPARC V9,
>>18 beshort 44 Siemens Tricore Embedded Processor,
>>18 beshort 45 Argonaut RISC Core, Argonaut Technologies Inc.,
>>18 beshort 46 Hitachi H8/300,
>>18 beshort 47 Hitachi H8/300H,
>>18 beshort 48 Hitachi H8S,
>>18 beshort 49 Hitachi H8/500,
>>18 beshort 50 Intel Merced Processor,
>>18 beshort 51 Stanford MIPS-X,
>>18 beshort 52 Motorola Coldfire,
>>18 beshort 53 Motorola M68HC12,
>>18 beshort 73 Cray NV1,
>>18 beshort 75 Digital VAX,
>>18 beshort 97 NatSemi 32k,
>>18 beshort 0x9026 Alpha (unofficial),
>>18 beshort 0xa390 IBM S/390 (obsolete),
>>18 beshort 0xde3d Ubicom32,
>>20 belong 0 invalid version
>>20 belong 1 version 1
>>36 belong 1 MathCoPro/FPU/MAU Required
# Up to now only 0, 1 and 2 are defined; I've seen a file with 0x83, it seemed
# like proper ELF, but extracting the string had bad results.
>4 byte <0x80
>>8 string >\0 ("%s")
>8 string \0
>>7 byte 0 (SYSV)
>>7 byte 1 (HP-UX)
>>7 byte 2 (NetBSD)
>>7 byte 3 (GNU/Linux)
>>7 byte 4 (GNU/Hurd)
>>7 byte 5 (86Open)
>>7 byte 6 (Solaris)
>>7 byte 7 (Monterey)
>>7 byte 8 (IRIX)
>>7 byte 9 (FreeBSD)
>>7 byte 10 (Tru64)
>>7 byte 11 (Novell Modesto)
>>7 byte 12 (OpenBSD)
>>7 byte 97 (ARM)
>>7 byte 255 (embedded)
# XXX - according to Microsoft's spec, at an offset of 0x3c in a
# PE-format executable is the offset in the file of the PE header;
# unfortunately, that's a little-endian offset, and there's no way
# to specify an indirect offset with a specified byte order.
# So, for now, we assume the standard MS-DOS stub, which puts the
# PE header at 0x80 = 128.
#
# Required OS version and subsystem version were 4.0 on some NT 3.51
# executables built with Visual C++ 4.0, so it's not clear that
# they're interesting. The user version was 0.0, but there's
# probably some linker directive to set it. The linker version was
# 3.0, except for one ".exe" which had it as 4.20 (same damn linker!).
#
# many of the compressed formats were extraced from IDARC 1.23 source code
#
# Not a very useful signature...
#0 string MZ Microsoft
#>0x18 leshort <0x40 MS-DOS executable
0 string MZ\0\0\0\0\0\0\0\0\0\0
>12 string PE\0\0 Microsoft PE
>0x18 leshort <0x40 MS-DOS executable
>>&18 leshort&0x2000 >0 (DLL)
>>&88 leshort 0 (unknown subsystem)
>>&88 leshort 1 (native)
>>&88 leshort 2 (GUI)
>>&88 leshort 3 (console)
>>&88 leshort 7 (POSIX)
>>&0 leshort 0x0 unknown processor
>>&0 leshort 0x14c Intel 80386
>>&0 leshort 0x166 MIPS R4000
>>&0 leshort 0x184 Alpha
>>&0 leshort 0x268 Motorola 68000
>>&0 leshort 0x1f0 PowerPC
>>&0 leshort 0x290 PA-RISC
>>&18 leshort&0x0100 >0 32-bit
>>&18 leshort&0x1000 >0 system file
>>&228 lelong >0 \b, Mono/.Net assembly
>>&0xf4 search/0x140 \x0\x40\x1\x0
>>>(&0.l+(4)) string MSCF \b, WinHKI CAB self-extracting archive
>30 string Copyright\x201989-1990\x20PKWARE\x20Inc. Self-extracting PKZIP archive
# Is next line correct? One might expect "Corp." not "Copr." If it is right, add a note to that effect.
>30 string PKLITE\x20Copr. Self-extracting PKZIP archive
>0x18 leshort >0x3f
>>(0x3c.l) string PE\0\0 PE
>>>(0x3c.l+25) byte 1 \b32 executable
>>>(0x3c.l+25) byte 2 \b32+ executable
# hooray, there's a DOS extender using the PE format, with a valid PE
# executable inside (which just prints a message and exits if run in win)
>>>(0x3c.l+92) leshort <10
>>>>(8.s*16) string 32STUB for MS-DOS, 32rtm DOS extender
>>>>(8.s*16) string !32STUB for MS Windows
>>>>>(0x3c.l+22) leshort&0x2000 >0 (DLL)
>>>>>(0x3c.l+92) leshort 0 (unknown subsystem)
>>>>>(0x3c.l+92) leshort 1 (native)
>>>>>(0x3c.l+92) leshort 2 (GUI)
>>>>>(0x3c.l+92) leshort 3 (console)
>>>>>(0x3c.l+92) leshort 7 (POSIX)
>>>(0x3c.l+92) leshort 10 (EFI application)
>>>(0x3c.l+92) leshort 11 (EFI boot service driver)
>>>(0x3c.l+92) leshort 12 (EFI runtime driver)
>>>(0x3c.l+92) leshort 13 (XBOX)
>>>(0x3c.l+4) leshort 0x0 unknown processor
>>>(0x3c.l+4) leshort 0x14c Intel 80386
>>>(0x3c.l+4) leshort 0x166 MIPS R4000
>>>(0x3c.l+4) leshort 0x184 Alpha
>>>(0x3c.l+4) leshort 0x268 Motorola 68000
>>>(0x3c.l+4) leshort 0x1f0 PowerPC
>>>(0x3c.l+4) leshort 0x290 PA-RISC
>>>(0x3c.l+4) leshort 0x200 Intel Itanium
>>>(0x3c.l+22) leshort&0x0100 >0 32-bit
>>>(0x3c.l+22) leshort&0x1000 >0 system file
>>>(0x3c.l+232) lelong >0 Mono/.Net assembly
>>>>(0x3c.l+0xf8) string UPX0 \b, UPX compressed
>>>>(0x3c.l+0xf8) search/0x140 PEC2 \b, PECompact2 compressed
>>>>(0x3c.l+0xf8) search/0x140 UPX2
>>>>>(&0x10.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive (Info-Zip)
>>>>(0x3c.l+0xf8) search/0x140 .idata
>>>>>(&0xe.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive (Info-Zip)
>>>>>(&0xe.l+(-4)) string ZZ0 \b, ZZip self-extracting archive
>>>>>(&0xe.l+(-4)) string ZZ1 \b, ZZip self-extracting archive
>>>>(0x3c.l+0xf8) search/0x140 .rsrc
>>>>>(&0x0f.l+(-4)) string a\\\4\5 \b, WinHKI self-extracting archive
>>>>>(&0x0f.l+(-4)) string Rar! \b, RAR self-extracting archive
>>>>>(&0x0f.l+(-4)) search/0x3000 MSCF \b, InstallShield self-extracting archive
>>>>>(&0x0f.l+(-4)) search/32 Nullsoft \b, Nullsoft Installer self-extracting archive
>>>>(0x3c.l+0xf8) search/0x140 .data
>>>>>(&0x0f.l) string WEXTRACT \b, MS CAB-Installer self-extracting archive
>>>>(0x3c.l+0xf8) search/0x140 .petite\0 \b, Petite compressed
>>>>>(0x3c.l+0xf7) byte x
>>>>>>(&0x104.l+(-4)) string =!sfx! \b, ACE self-extracting archive
>>>>(0x3c.l+0xf8) search/0x140 .WISE \b, WISE installer self-extracting archive
>>>>(0x3c.l+0xf8) search/0x140 .dz\0\0\0 \b, Dzip self-extracting archive
>>>>(0x3c.l+0xf8) search/0x140 .reloc
>>>>>(&0xe.l+(-4)) search/0x180 PK\3\4 \b, ZIP self-extracting archive (WinZip)
>>>>&(0x3c.l+0xf8) search/0x100 _winzip_ \b, ZIP self-extracting archive (WinZip)
>>>>&(0x3c.l+0xf8) search/0x100 SharedD \b, Microsoft Installer self-extracting archive
>>>>0x30 string Inno \b, InnoSetup self-extracting archive
>>(0x3c.l) string !PE\0\0 MS-DOS executable
>>(0x3c.l) string NE \b, NE
>>>(0x3c.l+0x36) byte 0 (unknown OS)
>>>(0x3c.l+0x36) byte 1 for OS/2 1.x
>>>(0x3c.l+0x36) byte 2 for MS Windows 3.x
>>>(0x3c.l+0x36) byte 3 for MS-DOS
>>>(0x3c.l+0x36) byte >3 (unknown OS)
>>>(0x3c.l+0x36) byte 0x81 for MS-DOS, Phar Lap DOS extender
>>>(0x3c.l+0x0c) leshort&0x8003 0x8002 (DLL)
>>>(0x3c.l+0x0c) leshort&0x8003 0x8001 (driver)
>>>&(&0x24.s-1) string ARJSFX \b, ARJ self-extracting archive
>>>(0x3c.l+0x70) search/0x80 WinZip(R)\x20Self-Extractor \b, ZIP self-extracting archive (WinZip)
>>(0x3c.l) string LX\0\0 \b, LX
>>>(0x3c.l+0x0a) leshort <1 (unknown OS)
>>>(0x3c.l+0x0a) leshort 1 for OS/2
>>>(0x3c.l+0x0a) leshort 2 for MS Windows
>>>(0x3c.l+0x0a) leshort 3 for DOS
>>>(0x3c.l+0x0a) leshort >3 (unknown OS)
>>>(0x3c.l+0x10) lelong&0x28000 =0x8000 (DLL)
>>>(0x3c.l+0x10) lelong&0x20000 >0 (device driver)
>>>(0x3c.l+0x10) lelong&0x300 0x300 (GUI)
>>>(0x3c.l+0x10) lelong&0x28300 <0x300 (console)
>>>(0x3c.l+0x08) leshort 1 i80286
>>>(0x3c.l+0x08) leshort 2 i80386
>>>(0x3c.l+0x08) leshort 3 i80486
>>>(8.s*16) string emx \b, emx
>>>>&1 string x "%s"
>>>&(&0x54.l-3) string arjsfx \b, ARJ self-extracting archive
#------------------------------------------------------------------------------
# bFLT: file(1) magic for BFLT uclinux binary files
#
# From Philippe De Muyter <phdm@macqel.be>
#
# Additional fields added by Craig Heffner
#
0 string bFLT BFLT executable
>4 belong <1 invalid
>4 belong >4 invalid
>4 belong x version %ld,
>4 belong 4
>8 belong x code offset: 0x%.8X,
>12 belong x data segment starts at: 0x%.8X,
>16 belong x bss segment starts at: 0x%.8X,
>20 belong x bss segment ends at: 0x%.8X,
>24 belong x stack size: %d bytes,
>28 belong x relocation records start at: 0x%.8X,
>32 belong x number of reolcation records: %d,
>>36 belong&0x1 0x1 ram
>>36 belong&0x2 0x2 gotpic
>>36 belong&0x4 0x4 gzip
>>36 belong&0x8 0x8 gzdata
# Windows CE package files
0 string MSCE\0\0\0\0 Microsoft WinCE installer
>20 lelong 0 \b, architecture-independent
>20 lelong 103 \b, Hitachi SH3
>20 lelong 104 \b, Hitachi SH4
>20 lelong 0xA11 \b, StrongARM
>20 lelong 4000 \b, MIPS R4000
>20 lelong 10003 \b, Hitachi SH3
>20 lelong 10004 \b, Hitachi SH3E
>20 lelong 10005 \b, Hitachi SH4
>20 lelong 70001 \b, ARM 7TDMI
>52 leshort 1 \b, 1 file
>52 leshort >1 \b, %u files
>56 leshort 1 \b, 1 registry entry
>56 leshort >1 \b, %u registry entries
#------------------------------------------------------------------------------
# Microsoft Xbox executables .xbe (Esa Hyytiä <ehyytia@cc.hut.fi>)
0 string XBEH XBE, Microsoft Xbox executable
# probabilistic checks whether signed or not
>0x0004 ulelong =0x0
>>&2 ulelong =0x0
>>>&2 ulelong =0x0 \b, not signed
>0x0004 ulelong >0
>>&2 ulelong >0
>>>&2 ulelong >0 \b, signed
# expect base address of 0x10000
>0x0104 ulelong =0x10000
>>(0x0118-0x0FF60) ulelong&0x80000007 0x80000007 \b, all regions
>>(0x0118-0x0FF60) ulelong&0x80000007 !0x80000007
>>>(0x0118-0x0FF60) ulelong >0 (regions:
>>>>(0x0118-0x0FF60) ulelong &0x00000001 NA
>>>>(0x0118-0x0FF60) ulelong &0x00000002 Japan
>>>>(0x0118-0x0FF60) ulelong &0x00000004 Rest_of_World
>>>>(0x0118-0x0FF60) ulelong &0x80000000 Manufacturer
>>>(0x0118-0x0FF60) ulelong >0 \b)
#------------------------------------------------------------------------------
# motorola: file(1) magic for Motorola 68K and 88K binaries
#
# 68K
#
# These signatures are useless without further sanity checking. Disable them until
# that can be implemented.
#0 beshort 0x0208 mc68k COFF
#>18 beshort ^00000020 object
#>18 beshort &00000020 executable
#>12 belong >0 not stripped
#>168 string .lowmem Apple toolbox
#>20 beshort 0407 (impure)
#>20 beshort 0410 (pure)
#>20 beshort 0413 (demand paged)
#>20 beshort 0421 (standalone)
#0 beshort 0x0209 mc68k executable (shared)
#>12 belong >0 not stripped
#0 beshort 0x020A mc68k executable (shared demand paged)
#>12 belong >0 not stripped
#------------------------------------------------------------------------------
# Sony Playstation executables (Adam Sjoegren <asjo@diku.dk>) :
0 string PS-X\x20EXE Sony Playstation executable
# Area:
>113 string x ("%s")
#------------------------------------------------------------------------------
# cisco: file(1) magic for cisco Systems routers
#
# Most cisco file-formats are covered by the generic elf code
0 string \x85\x01\x14 Cisco IOS microcode
>7 string >\0
>>7 string x for "%s"
0 string \x85\x01\xcb Cisco IOS experimental microcode
>7 string >\0
>>7 string x for "%s"
# EST flat binary format (which isn't, but anyway)
# From: Mark Brown <broonie@sirena.org.uk>
0 string ESTFBINR EST flat binary
# These are not the binaries themselves, but string references to them
# are a strong indication that they exist elsewhere...
#0 string /bin/busybox Busybox string reference: "%s"{one-of-many}
#0 string /bin/sh Shell string reference: "%s"{one-of-many}
# Mach-O's
0 string \xca\xfe\xba\xbe\x00\x00\x00\x01 Mach-O universal binary with 1 architecture
0 string \xca\xfe\xba\xbe\x00\x00\x00\x02 Mach-O universal binary with 2 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x03 Mach-O universal binary with 3 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x04 Mach-O universal binary with 4 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x05 Mach-O universal binary with 5 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x06 Mach-O universal binary with 6 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x07 Mach-O universal binary with 7 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x08 Mach-O universal binary with 8 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x0a Mach-O universal binary with 9 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x0b Mach-O universal binary with 10 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x0c Mach-O universal binary with 11 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x0d Mach-O universal binary with 12 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x0e Mach-O universal binary with 13 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x0f Mach-O universal binary with 14 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x10 Mach-O universal binary with 15 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x11 Mach-O universal binary with 16 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x12 Mach-O universal binary with 17 architectures
0 string \xca\xfe\xba\xbe\x00\x00\x00\x13 Mach-O universal binary with 18 architectures
# The magic bytes for Java .class files is 0xcafebabe, but AFAIK all major version numbers are less than 255
# and all minor version numbers are 0. This gives us three more bytes we can signature on.
0 string \xca\xfe\xba\xbe\x00\x00\x00 Compiled Java class data,
>6 beshort x version %d.
>4 beshort x \b%d
# Which is which?
>4 belong 0x032d (Java 1.0/1.1)
#>4 belong 0x032d (Java 1.1)
>4 belong 0x002e (Java 1.2)
>4 belong 0x002f (Java 1.3)
>4 belong 0x0030 (Java 1.4)
>4 belong 0x0031 (Java 1.5)
>4 belong 0x0032 (Java 1.6)
>4 belong >0x0050 invalid
# Summary: HP-38/39 calculator
0 string HP38Bin HP 38 binary
>7 string A (Directory List)
>7 string B (Zaplet)
>7 string C (Note)
>7 string D (Program)
>7 string E (Variable)
>7 string F (List)
>7 string G (Matrix)
>7 string H (Library)
>7 string I (Target List)
>7 string J (ASCII Vector specification)
>7 string K (wildcard)
>7 byte <0x41 invalid
>7 byte >0x4B invalid
0 string HP39Bin HP 39 binary
>7 string A (Directory List)
>7 string B (Zaplet)
>7 string C (Note)
>7 string D (Program)
>7 string E (Variable)
>7 string F (List)
>7 string G (Matrix)
>7 string H (Library)
>7 string I (Target List)
>7 string J (ASCII Vector specification)
>7 string K (wildcard)
>7 byte <0x41 invalid
>7 byte >0x4B invalid
0 string HP38Asc HP 38 ASCII
>7 string A (Directory List)
>7 string B (Zaplet)
>7 string C (Note)
>7 string D (Program)
>7 string E (Variable)
>7 string F (List)
>7 string G (Matrix)
>7 string H (Library)
>7 string I (Target List)
>7 string J (ASCII Vector specification)
>7 string K (wildcard)
>7 byte <0x41 invalid
>7 byte >0x4B invalid
0 string HP39Asc HP 39 ASCII
>7 string A (Directory List)
>7 string B (Zaplet)
>7 string C (Note)
>7 string D (Program)
>7 string E (Variable)
>7 string F (List)
>7 string G (Matrix)
>7 string H (Library)
>7 string I (Target List)
>7 string J (ASCII Vector specification)
>7 string K (wildcard)
>7 byte <0x41 invalid
>7 byte >0x4B invalid
# Summary: HP-48/49 calculator
0 string HPHP48 HP 48 binary
>8 leshort 0x2911 (ADR)
>8 leshort 0x2933 (REAL)
>8 leshort 0x2955 (LREAL)
>8 leshort 0x2977 (COMPLX)
>8 leshort 0x299d (LCOMPLX)
>8 leshort 0x29bf (CHAR)
>8 leshort 0x29e8 (ARRAY)
>8 leshort 0x2a0a (LNKARRAY)
>8 leshort 0x2a2c (STRING)
>8 leshort 0x2a4e (HXS)
>8 leshort 0x2a74 (LIST)
>8 leshort 0x2a96 (DIR)
>8 leshort 0x2ab8 (ALG)
>8 leshort 0x2ada (UNIT)
>8 leshort 0x2afc (TAGGED)
>8 leshort 0x2b1e (GROB)
>8 leshort 0x2b40 (LIB)
>8 leshort 0x2b62 (BACKUP)
>8 leshort 0x2b88 (LIBDATA)
>8 leshort 0x2d9d (PROG)
>8 leshort 0x2dcc (CODE)
>8 leshort 0x2e48 (GNAME)
>8 leshort 0x2e6d (LNAME)
>8 leshort 0x2e92 (XLIB)
>8 leshort <0x2911 (invalid)
>8 leshort >0x2e92 (invalid)
0 string HPHP49 HP 49 binary
>8 leshort 0x2911 (ADR)
>8 leshort 0x2933 (REAL)
>8 leshort 0x2955 (LREAL)
>8 leshort 0x2977 (COMPLX)
>8 leshort 0x299d (LCOMPLX)
>8 leshort 0x29bf (CHAR)
>8 leshort 0x29e8 (ARRAY)
>8 leshort 0x2a0a (LNKARRAY)
>8 leshort 0x2a2c (STRING)
>8 leshort 0x2a4e (HXS)
>8 leshort 0x2a74 (LIST)
>8 leshort 0x2a96 (DIR)
>8 leshort 0x2ab8 (ALG)
>8 leshort 0x2ada (UNIT)
>8 leshort 0x2afc (TAGGED)
>8 leshort 0x2b1e (GROB)
>8 leshort 0x2b40 (LIB)
>8 leshort 0x2b62 (BACKUP)
>8 leshort 0x2b88 (LIBDATA)
>8 leshort 0x2d9d (PROG)
>8 leshort 0x2dcc (CODE)
>8 leshort 0x2e48 (GNAME)
>8 leshort 0x2e6d (LNAME)
>8 leshort 0x2e92 (XLIB)
>8 leshort <0x2911 (invalid)
>8 leshort >0x2e92 (invalid)
#--------------------File Systems---------------------
# Minix filesystems - Juan Cespedes <cespedes@debian.org>
# These signatures are useless until they can be improved.
#0x410 leshort 0x137f Minix filesystem
#>0x402 beshort !0 \b, %d zones
#>0x1e string minix \b, bootable
#0x410 leshort 0x138f Minix filesystem, 30 char names
#0x410 leshort 0x2468 Minix filesystem, version 2
#0x410 leshort 0x2478 Minix filesystem, version 2, 30 char names
#0x410 leshort 0x4d5a Minix filesystem, version 3
#0x410 leshort 0x4d6a Minix filesystem, version 3, 30 char names
#0x410 beshort 0x137f Minix filesystem (big endian)
#>0x402 beshort !0 \b, %d zones
#>0x1e string minix \b, bootable
#0x410 beshort 0x138f Minix filesystem (big endian), 30 char names
#0x410 beshort 0x2468 Minix filesystem (big endian), version 2
#0x410 beshort 0x2478 Minix filesystem (big endian), version 2, 30 char names
#0x410 beshort 0x4d5a Minix filesystem (big endian), version 3
#0x410 beshort 0x4d6a Minix filesystem (big endian), version 3, 30 char names
# YAFFS
0 string \x03\x00\x00\x00\x01\x00\x00\x00\xFF\xFF YAFFS filesystem
# EFS2 file system - jojo@utulsa.edu
0 lelong 0x53000000 EFS2 Qualcomm filesystem super block, little endian,
>8 string !EFSSuper invalid,
>4 leshort &1 NAND
>4 leshort ^1 NOR
>4 leshort x version 0x%x,
>24 lelong x %d blocks,
>16 lelong x 0x%x pages per block,
>20 lelong x 0x%x bytes per page
0 belong 0x53000000 EFS2 Qualcomm filesystem super block, big endian,
>8 string !SSFErepu invalid,
>4 beshort &1 NAND
>4 beshort ^1 NOR
>4 beshort x version 0x%x,
>24 belong x %d blocks,
>16 belong x 0x%x pages per block,
>20 belong x 0x%x bytes per page
# TROC file system
0 string TROC TROC filesystem,
>4 lelong x %d file entries
# PFS file system
0 string PFS/ PFS filesystem,
>4 string x version "%s",
>14 leshort x %d files
# MPFS file system
0 string MPFS MPFS (Microchip) filesystem,
>4 byte x version %d.
>5 byte x \b%d,
>6 leshort x %d file entries
# cramfs filesystem - russell@coker.com.au
0 lelong 0x28cd3d45 CramFS filesystem, little endian
>4 lelong <0 invalid
>4 lelong >1073741824 invalid
>4 lelong x size %lu
>8 lelong &1 version #2
>8 lelong &2 sorted_dirs
>8 lelong &4 hole_support
>32 lelong x CRC 0x%x,
>36 lelong x edition %lu,
>40 lelong <0 invalid
>40 lelong x %lu blocks,
>44 lelong <0 invalid
>44 lelong x %lu files
>4 lelong x {jump-to-offset:%lu}
>4 lelong x {file-size:%lu}
0 belong 0x28cd3d45 CramFS filesystem, big endian
>4 belong <0 invalid
>4 belong >1073741824 invalid
>4 belong x size %lu
>8 belong &1 version #2
>8 belong &2 sorted_dirs
>8 belong &4 hole_support
>32 belong x CRC 0x%x,
>36 belong x edition %lu,
>40 belong <0 invalid
>40 belong x %lu blocks,
>44 belong <0 invalid
>44 belong x %lu files
>4 belong x {jump-to-offset:%lu}
>4 belong x {file-size:%lu}
# JFFS2 file system
# If used with binwalk's smart signature feature (on by default, -S to disable)
# this signature can potentially lead to missing some JFFS2 file systems if there
# are multiple JFFS2 file systems in a target file and there are no other identified
# files in between the JFFS2 file systems. This is an unlikely scenario however, and
# the below signatures are much improved in terms of readability and accuracy in the
# vast majority of real world scenarios.
0 leshort 0x1985 JFFS2 filesystem, little endian
>2 leshort !0xE001
>>2 leshort !0xE002
>>>2 leshort !0x2003
>>>>2 leshort !0x2004
>>>>>2 leshort !0x2006
>>>>>>2 leshort !0xE008
>>>>>>>2 leshort !0xE009 \b, invalid
>(4.l) leshort !0x1985
>>(4.l+1) leshort !0x1985
>>>(4.l+2) leshort !0x1985
>>>>(4.l+3) leshort !0x1985
>>>>>(4.l) leshort !0xFFFF
>>>>>>(4.l+1) leshort !0xFFFF
>>>>>>>(4.l+2) leshort !0xFFFF
>>>>>>>>(4.l+3) leshort !0xFFFF \b, invalid
>4 lelong 0 invalid
>4 lelong <0 invalid
>4 lelong x {one-of-many}{jump-to-offset:%d}
0 beshort 0x1985 JFFS2 filesystem, big endian
>2 beshort !0xE001
>>2 beshort !0xE002
>>>2 beshort !0x2003
>>>>2 beshort !0x2004
>>>>>2 beshort !0x2006
>>>>>>2 beshort !0xE008
>>>>>>>2 beshort !0xE009 \b, invalid
>(4.L) beshort !0x1985
>>(4.L+1) beshort !0x1985
>>>(4.L+2) beshort !0x1985
>>>>(4.L+3) beshort !0x1985
>>>>>(4.L) beshort !0xFFFF
>>>>>>(4.L+1) beshort !0xFFFF
>>>>>>>(4.L+2) beshort !0xFFFF
>>>>>>>>(4.L+3) beshort !0xFFFF \b, invalid
>4 belong 0 invalid
>4 belong <0 invalid
>4 belong x {one-of-many}{jump-to-offset:%d}
# Squashfs, big endian
0 string sqsh Squashfs filesystem, big endian,
>28 beshort >10 invalid
>28 beshort <1 invalid
>30 beshort >10 invalid
>28 beshort x version %d.
>30 beshort x \b%d,
>28 beshort >3 compression:
>>20 beshort 1 \bgzip,
>>20 beshort 2 \blzma,
>>20 beshort 3 \bgzip (non-standard type definition),
>>20 beshort 4 \blzma (non-standard type definition),
>>20 beshort 0 \binvalid,
>>20 beshort >4 \binvalid,
>28 beshort <3
>>8 belong x size: %d bytes,
>28 beshort 3
>>63 bequad x size: %lld bytes,
>28 beshort >3
>>40 bequad x size: %lld bytes,
>4 belong x %d inodes,
>28 beshort >3
>>12 belong blocksize: %d bytes,
>28 beshort <2
>>32 beshort x blocksize: %d bytes,
>28 beshort 2
>>51 belong x blocksize: %d bytes,
>28 beshort 3
>>51 belong x blocksize: %d bytes,
>28 beshort >3
>>12 belong x blocksize: %d bytes,
>28 beshort <4
>>39 bedate x created: %s
>28 beshort >3
>>8 bedate x created: %s
>28 beshort <3
>>8 belong x {jump-to-offset:%d}
>28 beshort 3
>>63 bequad x {jump-to-offset:%lld}
>28 beshort >3
>>40 bequad x {jump-to-offset:%lld}
# Squashfs, little endian
0 string hsqs Squashfs filesystem, little endian,
>28 leshort >10 invalid
>28 leshort <1 invalid
>30 leshort >10 invalid
>28 leshort x version %d.
>30 leshort x \b%d,
>28 leshort >3 compression:
>>20 leshort 1 \bgzip,
>>20 leshort 2 \blzma,
>>20 leshort 3 \bgzip (non-standard type definition),
>>20 leshort 4 \blzma (non-standard type definition),
>>20 leshort 0 \binvalid,
>>20 leshort >4 \binvalid,
>28 leshort <3
>>8 lelong x size: %d bytes,
>>8 lelong x {file-size:%d}
>28 leshort 3
>>63 lequad x size: %lld bytes,
>>63 lequad x {file-size:%lld}
>28 leshort >3
>>40 lequad x size: %lld bytes,
>>40 lequad x {file-size:%lld}
>4 lelong x %d inodes,
>28 leshort >3
>>12 lelong blocksize: %d bytes,
>28 leshort <2
>>32 leshort x blocksize: %d bytes,
>28 leshort 2
>>51 lelong x blocksize: %d bytes,
>28 leshort 3
>>51 lelong x blocksize: %d bytes,
>28 leshort >3
>>12 lelong x blocksize: %d bytes,
>28 leshort <4
>>39 ledate x created: %s
>28 leshort >3
>>8 ledate x created: %s
>28 leshort <3
>>8 lelong x {jump-to-offset:%d}
>28 leshort 3
>>63 lequad x {jump-to-offset:%lld}
>28 leshort >3
>>40 lequad x {jump-to-offset:%lld}
# Squashfs with LZMA compression
0 string sqlz Squashfs filesystem, big endian, lzma compression,
>28 beshort >10 invalid
>28 beshort <1 invalid
>30 beshort >10 invalid
>28 beshort x version %d.
>30 beshort x \b%d,
>28 beshort >3 compression:
>>20 beshort 1 \bgzip,
>>20 beshort 2 \blzma,
>>20 beshort 3 \bgzip (non-standard type definition),
>>20 beshort 4 \blzma (non-standard type definition),
>>20 beshort 0 \binvalid,
>>20 beshort >4 \binvalid,
>28 beshort <3
>>8 belong x size: %d bytes,
>>8 belong x {file-size:%d}
>28 beshort 3
>>63 bequad x size: %lld bytes,
>>63 bequad x {file-size:%lld}
>28 beshort >3
>>40 bequad x size: %lld bytes,
>>40 bequad x {file-size:%lld}
>4 belong x %d inodes,
>28 beshort >3
>>12 belong blocksize: %d bytes,
>28 beshort <2
>>32 beshort x blocksize: %d bytes,
>28 beshort 2
>>51 belong x blocksize: %d bytes,
>28 beshort 3
>>51 belong x blocksize: %d bytes,
>28 beshort >3
>>12 belong x blocksize: %d bytes,
>28 beshort <4
>>39 bedate x created: %s
>28 beshort >3
>>8 bedate x created: %s
>28 beshort <3
>>8 belong x {jump-to-offset:%d}
>28 beshort 3
>>63 bequad x {jump-to-offset:%lld}
>28 beshort >3
>>40 bequad x {jump-to-offset:%lld}
# Squashfs 3.3 LZMA signature
0 string qshs Squashfs filesystem, big endian, lzma signature,
>28 beshort >10 invalid
>28 beshort <1 invalid
>30 beshort >10 invalid
>28 beshort x version %d.
>30 beshort x \b%d,
>28 beshort >3 compression:
>>20 beshort 1 \bgzip,
>>20 beshort 2 \blzma,
>>20 beshort 3 \bgzip (non-standard type definition),
>>20 beshort 4 \blzma (non-standard type definition),
>>20 beshort 0 \binvalid,
>>20 beshort >4 \binvalid,
>28 beshort <3
>>8 belong x size: %d bytes,
>>8 belong x {file-size:%d}
>28 beshort 3
>>63 bequad x size: %lld bytes,
>>63 bequad x {file-size:%lld}
>28 beshort >3
>>40 bequad x size: %lld bytes,
>>40 bequad x {file-size:%lld}
>4 belong x %d inodes,
>28 beshort >3
>>12 belong blocksize: %d bytes,
>28 beshort <2
>>32 beshort x blocksize: %d bytes,
>28 beshort 2
>>51 belong x blocksize: %d bytes,
>28 beshort 3
>>51 belong x blocksize: %d bytes,
>28 beshort >3
>>12 belong x blocksize: %d bytes,
>28 beshort <4
>>39 bedate x created: %s
>28 beshort >3
>>8 bedate x created: %s
>28 beshort <3
>>8 belong x {jump-to-offset:%d}
>28 beshort 3
>>63 bequad x {jump-to-offset:%lld}
>28 beshort >3
>>40 bequad x {jump-to-offset:%lld}
# Squashfs for DD-WRT
0 string tqsh Squashfs filesystem, big endian, DD-WRT signature,
>28 beshort >10 invalid
>28 beshort <1 invalid
>30 beshort >10 invalid
>28 beshort x version %d.
>30 beshort x \b%d,
>28 beshort >3 compression:
>>20 beshort 1 \bgzip,
>>20 beshort 2 \blzma,
>>20 beshort 3 \bgzip (non-standard type definition),
>>20 beshort 4 \blzma (non-standard type definition),
>>20 beshort 0 \binvalid,
>>20 beshort >4 \binvalid,
>28 beshort <3
>>8 belong x size: %d bytes,
>>8 belong x {file-size:%d}
>28 beshort 3
>>63 bequad x size: %lld bytes,
>>63 bequad x {file-size:%lld}
>28 beshort >3
>>40 bequad x size: %lld bytes,
>>40 bequad x {file-size:%lld}
>4 belong x %d inodes,
>28 beshort >3
>>12 belong blocksize: %d bytes,
>28 beshort <2
>>32 beshort x blocksize: %d bytes,
>28 beshort 2
>>51 belong x blocksize: %d bytes,
>28 beshort 3
>>51 belong x blocksize: %d bytes,
>28 beshort >3
>>12 belong x blocksize: %d bytes,
>28 beshort <4
>>39 bedate x created: %s
>28 beshort >3
>>8 bedate x created: %s
>28 beshort <3
>>8 belong x {jump-to-offset:%d}
>28 beshort 3
>>63 bequad x {jump-to-offset:%lld}
>28 beshort >3
>>40 bequad x {jump-to-offset:%lld}
# Squashfs for DD-WRT
0 string hsqt Squashfs filesystem, little endian, DD-WRT signature,
>28 leshort >10 invalid
>28 leshort <1 invalid
>30 leshort >10 invalid
>28 leshort x version %d.
>30 leshort x \b%d,
>28 leshort >3 compression:
>>20 leshort 1 \bgzip,
>>20 leshort 2 \blzma,
>>20 leshort 3 \bgzip (non-standard type definition),
>>20 leshort 4 \blzma (non-standard type definition),
>>20 leshort 0 \binvalid,
>>20 leshort >4 \binvalid,
>28 leshort <3
>>8 lelong x size: %d bytes,
>>8 lelong x {file-size:%d}
>28 leshort 3
>>63 lequad x size: %lld bytes,
>>63 lequad x {file-size:%lld}
>28 leshort >3
>>40 lequad x size: %lld bytes,
>>40 lequad x {file-size:%lld}
>4 lelong x %d inodes,
>28 leshort >3
>>12 lelong blocksize: %d bytes,
>28 leshort <2
>>32 leshort x blocksize: %d bytes,
>28 leshort 2
>>51 lelong x blocksize: %d bytes,
>28 leshort 3
>>51 lelong x blocksize: %d bytes,
>28 leshort >3
>>12 lelong x blocksize: %d bytes,
>28 leshort <4
>>39 ledate x created: %s
>28 leshort >3
>>8 ledate x created: %s
>28 leshort <3
>>8 lelong x {jump-to-offset:%d}
>28 leshort 3
>>63 lequad x {jump-to-offset:%lld}
>28 leshort >3
>>40 lequad x {jump-to-offset:%lld}
# Non-standard Squashfs signature found on some D-Link routers
0 string shsq Squashfs filesystem, little endian, non-standard signature,
>28 leshort >10 invalid
>28 leshort <1 invalid
>30 leshort >10 invalid
>28 leshort x version %d.
>30 leshort x \b%d,
>28 leshort >3 compression:
>>20 leshort 1 \bgzip,
>>20 leshort 2 \blzma,
>>20 leshort 3 \bgzip (non-standard type definition),
>>20 leshort 4 \blzma (non-standard type definition),
>>20 leshort 0 \binvalid,
>>20 leshort >4 \binvalid,
>28 leshort <3
>>8 lelong x size: %d bytes,
>>8 lelong x {file-size:%d}
>28 leshort 3
>>63 lequad x size: %lld bytes,
>>63 lequad x {file-size:%lld}
>28 leshort >3
>>40 lequad x size: %lld bytes,
>>40 lequad x {file-size:%lld}
>4 lelong x %d inodes,
>28 leshort >3
>>12 lelong blocksize: %d bytes,
>28 leshort <2
>>32 leshort x blocksize: %d bytes,
>28 leshort 2
>>51 lelong x blocksize: %d bytes,
>28 leshort 3
>>51 lelong x blocksize: %d bytes,
>28 leshort >3
>>12 lelong x blocksize: %d bytes,
>28 leshort <4
>>39 ledate x created: %s
>28 leshort >3
>>8 ledate x created: %s
>28 leshort <3
>>8 lelong x {jump-to-offset:%d}
>28 leshort 3
>>63 lequad x {jump-to-offset:%lld}
>28 leshort >3
>>40 lequad x {jump-to-offset:%lld}
# ext2/ext3 filesystems - Andreas Dilger <adilger@dilger.ca>
# ext4 filesystem - Eric Sandeen <sandeen@sandeen.net>
# volume label and UUID Russell Coker
# http://etbe.coker.com.au/2008/07/08/label-vs-uuid-vs-device/
0 leshort 0xEF53 Linux EXT filesystem,{offset-adjust:-0x438}
>2 leshort >4 invalid state
>2 leshort 3 invalid state
>2 leshort <0 invalid state
>4 leshort >3 invalid error behavior
>4 leshort <0 invalid error behavior
>4 lelong >1 invalid major revision
>4 lelong <0 invalid major revision
>4 lelong x rev %d
>6 leshort x \b.%d
# No journal? ext2
>36 lelong ^0x0000004 ext2 filesystem data
>>2 leshort ^0x0000001 (mounted or unclean)
# Has a journal? ext3 or ext4
>36 lelong &0x0000004
# and small INCOMPAT?
>>40 lelong <0x0000040
# and small RO_COMPAT?
>>>44 lelong <0x0000008 ext3 filesystem data
# else large RO_COMPAT?
>>>44 lelong >0x0000007 ext4 filesystem data
# else large INCOMPAT?
>>40 lelong >0x000003f ext4 filesystem data
>48 belong x \b, UUID=%08x
>52 beshort x \b-%04x
>54 beshort x \b-%04x
>56 beshort x \b-%04x
>58 belong x \b-%08x
>60 beshort x \b%04x
>64 string >0 \b, volume name "%s"
#romfs filesystems - Juan Cespedes <cespedes@debian.org>
0 string -rom1fs-\0 romfs filesystem, version 1
>8 belong >10000000 invalid
>8 belong x size: %d bytes,
>16 string x {file-name:%s}
>16 string x named "%s"
>8 belong x {file-size:%d}
>8 belong x {jump-to-offset:%d}
# Wind River MemFS file system, found in some VxWorks devices
0 string owowowowowowowowowowowowowowow Wind River management filesystem,
>30 string !ow invalid,
>32 belong 1 compressed,
>32 belong 2 plain text,
>36 belong x %d files
# netboot image - Juan Cespedes <cespedes@debian.org>
0 lelong 0x1b031336L Netboot image,
>4 lelong&0xFFFFFF00 0
>>4 lelong&0x100 0x000 mode 2
>>4 lelong&0x100 0x100 mode 3
>4 lelong&0xFFFFFF00 !0 unknown mode (invalid)
0 string WDK\x202.0\x00 WDK file system, version 2.0{offset-adjust:-18}
0 string CD001 ISO{offset-adjust:-32769}
>6144 string !NSR0 9660 CD-ROM filesystem data,
>6144 string NSR0 UDF filesystem data,
>6148 string 1 version 1.0,
>6148 string 2 version 2.0,
>6148 string 3 version 3.0
>6148 byte >0x33 invalid version,
>6148 byte <0x31 invalid version,
>38 string >\0 volume name: "%s",
>2047 string \000CD001\001EL\x20TORITO\x20SPECIFICATION bootable
# updated by Joerg Jenderek at Nov 2012
# DOS Emulator image is 128 byte, null right padded header + harddisc image
0 string DOSEMU\0 DOS Emulator image
>0x27E leshort !0xAA55 \b, invalid
>0x27E leshort 0xAA55
#offset is 128
>>19 byte 128
>>>(19.b-1) byte 0x0
>>>>7 lelong >0 \b, %d heads
>>>>11 lelong >0 \b, %d sectors/track
>>>>15 lelong >0 \b, %d cylinders
# From: Alex Beregszaszi <alex@fsn.hu>
0 string COWD\x03 VMWare3 disk image,
>32 lelong x (%d/
>36 lelong x \b%d/
>40 lelong x \b%d)
0 string COWD\x02 VMWare3 undoable disk image,
>32 string >\0 "%s"
# TODO: Add header validation
0 string VMDK VMware4 disk image
0 string KDMV VMware4 disk image
#--------------------------------------------------------------------
# Qemu Emulator Images
# Lines written by Friedrich Schwittay (f.schwittay@yousable.de)
# Updated by Adam Buchbinder (adam.buchbinder@gmail.com)
# Made by reading sources, reading documentation, and doing trial and error
# on existing QCOW files
0 string QFI\xFB QEMU QCOW Image
#--------------------------Firmware Formats---------------------------
# uImage file
# From: Craig Heffner, U-Boot image.h header definitions file
0 belong 0x27051956 uImage header, header size: 64 bytes,
>4 belong x header CRC: 0x%X,
>8 bedate x created: %s,
>12 belong x image size: %d bytes,
>16 belong x Data Address: 0x%X,
>20 belong x Entry Point: 0x%X,
>24 belong x data CRC: 0x%X,
#>28 byte x OS type: %d,
>28 byte 0 OS: invalid OS,
>28 byte 1 OS: OpenBSD,
>28 byte 2 OS: NetBSD,
>28 byte 3 OS: FreeBSD,
>28 byte 4 OS: 4.4BSD,
>28 byte 5 OS: Linux,
>28 byte 6 OS: SVR4,
>28 byte 7 OS: Esix,
>28 byte 8 OS: Solaris,
>28 byte 9 OS: Irix,
>28 byte 10 OS: SCO,
>28 byte 11 OS: Dell,
>28 byte 12 OS: NCR,
>28 byte 13 OS: LynxOS,
>28 byte 14 OS: VxWorks,
>28 byte 15 OS: pSOS,
>28 byte 16 OS: QNX,
>28 byte 17 OS: Firmware,
>28 byte 18 OS: RTEMS,
>28 byte 19 OS: ARTOS,
>28 byte 20 OS: Unity OS,
#>29 byte x CPU arch: %d,
>29 byte 0 CPU: invalid OS,
>29 byte 1 CPU: Alpha,
>29 byte 2 CPU: ARM,
>29 byte 3 CPU: Intel x86,
>29 byte 4 CPU: IA64,
>29 byte 5 CPU: MIPS,
>29 byte 6 CPU: MIPS 64 bit,
>29 byte 7 CPU: PowerPC,
>29 byte 8 CPU: IBM S390,
>29 byte 9 CPU: SuperH,
>29 byte 10 CPU: Sparc,
>29 byte 11 CPU: Sparc 64 bit,
>29 byte 12 CPU: M68K,
>29 byte 13 CPU: Nios-32,
>29 byte 14 CPU: MicroBlaze,
>29 byte 15 CPU: Nios-II,
>29 byte 16 CPU: Blackfin,
>29 byte 17 CPU: AVR,
>29 byte 18 CPU: STMicroelectronics ST200,
#>30 byte x image type: %d,
>30 byte 0 image type: invalid Image,
>30 byte 1 image type: Standalone Program,
>30 byte 2 image type: OS Kernel Image,
>30 byte 3 image type: RAMDisk Image,
>30 byte 4 image type: Multi-File Image,
>30 byte 5 image type: Firmware Image,
>30 byte 6 image type: Script file,
>30 byte 7 image type: Filesystem Image,
>30 byte 8 image type: Binary Flat Device Tree Blob
#>31 byte x compression type: %d,
>31 byte 0 compression type: none,
>31 byte 1 compression type: gzip,
>31 byte 2 compression type: bzip2,
>31 byte 3 compression type: lzma,
>32 string x image name: "%s"
#IMG0 header, found in VxWorks-based Mercury router firmware
0 string IMG0 IMG0 (VxWorks) header,
>4 belong x size: %d
#Mediatek bootloader signature
#From xp-dev.com
0 string BOOTLOADER! Mediatek bootloader
#CSYS header formats
0 string CSYS\x00 CSYS header, little endian,
>8 lelong x size: %d
0 string CSYS\x80 CSYS header, big endian,
>8 belong x size: %d
# wrgg firmware image
0 string wrgg02 WRGG firmware header,
>6 string x name: "%s",
>48 string x root device: "%s"
# trx image file
0 string HDR0 TRX firmware header, little endian, header size: 28 bytes,
>4 lelong <1 invalid
>4 lelong x image size: %d bytes,
>8 lelong x CRC32: 0x%X
>12 leshort x flags: 0x%X,
>14 leshort x version: %d
0 string 0RDH TRX firmware header, big endian, header size: 28 bytes,
>4 belong <1 invalid
>4 belong x image size: %d bytes,
>8 belong x CRC32: 0x%X
>12 beshort x flags: 0x%X,
>14 beshort x version: %d
# Ubicom firmware image
0 belong 0xFA320080 Ubicom firmware header,
>12 belong x checksum: 0x%X,
>24 belong <0 invalid
>24 belong x image size: %d
# The ROME bootloader is used by several RealTek-based products.
# Unfortunately, the magic bytes are specific to each product, so
# separate signatures must be created for each one.
# Netgear KWGR614 ROME image
0 string G614 Realtek firmware header (ROME bootloader),
>4 beshort 0xd92f image type: KFS,
>4 beshort 0xb162 image type: RDIR,
>4 beshort 0xea43 image type: BOOT,
>4 beshort 0x8dc9 image type: RUN,
>4 beshort 0x2a05 image type: CCFG,
>4 beshort 0x6ce8 image type: DCFG,
>4 beshort 0xc371 image type: LOG,
>6 byte x header version: %d,
#month
>10 byte x created: %d/
#day
>12 byte x \b%d/
#year
>8 beshort x \b%d,
>16 belong x image size: %d bytes,
>22 byte x body checksum: 0x%X,
>23 byte x header checksum: 0x%X
# Linksys WRT54GX ROME image
0 belong 0x59a0e842 Realtek firmware header (ROME bootloader)
>4 beshort 0xd92f image type: KFS,
>4 beshort 0xb162 image type: RDIR,
>4 beshort 0xea43 image type: BOOT,
>4 beshort 0x8dc9 image type: RUN,
>4 beshort 0x2a05 image type: CCFG,
>4 beshort 0x6ce8 image type: DCFG,
>4 beshort 0xc371 image type: LOG,
>6 byte x header version: %d,
#month
>10 byte x created: %d/
#day
>12 byte x \b%d/
#year
>8 beshort x \b%d,
>16 belong x image size: %d bytes,
>22 byte x body checksum: 0x%X,
>23 byte x header checksum: 0x%X
# PackImg tag, somtimes used as a delimiter between the kernel and rootfs in firmware images.
0 string --PaCkImGs-- PackImg section delimiter tag,
# If the size in both big and little endian is greater than 512MB, consider this a false positive
>16 lelong >0x20000000
>>16 belong >0x20000000 invalid
>16 lelong <0
>>16 belong <0 invalid
>16 lelong >0
>>16 lelong x little endian size: %d bytes;
>16 belong >0
>>16 belong x big endian size: %d bytes
#------------------------------------------------------------------------------
# Broadcom header format
#
0 string BCRM Broadcom header,
>4 lelong <0 invalid
>4 lelong x number of sections: %d,
>>8 lelong 18 first section type: flash
>>8 lelong 19 first section type: disk
>>8 lelong 21 first section type: tag
# Berkeley Lab Checkpoint Restart (BLCR) checkpoint context files
# http://ftg.lbl.gov/checkpoint
0 string Ck0\0\0R\0\0\0 BLCR
>16 lelong 1 x86
>16 lelong 3 alpha
>16 lelong 5 x86-64
>16 lelong 7 ARM
>8 lelong x context data (little endian, version %d)
0 string \0\0\0C\0\0\0R BLCR
>16 belong 2 SPARC
>16 belong 4 ppc
>16 belong 6 ppc64
>16 belong 7 ARMEB
>16 belong 8 SPARC64
>8 belong x context data (big endian, version %d)
# Aculab VoIP firmware
# From: Mark Brown <broonie@sirena.org.uk>
0 string VoIP\x20Startup\x20and Aculab VoIP firmware
>35 string x format "%s"
#------------------------------------------------------------------------------
# HP LaserJet 1000 series downloadable firmware file
0 string \xbe\xefABCDEFGH HP LaserJet 1000 series downloadable firmware
# From Albert Cahalan <acahalan@gmail.com>
# really le32 operation,destination,payloadsize (but quite predictable)
# 01 00 00 00 00 00 00 c0 00 02 00 00
0 string \1\0\0\0\0\0\0\300\0\2\0\0 Marvell Libertas firmware
#---------------------------------------------------------------------------
# The following entries have been tested by Duncan Laurie <duncan@sun.com> (a
# lead Sun/Cobalt developer) who agrees that they are good and worthy of
# inclusion.
# Boot ROM images for Sun/Cobalt Linux server appliances
0 string Cobalt\x20Networks\x20Inc.\nFirmware\x20v Paged COBALT boot rom
>38 string x V%.4s
# New format for Sun/Cobalt boot ROMs is annoying, it stores the version code
# at the very end where file(1) can't get it.
0 string CRfs COBALT boot rom data (Flat boot rom or file system)
#
# Motorola S-Records, from Gerd Truschinski <gt@freebsd.first.gmd.de>
# Useless until forther improvements can be made to the signature.
#0 string S0 Motorola S-Record; binary data in text format
# --------------------------------
# Microsoft Xbox data file formats
0 string XIP0 XIP, Microsoft Xbox data
0 string XTF0 XTF, Microsoft Xbox data
#Windows CE
0 string CECE Windows CE RTOS{offset-adjust:-64}
# --------------------------------
# ZynOS ROM header format
# From openwrt zynos.h.
0 string SIG ZynOS header, header size: 48 bytes,{offset-adjust:-6}
#>0 belong x load address 0x%X,
>3 byte <0x7F rom image type:
>>3 byte <1 invalid,
>>3 byte >7 invalid,
>>3 byte 1 ROMIMG,
>>3 byte 2 ROMBOOT,
>>3 byte 3 BOOTEXT,
>>3 byte 4 ROMBIN,
>>3 byte 5 ROMDIR,
>>3 byte 6 6,
>>3 byte 7 ROMMAP,
>3 byte >0x7F ram image type:
>>3 byte >0x82 invalid,
>>3 byte 0x80 RAM,
>>3 byte 0x81 RAMCODE,
>>3 byte 0x82 RAMBOOT,
>4 belong >0x40000000 invalid
>4 belong <0 invalid
>4 belong 0 invalid
>4 belong x uncompressed size: %d,
>8 belong >0x40000000 invalid
>8 belong <0 invalid
>8 belong 0 invalid
>8 belong x compressed size: %d,
>14 beshort x uncompressed checksum: 0x%X,
>16 beshort x compressed checksum: 0x%X,
>12 byte x flags: 0x%X,
>12 byte &0x40 uncompressed checksum is valid,
>12 byte &0x80 the binary is compressed,
>>12 byte &0x20 compressed checksum is valid,
>35 belong x memory map table address: 0x%X
# Firmware header used by some VxWorks-based Cisco products
0 string CI032.00 Cisco VxWorks firmware header,
>8 lelong >1024 invalid
>8 lelong <0 invalid
>8 lelong x header size: %d bytes,
>32 lelong >1024 invalid
>32 lelong <0 invalid
>32 lelong x number of files: %d,
>48 lelong <0 invalid
>48 lelong x image size: %d,
>64 string x firmware version: "%s"
# Firmware header used by some TV's
0 string FNIB ZBOOT firmware header, header size: 32 bytes,
>8 lelong x load address: 0x%.8X,
>12 lelong x start address: 0x%.8X,
>16 lelong x checksum: 0x%.8X,
>20 lelong x version: 0x%.8X,
>24 lelong <1 invalid
>24 lelong x image size: %d bytes
# Firmware header used by several D-Link routers (and probably others)
0 string \x5e\xa3\xa4\x17 DLOB firmware header,
>(7.b+12) string !\x5e\xa3\xa4\x17 invalid,
#>>12 string x %s,
>(7.b+40) string x boot partition: "%s"
# TP-Link firmware header structure; thanks to Jonathan McGowan for reversing and documenting this format
0 string TP-LINK\x20Technologies TP-Link firmware header,{offset-adjust:-4}
#>-4 lelong x header version: %d,
>0x94 beshort x firmware version: %d.
>0x96 beshort x \b%d.
>0x98 beshort x \b%d,
>0x18 string x image version: "%s",
#>0x74 belong x image size: %d bytes,
>0x3C belong x product ID: 0x%X,
>0x40 belong x product version: %d,
>0x70 belong x kernel load address: 0x%X,
>0x74 belong x kernel entry point: 0x%X,
>0x7C belong x kernel offset: %d,
>0x80 belong x kernel length: %d,
>0x84 belong x rootfs offset: %d,
>0x88 belong x rootfs length: %d,
>0x8C belong x bootloader offset: %d,
>0x90 belong x bootloader length: %d
# Header format from: http://skaya.enix.org/wiki/FirmwareFormat
0 string \x36\x00\x00\x00 Broadcom 96345 firmware header, header size: 256,
>4 string !Broadcom
>>4 string !\x20\x20\x20\x20 invalid
>41 beshort !0x2020
>>41 beshort !0x0000
>>>41 string x firmware version: "%.4s",
>45 beshort !0x0202
>>45 beshort !0x0000
>>>45 string x board id: "%s",
>236 belong x ~CRC32 header checksum: 0x%X,
>216 belong x ~CRC32 data checksum: 0x%X
# Xerox MFP DLM signatures
0 string %%XRXbegin Xerox DLM firmware start of header
0 string %%OID_ATT_DLM_NAME Xerox DLM firmware name:
>19 string x "%s"
0 string %%OID_ATT_DLM_VERSION Xerox DLM firmware version:
>22 string x "%s"
0 string %%XRXend Xerox DLM firmware end of header
# Generic copyright signature
0 string Copyright Copyright string:
>9 byte 0 invalid
>9 string x "%s
>40 string x \b%s"
# Sercomm firmware header
0 string sErCoMm Sercomm firmware signature,
>7 leshort x version control: %d,
>9 leshort x download control: %d,
>11 string x hardware ID: "%s",
>44 leshort x hardware version: 0x%X,
>58 leshort x firmware version: 0x%X,
>60 leshort x starting code segment: 0x%X,
>62 leshort x code size: 0x%X
# NPK firmware header, used by Mikrotik
0 belong 0x1EF1D0BA NPK firmware header,
>4 lelong <0 invalid
>4 lelong x image size: %d,
>14 string x image name: "%s",
>(48.l+58) string x description: "%s
>(48.l+121) string x \b%s"
# Ubiquiti firmware signatures
0 string UBNT Ubiquiti firmware header,
>0x104 belong x ~CRC32: 0x%X,
>4 string x version: "%s"
0 string GEOS Ubiquiti firmware header,
>0x104 belong x ~CRC32: 0x%X,
>4 string x version: "%s"
# Too many false positives...
#0 string OPEN Ubiquiti firmware header, third party,
#>0x104 belong x ~CRC32: 0x%X,
#>4 string x version: "%s"
0 string PARTkernel Ubiquiti firmware kernel partition
0 string PARTcramfs Ubiquiti firmware CramFS partition
0 string PARTrootfs Ubiquiti firmware rootfs partition
# Found in DIR-100 firmware
0 string AIH0N AIH0N firmware header, header size: 48,
>12 belong x size: %d,
>8 belong !0 executable code,
>>8 belong x load address: 0x%X,
>32 string x version: "%s"
0 belong 0x5EA3A417 SEAMA firmware header, big endian,
>6 beshort x meta size: %d,
>8 belong x size: %d
0 lelong 0x5EA3A417 SEAMA firmware header, little endian,
>6 leshort x meta size: %d,
>8 lelong x size: %d
0 belong 0x4D544443 NSP firmware header, big endian,
>16 belong x header size: %d,
>20 belong x image size: %d,
>4 belong x kernel offset: %d,
>12 belong x header version: %d,
0 lelong 0x4D544443 NSP firmware header, little endian,
>16 lelong x header size: %d,
>20 lelong x image size: %d,
>4 lelong x kernel offset: %d,
>12 lelong x header version: %d,
# http://www.openwiz.org/wiki/Firmware_Layout#Beyonwiz_.wrp_header_structure
0 string WizFwPkgl Beyonwiz firmware header,
>20 string x version: "%s"
0 string BLI223WJ0 Thompson/Alcatel encoded firmware,
>32 byte x version: %d.
>33 byte x \b%d.
>34 byte x \b%d.
>35 byte x \b%d,
>44 belong x size: %d,
>48 belong x crc: 0x%.8X,
>35 byte x try decryption tool from:
>35 byte x http://download.modem-help.co.uk/mfcs-A/Alcatel/Modems/Misc/
0 string \xd9\x54\x93\x7a\x68\x04\x4a\x44\x81\xce\x0b\xf6\x17\xd8\x90\xdf UEFI PI firmware volume{offset-adjust:-16}
# http://android.stackexchange.com/questions/23357/\
# is-there-a-way-to-look-inside-and-modify-an-adb-backup-created-file/\
# 23608#23608
0 string ANDROID\040BACKUP\n Android Backup
>15 string 1\n \b, version 1
>17 string 0\n \b, uncompressed
>17 string 1\n \b, compressed
>19 string none\n \b, unencrypted
>19 string AES-256\n \b, encrypted AES-256
# Tag Image File Format, from Daniel Quinlan (quinlan@yggdrasil.com)
# The second word of TIFF files is the TIFF version number, 42, which has
# never changed. The TIFF specification recommends testing for it.
0 string MM\x00\x2a TIFF image data, big-endian
0 string II\x2a\x00 TIFF image data, little-endian
# PNG [Portable Network Graphics, or "PNG's Not GIF"] images
# (Greg Roelofs, newt@uchicago.edu)
# (Albert Cahalan, acahalan@cs.uml.edu)
#
# 137 P N G \r \n ^Z \n [4-byte length] H E A D [HEAD data] [HEAD crc] ...
#
0 string \x89PNG\x0d\x0a\x1a\x0a PNG image
>16 belong x \b, %ld x
>20 belong x %ld,
>24 byte x %d-bit
>25 byte 0 grayscale,
>25 byte 2 \b/color RGB,
>25 byte 3 colormap,
>25 byte 4 gray+alpha,
>25 byte 6 \b/color RGBA,
#>26 byte 0 deflate/32K,
>28 byte 0 non-interlaced
>28 byte 1 interlaced
# GIF
0 string GIF8 GIF image data
>4 string 7a \b, version "8%s",
>4 string 9a \b, version "8%s",
>6 leshort >0 %hd x
>8 leshort >0 %hd
#>10 byte &0x80 color mapped,
#>10 byte&0x07 =0x00 2 colors
#>10 byte&0x07 =0x01 4 colors
#>10 byte&0x07 =0x02 8 colors
#>10 byte&0x07 =0x03 16 colors
#>10 byte&0x07 =0x04 32 colors
#>10 byte&0x07 =0x05 64 colors
#>10 byte&0x07 =0x06 128 colors
#>10 byte&0x07 =0x07 256 colors
# PC bitmaps (OS/2, Windows BMP files) (Greg Roelofs, newt@uchicago.edu)
0 string BM
>14 leshort 12 PC bitmap, OS/2 1.x format
>>18 lelong <1 invalid
>>18 lelong >1000000 invalid
>>18 leshort x \b, %d x
>>20 lelong <1 invalid
>>20 lelong >1000000 invalid
>>20 leshort x %d
>14 leshort 64 PC bitmap, OS/2 2.x format
>>18 lelong <1 invalid
>>18 lelong >1000000 invalid
>>18 leshort x \b, %d x
>>20 lelong <1 invalid
>>20 lelong >1000000 invalid
>>20 leshort x %d
>14 leshort 40 PC bitmap, Windows 3.x format
>>18 lelong <1 invalid
>>18 lelong >1000000 invalid
>>18 lelong x \b, %d x
>>22 lelong <1 invalid
>>22 lelong >1000000 invalid
>>22 lelong x %d x
>>28 lelong <1 invalid
>>28 lelong >1000000 invalid
>>28 leshort x %d
>14 leshort 128 PC bitmap, Windows NT/2000 format
>>18 lelong >1000000 invalid
>>18 lelong <1 invalid
>>18 lelong x \b, %d x
>>22 lelong <1 invalid
>>22 lelong >1000000 invalid
>>22 lelong x %d x
>>28 lelong <1 invalid
>>28 lelong >1000000 invalid
>>28 leshort x %d
#------------------------------------------------------------------------------
# JPEG images
# SunOS 5.5.1 had
#
# 0 string \377\330\377\340 JPEG file
# 0 string \377\330\377\356 JPG file
#
# both of which turn into "JPEG image data" here.
#
0 belong 0xffd8ffe0 JPEG image data, JFIF standard
>6 string !JFIF invalid
# The following added by Erik Rossen <rossen@freesurf.ch> 1999-09-06
# in a vain attempt to add image size reporting for JFIF. Note that these
# tests are not fool-proof since some perfectly valid JPEGs are currently
# impossible to specify in magic(4) format.
# First, a little JFIF version info:
>11 byte x \b %d.
>12 byte x \b%02d
# Next, the resolution or aspect ratio of the image:
#>>13 byte 0 \b, aspect ratio
#>>13 byte 1 \b, resolution (DPI)
#>>13 byte 2 \b, resolution (DPCM)
#>>4 beshort x \b, segment length %d
# Next, show thumbnail info, if it exists:
>18 byte !0 \b, thumbnail %dx
>>19 byte x \b%d
0 belong 0xffd8ffe1 JPEG image data, EXIF standard
# EXIF moved down here to avoid reporting a bogus version number,
# and EXIF version number printing added.
# - Patrik R=E5dman <patrik+file-magic@iki.fi>
>6 string !Exif invalid
# Look for EXIF IFD offset in IFD 0, and then look for EXIF version tag in EXIF IFD.
# All possible combinations of entries have to be enumerated, since no looping
# is possible. And both endians are possible...
# The combinations included below are from real-world JPEGs.
# Little-endian
>12 string II
# IFD 0 Entry #5:
>>70 leshort 0x8769
# EXIF IFD Entry #1:
>>>(78.l+14) leshort 0x9000
>>>>(78.l+23) byte x %c
>>>>(78.l+24) byte x \b.%c
>>>>(78.l+25) byte !0x30 \b%c
# IFD 0 Entry #9:
>>118 leshort 0x8769
# EXIF IFD Entry #3:
>>>(126.l+38) leshort 0x9000
>>>>(126.l+47) byte x %c
>>>>(126.l+48) byte x \b.%c
>>>>(126.l+49) byte !0x30 \b%c
# IFD 0 Entry #10
>>130 leshort 0x8769
# EXIF IFD Entry #3:
>>>(138.l+38) leshort 0x9000
>>>>(138.l+47) byte x %c
>>>>(138.l+48) byte x \b.%c
>>>>(138.l+49) byte !0x30 \b%c
# EXIF IFD Entry #4:
>>>(138.l+50) leshort 0x9000
>>>>(138.l+59) byte x %c
>>>>(138.l+60) byte x \b.%c
>>>>(138.l+61) byte !0x30 \b%c
# EXIF IFD Entry #5:
>>>(138.l+62) leshort 0x9000
>>>>(138.l+71) byte x %c
>>>>(138.l+72) byte x \b.%c
>>>>(138.l+73) byte !0x30 \b%c
# IFD 0 Entry #11
>>142 leshort 0x8769
# EXIF IFD Entry #3:
>>>(150.l+38) leshort 0x9000
>>>>(150.l+47) byte x %c
>>>>(150.l+48) byte x \b.%c
>>>>(150.l+49) byte !0x30 \b%c
# EXIF IFD Entry #4:
>>>(150.l+50) leshort 0x9000
>>>>(150.l+59) byte x %c
>>>>(150.l+60) byte x \b.%c
>>>>(150.l+61) byte !0x30 \b%c
# EXIF IFD Entry #5:
>>>(150.l+62) leshort 0x9000
>>>>(150.l+71) byte x %c
>>>>(150.l+72) byte x \b.%c
>>>>(150.l+73) byte !0x30 \b%c
# Big-endian
>12 string MM
# IFD 0 Entry #9:
>>118 beshort 0x8769
# EXIF IFD Entry #1:
>>>(126.L+14) beshort 0x9000
>>>>(126.L+23) byte x %c
>>>>(126.L+24) byte x \b.%c
>>>>(126.L+25) byte !0x30 \b%c
# EXIF IFD Entry #3:
>>>(126.L+38) beshort 0x9000
>>>>(126.L+47) byte x %c
>>>>(126.L+48) byte x \b.%c
>>>>(126.L+49) byte !0x30 \b%c
# IFD 0 Entry #10
>>130 beshort 0x8769
# EXIF IFD Entry #3:
>>>(138.L+38) beshort 0x9000
>>>>(138.L+47) byte x %c
>>>>(138.L+48) byte x \b.%c
>>>>(138.L+49) byte !0x30 \b%c
# EXIF IFD Entry #5:
>>>(138.L+62) beshort 0x9000
>>>>(138.L+71) byte x %c
>>>>(138.L+72) byte x \b.%c
>>>>(138.L+73) byte !0x30 \b%c
# IFD 0 Entry #11
>>142 beshort 0x8769
# EXIF IFD Entry #4:
>>>(150.L+50) beshort 0x9000
>>>>(150.L+59) byte x %c
>>>>(150.L+60) byte x \b.%c
>>>>(150.L+61) byte !0x30 \b%c
# Here things get sticky. We can do ONE MORE marker segment with
# indirect addressing, and that's all. It would be great if we could
# do pointer arithemetic like in an assembler language. Christos?
# And if there was some sort of looping construct to do searches, plus a few
# named accumulators, it would be even more effective...
# At least we can show a comment if no other segments got inserted before:
>(4.S+5) byte 0xFE
>>(4.S+8) string >\0 \b, comment: "%s"
# FIXME: When we can do non-byte counted strings, we can use that to get
# the string's count, and fix Debian bug #283760
#>(4.S+5) byte 0xFE \b, comment
#>>(4.S+6) beshort x \b length=%d
#>>(4.S+8) string >\0 \b, "%s"
# Or, we can show the encoding type (I've included only the three most common)
# and image dimensions if we are lucky and the SOFn (image segment) is here:
>(4.S+5) byte 0xC0 \b, baseline
>>(4.S+6) byte x \b, precision %d
>>(4.S+7) beshort x \b, %dx
>>(4.S+9) beshort x \b%d
>(4.S+5) byte 0xC1 \b, extended sequential
>>(4.S+6) byte x \b, precision %d
>>(4.S+7) beshort x \b, %dx
>>(4.S+9) beshort x \b%d
>(4.S+5) byte 0xC2 \b, progressive
>>(4.S+6) byte x \b, precision %d
>>(4.S+7) beshort x \b, %dx
>>(4.S+9) beshort x \b%d
# I've commented-out quantisation table reporting. I doubt anyone cares yet.
#>(4.S+5) byte 0xDB \b, quantisation table
#>>(4.S+6) beshort x \b length=%d
#>14 beshort x \b, %d x
#>16 beshort x \b %d
0 string M88888888888888888888888888 Binwalk logo, ASCII art (Toph){offset-adjust:-50}
>27 string !8888888888\n invalid
#-------------------------Kernels-------------------------------------
# Linux kernel boot images, from Albert Cahalan <acahalan@cs.uml.edu>
# and others such as Axel Kohlmeyer <akohlmey@rincewind.chemie.uni-ulm.de>
# and Nicolás Lichtmaier <nick@debian.org>
# All known start with: b8 c0 07 8e d8 b8 00 90 8e c0 b9 00 01 29 f6 29
0 string \xb8\xc0\x07\x8e\xd8\xb8\x00\x90\x8e\xc0\xb9\x00\x01\x29\xf6\x29 Linux kernel boot image
>514 string !HdrS (invalid)
# Finds and prints Linux kernel strings in raw Linux kernels (output like uname -a).
# Commonly found in decompressed embedded kernel binaries.
0 string Linux\ version\ Linux kernel version
>14 byte 0 invalid
>14 byte !0
>>14 string x "%s
>>45 string x \b%s"
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x40
# ------------------------------------------------------------------
0 string \x40\x00\x00 LZMA compressed data, properties: 0x40,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x41
# ------------------------------------------------------------------
0 string \x41\x00\x00 LZMA compressed data, properties: 0x41,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x48
# ------------------------------------------------------------------
0 string \x48\x00\x00 LZMA compressed data, properties: 0x48,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x49
# ------------------------------------------------------------------
0 string \x49\x00\x00 LZMA compressed data, properties: 0x49,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x51
# ------------------------------------------------------------------
0 string \x51\x00\x00 LZMA compressed data, properties: 0x51,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x5A
# ------------------------------------------------------------------
0 string \x5A\x00\x00 LZMA compressed data, properties: 0x5A,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x5B
# ------------------------------------------------------------------
0 string \x5B\x00\x00 LZMA compressed data, properties: 0x5B,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x5C
# ------------------------------------------------------------------
0 string \x5C\x00\x00 LZMA compressed data, properties: 0x5C,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x5D
# ------------------------------------------------------------------
0 string \x5D\x00\x00 LZMA compressed data, properties: 0x5D,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x5E
# ------------------------------------------------------------------
0 string \x5E\x00\x00 LZMA compressed data, properties: 0x5E,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x63
# ------------------------------------------------------------------
0 string \x63\x00\x00 LZMA compressed data, properties: 0x63,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x64
# ------------------------------------------------------------------
0 string \x64\x00\x00 LZMA compressed data, properties: 0x64,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x65
# ------------------------------------------------------------------
0 string \x65\x00\x00 LZMA compressed data, properties: 0x65,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x66
# ------------------------------------------------------------------
0 string \x66\x00\x00 LZMA compressed data, properties: 0x66,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x6C
# ------------------------------------------------------------------
0 string \x6C\x00\x00 LZMA compressed data, properties: 0x6C,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x6D
# ------------------------------------------------------------------
0 string \x6D\x00\x00 LZMA compressed data, properties: 0x6D,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x6E
# ------------------------------------------------------------------
0 string \x6E\x00\x00 LZMA compressed data, properties: 0x6E,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x75
# ------------------------------------------------------------------
0 string \x75\x00\x00 LZMA compressed data, properties: 0x75,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x76
# ------------------------------------------------------------------
0 string \x76\x00\x00 LZMA compressed data, properties: 0x76,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x7E
# ------------------------------------------------------------------
0 string \x7E\x00\x00 LZMA compressed data, properties: 0x7E,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x87
# ------------------------------------------------------------------
0 string \x87\x00\x00 LZMA compressed data, properties: 0x87,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x88
# ------------------------------------------------------------------
0 string \x88\x00\x00 LZMA compressed data, properties: 0x88,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x89
# ------------------------------------------------------------------
0 string \x89\x00\x00 LZMA compressed data, properties: 0x89,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x8A
# ------------------------------------------------------------------
0 string \x8A\x00\x00 LZMA compressed data, properties: 0x8A,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x8B
# ------------------------------------------------------------------
0 string \x8B\x00\x00 LZMA compressed data, properties: 0x8B,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x90
# ------------------------------------------------------------------
0 string \x90\x00\x00 LZMA compressed data, properties: 0x90,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x91
# ------------------------------------------------------------------
0 string \x91\x00\x00 LZMA compressed data, properties: 0x91,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x92
# ------------------------------------------------------------------
0 string \x92\x00\x00 LZMA compressed data, properties: 0x92,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x93
# ------------------------------------------------------------------
0 string \x93\x00\x00 LZMA compressed data, properties: 0x93,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x99
# ------------------------------------------------------------------
0 string \x99\x00\x00 LZMA compressed data, properties: 0x99,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x9A
# ------------------------------------------------------------------
0 string \x9A\x00\x00 LZMA compressed data, properties: 0x9A,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0x9B
# ------------------------------------------------------------------
0 string \x9B\x00\x00 LZMA compressed data, properties: 0x9B,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xA2
# ------------------------------------------------------------------
0 string \xA2\x00\x00 LZMA compressed data, properties: 0xA2,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xA3
# ------------------------------------------------------------------
0 string \xA3\x00\x00 LZMA compressed data, properties: 0xA3,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xAB
# ------------------------------------------------------------------
0 string \xAB\x00\x00 LZMA compressed data, properties: 0xAB,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xB4
# ------------------------------------------------------------------
0 string \xB4\x00\x00 LZMA compressed data, properties: 0xB4,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xB5
# ------------------------------------------------------------------
0 string \xB5\x00\x00 LZMA compressed data, properties: 0xB5,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xB6
# ------------------------------------------------------------------
0 string \xB6\x00\x00 LZMA compressed data, properties: 0xB6,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xB7
# ------------------------------------------------------------------
0 string \xB7\x00\x00 LZMA compressed data, properties: 0xB7,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xB8
# ------------------------------------------------------------------
0 string \xB8\x00\x00 LZMA compressed data, properties: 0xB8,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xBD
# ------------------------------------------------------------------
0 string \xBD\x00\x00 LZMA compressed data, properties: 0xBD,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xBE
# ------------------------------------------------------------------
0 string \xBE\x00\x00 LZMA compressed data, properties: 0xBE,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xBF
# ------------------------------------------------------------------
0 string \xBF\x00\x00 LZMA compressed data, properties: 0xBF,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xC0
# ------------------------------------------------------------------
0 string \xC0\x00\x00 LZMA compressed data, properties: 0xC0,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xC6
# ------------------------------------------------------------------
0 string \xC6\x00\x00 LZMA compressed data, properties: 0xC6,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xC7
# ------------------------------------------------------------------
0 string \xC7\x00\x00 LZMA compressed data, properties: 0xC7,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xC8
# ------------------------------------------------------------------
0 string \xC8\x00\x00 LZMA compressed data, properties: 0xC8,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xCF
# ------------------------------------------------------------------
0 string \xCF\x00\x00 LZMA compressed data, properties: 0xCF,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xD0
# ------------------------------------------------------------------
0 string \xD0\x00\x00 LZMA compressed data, properties: 0xD0,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
# ------------------------------------------------------------------
# Signature for LZMA compressed data with valid properties byte 0xD8
# ------------------------------------------------------------------
0 string \xD8\x00\x00 LZMA compressed data, properties: 0xD8,
# These are all the valid dictionary sizes supported by LZMA utils.
>1 lelong !65536
>>1 lelong !131072
>>>1 lelong !262144
>>>>1 lelong !524288
>>>>>1 lelong !1048576
>>>>>>1 lelong !2097152
>>>>>>>1 lelong !4194304
>>>>>>>>1 lelong !8388608
>>>>>>>>>1 lelong !16777216
>>>>>>>>>>1 lelong !33554432 invalid
>1 lelong x dictionary size: %d bytes,
# Assume that a valid size will be greater than 32 bytes and less than 1GB (a value of -1 IS valid).
# This could technically be valid, but is unlikely.
>5 lequad !-1
>>5 lequad <32 invalid
>>5 lequad >0x40000000 invalid
# These are not 100%. The uncompressed size could be exactly the same as the dicionary size, but it is unlikely.
# Since most false positives are the result of repeating sequences of bytes (such as executable instructions),
# marking matches with the same uncompressed and dictionary sizes as invalid eliminates much of these false positives.
>1 lelong 65536
>>5 lequad 65536 invalid
>1 lelong 131072
>>5 lequad 131072 invalid
>1 lelong 262144
>>5 lequad 262144 invalid
>1 lelong 524288
>>5 lequad 524288 invalid
>1 lelong 1048576
>>5 lequad 1048576 invalid
>1 lelong 2097152
>>5 lequad 2097152 invalid
>1 lelong 4194304
>>5 lequad 4194304 invalid
>1 lelong 8388608
>>5 lequad 8388608 invalid
>1 lelong 16777216
>>5 lequad 16777216 invalid
>1 lelong 33554432
>>5 lequad 33554432 invalid
>5 lequad x uncompressed size: %lld bytes
#------------------------------------------------------------------------------
# $File: pdf,v 1.6 2009/09/19 16:28:11 christos Exp $
# pdf: file(1) magic for Portable Document Format
#
0 string %PDF- PDF document,
>6 byte !0x2e invalid
>5 string x version: "%3s"
#------------------------------------------------------------------------------
# $File: zyxel,v 1.6 2009/09/19 16:28:13 christos Exp $
# zyxel: file(1) magic for ZyXEL modems
#
# From <rob@pe1chl.ampr.org>
# These are the /etc/magic entries to decode datafiles as used for the
# ZyXEL U-1496E DATA/FAX/VOICE modems. (This header conforms to a
# ZyXEL-defined standard)
0 string ZyXEL\002 ZyXEL voice data
>10 byte 0 \b, CELP encoding
>10 byte&0x0B 1 \b, ADPCM2 encoding
>10 byte&0x0B 2 \b, ADPCM3 encoding
>10 byte&0x0B 3 \b, ADPCM4 encoding
>10 byte&0x0B 8 \b, New ADPCM3 encoding
>10 byte&0x04 4 \b,with resync
0 string LinuxGuestRecord Xen saved domain file
0 string \x3chtml HTML document header{extract-delay:HTML document footer}
>5 byte !0x20
>>5 byte !0x3e \b, invalid
0 string \x3cHTML HTML document header{extract-delay:HTML document footer}
>5 byte !0x20
>>5 byte !0x3e \b, invalid
0 string \x3c/html\x3e HTML document footer{offset-adjust:7}
0 string \x3c/HTML\x3e HTML document footer{offset-adjust:7}
0 string \x3c?xml\x20version XML document,
>15 string x version: "%.3s"
#------------------------------------------------------------------------------
# $File: sql,v 1.6 2009/09/19 16:28:12 christos Exp $
# sql: file(1) magic for SQL files
#
# From: "Marty Leisner" <mleisner@eng.mc.xerox.com>
# Recognize some MySQL files.
#
0 beshort 0xfe01 MySQL table definition file
>2 string <1 invalid
>2 string >\11 invalid
>2 byte x Version %d
0 string \xfe\xfe\x03 MySQL MISAM index file
>3 string <1 invalid
>3 string >\11 invalid
>3 byte x Version %d
0 string \xfe\xfe\x07 MySQL MISAM compressed data file
>3 string <1 invalid
>3 string >\11 invalid
>3 byte x Version %d
0 string \xfe\xfe\x05 MySQL ISAM index file
>3 string <1 invalid
>3 string >\11 invalid
>3 byte x Version %d
0 string \xfe\xfe\x06 MySQL ISAM compressed data file
>3 string <1 invalid
>3 string >\11 invalid
>3 byte x Version %d
0 string \376bin MySQL replication log
#------------------------------------------------------------------------------
# iRiver H Series database file
# From Ken Guest <ken@linux.ie>
# As observed from iRivNavi.iDB and unencoded firmware
#
0 string iRivDB iRiver Database file
>11 string >\0 Version "%s"
>39 string iHP-100 [H Series]
#------------------------------------------------------------------------------
# SQLite database files
# Ken Guest <ken@linux.ie>, Ty Sarna, Zack Weinberg
#
# Version 1 used GDBM internally; its files cannot be distinguished
# from other GDBM files.
#
# Version 2 used this format:
0 string **\x20This\x20file\x20contains\x20an\x20SQLite SQLite 2.x database
# Version 3 of SQLite allows applications to embed their own "user version"
# number in the database. Detect this and distinguish those files.
0 string SQLite\x20format\x203
>60 string _MTN Monotone source repository
>60 belong !0 SQLite 3.x database, user version %u
>60 belong 0 SQLite 3.x database
#!/usr/bin/env python
import os
import sys
from os import listdir, path
from distutils.core import setup
WIDTH = 115
# Check for pre-requisite modules only if --no-prereq-checks was not specified
if "--no-prereq-checks" not in sys.argv:
print "checking pre-requisites"
try:
import magic
try:
magic.MAGIC_NO_CHECK_TEXT
except Exception, e:
print "\n", "*" * WIDTH
print "Pre-requisite failure:", str(e)
print "It looks like you have an old or incompatible magic module installed."
print "Please install the official python-magic module, or download and install it from source: ftp://ftp.astron.com/pub/file/"
print "*" * WIDTH, "\n"
sys.exit(1)
except Exception, e:
print "\n", "*" * WIDTH
print "Pre-requisite failure:", str(e)
print "Please install the python-magic module, or download and install it from source: ftp://ftp.astron.com/pub/file/"
print "*" * WIDTH, "\n"
sys.exit(1)
try:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot
import numpy
except Exception, e:
print "\n", "*" * WIDTH
print "Pre-requisite check warning:", str(e)
print "To take advantage of this tool's entropy plotting capabilities, please install the python-matplotlib module."
print "*" * WIDTH, "\n"
if raw_input('Continue installation without this module (Y/n)? ').lower().startswith('n'):
print 'Quitting...\n'
sys.exit(1)
else:
# This is super hacky.
sys.argv.pop(sys.argv.index("--no-prereq-checks"))
# Build / install C compression libraries
c_lib_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "C")
c_lib_makefile = os.path.join(c_lib_dir, "Makefile")
working_directory = os.getcwd()
os.chdir(c_lib_dir)
status = 0
if not os.path.exists(c_lib_makefile):
status |= os.system("./configure")
status |= os.system("make")
if status != 0:
print "ERROR: Failed to build libtinfl.so! Do you have gcc installed?"
sys.exit(1)
if "install" in sys.argv:
os.system("make install")
os.chdir(working_directory)
# Generate a new magic file from the files in the magic directory
print "generating binwalk magic file"
magic_files = listdir("magic")
magic_files.sort()
fd = open("binwalk/magic/binwalk", "wb")
for magic in magic_files:
fpath = path.join("magic", magic)
if path.isfile(fpath):
fd.write(open(fpath).read())
fd.close()
# The data files to install along with the binwalk module
install_data_files = ["magic/*", "config/*", "plugins/*"]
# Install the binwalk module, script and support files
setup( name = "binwalk",
version = "1.2.3",
description = "Firmware analysis tool",
author = "Craig Heffner",
url = "http://binwalk.googlecode.com",
requires = ["magic", "matplotlib.pyplot"],
packages = ["binwalk"],
package_data = {"binwalk" : install_data_files},
scripts = ["bin/binwalk"],
)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment