The Gz module contains functions to compress and uncompress strings using the same algorithm as the program gzip. Compressing can be done in streaming mode or all at once.
The Gz module consists of two classes; Gz.deflate and Gz.inflate. Gz.deflate is used to pack data and Gz.inflate is used to unpack data. (Think "inflatable boat")
Note that this module is only available if the gzip library was available when Pike was compiled.
Note that although these functions use the same algorithm as gzip, they do not use the exact same format, so you cannot directly unzip gzipped files with these routines. Support for this will be added in the future.
constant
Gz.DEFAULT_STRATEGY
The default strategy as selected in the zlib library.
constant
Gz.FILTERED
This strategy is intented for data created by a filter or predictor and will put more emphasis on huffman encoding and less on LZ string matching. This is between DEFAULT_STRATEGY and HUFFMAN_ONLY.
constant
Gz.FIXED
In this mode dynamic huffman codes are disabled, allowing for a simpler decoder for special applications. This mode is not available in all zlib versions.
constant
Gz.HUFFMAN_ONLY
This strategy will turn of string matching completely, only doing huffman encoding. Window size doesn't matter in this mode and the data can be decompressed with a zero size window.
constant
Gz.RLE
This strategy is even closer to the HUFFMAN_ONLY in that it only looks at the latest byte in the window, i.e. a window size of 1 byte is sufficient for decompression. This mode is not available in all zlib versions.
int
adler32(string(8bit)
data
, void
|int(0..)
start_value
)
This function calculates the Adler-32 Cyclic Redundancy Check.
string(8bit)
compress(string(8bit)
|String.Buffer
|System.Memory
|Stdio.Buffer
data
, void
|bool
raw
, void
|int(0..9)
level
, void
|int
strategy
, void
|int(8..15)
window_size
)
Encodes and returns the input data
according to the deflate
format defined in RFC 1951.
data
The data to be encoded.
raw
If set, the data is encoded without the header and footer defined in RFC 1950. Example of uses is the ZIP container format.
level
Indicates the level of effort spent to make the data compress well. Zero means no packing, 2-3 is considered 'fast', 8 is default and higher is considered 'slow' but gives better packing.
strategy
The strategy to be used when compressing the data. One of the following.
| The default strategy as selected in the zlib library. |
| This strategy is intented for data created by a filter or predictor and will put more emphasis on huffman encoding and less on LZ string matching. This is between DEFAULT_STRATEGY and HUFFMAN_ONLY. |
| This strategy is even closer to the HUFFMAN_ONLY in that it only looks at the latest byte in the window, i.e. a window size of 1 byte is sufficient for decompression. This mode is not available in all zlib versions. |
| This strategy will turn of string matching completely, only doing huffman encoding. Window size doesn't matter in this mode and the data can be decompressed with a zero size window. |
| In this mode dynamic huffman codes are disabled, allowing for a simpler decoder for special applications. This mode is not available in all zlib versions. |
window_size
Defines the size of the LZ77 window from 256 bytes to 32768 bytes, expressed as 2^x.
deflate
, inflate
, uncompress
int
crc32(string(8bit)
data
, void
|int(0..)
start_value
)
This function calculates the standard ISO3309 Cyclic Redundancy Check.
string(8bit)
uncompress(string(8bit)
|String.Buffer
|System.Memory
|Stdio.Buffer
data
, void
|bool
raw
)
Uncompresses the data
and returns it. The raw
parameter
tells the decoder that the indata lacks the data header and footer
defined in RFC 1950.
inherit ._file : _file
Allows the user to open a Gzip archive and read and write
it's contents in an uncompressed form, emulating the Stdio.File
interface.
An important limitation on this class is that it may only be used for reading or writing, not both at the same time. Please also note that if you want to reopen a file for reading after a write, you must close the file before calling open or strange effects might be the result.
Gz.File Gz.File(
void
|string
|int
|Stdio.Stream
file
, void
|string
mode
)
file
Filename or filedescriptor of the gzip file to open, or an already open Stream.
mode
mode for the file. Defaults to "rb".
open
Stdio.File
String.SplitIterator
|Stdio.LineIterator
line_iterator(int
|void
trim
)
Returns an iterator that will loop over the lines in this file. If trim is true, all '\r' characters will be removed from the input.
int
open(string
|int
|Stdio.Stream
file
, void
|string
mode
)
file
Filename or filedescriptor of the gzip file to open, or an already open Stream.
mode
mode for the file. Defaults to "rb". May be one of the following:
read mode
write mode
append mode
For the wb and ab mode, additional parameters may be specified. Please se zlib manual for more info.
non-zero if successful.
int
|string
read(void
|int
length
)
Reads data from the file. If no argument is given, the whole file is read.
function
(:string
) read_function(int
nbytes
)
Returns a function that when called will call read
with
nbytes as argument. Can be used to get various callback
functions, eg for the fourth argument to
String.SplitIterator
.
Low-level implementation of read/write support for GZip files
int
close()
closes the file
1 if successful
Gz._file Gz._file(
void
|string
|Stdio.Stream
gzFile
, void
|string
mode
)
Opens a gzip file for reading.
bool
eof()
1 if EOF has been reached.
int
open(string
|int
|Stdio.Stream
file
, void
|string
mode
)
Opens a file for I/O.
file
The filename or an open filedescriptor or Stream for the GZip file to use.
mode
Mode for the file operations. Defaults to read only. The following mode characters are unique to Gz.File.
| Values 0 to 9 set the compression level from no compression to maximum available compression. Defaults to 6. |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Sets the compression strategy to |
| Sets the compression strategy to |
If the object already has been opened, it will first be closed.
int
|string
read(int
len
)
Reads len (uncompressed) bytes from the file. If read is unsuccessful, 0 is returned.
int
seek(int
pos
, void
|int
type
)
Seeks within the file.
pos
Position relative to the searchtype.
type
SEEK_SET = set current position in file to pos SEEK_CUR = new position is current+pos SEEK_END is not supported.
New position or negative number if seek failed.
int
setparams(void
|int(0..9)
level
, void
|int
strategy
, void
|int(8..15)
window_size
)
Sets the encoding level, strategy and window_size.
Gz.deflate
int
tell()
the current position within the file.
int
write(string
data
)
Writes the data to the file.
the number of bytes written to the file.
This class interfaces with the compression routines in the libz library.
This class is only available if libz was available and found when Pike was compiled.
Gz.inflate()
, Gz.compress()
, Gz.uncompress()
Gz.deflate
clone()
Clones the deflate object. Typically used to test compression of new content using the same exact state.
Gz.deflate Gz.deflate(
int(-9..9)
|void
level
, int
|void
strategy
, int(8..15)
|void
window_size
)
Gz.deflate Gz.deflate(
mapping
options
)
This function can also be used to re-initialize a Gz.deflate object so it can be re-used.
If a mapping is passed as the only argument, it will accept the
parameters described below as indices, and additionally it accepts
a string
as dictionary
.
level
Indicates the level of effort spent to make the data compress well. Zero means no packing, 2-3 is considered 'fast', 6 is default and higher is considered 'slow' but gives better packing.
If the argument is negative, no headers will be emitted. This is needed to produce ZIP-files, as an example. The negative value is then negated, and handled as a positive value.
strategy
The strategy to be used when compressing the data. One of the following.
| The default strategy as selected in the zlib library. |
| This strategy is intented for data created by a filter or predictor and will put more emphasis on huffman encoding and less on LZ string matching. This is between DEFAULT_STRATEGY and HUFFMAN_ONLY. |
| This strategy is even closer to the HUFFMAN_ONLY in that it only looks at the latest byte in the window, i.e. a window size of 1 byte is sufficient for decompression. This mode is not available in all zlib versions. |
| This strategy will turn of string matching completely, only doing huffman encoding. Window size doesn't matter in this mode and the data can be decompressed with a zero size window. |
| In this mode dynamic huffman codes are disabled, allowing for a simpler decoder for special applications. This mode is not available in all zlib versions. |
window_size
Defines the size of the LZ77 window from 256 bytes to 32768 bytes, expressed as 2^x.
string(8bit)
deflate(string(8bit)
|String.Buffer
|System.Memory
|Stdio.Buffer
data
, int
|void
flush
)
This function performs gzip style compression on a string data
and
returns the packed data. Streaming can be done by calling this
function several times and concatenating the returned data.
The optional argument flush
should be one of the following:
| Only data that doesn't fit in the internal buffers is returned. |
| All input is packed and returned. |
| All input is packed and returned. |
| All input is packed and an 'end of data' marker is appended (default). |
Gz.inflate->inflate()
This class interfaces with the uncompression routines in the libz library.
This program is only available if libz was available and found when Pike was compiled.
deflate
, compress
, uncompress
Gz.inflate Gz.inflate(
int
|void
window_size
)
Gz.inflate Gz.inflate(
mapping
options
)
If called with a mapping as only argument, create
accepts
the entries window_size
(described below) and
dictionary
, which is a string to be set as dictionary.
The window_size value is passed down to inflateInit2 in zlib.
If the argument is negative, no header checks are done, and no verification of the data will be done either. This is needed for uncompressing ZIP-files, as an example. The negative value is then negated, and handled as a positive value.
Positive arguments set the maximum dictionary size to an exponent of 2, such that 8 (the minimum) will cause the window size to be 256, and 15 (the maximum, and default value) will cause it to be 32Kb. Setting this to anything except 15 is rather pointless in Pike.
It can be used to limit the amount of memory that is used to uncompress files, but 32Kb is not all that much in the great scheme of things.
To decompress files compressed with level 9 compression, a 32Kb window size is needed. level 1 compression only requires a 256 byte window.
If the options
version is used you can specify your own dictionary in addition to the window size.
|
|
string(8bit)
end_of_stream()
This function returns 0 if the end of stream marker has not yet been encountered, or a string (possibly empty) containg any extra data received following the end of stream marker if the marker has been encountered. If the extra data is not needed, the result of this function can be treated as a logical value.
string(8bit)
inflate(string(8bit)
|String.Buffer
|System.Memory
|Stdio.Buffer
data
)
This function performs gzip style decompression. It can inflate a whole file at once or in blocks.
// whole file
write(Gz.inflate()->inflate(stdin->read(0x7fffffff)); // streaming (blocks) function inflate=Gz.inflate()->inflate; while(string s=stdin->read(8192)) write(inflate(s));
Gz.deflate->deflate()
, Gz.uncompress
The Bz2 module contains functions to compress and uncompress strings using the same algorithm as the program bzip2. Compressing and decompressing can be done in streaming mode feeding the compress and decompress objects with arbitrarily large pieces of data.
The Bz2 module consists of three classes; Bz2.Deflate
,
Bz2.Inflate
and Bz2.File
. Bz2.Deflate
is used to compress
data and Bz2.Inflate
is used to uncompress data. Bz2.File
is
used to handle Bzip2 files.
Note that this module is only available if libbzip2 was available when Pike was compiled.
Note that although the functions in Inflate
and Deflate
use the same algorithm as bzip2, they do not use the
exact same format, so you can not directly zip files or unzip
zip-files using those functions. That is why there exists a
third class for files.
inherit "___Bz2" : Bz2
Bz2.Deflate is a builtin program written in C. It interfaces the packing routines in the bzlib library.
This program is only available if libz was available and found when Pike was compiled.
Bz2.Inflate()
Bz2.Deflate Bz2.Deflate(
int(1..9)
|void
block_size
)
If given, block_size
should be a number from 1 to 9 indicating
the block size used when doing compression. The actual block size
will be a 100000 times this number. Low numbers are considered
'fast', higher numbers are considered 'slow' but give better
packing. The parameter is set to 9 if it is omitted.
This function can also be used to re-initialize a Bz2.Deflate object so it can be re-used.
string
deflate(string
data
, int(0..2)
|void
flush_mode
)
This function performs bzip2 style compression on a string
data
and returns the packed data. Streaming can be done by
calling this function several times and concatenating the
returned data.
The optional argument flush_mode
should be one of the
following:
| Runs Bz2.Deflate->feed() |
| Runs Bz2.Deflate->read() |
| Runs Bz2.Deflate->finish() |
Bz2.Inflate->inflate()
void
feed(string
data
)
This function feeds the data to the internal buffers of the Deflate object. All data is buffered until a read or a finish is done.
Bz2.Deflate->read()
Bz2.Deflate->finish()
string
finish(string
data
)
This method feeds the data to the internal buffers of the Deflate object. Then it compresses all buffered data adds a end of data marker ot it, returns the compressed data as a string, and reinitializes the deflate object.
Bz2.Deflate->feed()
Bz2.Deflate->read()
string
read(string
data
)
This function feeds the data to the internal buffers of the Deflate object. Then it compresses all buffered data and returns the compressed data as a string
Bz2.Deflate->feed()
Bz2.Deflate->finish()
Low-level implementation of read/write support for Bzip2 files
This class is currently not available on Windows.
inherit Bz2::File : File
bool
close()
closes the file
Bz2.File Bz2.File()
Bz2.File Bz2.File(
string
filename
, void
|string
mode
)
Creates a Bz2.File object
bool
eof()
1 if EOF has been reached, 0 otherwise
String.SplitIterator
|Stdio.LineIterator
line_iterator(int
|void
trim
)
Returns an iterator that will loop over the lines in this file. If trim is true, all '\r' characters will be removed from the input.
bool
open(string
file
, void
|string
mode
)
Opens a file for I/O.
file
The name of the file to be opened
mode
Mode for the file operations. Can be either "r" (read) or "w". Read is default.
string
read(int
len
)
Reads len (uncompressed) bytes from the file. If len is omitted the whole file is read. If read is unsuccessful, 0 is returned.
function
(:string
) read_function(int
nbytes
)
Returns a function that when called will call read
with
nbytes as argument. Can be used to get various callback
functions, eg for the fourth argument to
String.SplitIterator
.
bool
read_open(string
file
)
Opens a file for reading.
file
The name of the file to be opened
int
write(string
data
)
Writes the data to the file.
the number of bytes written to the file.
bool
write_open(string
file
)
Opens a file for writing.
file
The name of the file to be opened
Bz2.Inflate is a builtin program written in C. It interfaces the unpacking routines in the libz library.
This program is only available if bzlib was available and found when Pike was compiled.
Deflate
Bz2.Inflate Bz2.Inflate()
string
inflate(string
data
)
This function performs bzip2 style decompression. It can do decompression with arbitrarily large pieces of data. When fed with data, it decompresses as much as it can and buffers the rest.
while(..){ foo = compressed_data[i..i+9]; uncompressed_concatenated_data += inflate_object->inflate(foo); i = i+10; }
Bz2.Deflate->deflate()
Implementation of the HPACK (RFC 7541) header packing standard.
This is the header packing system that is used in HTTP/2 (RFC 7540).
inherit "___HPack" : "___HPack"
constant
int
HPack.DEFAULT_HEADER_TABLE_SIZE
This is the default static maximum size of the dynamic header table.
This constant is taken from RFC 7540 section 6.5.2.
constant
HPack.static_header_tab
Table of static headers. RFC 7541 appendix A, Table 1.
Array | |||||||
|
|
Note that this table is indexed starting on 0
(zero),
while the corresponding table in RFC 7541 starts on 1
(one).
protected
mapping
(string(8bit)
:int
|mapping
(string(8bit)
:int
)) HPack.static_header_index
Index for static_header_tab
.
Note that the indices are offset by 1
(one).
This variable should be regarded as a constant.
This variable is used to initialize the header index in the Context
.
static_header_tab
, Context()->header_index
protected
mapping
(string(8bit)
:int
|mapping
(string(8bit)
:int
)) create_index(array
(array
(string(8bit)
)) tab
)
Helper function used to create the static_header_index
.
string(8bit)
huffman_decode(string(8bit)
str
)
Decodes the string str
encoded with the static huffman code specified
in RFC 7541 appendix B.
str
String to decode.
Returns the decoded string.
huffman_encode()
.
string(8bit)
huffman_encode(string(8bit)
str
)
Encodes the string str
with the static huffman code specified
in RFC 7541 appendix B.
str
String to encode.
Returns the encoded string.
huffman_decode()
.
protected
void
update_index(mapping
(string(8bit)
:int
|mapping
(string(8bit)
:int
)) index
, int
i
, array
(string(8bit)
) key
)
Update the specified encoder lookup index.
index
Lookup index to add an entry to.
key
Lookup key to add.
i
Value to store in the index for the key.
Flags for Context()->encode_header()
et al.
constant
HPack.HEADER_INDEXED
Indexed header.
constant
HPack.HEADER_INDEXED_MASK
Bitmask for indexing mode.
constant
HPack.HEADER_NEVER_INDEXED
Never indexed header.
constant
HPack.HEADER_NOT_INDEXED
Unindexed header.
Context for an HPack encoder or decoder.
This class implements the majority of RFC 7541.
Functions of interest are typically encode()
and decode()
.
protected
array
(array
(string(8bit)
)) HPack.Context.dynamic_headers
Table of currently available dynamically defined headers.
New entries are appended last, and the first dynamic_prefix
elements are not used.
header_index
, add_header()
protected
int
HPack.Context.dynamic_max_size
Current upper size limit in bytes for dynamic_headers
.
set_dynamic_size()
protected
int
HPack.Context.dynamic_prefix
Index of first avaiable header in dynamic_headers
.
protected
int
HPack.Context.dynamic_size
Current size in bytes of dynamic_headers
.
protected
mapping
(string(8bit)
:int
|mapping
(string(8bit)
:int
)) HPack.Context.header_index
Index into dynamic_headers
and static_headers
.
| Indexed on the header name in lower-case. The value is one of:
|
The index values in turn are coded as follows:
| Index into |
| Not used. |
| Inverted ( |
dynamic_headers
, static_header_tab
, add_header()
protected
int
HPack.Context.static_max_size
Static upper size limit in bytes for dynamic_headers
.
create()
, set_dynamic_size()
int(0)
|int(62)
add_header(string(8bit)
header
, string(8bit)
value
)
Add a header to the table of known headers and to the header index.
header
Name of header to add.
value
Value of the header.
Returns 0
(zero) if the header was too large to store.
Returns the encoding key for the header on success (this is always
sizeof(static_header_tab + 1
(ie 62
), as new headers
are prepended to the dynamic header table.
Adding a header may cause old headers to be evicted from the table.
get_indexed_header()
HPack.Context HPack.Context(
int
|void
protocol_dynamic_max_size
)
Create a new HPack Context
.
static_max_size
This is the static maximum size in bytes (as calculated by
RFC 7541 section 4.1) of the dynamic header table.
It defaults to DEFAULT_HEADER_TABLE_SIZE
, and is the
upper limit for set_dynamic_size()
.
set_dynamic_size()
array
(array
(string(8bit)
|HPackFlags
)) decode(Stdio.Buffer
buf
)
Decode a HPack header block.
buf
Input buffer.
Returns an array of headers. Cf decode_header()
.
decode_header()
, encode()
array
(string(8bit)
|HPackFlags
) decode_header(Stdio.Buffer
buf
)
Decode a single HPack header.
buf
Input buffer.
Returns UNDEFINED
on empty buffer.
Returns an array with a header and value otherwise:
Array | |
| Name of the header. Under normal circumstances this is always lower-case, but no check is currently performed. |
| Value of the header. |
| Optional encoding flags. Only set for fields having
|
The elements in the array are in the same order and compatible
with the arguments to encode_header()
.
Throws on encoding errors.
The returned array MUST NOT be modified.
In future implementations the result array may get extended with a flag field.
The in-band signalling of encoding table sizes is handled internally.
decode()
, encode_header()
void
encode(array
(array
(string(8bit)
|HPackFlags
)) headers
, Stdio.Buffer
buf
)
Encode a full set of headers.
headers
An array of ({ header, value })-tuples.
buf
Output buffer.
encode_header()
, decode()
variant
string(8bit)
encode(array
(array
(string(8bit)
)) headers
)
Convenience variant of encode()
.
headers
An array of ({ header, value })-tuples.
Returns the corresponding HPack encoding.
void
encode_header(Stdio.Buffer
buf
, string(8bit)
header
, string(8bit)
value
, HPackFlags
|void
flags
)
Encode a single HPack header.
buf
Output buffer.
header
Name of header. This should under normal circumstances be a lower-case string, but this is currently not checked.
value
Header value.
flags
Optional encoding flags.
encode()
, decode_header()
protected
void
evict_dynamic_headers()
Evict dynamic headers until dynamic_size
goes below
dynamic_max_size
.
array
(string(8bit)
) get_indexed_header(int(1..)
index
)
Lookup a known header.
index
Encoding key for the header to retrieve.
Returns UNDEFINED
on unknown header.
Returns an array with a header and value otherwise:
Array | |
| Name of the header. Under normal circumstances this is always lower-case, but no check is currently performed. |
| Value of the header. |
add_header()
protected
void
put_int(Stdio.Buffer
buf
, int(8bit)
bits
, int(8bit)
mask
, int
value
)
Encode an integer with the HPack integer encoding.
buf
Output buffer.
bits
Bits that should always be set in the first byte of output.
mask
Bitmask for the value part of the first byte of output.
value
Integer value to encode.
protected
void
put_string(Stdio.Buffer
buf
, string(8bit)
str
)
Encode a string with the HPack string encoding.
buf
Output buffer.
str
String to output.
The encoder will huffman_encode()
the string if that
renders a shorter encoding than the verbatim string.
void
set_dynamic_size(Stdio.Buffer
buf
, int(0..)
new_max_size
)
Set the dynamic maximum size of the dynamic header lookup table.
buf
Output buffer.
new_max_size
New dynamic maximum size in bytes (as calculated by RFC 7541 section 4.1).
This function can be used to clear the dynamic header table by setting the size to zero.
Also note that the new_max_size
has an upper bound that
is limited by static_max_size
.
encode_header()
, encode()
, create()
.