Research · Reverse Engineering

IonCube PHP 8.1–8.4 — Static Opcode Extraction Without Execution

Loader family: ioncube_loader_win_8.x.dll  ·  Targets: PHP 8.1–8.4  ·  v15.5.0  ·  Tools: IDA Pro, Python

ioncube php8.1 php8.2 php8.3 php8.4 static-analysis HR+c MT4IC PRNG6 opcode-xor B180 ida-pro

§ Overview

IonCube is one of the most widely deployed PHP obfuscation systems. It replaces human-readable PHP source with a proprietary binary format; a closed-source loader extension re-materialises the Zend opcode array at runtime. The standard analysis approach — hooking the loader at runtime — requires a running PHP environment, the correct PHP version, and leaves traces the loader can detect.

This article documents a fully static approach developed from the PHP 8.1 loader and then ported to the PHP 8.2, 8.3, and 8.4 v15.5.0 loaders. The Python toolkit decodes the HR+c container, inflates the body, recovers supported main/function/class op_arrays, and emits a normalized icdump-ir-v1 plus a readable opcode listing — without executing the encoded sample.

💡
For the documented v15 record layouts, the outer function record remains structurally parseable even when the inner body blob is encrypted. This exposes names, arguments, variables, and the cipher identifier before body decryption.

Supported loader profiles

Target ABICLI profileZend opcode rangeSerialized integer assignment rule
PHP 8.1--php-version 810–202ASSIGN_OP: subtract encode-time bias 2
PHP 8.2--php-version 820–202ASSIGN and ASSIGN_OP: subtract 2
PHP 8.3--php-version 830–203ASSIGN and ASSIGN_OP: subtract 2
PHP 8.4--php-version 840–209No assignment bias; values are stored verbatim

Each profile selects its own handler-lane tables, opcode metadata, type-dimension table, feature word, interned-string table, and Zend opcode bound. The container constants remain shared; the PHP ABI is read from php_version_code in the decrypted header trailer, not inferred from the HR+c revision word.

ℹ️
Function names and virtual addresses shown in the reverse-engineering walkthrough are from the PHP 8.1 DLL used as the reference case. The maintained profiles contain the relocated data addresses for 8.2–8.4; do not copy an 8.1 address into another DLL.

01 The HR+c Container Format

The tested PHP 8.1–8.4 v15 samples share the same outer HR+c layout. A standard PHP stub opens the file — if the loader is absent the stub executes die() with a human-readable error — and closes with a bare ?>. The four-byte ASCII sequence HR+c follows immediately with no whitespace. Every byte from the H through to end-of-file is a single custom Base64-encoded blob.

sample.php — file layout
<?php //0b9e ...
// IonCube Loader is required ...
die("... loader not installed ..."); ?>HR+c[base64-payload…to-EOF]
                                          ↑ no space here

Finding the format handler in IDA

Two searches bring you directly to sub_1009E160 — the function that validates the marker and decodes the meta-header. All six format constants live inside it:

IDA entry-point searches
Alt+B → ASCII "HR+c"       # string literal compared in the preamble
Alt+I → 0x2853CEF2        # VERSION_XOR — first immediate in the dispatch path

Custom Base64 Alphabet

IonCube shifts digit characters to the front. Any standard Base64 decoder silently produces garbage because its lookup table is wrong:

Alphabet comparison
ioncube:  0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/
standard: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
ioncube_b64decode Python reference
ALPHA = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/"
LUT   = {c: i for i, c in enumerate(ALPHA)}

def ioncube_b64decode(s: str) -> bytes:
    out = bytearray()
    for i in range(0, len(s), 4):
        a = LUT[s[i]];   b = LUT[s[i+1]]
        c = LUT[s[i+2]]; d = LUT[s[i+3]]
        n = (a << 18) | (b << 12) | (c << 6) | d
        out += bytes([(n >> 16) & 0xFF, (n >> 8) & 0xFF, n & 0xFF])
    return bytes(out)   # caller trims any trailing padding bytes

Decoded Payload Structure

Payload layout after Base64 decode
Offset  Size   Field
──────  ─────  ─────────────────────────────────────────────────────────
0x00    4      version_word        uint32 LE — XOR-verified first
0x04    24     meta_header_raw     escape-encoded  →  12 decoded bytes
0x1C    var    chunked_cipher      assembled up to header_size bytes
 +8     8      transition_block    [4 Adler-17 checksum] [4 reserved]
(var)   var    body                MT4IC-encrypted DEFLATE frames
Version word check VERSION_XOR = 0x2853CEF2 · accepted v15 container revisions
version = unpack32le(payload, 0) ^ 0x2853CEF2
if version not in HRC_VERSIONS:   # container revisions, not PHP ABI codes
    raise ValueError("wrong loader version or corrupt file")
🔍
The decoded word selects an accepted HR+c container revision. It does not identify PHP 8.1/8.2/8.3/8.4; that target code is stored in the decrypted header trailer.

02 Meta Header Decoding — Step by Step

Step 1 — Escape decode: 24 raw bytes → 12 decoded bytes

The 24 bytes at offset 0x04 use a one-byte escape scheme designed so 0xFF never appears literally in the output. Walk the input one byte at a time; the encoder always produces exactly 12 decoded bytes:

escape_decode
def escape_decode(raw: bytes) -> bytes:
    out, i = bytearray(), 0
    while i < len(raw):
        b = raw[i]; i += 1
        if b == 0xFF:
            nxt = raw[i]; i += 1
            out.append(0x3C if (nxt & 0x80) else 0xFF)
        else:
            out.append(b)
    return bytes(out)    # guaranteed 12 bytes for the meta-header input

# Example — 6 raw bytes → 4 decoded bytes:
#   0x41        →  literal 'A'  → out: 0x41
#   0xFF 0x9F   →  0x9F & 0x80 ≠ 0  → out: 0x3C
#   0xFF 0x20   →  0x20 & 0x80 = 0  → out: 0xFF
#   0x42        →  literal 'B'  → out: 0x42

Step 2 — Parse the 12-byte meta header

BytesTypeFieldRole
[0 : 4]uint32 LEraw_file_sizeXOR-obfuscated logical file length
[4 : 8]uint32 LEraw_header_sizeXOR-obfuscated encrypted header length
[8 : 12]uint32 LEseedMT4IC PRNG seed; also mixed into both size formulas

Step 3 — Recover true sizes with XOR constants from sub_1009E160

Four constants unmask the real values. seed is intentionally folded into both formulas so the same raw bytes mean different things in every file — without seed you cannot decode either size:

VERSION_XOR0x2853CEF2
HRC_VERSIONS4 accepted v15 revisions
FILE_SIZE_XOR0x23958CDE
FILE_SIZE_OFFSET12321 (0x3021)
HEADER_SIZE_XOR0x184FF593
HEADER_SIZE_OFFSET0x0C21672E
Size formulas sub_1009E160
# file_size: XOR with seed AND constant, then subtract offset
logical_file_size = (raw_file_size  ^ seed ^ FILE_SIZE_XOR)  - FILE_SIZE_OFFSET

# header_size: XOR constant first, subtract offset, then XOR seed
header_size       = ((raw_header_size ^ HEADER_SIZE_XOR) - HEADER_SIZE_OFFSET) ^ seed
📷
IDA Pro · sub_1009E160 — meta header constants
G → 1009E160 → F5 — capture the two XOR + subtract operations on raw_file_size and raw_header_size with constants 0x23958CDE and 0x184FF593.
sub_1009E160 — PHP 8.1 reference meta-header decoder. The same size constants were verified in the 8.2–8.4 v15 loaders.
🔍
In IDA: Alt+I → 0x23958CDE. Both XOR constants and both subtraction offsets appear within ~10 decompiler lines of each other in sub_1009E160. Cross-referencing this function also reaches the chunk parser and the Adler-17 checksum reader.

Step 4 — Chunk parser: assembling header_size bytes of ciphertext

Starting at payload offset 0x1C, variable-length chunks are consumed until exactly header_size bytes of ciphertext are collected. Each chunk begins with a 2-byte control word (flag, second):

Chunk parser — complete
def read_chunks(payload: bytes, pos: int, header_size: int):
    cipher = bytearray()
    while len(cipher) < header_size:
        flag   = payload[pos]
        second = payload[pos + 1]
        pos   += 2
        if flag & 0x80:
            # literal chunk: copy exactly `second` bytes verbatim
            chunk = payload[pos : pos + second]
            pos  += second
            if flag & 0x40:
                chunk = bytes(chunk) + b'\x3C'   # forced suffix byte
        else:
            # fixed chunk: always 0xE3 (227) bytes, `second` is unused
            chunk = payload[pos : pos + 0xE3]
            pos  += 0xE3
        cipher += chunk
    return bytes(cipher[:header_size]), pos   # trim any over-read
flag bitsChunk typePayload lengthSuffix
bit7 = 0Fixed0xE3 bytes
bit7 = 1, bit6 = 0Literalsecond bytes
bit7 = 1, bit6 = 1Literal + suffixsecond bytes0x3C

Step 5 — Transition block and Adler-17 verification

Immediately after the last chunk an 8-byte transition block appears. The first 4 bytes are escape-decoded and compared against the Adler-style checksum of all assembled ciphertext. IonCube's variant initialises both 16-bit accumulators at 17 — not the standard Adler-32 value of 1:

adler17 — sub_10074EF0 init = 17 (mov eax, 11h at function entry)
def adler17(data: bytes) -> int:
    low  = 17          # 17 & 0xFFFF = 17
    high = 0           # 17 >> 16    = 0
    for b in data:
        low  = (low  + b) % 0xFFF1
        high = (high + low) % 0xFFF1
    return low | (high << 16)

stored = unpack32le(escape_decode(transition_block[0:6])[:4])  # first 4 escape-decoded bytes
assert stored == adler17(ciphertext)
⚠️
Using init = 1 (standard Adler-32) fails on the tested v15 fixtures. The value 17 was read directly from the mov eax, 11h at the very top of sub_10074EF0 — it is the first instruction the function executes.

03 Header Encryption: MT4IC (PRNG Type 5)

The assembled ciphertext is decrypted with MT4IC — IonCube's "type 5" PRNG. It combines a linear congruential generator (state t1) with an xorshift (state t2) into a 4 096-entry Complementary Multiply-With-Carry (CMWC) ring buffer.

Identifying MT4IC in IDA

Three constants uniquely fingerprint sub_10090F30:

MT4IC fingerprint constants
COUNT      = 0x1000   # ring size → alloc of 0x4000 bytes (4096 × uint32)
MULTIPLIER = 0x10DCD  # 69069 — Borosh-Niederreiter LCG multiplier
CMWC_MULT  = 0x495E   # 18782 — CMWC carry multiplier

# IDA searches to locate the functions:
#   Alt+I → 0x10DCD   appears in both init (sub_10090F30) and get() (sub_10091060)
#   Alt+I → 0x495E    appears only in get()
#   malloc/VirtualAlloc site with argument 0x4000 → the ring buffer allocation

MT4IC Initialization — sub_10090F30

The initialiser sets up three independent state components, then fills the ring buffer. The parity of seed selects between two xorshift variants for the ring fill — this is the most common implementation mistake:

MT4IC.__init__ — complete sub_10090F30 · 0x10090F30
COUNT      = 0x1000    # 4096 ring entries
MULTIPLIER = 0x10DCD   # 69069 LCG multiplier
CMWC_MULT  = 0x495E    # 18782

def MT4IC_init(seed: int):
    # ── t1: LCG — one warm-up step before the ring fill ──────────────
    t1 = (seed * MULTIPLIER + 0x12D687) & 0xFFFFFFFF

    # ── t2: xorshift — (seed % 9) warm-up rounds ─────────────────────
    t2 = seed
    for _ in range(seed % 9):
        t2 ^= (t2 << 10) & 0xFFFFFFFF
        t2 ^=  t2 >> 15
        t2 ^= (t2 <<  4) & 0xFFFFFFFF
        t2 ^=  t2 >> 13

    # ── carry: initialised as seed modulo CMWC_MULT ──────────────────
    carry  = seed % CMWC_MULT
    parity = seed & 1         # 0 = even, 1 = odd
    ring   = [0] * COUNT

    # ── ring buffer fill — parity selects xorshift variant ───────────
    for i in range(COUNT):
        t1 = (t1 * MULTIPLIER + 0x7B) & 0xFFFFFFFF
        if parity:            # odd seed: 3-step xorshift
            t2 ^= (t2 << 13) & 0xFFFFFFFF
            t2 ^=  t2 >> 17
            t2 ^= (t2 <<  5) & 0xFFFFFFFF
        else:                 # even seed: 3-step xorshift (different constants)
            t2 ^=  t2 >> 9
            t2 ^= (t2 <<  1) & 0xFFFFFFFF
            t2 ^=  t2 >> 7
        ring[i] = (t1 + t2) & 0xFFFFFFFF

    return ring, carry        # index starts at 0
⚠️
The ring-fill xorshift is not the same as the warm-up xorshift. The warm-up uses «10, »15, «4, »13 (4-step); the ring-fill uses «13, »17, «5 (odd) or »9, «1, »7 (even) — both 3-step sequences with completely different constants. A clone that copies the warm-up into the ring fill silently decrypts to garbage for every seed value.
IDA Pro decompiler view of sub_10090F30 — MT4IC PRNG type 5 initialization showing 69069 * a1 + 1234567 constant, seed % 9 warm-up loop, and seed % 0x495E carry init
sub_10090F30 · 0x10090F30 — MT4IC PRNG type 5 initialization. Line 16: a2[2] = 69069 * a1 + 1234567 — t1 seed (MULTIPLIER = 0x10DCD). Lines 20–26: xorshift warm-up for seed % 9 iterations. Line 29: carry = seed % 0x495E. The parity branch for the ring-fill appears further down the function body.

MT4IC get() — CMWC step — sub_10091060

MT4IC.get() sub_10091060 · 0x10091060
def get(self) -> int:
    self.index  = (self.index + 1) % COUNT
    product     = self.ring[self.index] * CMWC_MULT + self.carry   # 64-bit product
    lo, hi      = product & 0xFFFFFFFF, (product >> 32) & 0xFFFFFFFF
    folded      = (lo + hi) & 0xFFFFFFFF
    if folded < hi:          folded = (folded + 1) & 0xFFFFFFFF; hi += 1
    if folded == 0xFFFFFFFF: folded = 0;                         hi += 1
    self.carry              = hi & 0xFFFFFFFF
    value                   = (0xFFFFFFFE - folded) & 0xFFFFFFFF
    self.ring[self.index]   = value
    return value

Header decryption formula

The last 16 bytes of the ciphertext serve a dual purpose: they hold the MD4 digest of the plaintext stored byte-rotated (ROL3), and simultaneously act as a 16-byte repeating XOR mask. Rotating each byte left 3 bits recovers both at once:

Header decrypt — complete
def rol8(b: int, n: int = 3) -> int:
    """Rotate byte left by n bits."""
    return ((b << n) | (b >> (8 - n))) & 0xFF

# Step 1: derive XOR mask from last 16 ciphertext bytes
mask = bytes(rol8(b) for b in ciphertext[-16:])  # ROL3 each byte

# Step 2: decrypt all bytes except the last 16
prng      = MT4IC(seed)
plaintext = bytes(
    ciphertext[i] ^ mask[i & 0xF] ^ (prng.get() & 0xFF)
    for i in range(len(ciphertext) - 16)
)

# Step 3: MD4 integrity gate
# The mask IS the MD4 digest — stored as ROR3(digest) in the ciphertext tail,
# recovered here as ROL3. Wrong seed → wrong PRNG stream → wrong plaintext → wrong digest.
import hashlib
assert hashlib.new('md4', plaintext).digest() == mask
💡
MD4 is chosen for speed, not cryptographic strength — IonCube needs a fast tamper-detection gate, not collision resistance. A wrong seed is expected to fail the MD4 check, giving a strong binary pass/fail signal for any implementation to validate itself against.

04 Decrypted Header: Keys and Flags

After MD4 verification, plaintext is the fully decoded header. Its layout starts with a variable-length private blob (opaque, per-file), followed by the key slot and a fixed 40-byte trailer at the end:

RegionOffsetSizeFieldNotes
Initial fields 0x004 version uint32 LE — format version (e.g. 6 for v15.5.0)
0x044 min_loader_ver uint32 LE — minimum loader version required
0x084 obfuscation_flags uint32 LE — per-file obfuscation control bits
0x0C4 private_tag uint32 LE — opaque tag identifying the private blob
0x104 private_size uint32 LE — byte length of the private blob that follows
Private blob 0x14private_size (private data) Opaque; content varies per encoded file. private_start = 20, private_end = 20 + private_size
Key slot private_end4 bytecode_xor_key uint32 LE — alias request_key; XOR'd into PRNG6 key stream (§9)
private_end + 44 owner_key uint32 LE — per-installation owner identifier
Fixed trailer
(last 40 bytes)
end − 402 reserved
end − 382 php_version_code uint16 — decimal 81, 82, 83, or 84
end − 364 php_flags uint32 bitmask — controls opcode-XOR, line suppression, etc.
end − 324 encoder_version 4 bytes: gen · major · minor · rev (e.g. 0F 05 00 00 = v15.5)
Field extraction
# Fixed prefix: 5 × uint32 = 20 bytes
version           = read_u32(plaintext,  0)
min_loader_ver    = read_u32(plaintext,  4)
obfuscation_flags = read_u32(plaintext,  8)
private_tag       = read_u32(plaintext, 12)
private_size      = read_u32(plaintext, 16)

private_start     = 20
private_end       = private_start + private_size   # dynamic — varies per file

bytecode_xor_key  = read_u32(plaintext, private_end)      # uint32 request_key
owner_key         = read_u32(plaintext, private_end + 4)

trailer           = plaintext[-40:]
php_version_code  = read_u16(trailer, 2)           # 81..84 = PHP 8.1..8.4
php_flags         = read_u32(trailer, 4)
encoder_version   = trailer[8:12]                  # e.g. b'\x0f\x05\x00\x00' → v15.5.0

php_flags bitmask

MaskEffect when set
0x2C80Opcode-XOR layer active — PRNG6 + C3D0/B3C0 processing required for every method
0x0800Line-number field suppressed in every opcode word
💡
private_end = 20 + private_size must be computed dynamically — its value varies per encoded file. The fixed prefix is always 20 bytes (5 × uint32); only the private blob length differs. Never hardcode an absolute offset to reach bytecode_xor_key. In the sample.php fixture the extracted value is 0x0036F936.

T Toolchain: Source Code & Binary Artifact Extraction

The complete path from the HR+c header to normalized static opcode IR is implemented as a small Python 3.11+ toolchain. The input is always parsed as data; sample.php is never loaded or executed by PHP. Core container and body decoding use only the standard library. pefile is required when a loader DLL is supplied for handler-table and interned-string resolution.

Maintained source files

FileRole
decode_hrc_header.pyContainer stage: HR+c decode, MT4IC, MD4, header parse, and body extraction
decode_php_body.pyPHP 8.1–8.4 B180 parser, PRNG6, C3D0/B3C0, handler profiles, interned strings, and dynamic-key records
loader_static_to_icdump_ir.pyNormalize reverse-engineering details into consumer-neutral icdump-ir-v1 JSON
dump_plain_file_static.pyPublic entry point: orchestrates the complete file dump and writes JSON plus a readable opcode listing
ioncube_loader_extractor.pyExtract shared container constants and flag body-profile fields requiring loader-specific analysis
ida_qo9_materialize_strings.pyIDA Pro script — decodes all Qo9 encoded strings in the loader IDB and writes them into a navigable QO9STR segment with callsite comments and xrefs

Available loader DLLs

Target ABILoader artifactSizeUse with
PHP 8.1 ioncube_loader_win_8.1.dll 1.71 MiB--php-version 81
PHP 8.2 ioncube_loader_win_8.2.dll 1.72 MiB--php-version 82
PHP 8.3 ioncube_loader_win_8.3.dll 1.73 MiB--php-version 83
PHP 8.4 ioncube_loader_win_8.4.dll 1.79 MiB--php-version 84

Select the DLL whose minor version matches php_version_code in the protected file. The dumper infers this ABI from the header and rejects a recognizably mismatched ioncube_loader_win_8.x.dll filename.

Step 0 — Fingerprint the loader DLL

Before decoding files from a new loader build, run ioncube_loader_extractor.py directly against the DLL. It contains a minimal PE parser, uses only the Python standard library, and does not require IDA. The script locates the version-dispatch chain and the encoded file/header-size formulas, then prints constants ready to compare with decode_hrc_header.py.

Loader fingerprint ioncube_loader_extractor.py
python ioncube_loader_extractor.py ioncube_loader_win_8.x.dll

[1] VERSION DISPATCH
  VERSION_XOR  = 0x2853CEF2
  HRC_VERSIONS = { ... four accepted revisions ... }

[2] FILE SIZE CONSTANTS
  FILE_SIZE_XOR    = 0x23958CDE
  FILE_SIZE_OFFSET = 12321

[3] HEADER SIZE CONSTANTS
  HEADER_SIZE_XOR    = 0x184FF593
  HEADER_SIZE_OFFSET = 0x0C21672E

[4] BODY CONSTANTS
  VARIANT_TABLES       = TODO
  OPCODE_HANDLER_META  = TODO
  OPCODE_META_ID       = TODO
  TYPE_DIMENSION_TABLE = TODO
  GLOBAL_FEATURE_WORD  = TODO
  STRING_XOR_KEY       = TODO
  STRING_POINTER_TABLE = TODO
On the tested 8.1–8.4 v15.5.0 DLLs, the container constants above are identical. The TODO fields are intentionally not guessed: their addresses move between DLLs and must be recovered from the matching body-decoder, handler, and interned-string paths in IDA before adding a new profile.

Step 1 — Decode the file and extract binary artifacts

decode_hrc_header.py implements the complete container stage. Running it produces a JSON report on stdout and optionally writes the decrypted header bytes and inflated body to disk for inspection:

CLI usage — decode + dump decode_hrc_header.py
# Inspect the decoded header and inflated body independently
python decode_hrc_header.py sample.php \
    --header-out header.bin \
    --body-out   body.bin

# What you get:
#   header.bin  —  decrypted, MD4-verified header  (≈ 399 bytes for v15.5.0)
#   body.bin    —  inflated PHP function records (plain binary stream)
#   stdout      —  JSON report with every field decoded

Step 2 — Decode every op array and build the static IR

The public entry point repeats the verified header/body decode internally, parses the main script and function records, resolves operands and jump targets, then emits both the human-readable opcode dump and normalized, consumer-neutral IR.

Full static opcode dump dump_plain_file_static.py
python dump_plain_file_static.py sample.php \
    --loader-dll path/to/ioncube_loader_win_8.x.dll \
    --out-dir sample_dump

# Output:
#   sample_dump/sample.icdump.json  — icdump-ir-v1
#   sample_dump/sample.icdump.txt   — readable opcode listing
#   stdout must end with: done: N decoded, 0 failed

The target ABI is read from the header; an explicit --php-version must agree with it, and the supplied loader DLL must match. The maintained toolchain stops at static IR.

Step 3 — Reconstruct readable PHP

Source reconstruction lives in a separate companion project. It consumes the generated icdump-ir-v1, builds basic blocks and a control-flow graph, lowers the result to a PHP AST, and emits formatted PHP source.

GH dawwinci/ioncube-php8-decompiler PHP 8.1–8.4 opcode IR → CFG → PHP AST → readable PHP source View on GitHub ↗
Static IR to PHP source php-reconstruct
php-reconstruct sample_dump/sample.icdump.json \
    -o sample_dump/sample.reconstructed.php

Key code snapshots from decode_hrc_header.py

① Custom Base64 decode

The IonCube alphabet reorders digits to the front. The implementation translates to the standard alphabet first, then delegates to Python's base64 module — no manual bit-twiddling needed:

decode_custom_base64 decode_hrc_header.py:46
CUSTOM_B64   = b"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
STANDARD_B64 = b"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"

def decode_custom_base64(data: bytes) -> bytes:
    compact = b"".join(data.split())               # strip whitespace / newlines
    translation = bytes.maketrans(CUSTOM_B64, STANDARD_B64)
    return base64.b64decode(compact.translate(translation), validate=True)

② Escape decode: 24 raw → 12 header bytes

decode_escaped_block decode_hrc_header.py:52
def decode_escaped_block(data: bytes, output_size: int) -> bytes:
    output = bytearray()
    position = 0
    while len(output) < output_size:
        value = data[position]; position += 1
        if value == 0xFF:
            value = 0x3C if data[position] & 0x80 else 0xFF
            position += 1
        output.append(value)
    return bytes(output)

# Called as: decoded = decode_escaped_block(payload[4:28], 12)
# Result: struct.unpack("<III", decoded)  →  (raw_file_size, raw_header_size, seed)

③ Adler-17 checksum (init = 17)

loader_checksum / update_loader_checksum decode_hrc_header.py:69
def update_loader_checksum(checksum: int, data: bytes) -> int:
    low  = checksum & 0xFFFF
    high = checksum >> 16
    position = 0
    while position < len(data):
        end = min(position + 5552, len(data))
        for value in data[position:end]:
            low  += value
            high += low
        low  %= 0xFFF1
        high %= 0xFFF1
        position = end
    return low | (high << 16)

def loader_checksum(data: bytes) -> int:
    return update_loader_checksum(17, data)   # init = 17, NOT standard Adler-32's 1

④ MT4IC: the parity branch in _round_t2

This is the most critical implementation detail. The warm-up uses a different xorshift from the ring-fill, and the ring-fill itself has two variants:

MT4IC._round_t2 — parity-selected xorshift decode_hrc_header.py:196
def _round_t2(self, value: int) -> int:
    if self.odd_seed:           # seed & 1 == 1
        value = u32(value ^ u32(value << 13))
        value = u32(value ^ (value >> 17))
        return u32(value ^ u32(value << 5))
    # seed & 1 == 0
    value = u32(value ^ (value >> 9))
    value = u32(value ^ u32(value << 1))
    return u32(value ^ (value >> 7))

# Compare with the warm-up (seed % 9 rounds before ring fill):
#   t2 ^= (t2 << 10); t2 ^= (t2 >> 15); t2 ^= (t2 << 4); t2 ^= (t2 >> 13)
#   ← 4-step, completely different constants.
# A clone that copies the warm-up into the ring fill is wrong for every seed.

⑤ Full header decode pipeline — decode_file

The complete pipeline in one function — Base64 → version check → meta header → chunks → Adler-17 → MT4IC decrypt → MD4 verify → field parse:

decode_file — pipeline core decode_hrc_header.py:565
def decode_file(path: Path) -> tuple[dict, bytes]:
    source = path.read_bytes()
    marker_offset = source.find(b"HR+c")           # ① locate marker
    payload = decode_custom_base64(source[marker_offset:])  # ② custom b64

    raw_version = read_u32(payload, 0)
    version = raw_version ^ VERSION_XOR             # ③ version check
    if version not in HRC_VERSIONS:
        raise ValueError(f"unexpected HR+c version 0x{version:08X}")

    meta = decode_meta_header(payload[4:28])        # ④ escape-decode + sizes
    encrypted_header, consumed, chunks = unchunk_header(
        payload[28:], meta.header_size              # ⑤ chunk assembly
    )
    stored_ck = read_u32(decode_escaped_block(transition, 4), 0)
    assert stored_ck == loader_checksum(payload[4:transition_offset])  # ⑥ Adler-17

    checksum = bytes(rol8(v) for v in encrypted_header[-16:])  # ⑦ ROL3 mask
    prng = MT4IC(meta.seed)
    decrypted = bytes(                              # ⑧ MT4IC XOR decrypt
        encrypted_header[i] ^ checksum[i & 0xF] ^ (prng.get() & 0xFF)
        for i in range(meta.header_size - 16)
    )
    assert md4(decrypted) == checksum              # ⑨ MD4 integrity gate

    initial = parse_initial_header(decrypted)      # ⑩ extract key fields
    trailer  = parse_header_trailer(decrypted)
    return report, decrypted

⑥ Extracting bytecode_xor_key from the decrypted header

parse_initial_header decode_hrc_header.py:463
def parse_initial_header(data: bytes) -> InitialHeader:
    version           = read_u32(data, 0)
    private_tag       = read_u32(data, 12)
    private_size      = read_u32(data, 16)
    private_start     = 20
    private_end       = private_start + private_size   # priv_end is dynamic

    bytecode_xor_key  = read_u32(data, private_end)    # ← the request_key
    owner_key         = read_u32(data, private_end + 4)

    # In sample.php: bytecode_xor_key = 0x0036F936

⑦ Full trailer — 40 bytes at end of decrypted header

parse_header_trailer decode_hrc_header.py:522
def parse_header_trailer(data: bytes) -> HeaderTrailer:
    trailer = data[-40:]
    php_format_id, php_version_code = struct.unpack_from("<HH", trailer, 0)
    # php_version_code: 81..84 → "8.1".."8.4" (integer, not ASCII)

    return HeaderTrailer(
        php_format_id     = php_format_id,
        php_version_code  = php_version_code,          # target ABI profile
        php_flags         = read_u32(trailer, 4),       # 0x2C80 = opcode-XOR on
        encoder_generation= trailer[8],                 # 15
        encoder_major     = trailer[9],                 # 5
        encoder_minor     = trailer[10],                # 0
        ip_address        = ".".join(str(v) for v in trailer[20:24]),
        mac_address       = ":".join(f"{v:02x}" for v in trailer[24:30]),
        is_demo           = bool(trailer[30]),
        ...
    )

⑧ Sample JSON report (abbreviated)

stdout — JSON report excerpt python decode_hrc_header.py sample.php
{
  "raw_version":   "0x97A6BD55",
  "decoded_version": "0x4FF571B7",           // accepted HRC revision ✓
  "meta_header": {
    "seed":              "0xA3F1C820",
    "logical_file_size": 14832,
    "header_size":       399
  },
  "transition_checksum_matches": true,        // Adler-17 ✓
  "md4_matches": true,                        // MT4IC + MD4 ✓
  "initial_header": {
    "bytecode_xor_key": "0x0036F936",         // request_key used in §9 C3D0
    "version": 6
  },
  "header_trailer": {
    "php_version_code":  81,
    "inferred_php_version": "8.1",
    "php_flags": "0x00002C80",                // opcode-XOR layer active
    "encoder_generation": 15,
    "encoder_major":      5
  },
  "body": {
    "compressed_size":   9841,
    "decompressed_size": 41376,
    "all_checksums_match": true               // body Adler-17 frames ✓
  }
}
💡
All three correctness gates must pass together: transition_checksum_matches (Adler-17 on chunk stream), md4_matches (MT4IC decrypt integrity), and body.all_checksums_match (Adler-17 on DEFLATE frames). A wrong constant normally fails at the nearest gate, making these useful diagnostic markers during porting.

IDA Pro: ida_qo9_materialize_strings.py

The IonCube loader encodes every internal string constant using a proprietary length-prefixed XOR scheme called Qo9. Raw encoded bytes sitting in .rdata and .data are opaque to IDA's built-in string scanner — they never appear in the Strings window and carry no useful name. ida_qo9_materialize_strings.py reverses the encoding and writes the results directly into the open IDB so they become first-class IDA citizens.

What Qo9 encoding looks like

Two variants exist in the loader, distinguished by how the length byte is stored:

CodecLength byteKey sizeTerminator
qo9_16 plain raw[0] 16-byte rotating key encoded NUL at raw[len+1]
qo9_32_xor48 raw[0] ^ 0x48 32-byte rotating key none (no trailing terminator)

Both variants XOR each payload byte against a position-indexed key: out[i] = KEY[(offset + i) & mask] ^ raw[i+1]. The key bytes were recovered by reversing ic_qo9_decode_len16 (sub_44B747) and ic_qo9_decode_xor48_32 (sub_44B875) in IDA.

What the script does to the IDB

  1. Scans .text, .rdata, and .data for valid Qo9 blobs (both codecs).
  2. Creates a new IDA segment named QO9STR at the end of the address space.
  3. Writes each decoded string there as a NUL-terminated C string and defines it with idc.create_strlit().
  4. Names each string: qo9_<orig_ea>_<codec>_<text>.
  5. Adds a repeatable comment on the original encoded location: DECODED_QO9 qo9_16 → QO9STR:0x…: "dynamic_key".
  6. Adds a comment on every callsite that had a data-ref to the encoded blob: DECODED_QO9: "dynamic_key".
  7. Adds data xrefs from each callsite to the decoded string in QO9STR.
  8. Calls ida_strlist.build_strlist() so all decoded strings appear immediately in the IDA Strings window.

How to run it

The script requires the loader DLL to already be open in IDA Pro. Paste either line into the IDA Python console (File → Script command… or the bottom bar):

IDA Python console
# One-liner — run from IDA Python console
exec(open(r"path\to\ida_qo9_materialize_strings.py", encoding="utf-8").read()); main()
⚠️
A plain exec(open(...).read()) without calling main() only defines the functions — it does not run the scan.

Output — what you see in IDA after running

Where in IDAWhat appears
View → Open Subviews → Strings All decoded strings listed under segment QO9STR, navigable by double-click
Disassembly — encoded blob address Repeatable comment: DECODED_QO9 qo9_16 → QO9STR:0x…: "rijndael"
Disassembly — every callsite with a dref Regular comment: DECODED_QO9: "rijndael"
Xrefs (X key) on any callsite Cross-reference to the decoded string in QO9STR
JSON output file ida_qo9_materialized.json — full record per string: orig address, codec, text, xrefs

Result on ioncube_loader_win_8.x.dll

Running against a matching PHP 8.x loader with both codecs enabled recovers encoded strings across .text, .rdata, and .data; the total varies between loader builds. Strings surfaced include cipher names (rijndael, blowfish, cast, des), key-mode identifiers (dynamic, basic, random), PHP extension entry-points, error message templates, and internal state labels — all previously invisible to IDA's string scanner.

💡
After running the script, searching the Strings window for dynamic, cipher, or seed immediately surfaces the functions that implement dynamic-key selection and body cipher dispatch — the same functions described in §10. The decoded strings collapse hours of manual xref tracing into a few double-clicks.

05 Body Decryption: Framed DEFLATE Stream

The body opens with two 32-bit seeds (primary_seed, secondary_seed), then a frame sequence. A separate MT4IC seeded from primary_seed decrypts the frames:

Body frame loop
prng = MT4IC(primary_seed)
compressed = bytearray()

while pos < len(payload):
    flag, second = payload[pos], payload[pos + 1]
    pos += 2

    if flag < 0x80:
        # encrypted chunk: XOR `second` bytes with MT4IC byte stream
        for b in payload[pos : pos + second]:
            compressed.append(b ^ (prng.get() & 0xFF))
        pos += second

    elif (flag & 0xE0) == 0xA0:
        # Adler checksum marker — 4 escape-decoded bytes (verification point)
        stored = decode_escaped(payload[pos - 1:])
        assert stored == running_checksum

    elif (flag & 0xE0) == 0x80:
        compressed.append(second)   # literal byte pass-through

    elif (flag & 0xE0) == 0xC0:
        compressed.append(0x3C)     # literal '<'

body = zlib.decompress(bytes(compressed), wbits=-15)   # raw DEFLATE (no zlib header)

06 The Inflated Body: Function Record Layout

The inflated body is a flat byte array containing one serialized record per PHP function/method:

Function record structure
uint32   zero_sentinel       # always 0x00000000
uint32   blob_length         # size of inner body blob
uint32   outer_key_a         # PRNG6 seed_a for opcode-XOR
uint32   outer_key_b         # PRNG6 seed_b for opcode-XOR
[outer descriptor]           # variable-width → sub_10002FC0
uint16   name_length
char[]   table_name          # lowercase function name ("addtocart")
uint32   last_var, temp_var_count, outer_literal_count
uint8    num_args, required_num_args
uint32   fn_flags
[arg descriptors × num_args]
[variable name strings]
uint8    extra_flags
[optional filename string]
uint32   blob_tag            # "BLOB"
uint8[]  blob_data[blob_length]
💡
In the documented v15 layouts, fields before blob_data are parsable. Variable names, argument count, line range, and cipher id are available even when the blob remains encrypted — the basis of the static classification step.

07 Outer Descriptor & Cipher Selector

The outer descriptor is a variable-width structure that the encoder appends to each record. It terminates with two 32-bit words that encode cipher selection. IDA decompiles sub_10002FC0 as a sequential reader:

Outer descriptor reader sub_10002FC0 · 0x10002FC0
tag         = read_u8()
payload_len = read_u32()
payload     = read_bytes(payload_len)       # opaque, skipped
item_count  = read_u32()
for _ in range(item_count):
    item_len = read_u32()
    item     = read_bytes(item_len)         # per-item blob, skipped
word10      = read_u32()                    # cipher_id  material
word11      = read_u32()                    # cipher_arg material
📷
IDA Pro · sub_10002FC0 — outer descriptor  [Priority 4]
G → 10002FC0 → F5 — capture the full function. The final two 4-byte reads at the bottom (word10, word11) are the cipher selector material.
sub_10002FC0 — outer descriptor reader. Clean sequential structure: tag → payload → item list → word10 → word11.

Cipher Selector Decode

Cipher id derivation
context    = len(table_name) + 1    # lowercase function name
cipher_id  = word10 ^ context
cipher_arg = word11 ^ context
cipher_idAlgorithmStatic resolution
0PRNG6 XOR (cipher0)Seed finder applicable — fully static
1Rijndael-128Requires dynamic key material
2CAST-256Requires dynamic key material
3BlowfishRequires dynamic key material
4CAST-128Requires dynamic key material
5Triple-DESRequires dynamic key material
6TwofishRequires dynamic key material

Static cipher-type detection — blob geometry

Even without decrypting the body blob, the cipher type is recoverable from two fields present in the documented outer function record: blob_tag (a u32 read immediately before the blob data) and blob_len (the byte length of the blob itself).

Block ciphers such as Rijndael-128 prepend a 16-byte initialisation vector to the ciphertext, making the stored blob exactly 16 bytes larger than blob_tag. Stream-cipher (cipher0 / basic) blobs carry no such overhead, so both values are equal.

blob geometry → cipher type
blob_tag = u32 read immediately before blob_data in the outer record
blob_len = len(blob_data)

if blob_len == blob_tag + 16:
    cipher_type = "random"   # block cipher (Rijndael class) — 16-byte IV overhead
elif blob_len == blob_tag:
    cipher_type = "basic"    # stream cipher (cipher0) — no block overhead

Verified on two distinct function types from the same encoded file:

Functionblob_tagblob_lenDeltaDetected type
fn_A26002616+16random (Rijndael)
fn_B18351851+16random (Rijndael)
fn_C6036030basic (cipher0)

What the static analyser recovers from an encoded function

In the documented v15 layouts, the outer function record remains parsable regardless of cipher type. It carries the function name (table_name), argument count, parameter names, and the PRNG6 opcode-seed pair (outer_key_a / outer_key_b). The body blob and its literals remain opaque until the runtime key is known.

The static dump records a typed IR stub for a function whose body cannot be decrypted, preserving its signature and the reason recovery stopped:

Encoding type fn name arg names line range opcodes literals body
plain · cipher0
basic · stream key ✓ * stub
random · Rijndael ✓ * stub

* line range unavailable when php_flags & 0x800 (strip-line-numbers flag set by encoder)

sample.php — dynamic-key status in static IR
"static_dump_status": {
  "failed": true,
  "dynamic_key_stub": true,
  "cipher_type": "random",
  "blob_tag": 2600,
  "blob_len": 2616
}
💡
Parameter names and arity in the IR stub come from the outer record rather than the encrypted body blob. A basic-key function whose seed is subsequently found can be fully decrypted and the stub replaced with the real body.

08 The B180 Opcode Block (sub_1009B180)

B180 binary structure sub_1009B180 · 0x1009B180
struct B180Block:
    opcode_count  : uint32           // number of opcodes
    word_count    : uint32           // number of encoded words
    words         : uint32[word_count]
    aux_count     : uint32           // number of auxiliary records
    aux_records   : uint8[aux_count × 5]   // 5 bytes each

IDA Pro decompiled output (condensed):

sub_1009B180 — IDA pseudocode (condensed) 0x1009B180
read(&tmp, 4);  a2[1] = tmp;         // opcode_count
read(&tmp, 4);  a2[5] = tmp;         // word_count
if (tmp) a2[4] = alloc(4 * tmp);     // words[] buffer
read(&tmp, 4);  a2[7] = tmp;         // aux_count
if (tmp) a2[6] = alloc(5 * tmp);     // aux_records[] (5 bytes each)
📷
IDA Pro · sub_1009B180 — B180 block reader  [Priority 3]
G → 1009B180 → F5 — capture the four sequential reads. The alloc(5 × aux_count) call is the key proof of the 5-byte aux record format.
sub_1009B180 — B180 block reader. alloc(5 × aux_count) confirms 5-byte auxiliary records.

Encoded Word Bit Layout

Encoded word (32-bit) bit fields
bits [7:0]    encoded opcode byte      (XOR-obfuscated — recovered by B3C0)
bit  [8]      result operand present   (consume 1 aux record if set)
bit  [9]      op1    operand present   (consume 1 aux record if set)
bit  [10]     op2    operand present   (consume 1 aux record if set)
bits [12:11]  extended_value mode:     00→0   01→1   10→0x3C   11→next word
bits [31:16]  line number              (0xFFFF → line num is in next word)
💡
For serialization v6+, each encoded word is followed by a handler word — each opcode consumes two consecutive words, not one. This explains why word_count can be up to 2 × opcode_count.

Auxiliary Record Format

Aux record layout + operand type constants
struct AuxRecord:
    type  : uint8     # IS_UNUSED=0  IS_CONST=1  IS_TMP_VAR=2  IS_VAR=4  IS_CV=8
    value : uint32

# Jump target normalisation (sub_1009B200)
# One-based (JMP, JMPZ, JMPNZ, JMPZNZ, FE_FETCH_R/RW):
#     zero_based = stored_value - 1
# Zero-based (FE_RESET_R, FE_RESET_RW):
#     zero_based = stored_value

09 Opcode Obfuscation: PRNG6 + C3D0 / B3C0

When php_flags & 0x2C80 != 0, every opcode byte in the B180 word stream is XOR-obfuscated. Key generation is in sub_1009C3D0 (C3D0); decoding is in sub_1009B3C0 (B3C0). The PRNG is PRNG6 — two 32-bit half-word LCGs with interleaved output.

PRNG6 — sub_10091150

PRNG6.next() sub_10091150 · 0x10091150
class PRNG6:
    def __init__(self, seed_a, seed_b):
        self.seed_a = seed_a & 0xFFFFFFFF
        self.seed_b = seed_b & 0xFFFFFFFF

    def next(self):
        # two independent half-word LCGs
        next_b = ((self.seed_b & 0xFFFF) * 0x7689 + (self.seed_b >> 16)) & 0xFFFFFFFF
        next_a = ((self.seed_a & 0xFFFF) * 0x4650 + (self.seed_a >> 16)) & 0xFFFFFFFF
        self.seed_b, self.seed_a = next_b, next_a
        # combine: ROL32(next_b, 16) + next_a
        return ((next_b << 16 | next_b >> 16) + next_a) & 0xFFFFFFFF
📷
IDA Pro · sub_10091150 — PRNG6 next()  [Priority 5]
G → 10091150 → F5 — capture both half-word LCG steps. The multipliers 0x7689 and 0x4650 must be visible as immediates. Also capture the ROL32 + add at the bottom.
sub_10091150 — PRNG6 next(). Two independent half-word LCGs combined via ROL32 + add.

C3D0 — Key Stream Generation (sub_1009C3D0)

C3D0 key stream sub_1009C3D0 · 0x1009C3D0
prng = PRNG6(outer_key_a, outer_key_b)   # seeds from outer record
key_bytes = bytearray()
for _ in range(opcode_count + 1):         # N+1 dwords
    dword = (prng.next() ^ request_key) & 0xFFFFFFFF
    key_bytes.extend(dword.to_bytes(4, 'little'))

# key_bytes[0 .. opcode_count-1]   → per-opcode XOR byte
# key_bytes[opcode_count ..]       → sentinel generation material (v6)
📷
IDA Pro · sub_1009C3D0 — C3D0 key setup  [Priority 6]
G → 1009C3D0 → F5 — capture the loop over opcode_count + 1 iterations, the PRNG6 call (sub_10090A80), and the XOR with the request_key from the context struct.
sub_1009C3D0 — C3D0 key stream generation. PRNG6 XOR'd with request_key produces the per-opcode byte key.

B3C0 — Opcode Recovery with Dual Sentinel (sub_1009B3C0)

The sentinel mechanism is the most subtle part of the pipeline. It makes the XOR stream self-synchronising: any opcode that would collide with the sentinel forces the key to zero, ensuring the sentinel value passes through unmodified.

IDA decompiles the batch opcode path (the do-while in the LABEL_128 else-branch) with the sentinel checks in their clearest form:

sub_1009B3C0 — IDA pseudocode (sentinel do-while, LABEL_128 path) 0x1009B3C0
v59 = (_BYTE *)(v13 + 24);    // first opcode slot in output
do
{
    v60 = v91[4 * v58];        // v91 = aux_records; v60 = raw opcode byte
    *v59 = v60;                // default: store raw (overwritten below if XOR applies)
    v61 = v6[8];               // v6[8] = key_bytes[] generated by C3D0

    if ( v60 == -107 || v60 == -90 )    // -107=0x95, -90=0xA6 — STAGE 1 SENTINEL
    {
        *(_BYTE *)(v61 + v58) = 0;      // zero the key byte; result stays raw (v60)
    }
    else
    {
        v62 = (_BYTE *)(v61 + v58);
        v63 = v60 ^ *(_BYTE *)(v61 + v58);   // XOR raw with key byte

        if ( v63 == -107 || v63 == -90 )     // STAGE 2 SENTINEL: XOR result IS sentinel
            *v62 = 0;                          // zero key; *v59 still holds raw v60
        else
            *v59 = v63;                        // store decoded opcode
    }
    ++v58;
    v59 += 28;                 // advance to next opcode slot (28-byte opcode struct)
}
while ( v58 < v6[1] );        // v6[1] = opcode_count

The main while(1) loop in the same function handles the same logic per-opcode with the v5/v6 version split: v24 = -107 for version < 6 (constant sentinel), or key_bytes[N+delta+i] ^ 0x95 for v6 (position-dependent sentinel). The second check uses v26 with the same v5/v6 distinction.

IDA Pro decompiler view of sub_1009B3C0 sentinel do-while loop
sub_1009B3C0 · 0x1009B3C0 — dual sentinel check, batch opcode path. Line 173: if (v60 == -107 || v60 == -90) — stage 1, raw opcode IS sentinel, key byte zeroed. Line 180: v63 = v60 ^ *(_BYTE *)(v61 + v58) — XOR with key. Line 181: if (v63 == -107 || v63 == -90) — stage 2, XOR result WOULD BE sentinel, key zeroed and result falls back to raw. −107 = 0x95, −90 = 0xA6 as unsigned bytes.
💡
Without reversing the two-stage sentinel check, opcode decoding fails silently for any method whose encoded opcodes collide with 0x95. The failure manifests as random wrong opcodes — easy to mistake for a bad key or wrong seed.

10 Dynamic-Key Methods

IonCube's encoder supports two dynamic-key modes that change what protection is applied to each method's encrypted body blob. Both add a runtime gate: the host application must pass the correct key to the loader before it will execute the file. But from a static analysis perspective the two modes are fundamentally different — one leaves the body completely recoverable from disk, the other does not. The full specification is in the IonCube Source Encoder User Guide (PDF).

Basic keys — body seeds live in the outer record

At encode time the encoder assigns a fixed key and derives two body-decrypt seeds from it. Those seeds are stored inside the outer function record (the CD30/CB50 prelude block that precedes the body blob). The MT4IC PRNG is seeded with those values and its output keystream is XOR-applied to the body blob to produce the plaintext opcode stream.

Because the seeds sit in the outer record — which is itself protected only by the MT4IC header layer we already broke in §04 — a static analyst can fully recover them from the encoded file on disk:

  1. Decode the HR+c header (Base64 strip + XOR pass).
  2. Break the MT4IC PRNG layer to reveal the file header and extract bytecode_xor_key.
  3. Parse the B180 body to reach the method's outer record (CD30/CB50); read outer_key_a, outer_key_b, and the body decrypt seeds.
  4. Seed a second MT4IC PRNG instance with those seeds → generate keystream → XOR with body blob → plaintext opcodes.

The runtime key gate (the host app calling ioncube_loader_iset_request_key() or ioncube_read_file()) is a separate mechanism: it controls whether the loader will hand execution to the script, but it does not hide the body decrypt seeds from a static reader. If the loader receives the wrong runtime key it will feed a bad value into its PRNG6 XOR pass and produce garbage opcodes — but that is entirely independent of our ability to decrypt the body blob directly from the file.

💡
For a basic-key file: the body blob is self-contained. Every seed needed to decrypt it is stored in the outer record. A static analyst does not need the loader binary, a memory dump, or the runtime key — just the encoded .php file on disk.

Random keys — body key is RSA-wrapped inside the file

For random-key functions the encoder does not store the body decrypt key in plaintext inside the outer record. Instead the body blob is encrypted under a session key that itself is wrapped with the loader's embedded RSA public key. The RSA-encrypted key blob is stored in the encoded file, but the only party that can unwrap it is the loader binary — because it holds the matching RSA private key.

The decryption chain the loader performs at runtime is:

  1. Host supplies the per-request random key to the loader via the standard API.
  2. Loader combines the runtime key with file-specific material to derive the Rijndael session key.
  3. Loader unwraps the RSA blob with its embedded private key to verify or reconstruct the body cipher key.
  4. Rijndael-decrypt the body blob (16-byte IV prepended → hence blob_len = blob_tag + 16) to recover opcodes.

From the encoded file alone a static analyst can still read the outer record metadata — parameter names, variable names, opcode counts, and the function signature are all present in the outer record's unencrypted fields. But the body blob itself is Rijndael-encrypted and the session key is inaccessible without the loader's RSA private key.

⚠️
For a random-key file: the body blob is not recoverable from disk alone. To read opcodes you need either the loader binary (reverse it to extract the embedded RSA private key) or a memory dump taken while the loader is actively executing the script.

Side-by-side: what each mode exposes statically

What you can recover Plain (cipher 0) Basic (stream) Random (Rijndael)
Parameter names
Local variable names
Opcode count / literal count
Body decrypt seeds (outer record) ✗ RSA-wrapped
Opcodes — from disk file alone
Opcodes — with loader binary ✓ (extract RSA private key)
Opcodes — from memory dump ✓ (post-decryption)

Blob geometry as a cipher detector

Before spending effort on seed-search you can tell the two modes apart by measuring the body blob geometry in the outer record:

RelationshipCipherReason
blob_len == blob_tag Basic — stream XOR Stream cipher produces no size overhead; ciphertext length equals plaintext length
blob_len == blob_tag + 16 Random — Rijndael CBC A 16-byte IV is prepended to the ciphertext; body grows by exactly one block

What the static analyser sees: bytecode_xor_key

Regardless of which mode the encoder used, the decrypted file header contains a 32-bit field called bytecode_xor_key (alias request_key). The loader passes this value into sub_1009C3D0, which XORs it into the PRNG6 key stream before applying that stream to every opcode word.

Statically, we recover bytecode_xor_key directly from the header — it sits behind the MT4IC PRNG layer we already broke, not behind the runtime key gate. For Basic-mode files this value is stable across every execution of the protected file. For Random-mode files it reflects the key used in the specific captured copy; any other execution will carry a different value.

💡
Dynamic-key protection is a runtime gate, not a static obfuscation. For the documented format, bytecode_xor_key is recoverable from the header after decoding the MT4IC layer. For basic-key files the body is also fully recoverable from disk. For random-key files the gate is real: the body cipher key is not present in the file in any usable form without the loader's embedded RSA private key.

sample.php: a concrete basic-key example

The method below is a basic-key encoded function: its body decrypt seeds are stored in the outer record (CD30/CB50 prelude) and are fully recoverable statically. addToCart from sample.php is the verified example:

FieldValue
Record start in body0x186
Body blob offset / length0x285 / 2193 bytes
table_nameaddtocart
outer_key_a / outer_key_b0x46984AFE / 0x7D36EC86
body decrypt seeds0x3DD1078F / 0x4996639D
request_key (from header)0x0036F936
Opcode count / Literal count47 / 32
Local variables$productId, $quantity, $options, $product, $itemKey

blob_len == blob_tag (stream cipher, no overhead) confirms this is a basic-key function. The two body decrypt seeds feed directly into the MT4IC PRNG to produce the keystream that decrypts the blob.

Decoded Opcode Listing (first 10 of 47)

addToCart — static opcode dump 47 opcodes total
idx  line  op    mnemonic                operands
0000    6    63   ZEND_RECV               res=$productId
0001    6    64   ZEND_RECV_INIT          op2=lit[0]=1          res=$quantity
0002    6    64   ZEND_RECV_INIT          op2=lit[1]=[]         res=$options
0003    7    20   ZEND_IS_SMALLER         op1=$quantity  op2=lit[2]=1
0004    7    43   ZEND_JMPZ               op1=~0         op2→idx6
0005    7    62   ZEND_RETURN             op1=lit[3]=False
0006    9    89   ZEND_FETCH_IS           op1=lit[4]='_SESSION'
0007    9   115   ZEND_ISSET_ISEMPTY_DIM  op1=~1         op2=lit[5]='cart'
0008    9    14   ZEND_BOOL_NOT           op1=~2
0009    9    43   ZEND_JMPZ               op1=~3         op2→idx13
...    [37 more opcodes]

The full 47-opcode listing shows a function that:

  • Validates $quantity >= 1 (returns false otherwise)
  • Initialises $_SESSION['cart'] if absent
  • Calls getProductDetails($productId)
  • Builds a unique item key via md5(json_encode($options))
  • Assembles a cart-item array: id, name, price, quantity, subtotal
  • Calls calculateCartTotal() and returns true

11 Static Dump Validation

The end-to-end fixture is named sample.php. Validation is based on the generated structure rather than on source-code comparison or runtime instrumentation.

sample.php — validation gates
CONTAINER
  transition Adler checksum     PASS
  decrypted-header MD4          PASS
  body-frame Adler checksums    PASS

STATIC DUMP
  main op_array                 decoded
  function op_arrays            decoded
  literals / CVs / operands     resolved
  jump targets                  normalized
  failed op_arrays              0
  sample.icdump.json            valid icdump-ir-v1
  sample.icdump.txt             readable opcode listing

12 Version Profiles: PHP 8.1–8.4

The v15 container path is shared by the tested loaders, but body materialization is ABI-specific. Treating every target as “8.1 plus a different DLL” produced subtle errors: wrong handler candidates, unresolved interned strings, and integer literals shifted by two. The maintained decoder now selects an explicit profile:

ComponentShared or versioned?Profile data
HR+c framing, Base64, MT4IC, checksum constantsShared in tested v15 loadersOne container implementation; accept known HRC_VERSIONS
Handler-lane decodeVersionedSeven table pairs, opcode metadata, type dimensions, feature word
Interned stringsVersioned addressesXOR-key address and pointer-table address for each DLL
Zend opcode setVersioned by PHP ABI8.1/8.2: 0–202; 8.3: 0–203; 8.4: 0–209
Serialized integer assignmentsVersioned by PHP ABI8.1: ASSIGN_OP +2; 8.2/8.3: both assignment forms +2; 8.4: no bias
Verification strategy: the three built-in checksum assertions — transition Adler, body chunk Adlers, and MD4 header digest validate the container layers. Handler-variant scoring, zero unknown opcode names, and cross-version fixture comparison validate the ABI layer independently.

Stable foundation, versioned edges

  • The custom Base64 alphabet and 0xFF escape codec are shared by the tested v15 files.
  • Container checksums remain strong gates, but they do not validate handler tables.
  • Zend operand kinds are stable enough to share parsing code; opcode bounds and names are not.
  • The target ABI comes from the header trailer and must match the supplied loader DLL.

Opcode bounds and names were checked against the official PHP 8.2, PHP 8.3, and PHP 8.4 Zend headers.

13 Pipeline Summary

End-to-end static decode pipeline
sample.php
  │
  ├─ strip PHP preamble → locate HR+c marker
  ├─ custom Base64 decode (digit-first alphabet)
  ├─ verify version_word ^ 0x2853CEF2 is an accepted HRC revision
  ├─ escape-decode meta header → seed, file_size, header_size
  ├─ unchunk → encrypted_header[header_size]
  ├─ verify Adler transition checksum (init=17)    ← gate #1
  │
  ├─ MT4IC(seed) decrypt header + ROL3 mask
  ├─ MD4 verify                                    ← gate #2
  ├─ parse header → request_key, php_flags
  │
  ├─ MT4IC(primary_seed) decrypt body frames
  ├─ body chunk Adler checksums                    ← gate #3
  ├─ zlib decompress (raw DEFLATE) → in-memory body stream
  │
  ├─ enumerate main + function records
  ├─ parse each outer record and decode its inner op_array
  ├─ read B180 blocks (opcode words + auxiliary operands)
  ├─ generate C3D0 key streams
  │     PRNG6(outer_key_a, outer_key_b) ^ request_key
  ├─ recover opcodes through B3C0 dual-sentinel + XOR
  ├─ apply handler/string tables from the matching PHP 8.1–8.4 DLL profile
  ├─ resolve literals, interned strings, variables and function calls
  ├─ normalize jump targets and safe values
  │
  ├─ write sample_dump/sample.icdump.txt
  └─ write sample_dump/sample.icdump.json (icdump-ir-v1)

Encryption layer stack

Each protection layer wraps everything below it. The static tool breaks them from outside in, using the checksum gates at each boundary as correctness signals:

Layer 1  ·  Custom Base64 (HR+c container)
digit-first alphabet  ·  version_word XOR 0x2853CEF2  ·  Adler17 transition checksum
Layer 2  ·  MT4IC header cipher
CMWC(0x1000) + xorshift  ·  ROL3 bitmask  ·  MD4 header integrity digest
Layer 3  ·  MT4IC body cipher + zlib
frame-by-frame decrypt  ·  chunk Adler checks  ·  raw-DEFLATE inflate
Layer 4  ·  PRNG6 opcode XOR   (per function)
LCG(0x7689, 0x4650)  ·  outer_key_a / outer_key_b  ·  XOR request_key
plain functions (cipher0) — fully recoverable
opcodes  ·  literals  ·  variables  ·  jump targets
dynamic-key functions (cipher1–6) — body opaque
basic: stream key, seed-derivable  ·  random: RSA-wrapped Rijndael — requires loader private key

Decode pipeline — visual

sample.php — input
PHP die()-stub preamble HR+c marker byte sequence custom Base64 payload to EOF
01 HR+c Container Decode
digit-first Base64 decode VERSION_XOR 0x2853CEF2 escape-decode meta header unchunk encrypted_header[] Adler17 transition
02 Header — MT4IC Decrypt
CMWC(0x1000) + xorshift PRNG ROL3 bitmask MD4 header digest extract request_key extract php_flags
03 Body — MT4IC + zlib
frame-by-frame MT4IC decrypt chunk Adler checks raw-DEFLATE inflate function record stream
04 Opcode Recovery
B180 block (words + aux records) PRNG6 XOR key stream B3C0 dual-sentinel decode handler-table remapping resolve literals · CVs · jump targets
05 Output — icdump-ir-v1
write icdump.json (icdump-ir-v1) write readable icdump.txt record typed dynamic-key stubs

14 Conclusion

IonCube's protection rests on a layered stack of proprietary PRNGs, custom container formats, per-method key schedules, and opcode obfuscation. Each layer was individually identified through static analysis of the loader DLL in IDA Pro — no PHP runtime was involved, no candidate source code was aligned against the output.

The key architectural insight is that container decoding, ABI-specific op_array recovery, and IR normalization remain separate, testable stages. The maintained entry point emits static opcode dumps for the supported plain-encoded PHP 8.1–8.4 profiles without executing the encoded sample. The companion ioncube-php8-decompiler project consumes that IR when readable PHP source is required.

Porting a future loader means validating the shared container first, then extracting the versioned handler and interned-string tables and checking the target Zend opcode set. The checksum gates validate the outer layers; opcode and fixture checks validate the ABI-specific layer.