Lua Virtualization Part 3: Taking a Look at Luraph
Table of Contents
This is part 3 of a 5 part series about Lua Virtualization:
- Part 1: The Internals of the Lua VM
- Part 2: Obfuscation techniques
- Part 3: Taking a Look at Luraph
- Part 4: Devirtualizing Luraph
- Part 5: Actually Devirtualizing Luraph
The Target
Luraph has been in the market since 2017. It has maintained dominance while remaining immune to public deobfuscation. Except for one exception 4 years ago: Luraph Deobfuscator. Furthermore has the Luraph team themselves released a tool to deobfuscate competing obfuscators: LD Deobfuscator. This blog researches version 14.4.1 of Luraph and investigates exactly why and how it has maintained such a dominance over the obfuscation scene.
Why Luraph Remains Untouchable
I believe there are two main reasons why Luraph has become untouchable. First of all, it’s a well-known fact that Luraph uses Virtualization, even advertised on their website. As mentioned in Part 2 of this series, this is the strongest obfuscation method. Reversing this isn’t as straightforward as it seems. There is no up-to-date public research regarding the internals of Luraph. The latest research dates back to 2022, made by Ferib. Unfortunately for us, the version Ferib looked at had already been deobfuscated. Additionally has there been in three months 10 different DMCA takedowns to 31 different GitHub Repositories, regarding the deobfuscation of Luraph.
Analyzing Luraph Statically
We can simplify the analysis process by controlling the input, allowing us to predict the bytecode the VM will execute. When analyzing VMs, I prefer to begin with minimal input to establish some kind of baseline. This is achieved by obfuscating simple code like:
local foo = "bar"
This compiles to a single operation: LOADK.
Obfuscating this, we get a 65 KiB file, generated from the 20 byte Lua file.
Anti beautify measures
The first challenge we faced was anti beautify measures. I’m not sure if these were intentional or just a bug in the beautifier I’m using. Anyways, they’ve added \!, \:, and \# inside strings, which isn’t valid. Normally, in Lua, the backslash is the escape character in lua used for characters like \r or \n. The Lua compiler ignores this:
print("\!") -- "!"
This is easily fixed by removing the backslashes.
But that wasn’t the only trick hidden. Another clever anti-beautify tactic is where they structure the code in a way that deliberately trips up beautifiers. Take this snippet:
(foo)[1] = "bar";(bar)[1] = "foo";
At first glance, it looks innocent. But once beautified, the semicolon gets stripped away, and the code is reformatted into this:
(foo)[1] = "bar"
(bar)[1] = "foo"
Now it may look fine, but that’s where the trap is. Lua doesn’t see it this way. Instead, the interpreter reads it as:
(foo)[1] = "bar"(bar)[1] = "foo"
Which is nonsense. What’s happening here is that Lua tries to treat "bar" as if it were a function and call it with bar as its argument, before attempting to assign something to "foo". Of course, that’s invalid code, and the whole thing crashes.
Locating the entry point
With the file now properly beautified, the next step is to find the entry point. This is done by looking at the structure of the code. We quickly notice the code is returning a table from which the Mj member is accessed and invoked. The return value of Mj is then invoked with the parameters parsed into the file:
return ({
Cf = function(t, t, U)
(t[1])[0XD], U = 0X45 - 186 - 0X1, t[1][0X2E]
if t[1][0X2e] then
t[0X1][0x2E], t[0X1][46] = -t[1][45], (t[1][0xF])
end
return U
end,
Y = function(t, t)
(t[1])[24], t[1][4] = -(108 == 0X65), t[1][0XE]
end,
qf = function(t, t)
t = 0X02E
return t
end,
... -- COMPRESSED
Mj = (function(t)
... -- COMPRESSED
end),
... -- COMPRESSED
}):Mj()(...)
We observe that Mj is called using the colon syntax :, which implicitly passes the table itself as the first argument like so:
("%.12f"):format(3.141592653589793238462643383279)
-- Is the same as:
string.format("%.12f", 3.141592653589793238462643383279)
Taking a closer look
Taking a closer look at the entry point Mj:
Mj = (function(Self)
local Y, L, h, X = {}
X, h = Self:x(Y, h, X)
local E;
E, X = Self:G(h, Y, E, X)
X = Self:T(h, X, Y)
local a;
a, X = Self:K(Y, X, h, a)
local c;
X, c, a = Self:f(c, a, Y, X, h)
X = Self:w(h, Y, X, E)
X = Self:W(a, X, h, Y)
X = Self:mU(a, c, X, h, Y)
X = Self:jU(Y, h, X)
X = Self:rU(h, X, Y)
X = Self:vU(h, X, Y)
E, c, a = (nil)
a, X, E, c = Self:rf(E, h, X, Y, c, a)
c, L, E, X, a = Self:Df(c, X, Y, E, h, a)
return Self.M(L)
end)
We see it takes in a parameter Self, which points to the big table in which Mj itself is defined. The code itself consists mostly of function calls, which themselves consist of more nested function calls.
Inlining Factorized Code
To make sense of what each of these function calls actually does, we must inline the factorized pieces of code. For example, we can inline the factored function call:
_ = t:af(_, i)
We first look up the definition of af:
af = function(t, t, U)
t = U[30446]
return t
end
This af function is only called once, and essentially only returns the 30446’th index of U, therefore we can replace the invoking to:
_ = i[30446]
By systematically inlining every function call, we peel off the abstraction layer little by little, revealing the true code.
Beating control flow
Inlining isn’t the only obfuscation technique in use. Most of these factorized functions have control flow obfuscation, which obfuscates the execution flow like so:
p = function(Self, h_funcs, Y, L)
if L > 47 then
h_funcs[4] = 1
return 9036, L
else
if not (L < 66) then
else
h_funcs[22] = (Y)
L = 66
end
end;
return nil, L
end
Here is L, the variable that handles the control flow. In cases where there are only a few if statements, it’s rather simple to statically take a look at it, but in most cases, doing it dynamically is easier. The code above will first execute the h_funcs[4] = 1 then once called again it’ll execute: h_funcs[22] = (Y)
Removing Junk Code:
Another common theme in Luraph is junk code. The easiest way of detecting junk code is simply removing a bunch of code, and trying to re-run the code to check if it is still valid:
-- Junk code
if p[1][26] ~= 218 then
if not (p[1][27]) then
else
return p[1][13]
end;
if false then
else
(p[1])[34] = - p[1][15]
p[1][33] = 110 * p[1][26]
end
end;
-- Not junk
Stk[REG_C[VIP]] = Stk[REG_A[VIP]] % Stk[REG_B[VIP]]
Here is everything junk code, except the last part.
Unpacking Main
As we saw from earlier, the main consists of a factorized function only, which is nested within more factorized functions. For unpacking these, we’ll start in main and work our way through each nested factorization. Unpacking all this reveals that the main function primarily sets up helper functions located in h_funcs. As well as the invoking of the VM:
main = (function(Self)
local h_funcs = {}
h_funcs[1] = (4503599627370496);
h_funcs[2] = error;
h_funcs[3] = 9007199254740992;
h_funcs[4] = 1 -- blob idx
h_funcs[5] = Self._next;
h_funcs[6] = nil
h_funcs[6] = Self.string_match;
h_funcs[7] = {}
h_funcs[8] = Self._unpack;
h_funcs[9] = function(...)
...
end;
h_funcs[11] = nil
h_funcs[12] = function(tbl, start, finish)
...
end;
h_funcs[13] = (function(last_idx)
...
end)
h_funcs[14] = Self.string_sub;
h_funcs[15] = {}
h_funcs[16] = Self._select;
h_funcs[17] = nil;
h_funcs[18] = Self.string_byte;
-- COMPRESSED
h_funcs["VM"] = (function()
...
end)
h_funcs[51] = (function()
...
end)
-- COMPRESSED
-- Invoking of VM
return Self.__unpack({
h_funcs["VM"](deserialized_execution_data, h_funcs["upvalues"])
})
end)
Beautifying h_funcs
Now that we’ve unpacked the main function, the next step is to make it more readable. At this stage, the code is still heavily obfuscated. Functions and values are referenced through table lookups like h_funcs[12]. While these references are functional, they make the logic extremely difficult to follow for a human. To address this, we replace these indirect table accesses with descriptive identifiers. For example:
h_funcs[6]→h_funcs["string.match"].h_funcs[12]→h_funcs["safe_tbl_unpack"].h_funcs[18]→h_funcs["string.byte"].
This renaming process is applied across all functions. This is one of the most time consuming tasks. We first trace what each one does, analyze it, and then assign a name that accurately reflects its role in the program. While my work is still ongoing and certain parts of the code remain obfuscated, this is the progress I’ve made so far. Many functions have been analyzed, renamed, and clarified, providing a clearer view of the program’s structure (The code shown below includes only the function and variable declarations):
main = (function(Self)
local _bit = (Self._bit or Self._bit32);
local XOR = _bit and _bit.bxor;
local h_funcs = {}
h_funcs[1] = (4503599627370496);
h_funcs["error"] = error;
h_funcs[3] = 9007199254740992;
h_funcs["DIP_BLOB_IDX"] = 1 -- blob idx
h_funcs["next"] = Self._next;
h_funcs["is_first_constant"] = nil
h_funcs["string.match"] = Self.string_match;
h_funcs["empty_tbl"] = {}
h_funcs["unpack"] = Self._unpack;
h_funcs["get_self_index"] = function(...) end;
h_funcs[11] = nil
h_funcs["safe_tbl_unpack"] = function(tbl, start, finish) end;
h_funcs["gen_empty_tbl"] = (function(last_idx) end)
h_funcs["string_sub"] = Self.string_sub;
h_funcs["calcualted_char_tbl"] = {}
h_funcs["select"] = Self._select;
h_funcs["temp_Instructions"] = nil;
h_funcs["string.byte"] = Self.string_byte;
h_funcs["setmetatable"] = Self._setmetatable;
h_funcs["constant_multiplier"] = 4.294967296E9;
h_funcs["string.gsub"] = Self._string.gsub;
h_funcs["format_blob"] = (function(blob) end)(string.sub([=[LPH}S-&cD!bYWf!ERhRCtJnu?X[JUfE,)U<r`4#!.^3R5hH3H?Yj<+!CYQ@9$.D>z!!(.Kz!!!#O!H$HiG1ZgafE"e#!G^6f<S.=ffE":j#]t!&F_tT!EnCD$fE+]JfE"=k!?fMiz!!!#O!b#3R#ljr*zfEZJ/FCo*%G#nYD!!!#WDa;GLz!!!#O!H(s=z!'*;Gz!!"i@fE#+q!6PBDz!!(XPGLuq1fEQS>@ps1ifE#(+!8%<pfEHA*@VfW.E$014FEMVA+EM+9An>k'-t[U>@ruF'DC@+i/h%o`ATW'8DBL6H-n[,).3N2>A1SjEATVd#FCB9"@VfU(HQZN:-$(89+?^i"/hS8p/0K9^?XIMbA7^!.4WnBKFCo*%Fsnak/hSS%+FP[f+P6j)?Y!kofE>Z/FCj)*;qM+hfE$*H!_lcuz!!$t'fEH,#FCT":"CGMIES(J-DfT]'F<(D7"*.slfEPi(DI[*sfE"t(!IEB!:"TMD=P*jc?YOCgAU%8Vz!!kjB!HQfnB%R,/fE#@3#@_UiCh7$m<r`4#!!"]u5hH9RD..NrBNG06zn3P\-:=oSEfEQD=EbTE(fE+`KfEc)3DI[d&Df42.=B>J&zn3>%sz!!!"#z!!)LRfEH%hC3FOJ$tj-nD.RftFCAWpA_7$/fE#L)"98E%zaqk$Xz!8-oY$6UH6+<VdL+>#0L>7(][+<VdL+<VdL+<VdL+<VdL+<VdL+<VdL/jL^20.JM*/hSb//hS7h+<VdL/hSb-/1N;$,:+[%5V<Bd+<VdL+<VdL+<VdL+<VdL+<VdL-n6>^+=o/o,:+W_-9sg]5UId*-nd5,0.84s,9nKZ,9nTb0.JG&/1r%f+<VdX0/"_#/d`^D+<VdL+<VdL+<VdL+<VdL+>52e/gWbJ5X7S"5X6VH+<W9b-9sg]-71&d-71uC5X7S"-6jog/1rP-/hSb//h//45X6_M+<W3[/d`^D+<VdL+<VdL+<VdL+<VdV0-Dko5X7S"5X7Ra+<W'Y/0H&X.OZVj5X7S"5UId*.P*1p+<VdL+<VdL+<VdL/hAJ#,:+`f5X6YG+<W-b$6UH6+<VdL+<VdL+<VdL+<rE[00hcf5X7Ra+=\]d+=nid0.ne/,:+Z`5X7R]-mh2E5X7S"5X7S"5X6PD/1rP-/hS\.-9sg]5X7S"5U[a-,mkb;+<VdL+<VdL+<VdL+<r!O,="LZ5X6eP5U@O*,:+rq-nHu%0.JM+0.JM*/2&D$5X7S"5X7S"5X7S",sX^\5X7S"5X6PH,="LZ5X7R]/g)GI+<VdL+<VdL+<VdL+<W<[+=9?=5X7S"5X6_D5U.C$-712h5X7S",;1B/5X7Rf,pb/p,sX^\5X7S",qhMK-7CDf+=o&p/hSb!+=\[&5X6P:.LI:@+<VdL+<VdL+<VmO+>,!+5X7S"5X7S"5X6kK-m_,D5X7RZ/g)8Z+=nj)5U/NZ-7U,j-9sg]5X6YI/gEVH5X6tL5X6VD5X7R]-nd,"-7g8m/.*LB+<VdL+<VdT0-DA[-pT++-7(!(5X6YL/0HK/,:GfB5X6kC+<VdL+<VdO5X6tR-9rn#00hcf5X6kH,:,T?5X7R_+<VdL+=]WA5X7R]/0uSp+>+!D+<VdL+<Vd[+<Vm^/0dDF5UI^(0/"P85X6tF,sX^\-9sg]-nZVb+<W3^5X6_M.PE7o+=09<.NfiV,sX^\5X7R\+<VdL+<VdT5X6YE.P<>+,pk5O+<VdL+<VdL+>5B$5X6YI+<W'Z5X6PF+<Vd[5VF62.OIDG5X6P@5X6V?,q(/f5UIs'00hcf5X7R]/g)B(5X6P@5X7R],pbfA5X7S"-7geu.R5X3$6UH6+<VdL+=/<d-9rdu/g`hK5U.C)5X7S",pklB5UJ-:+<VdX0.85%.P)\b/h\P:5X7S"5X7S"5V+B3-n[/!5X6PD-9sg]-mL,m/hSb]=], 5))
h_funcs[23] = nil;
h_funcs["pre_calculated_xor_tbl"] = setmetatable({ })
h_funcs["meta_tbl_generate_xor"] = (function(a) end)
h_funcs["control_flow"] = 218
h_funcs["upvalues"] = {} -- upvalues maybe?
h_funcs["bit.lshift"] = (_bit and _bit.lshift);
h_funcs["metadata_stack"] = {} -- maybe alternative stack???
h_funcs["XOR"] = XOR or function(a, b) end;
h_funcs["bit.rshift"]= (_bit and _bit.rshift)
h_funcs["bit.rshift"]= h_funcs["bit.rshift"]or function(a, b) end;
h_funcs["unpack_crash_value"] = 2.147483648E9
h_funcs["2_power_tbl_calculated"] = { }
h_funcs["extract_bits"] = function(value, width, offset) end;
h_funcs["gBits8"] = (function() end)
h_funcs["gBits32"] = function() end;
h_funcs["gInt"] = function() end
h_funcs["pcall"] = Self._pcall;
h_funcs["setfenv"] = Self._setfenv;
h_funcs["gFloat"] = function() end;
h_funcs["decode_blob_128"] = function() end;
h_funcs["temp_Constants"] = nil;
h_funcs["type"] = Self._type;
h_funcs["decode_blob_128_wrapper"] = function() end
h_funcs["tostring"] = Self._tostring;
h_funcs["gString"] = function() end;
h_funcs["get_len_and_tbl"] = function(...) end;
h_funcs["getfenv"] = Self._getfenv;
h_funcs["temp_decoded_Instructions"] = nil;
h_funcs["VM"] = (function(EXECUTION_DATA, enclosing_func_upvalues)
call_vm = function(...)
end
return call_vm
end)
h_funcs["get_next_func_instructions"] = (function() end)
-- gen char tbl
for char = 0, 255 do
h_funcs["calcualted_char_tbl"][char] = string.char(char)
end;
for idx = 0, 15 do
setmetatable(h_funcs["pre_calculated_xor_tbl"][idx], h_funcs["meta_tbl_generate_xor"](idx))
end;
local Deserialize = (function() end)
local execute_first_arg = (function(...) end)
local deserialized_execution_data = h_funcs["VM"](Deserialize(), h_funcs["upvalues"])(Deserialize, Self.nil_self_index, h_funcs["get_self_index"], execute_first_arg, h_funcs["gFloat"], h_funcs["gBits8"], h_funcs["gBits32"], Self.control_flow_tbl_math, h_funcs[23], h_funcs["VM"])
return Self.__unpack({
h_funcs["VM"](deserialized_execution_data, h_funcs["upvalues"])
})
end)
As mentioned above, there is still some uncertainty regarding the code and the naming. I have assigned the names myself, and it is not guaranteed that all of them are correct.
Inspecting the Execution Flow
This is a superficial visual representation of the execution flow. We’ll dig further into the specifics in the next part:
First things first: the BLOB. This binary blob contains all the program constants. Everything from instructions to the encrypted constants. It’s compressed, so the workflow begins by formatting and unpacking the blob. Next comes deserialization: the blob is a custom data format, and the deserializer walks that format to extract the still encrypted constants and a sequence of instructions intended for the first VM.
The first VM executes those deserialized instructions. Its initial actions are dynamic anti‑tamper checks, which I’ll disclose the details of later in this post. If those checks pass, the first VM proceeds to decrypt the constants and the real instructions. The decrypted instructions are then parsed into a second VM, the real VM. In short, the first VM is an extra security layer that validates and decrypts content for the second VM, which then executes the actual code.
Analyzing the VM
With a basic understanding of the pre VM code, we can finally begin examining the VM itself. For this investigation, we want the obfuscated sample to be only marginally more complex than before. Something that actually executes more than 1 instruction:
print("Hello World!")
You can find the obfuscated file here. This new code will actually need to perform 4 OPCODES, first it’ll run OP_GETGLOBAL to get a handle of the print function, then it’ll need to load the constant "Hello World!" onto the stack with OP_LOADK, then OP_CALL the function with the argument, and finally OP_RETURN.
Analyzing the VM is easiest done statically, combined with dynamic analysis.
Defeating Dynamic Anti Beautify
When I attempted to run the beautified file, it immediately crashed with a Segmentation fault (core dumped). That felt oddly suspicious — I hadn’t changed anything, except the formatting. So this looks like an anti beautify trick that spits out fake errors to discourage an intruder.
Without knowing where the check is performed, a reliable way to find it is. It is to hook standard library functions that an anti beautify check might use to validate the code. The debug library would be an obvious candidate (it can inspect line count), but I didn’t enable Use Debug Library when obfuscating, and many runtime environments block debug anyway. A more likely target is string.gmatch, which is commonly used to scan text for patterns, for example, verifying line numbers embedded in error messages.
Below is a minimal hook that preserves the original string.gmatch reference, replaces string.gmatch with a wrapper, and logs the inputs and matches. A straightforward way to see what the anti beautify code is checking:
local function to_byte(str)
str = tostring(str or "")
return ("\"" ..str:gsub(".", function(b)
return "\\" .. b:byte()
end) .. "\"")
end
local string_gmatch = string.gmatch
string.gmatch = function(str, gmatch)
print(to_byte(str), to_byte(gmatch))
for match in string_gmatch(str, gmatch) do
print(match)
end
print("\n")
return string_gmatch(str, gmatch)
end
Why this works: we first store the real string.gmatch in string_gmatch. Replacing string.gmatch with a new function would otherwise lose the original behavior; by keeping a saved reference, we can call through to the original implementation and still observe the inputs and outputs.
Running the hook revealed the anti‑beautify output strings being scanned. Example observed outputs:
'stdin:1: unexpected symbol near "file.lua:2690: attempt to index a nil value"'
'stdin:1: unexpected symbol near "file.lua:4256: attempt to index a nil value"'
'stdin:1: unexpected symbol near "file.lua:3496: attempt to call a nil value"'
The pattern the anti‑tamper check appears to use is:
':(%d+)[:\r\n]'
Which extracts the following matches from the strings:
2693
4259
3499
We now know that the code intentionally executes invalid instructions at lines 2693, 4259, and 3499, likely wrapped in pcall to prevent the program from crashing. These invalid instructions generate error messages, which leak the actual line numbers, allowing the anti‑beautify mechanism to check whether the code has been reformatted. A clever way of getting the actual line numbers without using the debug library.
At this point, we can confidently say we have identified the line checker. The behavior strongly suggests the use of pcall — otherwise the program would have crashed on these invalid instructions. This can be confirmed by hooking pcall:
local _pcall = pcall
pcall = function(func, ...)
local ret = {_pcall(func, ...)} -- [1] = success, [2], [3], [4], .. [k] = ret
for k, v in pairs(ret) do
print(k, to_byte(v))
end
return unpack(ret)
end
Examining the first return value from these pcalls reveals a massive 9 MiB blob:
"\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\32\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46\63\46"
This blob is the return of the last pcall, likely meant to overwrite the console output and hide prior traces. Looking at the pcall results corresponding to the earlier invalid instructions:
1 false
2 file.lua:2705: attempt to index a nil value
1 false
2 file.lua:4271: attempt to index a nil value
1 false
2 file.lua:3511: attempt to call a nil value
Here, the first value being false confirms that the code inside the pcall failed. Inspecting the actual lines in the code:
(...)[...] = nil -- line 2705
return (...)[...] -- line 4271
return (...)() -- line 3511
All of these are invalid Lua code, reinforcing that the errors are deliberate.
We can bypass this check by hooking pcall and sanitizing the error messages, replacing the actual line numbers with a small number like 1:
local _pcall = pcall
pcall = function(func, ...)
local ret = {_pcall(func, ...)}
if not ret[1] and type(ret[2]) == "string" then
ret[2] = (ret[2]):gsub(":(%d+)([:\r\n])", ":1:")
end
return unpack(ret)
end
After applying this bypass, the fake segmentation fault disappears, and the code can be executed without triggering the line checking anti beautify mechanism.
The Structure of the VM
All VMs are Turing machines. Simple but complete. In Lua, that means a loop that fetches an instruction and executes behavior via nested if statements. In our sample, this is implemented in the h_funcs["VM"]:
repeat
if Enum < 49 then
if not (Enum < 64) then
if Enum < 65 then
-- COMPRESSED
else
if Enum == 66 then
-- COMPRESSED
else
-- COMPRESSED
end
end
else
if not (Enum < 62) then
if Enum ~= 63 then
-- COMPRESSED
else
-- COMPRESSED
end
else
-- COMPRESSED
end
end
end
VIP = (VIP + 1)
until false
This is the loop that performs the logic. Enum is the opcode identifier. A number that is read from the instruction that tells the VM which OPCODE to perform. For example, an Enum value of 45 might correspond to OP_ADD, while 12 could mean OP_MOV , etc.
Finding out which Enum corresponds to which opcode requires some background knowledge of Lua’s internal VM. If you haven’t gone through Part 1 I recommend checking that out first. In that post, I cover where the opcode definitions can be found.
With that knowledge, we can start lifting the obfuscated Enum values back to their real opcode counterparts. Some OpCodes are easy to spot and translate directly back to the Lua OpCodes:
-- OP_NIL
Stk[REG_B[VIP]] = nil
-- OP_NOT
Stk[REG_B[VIP]] = (not Stk[REG_A[VIP]])
-- OP_MUL
Stk[REG_B[VIP]] = Stk[REG_C[VIP]] * Stk[REG_A[VIP]]
-- OP_JMP
VIP = REG_B[VIP]
-- OP_CONCANT
Stk[REG_A[VIP]] = (Stk[REG_C[VIP]] .. Stk[REG_B[VIP]])
-- OP_LEN
Stk[REG_C[VIP]] = # Stk[REG_A[VIP]]
-- OP_DIV
Stk[REG_B[VIP]] = Stk[REG_C[VIP]] / constants[VIP]
-- OP_MOD
Stk[REG_A[VIP]] = Stk[REG_B[VIP]] % function_prototypes[VIP]
-- OP_CALL
Top = REG_C[VIP]
Stk[Top] = Stk[Top]()
-- OP_LT
if not (constants[VIP] < Stk[REG_B[VIP]]) then
VIP = REG_C[VIP]
end
And some are larger:
-- OP_CLOSURE
local closure_prototype = function_prototypes[VIP]
local closure_upvalue_data = closure_prototype[2]
local n_upvalues = #closure_upvalue_data;
local Upvalues = n_upvalues > 0 and {}
local new_closure_vm = h_funcs["VM"](closure_prototype, Upvalues);
setfenv(new_closure_vm, vm_env);
Stk[REG_A[VIP]] = new_closure_vm
if Upvalues then
for idx = 1, n_upvalues do
local upvalue = closure_upvalue_data[idx]
local upvalue_type = upvalue[2]
local upvalue_idx = upvalue[1]
if upvalue_type == 0 then
if not shared_upvalues then
shared_upvalues = {}
end;
local shared_upvalue = shared_upvalues[upvalue_idx]
if not shared_upvalue then
shared_upvalue = ({
[1] = upvalue_idx,
[2] = Stk
});
shared_upvalues[upvalue_idx] = shared_upvalue
end
Upvalues[idx - 1] = shared_upvalue
elseif upvalue_type == 1 then
Upvalues[idx - 1] = Stk[upvalue_idx]
else
Upvalues[idx - 1] = enclosing_func_upvalues[upvalue_idx]
end
end
end
And some OpCodes are custom, which don’t translate directly back to Lua:
--OP_XOR
Stk[REG_B[VIP]] = (h_funcs["XOR"](Stk[REG_C[VIP]], Stk[REG_A[VIP]]))
Therefore, lifting the OpCodes can take some time. It requires a mix of dynamic and static analysis that we’ll cover in the next part. A useful trick is controlling the input to the VM so you can predict which opcodes should appear and which shouldn’t. By feeding constrained or known inputs, you narrow the opcode space, making it much easier to spot standard Lua opcodes as well as custom ones.
Peaking into the First VM
By inspecting the constants of the first VM. Simply dumping the contents of the constants table and then tracing which opcodes interact with them. We quickly notice that all constant accesses originate from the instruction where Enum == 59. Let’s examine that more closely:
Stk[REG_B[VIP]][function_prototypes[VIP]] = constants[VIP]
This behavior strongly resembles OP_LOADK, since it places a constant value onto the stack. That aligns perfectly with our expectation: before the VM can operate on constants, they must first be loaded into the stack frame.
Now, we can log every stack index being written during this instruction:
print(("Stk[%s][%s]"):format(REG_B[VIP], function_prototypes[VIP]))
This produces the following output:
Stk[11][KX]
Stk[11][Z]
Stk[11][W]
Stk[11][N]
Stk[11][A]
Stk[11][R]
Stk[11][x]
Stk[11][S]
Stk[11][B]
Stk[11][l]
Stk[11][E]
Stk[11][o]
Stk[11][_]
All writes target stack index 11, but to different sub indexes.This behavior is odd because the opcode doesn’t fully match the standard OP_LOADK signature:
OP_LOADK, /* A Bx R(A) := Kst(Bx) */
The observed instruction writes into Stk[11][...] rather than a normal register slot, which suggests this is a custom opcode specific to the first VM. This is also validated by logging when the instruction fires. It always runs before the second VM is instantiated, confirming it’s part of the first VM runtime.
Putting the pieces together, index 11 appears to act like a hidden, embedded stack used during the first VM’s setup. My theory for now is that this slot is responsible for decrypting the real constants in preparation for handing off to the second, real VM. At this point, it’s still unknown.
My idea for investigating this was to log every read and write to Stk[11] so we could observe when and where the VM performs operations. I tried to wrap the slot with a proxy metatable and print on every access, but without tripping yet another security feature, triggering the Segmentation fault (core dumped).
Hitting a Wall
There’s still an active dynamic protection aimed at the real VM. I tried to discover how reads from Stk[11] might be detected, but that line of investigation didn’t yield results. So I shifted focus: what is actually causing the crash?
Finding the crash
While tracing the first VM’s execution, I discovered what caused the crash:
pcall(unpack, {}, 0, 2147483647)
This was wrapped inside a VM, making it hard to find, but with the help of hooking, I could hook the unpack function, and after finding the large 2147483647 number, I knew I had found the crash. Invoking this on Lua 5.1 reliably produces a Segmentation fault (core dumped).
While digging deeper — dumping random stack indexes after various opcodes to see what values they hold — I stumbled across another layer of protection. Some opcodes themselves appear to be protected, meaning direct inspection can trigger a crash.
For example:
-- OP_CONCAT
Stk[REG_A[VIP]] = (Stk[REG_C[VIP]] .. Stk[REG_B[VIP]])
At first glance, this looks perfectly normal. But the moment I tried to do something as simple as:
print(Stk[REG_A[VIP]])
Immediately afterward, the VM would crash. This strongly suggests that certain opcodes carry built-in anti tamper hooks. This is only applied to sensitive OpCodes like OP_CONCAT , which would likely leak all the unencrypted constants.
Writing a Bypass
Creating a quick bypass for the segfault is straightforward:
local _unpack = unpack
unpack = function(...)
local args = {...}
-- calling unpack(..., 2147483647) triggers a crash; clamp it down by 1
if args[3] == 2147483647 then
return _unpack(args[1], args[2], 2147483646)
end
return _unpack(...)
end
What this does: the VM triggers the crash by calling unpack with an extreme third argument (2147483647). Our wrapper catches that pattern and replaces the count with 2147483646, which is large but won’t crash, avoiding the segmentation fault. This fixes the immediate crash, but it does not make the VM run to completion.
Thinking like a Hacker
Been stuck for some time on this problem with a segmentation fault when reading? Gotta think like a hacker, first idea:
First attempt
I started by logging the opcode streams for both VMs (the one that crashes and the one that runs). At a glance, the traces are identical, every opcode matches except for one: the crashing VM executes Enum == 39 where the non‑crashing VM executes Enum == 48. I then compared the entire sequence of opcodes leading up to that difference and found nothing. In short, a single opcode value differs, but the preceding context is identical.
Second attempt
Knowing that print calls tostring in Lua 5.1, I tried to detect whether the VM hooks tostring , looking for the printing of constants:
tostring = function(...)
return "modified"
end
print("original string") -- modified
The idea was that if the VM checks or sanitizes printed constants via tostring, this hook would reveal it. I saw no evidence that the VM was hooking tostring to detect our inspection.
Third attempt
Next, I modified all handlers that implement OP_EQ on the hypothesis that the VM might perform an equality test as a tamper-detection step. I logged the output of every OP_EQ and its arguments for both runs. Again: no meaningful difference between the crashing and non-crashing VM.
What’s Next?
After several weeks of trial and error, this particular sample still resists. My plan for the next part is to move to a different obfuscated script. One that includes floats, because that variant would let us peel away the large helper table. With that big table removed. We should be able to fully decompose the main function, finish unpacking the pre VM logic, and make real progress on decrypting the BLOB.
This post is getting long, so I’m splitting the write-up into two. In the follow‑up, I’ll dig deeper into the new dynamic protections we’ve observed and show a manual devirtualization of the first VM that reveals its inner workings.
A Little Bonus
Here I’ll share some of the tips and tricks for analysis I’ve developed and refined over the years.
Dynamic Analysis of Functions
Sometimes it’s easier to understand a function by observing its runtime behavior rather than purely through static inspection. If we hadn’t already beautified the code and extracted a table of all helper functions, we could have dynamically discovered functions like this:
local functions = {}
setmetatable(functions, {
__index = function(tbl, key)
return key
end
})
local function dump_funcs(tbl_to_dump)
for k, v in pairs(tbl_to_dump) do
if type(v) == "function" then
functions[tostring(v)] = k
end
end
end
dump_funcs(_G)
dump_funcs(string)
dump_funcs(debug)
dump_funcs(math)
dump_funcs(table)
This builds a lookup table where a function reference points to its name. Normally, printing a function shows only a reference:
print(print) -- function: 0x5fae510b6fa0
With the lookup table, we can do:
for k, v in pairs(h_funcs) do
print(k, functions[tostring(v)])
end
Output example:
1 4.5035996273705e+15
2 error
3 9.007199254741e+15
4 1
5 next
7 match
8 table: 0x5be03abbd170
9 unpack
10 function: 0x5be03abbcb00
11 function: 0x5be03abb8410
12 function: 0x5be03abb8450
13 function: 0x5be03abb8490
14 sub
15 table: 0x5be03abb84d0
16 select
18 byte
19 setmetatable
20 4294967296
21 gsub
Dumping Non-Standard Functions
For functions that aren’t standard Lua library calls, like: h_funcs[10], we can dump and decompile them:
local function dump_func(func)
local func_id = tostring(func):match("function: (.+)")
local bytecode = string.dump(func)
local filename = func_id .. ".luac"
local file = assert(io.open(filename, "wb"))
file:write(bytecode)
file:close()
print(filename)
return filename
end
dump_func(L[10])
Then, using Unluac:
$ java -jar unluac.jar 0x636a56e7aef0.luac
Output example:
return (...)[(...)]
Table Dumping Utility
local function print_tbl(tbl, indent)
indent = indent or ""
for k, v in pairs(tbl) do
print(indent .. k, v)
if type(v) == "table" then
print_tbl(v, indent .. " ")
end
end
end
This is handy for recursively inspecting complex tables.
Hooking X to Track Calls
In this example, we’re hooking print:
local __print = print
print = function(...)
__print("Called from line", debug.getinfo(2, "l").currentline)
return __print(...)
end
This can help identify where print occurs during execution.