General IR - GIR
This page describes our self-developed universal intermediate language. Compared to previous languages, GIR focuses more on concisely expressing logical relationships between variable symbols, while semantics like type inference, control flow, etc., are delegated to a subsequent "all-in-one" semantic analysis engine.
GIR | Attributes | Description |
---|---|---|
program | name body |
|
namespace_decl | name body |
namespace name {body} |
comment_stmt | data | |
package_stmt | name | Represents a package declaration statement, formatted as package name |
import_stmt | attrs name alias |
Represents an import statement, formatted as import module_path or import module_path as alias Note: module_path is a directory or file path |
from_import_stmt | attrs source name alias |
attrs notes: - unit : name must be a filename, not a folder- init :target file must be initialized during import |
export_stmt | attrs name alias |
Represents an export command: export <name> as <alias> attrs identifies if it is export default (JS-only) |
from_export_stmt | attrs module_path name alias |
|
require_stmt | target name |
Represents a require statement: target = require(name) (PHP-only) |
class_decl | attrs name supers static_init init fields methods nested |
Represents a class declaration: - attrs : properties like public , static , private - name : class name- supers : list of parent classes- fields : member variables (each as variable_decl )- methods : member functions (each as method_decl )- nested : list of other nested declaration- init /static_init : initialization blocks, init for initialization of normal fields and static_init for initialization of the static fieldsExample: public class Name extends A implements B { int i = 1; } is represented as: {"class_decl": {"attrs": ["public"], "name": "Name", "supers": ["A", "B"], "fields": [{"variable_decl": {"data_type": "int", "name": "i"}}], "init": [this.i = 1]}} |
record_decl | attrs name supers type_parameters static_init init fields methods nested |
type_parameter is a list of all typed parameters, other is same as class_decl |
interface_decl | attrs name supers type_parameters static_init init fields methods nested |
Same as record_decl |
enum_decl | attrs name supers static_init init fields methods nested |
Same as class_decl |
annotation_type_decl | attrs name static_init init fields methods nested |
Same as class_decl |
annotation_type_elements_decl | attrs data_type name value |
Same as class_decl |
struct_decl | attrs name fields |
Same as class_decl |
parameter_decl | attrs data_type name default_value |
Represents parameter declarations. - data_type : the data type of parameter- name : name of the parameter- default_value : the default value Example for int f(int a, int b = 4) : parameters are [{"parameter_decl": {"data_type": "int", "name": "a"}}, {...}] |
variable_decl | attrs data_type name |
Represents local/field declarations. Example: signed int i = 10 is split into:[{"variable_decl": {"attrs": "signed", "data_type": "int", "name": "i"}}, {"assign_stmt": {"target": "i", "operand": 10}}] |
method_decl | attrs data_type name parameters body |
Represents function declarations. - attrs : properties like public , static , private - data_type : the data type of return value- name : name of the return value- parameters : list of parameters, each of the list is parameter_decl - body : list of the statements inside the methodExample: public int f(int a) {} has attrs: "public" , data_type: "int" , name: "f" .Anonymous functions (e.g., Python lambda x: x+1 ) are converted to named temporary methods def tmp_method(x): return x+1 . |
assign_stmt | data_type target operand operator operand2 |
Assignment statement:target = operand [<operator> operand2] Unary operation if operand2 is missing (e.g., a = -b ) |
call_stmt | target name positional_args packed_positional_args named_args packed_named_args data_type prototype |
Function call logic, formatted as target = name(args) - target : return value of the method, always a temporary variable- name : name of the called method- positional_args : list of positional parameters- packed_positional_args : Unwrapped positional parameters, and positional_args are mutually exclusive.- named_args : list of keyword parameters- packed_named_args : unwrapped keyword parameters, and named_args are mutually exclusive.- data_type : data type of the return value- prototype : prototype of the called function, will be used in llvm and dalvik Example for e = o.f(a, b, c + d) :1. %v1 = o.f 2. %v2 = c + d 3. %v3 = %v1(a, b, %v2) // positional_args:[a, b, %v2]4. e = %v3 Example for f(a,b,c, d=3) : call_stmt, name:f, positional_args:[a,b,c], named_args:{d:3} Example for f(a, b, c, *l, d, a = b, c = d) : 1. %v0 = [a, b, c] 2. %v1 = %v0.update(l) 3. %v2 = %v1.append(d) call_stmt, name:f, packed_positional_args:%v2, named_args:{a:b, c:d} Handles unpacking (e.g., *l → packed_positional_args ). |
echo_stmt | name | PHP echo statement. |
exit_stmt | name | PHP exit statement. |
return_stmt | name | Returns a variable: return name |
if_stmt | condition then_body else_body |
Example:if (a + b > c) {} →%v1 = a + b %v2 = %v1 > c if (%v2) {...} |
dowhile_stmt | condition body |
Similar to if_stmt |
while_stmt | condition body else_body |
Similar to if_stmt |
for_stmt | init_body condition condition_prebody update_body body |
Traditional for loop, formatted as for (init_body; condition_prebody; condition; update_body) {} - init_body : list of statements, the initial block- condition_prebody : list of statements, used for pre-statements of judging condition- condition : a variable- update_body : list of statements, need to be execute every time in the cycleExample for for (int a = 1, b = 3; a + b < 10; a ++, b++) {} for_stmt: [ init_body: [ variable_decl int a a = 1 variable_decl int b b = 3 ] condition_prebody: [ %v1 = a + b %v2 = %v1 < 10 ] condition: %v2 update_body: [ a = a + 1 b = b + 1 ] body : [] ] |
forin_stmt | attrs data_type name receiver body |
Similar to for_stmt - attrs : attributions of Iterative variables- data_type : data type of Iterative variables- name : the Iterative variables- receiver : the target variable- body : list of statementsFormatted as for attrs data_type name in receiver {} Iteration statement (e.g., for x in list ).forin receiver:list name:x |
for_value_stmt | attrs data_type name receiver body |
Designed for JS for of and PHP foreach . |
switch_stmt | condition body |
switch(condition) {body} |
case_stmt | condition body |
case block inside switch . |
default_stmt | body | default block inside switch . |
break_stmt | name | break name |
continue_stmt | name | continue name |
goto_stmt | name | goto name |
yield_stmt | name | yield name |
throw_stmt | name | throw target |
try_stmt | body catch_body else_body final_body |
try {body} catch {catch_body} else {else_body} finally {final_body} |
catch_stmt | exception body |
catch block |
label_stmt | name | Label declaration |
asm_stmt | target data_type attrs data extra args |
Inline assembly: target = attrs data(asm content) |
assert_stmt | condition | assert condition |
del_stmt | receiver name |
Python del target |
unset_stmt | receiver name |
PHP unset |
pass_stmt | Empty statement (Python pass ) |
|
global_stmt | name | Python global target |
nonlocal_stmt | name | Python nonlocal target |
type_cast_stmt | target data_type source error cast_action |
Type casting: target = (data_type) source if there is an error, there will be an error |
type_alias_decl | data_type name type_parameters |
Typedef: typedef int a → name: a , data_type: int |
with_stmt | attrs with_init |
Context manager (e.g., Python async with ... as file ).- attrs : always be async - with_init : the initialization of the context manager- body : statements inside the with_stmt Example as async with aiofiles.open(filepath, 'r') as file: content = await file.read() the GIR is : {'with_stmt': {'attrs': ['async'], 'with_init': [{'field_read': {'target': '%v0', 'receiver_object': 'aiofiles', 'field': 'open'}}, {'call_stmt': {'target': '%v1', 'name': '%v0', 'args': ['filepath', ""'r'""]}}, {'assign_stmt': {'target': 'file', 'operand': '%v1'}}], 'body': [{'field_read': {'target': '%v0', 'receiver_object': 'file', 'field': 'read'}}, {'call_stmt': {'target': '%v1', 'name': '%v0', 'args': []}}, {'await': {'target': '%v1'}}, {'variable_decl': {'data_type': None, 'name': 'content'}}, {'assign_stmt': {'target': 'content', 'operand': None}}]}} |
unsafe_block | body | Rust unsafe block |
block | body | Generic code block |
block_start | stmt_id parent_stmt_id |
Internal marker for block start |
block_end | stmt_id parent_stmt_id |
Internal marker for block end |
new_array | target attrs data_type |
Array instantiation: target = attrs data_type[] |
new_object | target attrs data_type args |
Class instantiation: target = attrs new data_type(args) |
new_record | target attrs data_type |
Dictionary instantiation |
new_set | target attrs data_type |
Set instantiation |
new_struct | target attrs data_type |
Struct instantiation |
phi_stmt | target phi_values phi_labels |
LLVM-style phi node: target = [phi_value, phi_label] |
mem_read | target address |
Read from memory: target = *address |
mem_write | address source |
Write to memory: *address = source |
array_write | array index source |
Array write: array[index] = source |
array_read | target array index |
Array read: a0 = result[0] |
array_insert | array source index |
Insert into array at index |
array_append | array source |
Append to array: <array>.append(<source>) |
array_extend | array source |
Extend array: <array>.extend(<source>) |
record_write | receiver_object key value |
Map write: record[key] = value |
record_extend | record source |
Map extend: <record>.update(<source>) |
field_write | receiver_object field source |
Field write: receiver_object.field = source |
field_read | target receiver_object field |
Field read: target = receiver_object.field |
slice_wirte | array source start end step |
Python slice write: array[start:end:step] = source - start : The index at which the slice begins- stop : The index at which the slice stops- step : The number of skipped elements each time |
slice_read | target array start end step |
Python slice read: target = array[start:end:step] Example as a = list[x:y:3] {'slice_read': {'target': '%v1', 'array': 'list', 'start': 'x', 'end': 'y', 'step': '3'}} {'assign_stmt': {'target': 'a', 'operand': '%v1'}} |
addr_of | target source |
Address-of: target = &source |
await_stmt | target | await statement |
field_addr | target data_type name |
Field offset calculation (e.g., offsetof(struct address, name) )Example as struct address { char name[50]; char street[50]; int phone; }; offsetof(struct address, name); Convert to target = data_type: address, name: name |
switch_type_stmt | condition body |
Type-based switch statement |