General IR - GIR

This page describes our self-developed universal intermediate language. Compared to previous languages, GIR focuses more on concisely expressing logical relationships between variable symbols, while semantics like type inference, control flow, etc., are delegated to a subsequent "all-in-one" semantic analysis engine.

GIR Attributes Description
program name
body
namespace_decl name
body
namespace name {body}
comment_stmt data
package_stmt name Represents a package declaration statement, formatted as package name
import_stmt attrs
name
alias
Represents an import statement, formatted as import module_path or import module_path as alias

Note:module_path is a directory or file path
from_import_stmt attrs
source
name
alias
attrs notes:
- unit : name must be a filename, not a folder
- init:target file must be initialized during import
export_stmt attrs
name
alias
Represents an export command: export <name> as <alias>
attrs identifies if it is export default (JS-only)
from_export_stmt attrs
module_path
name
alias
require_stmt target
name
Represents a require statement: target = require(name)
(PHP-only)
class_decl attrs
name
supers
static_init
init
fields
methods
nested
Represents a class declaration:
- attrs: properties like public, static, private
- name: class name
- supers: list of parent classes
- fields: member variables (each as variable_decl)
- methods: member functions (each as method_decl)
- nested: list of other nested declaration
- init/static_init: initialization blocks, init for initialization of normal fields and static_init for initialization of the static fields


Example:
public class Name extends A implements B { int i = 1; }
is represented as:
{"class_decl": {"attrs": ["public"], "name": "Name", "supers": ["A", "B"], "fields": [{"variable_decl": {"data_type": "int", "name": "i"}}], "init": [this.i = 1]}}
record_decl attrs
name
supers
type_parameters
static_init
init
fields
methods
nested
type_parameter is a list of all typed parameters, other is same as class_decl
interface_decl attrs
name
supers
type_parameters
static_init
init
fields
methods
nested
Same as record_decl
enum_decl attrs
name
supers
static_init
init
fields
methods
nested
Same as class_decl
annotation_type_decl attrs
name
static_init
init
fields
methods
nested
Same as class_decl
annotation_type_elements_decl attrs
data_type
name
value
Same as class_decl
struct_decl attrs
name
fields
Same as class_decl
parameter_decl attrs
data_type
name
default_value
Represents parameter declarations.
- data_type: the data type of parameter
- name: name of the parameter
- default_value: the default value

Example for int f(int a, int b = 4): parameters are [{"parameter_decl": {"data_type": "int", "name": "a"}}, {...}]
variable_decl attrs
data_type
name
Represents local/field declarations.
Example: signed int i = 10 is split into:
[{"variable_decl": {"attrs": "signed", "data_type": "int", "name": "i"}}, {"assign_stmt": {"target": "i", "operand": 10}}]
method_decl attrs
data_type
name
parameters
body
Represents function declarations.
- attrs: properties like public, static, private
- data_type: the data type of return value
- name: name of the return value
- parameters: list of parameters, each of the list is parameter_decl
- body: list of the statements inside the method
Example: public int f(int a) {} has attrs: "public", data_type: "int", name: "f".
Anonymous functions (e.g., Python lambda x: x+1) are converted to named temporary methods def tmp_method(x): return x+1.
assign_stmt data_type
target
operand
operator
operand2
Assignment statement:
target = operand [<operator> operand2]
Unary operation if operand2 is missing (e.g., a = -b)
call_stmt target
name
positional_args
packed_positional_args
named_args
packed_named_args
data_type
prototype
Function call logic, formatted as target = name(args)
- target: return value of the method, always a temporary variable
- name: name of the called method
- positional_args: list of positional parameters
- packed_positional_args: Unwrapped positional parameters, and positional_args are mutually exclusive.
- named_args: list of keyword parameters
- packed_named_args: unwrapped keyword parameters, and named_args are mutually exclusive.
- data_type: data type of the return value
- prototype: prototype of the called function, will be used in llvm and dalvik

Example for e = o.f(a, b, c + d):
1. %v1 = o.f
2. %v2 = c + d
3. %v3 = %v1(a, b, %v2) // positional_args:[a, b, %v2]
4. e = %v3

Example for f(a,b,c, d=3):
call_stmt, name:f, positional_args:[a,b,c], named_args:{d:3}

Example for f(a, b, c, *l, d, a = b, c = d):
1. %v0 = [a, b, c]
2. %v1 = %v0.update(l)
3. %v2 = %v1.append(d)
call_stmt, name:f, packed_positional_args:%v2, named_args:{a:b, c:d}
Handles unpacking (e.g., *lpacked_positional_args).
echo_stmt name PHP echo statement.
exit_stmt name PHP exit statement.
return_stmt name Returns a variable: return name
if_stmt condition
then_body
else_body
Example:
if (a + b > c) {}
%v1 = a + b
%v2 = %v1 > c
if (%v2) {...}
dowhile_stmt condition
body
Similar to if_stmt
while_stmt condition
body
else_body
Similar to if_stmt
for_stmt init_body
condition
condition_prebody
update_body
body
Traditional for loop, formatted as for (init_body; condition_prebody; condition; update_body) {}
- init_body: list of statements, the initial block
- condition_prebody: list of statements, used for pre-statements of judging condition
- condition: a variable
- update_body: list of statements, need to be execute every time in the cycle

Example for for (int a = 1, b = 3; a + b < 10; a ++, b++) {}
for_stmt: [
  init_body: [
    variable_decl int a
    a = 1
    variable_decl int b
    b = 3
  ]
  condition_prebody: [
    %v1 = a + b
    %v2 = %v1 < 10
  ]
  condition: %v2
  update_body: [
    a = a + 1
    b = b + 1
  ]
  body : []
]
forin_stmt attrs
data_type
name
receiver
body
Similar to for_stmt
- attrs: attributions of Iterative variables
- data_type: data type of Iterative variables
- name: the Iterative variables
- receiver: the target variable
- body: list of statements

Formatted as for attrs data_type name in receiver {}
Iteration statement (e.g., for x in list).
forin receiver:list name:x
for_value_stmt attrs
data_type
name
receiver
body
Designed for JS for of and PHP foreach.
switch_stmt condition
body
switch(condition) {body}
case_stmt condition
body
case block inside switch.
default_stmt body default block inside switch.
break_stmt name break name
continue_stmt name continue name
goto_stmt name goto name
yield_stmt name yield name
throw_stmt name throw target
try_stmt body
catch_body
else_body
final_body
try {body} catch {catch_body} else {else_body} finally {final_body}
catch_stmt exception
body
catch block
label_stmt name Label declaration
asm_stmt target
data_type
attrs
data
extra
args
Inline assembly: target = attrs data(asm content)
assert_stmt condition assert condition
del_stmt receiver
name
Python del target
unset_stmt receiver
name
PHP unset
pass_stmt Empty statement (Python pass)
global_stmt name Python global target
nonlocal_stmt name Python nonlocal target
type_cast_stmt target
data_type
source
error
cast_action
Type casting: target = (data_type) source
if there is an error, there will be an error
type_alias_decl data_type
name
type_parameters
Typedef: typedef int aname: a, data_type: int
with_stmt attrs
with_init
Context manager (e.g., Python async with ... as file).
- attrs: always be async
- with_init: the initialization of the context manager
- body: statements inside the with_stmt

Example as async with aiofiles.open(filepath, 'r') as file:
        content = await file.read()
the GIR is :
{'with_stmt': {'attrs': ['async'],
'with_init': [{'field_read': {'target': '%v0',
'receiver_object': 'aiofiles',
'field': 'open'}},
{'call_stmt': {'target': '%v1',
'name': '%v0',
'args': ['filepath', ""'r'""]}},
{'assign_stmt': {'target': 'file',
'operand': '%v1'}}],
'body': [{'field_read': {'target': '%v0',
'receiver_object': 'file',
'field': 'read'}},
{'call_stmt': {'target': '%v1',
'name': '%v0',
'args': []}},
{'await': {'target': '%v1'}},
{'variable_decl': {'data_type': None,
'name': 'content'}},
{'assign_stmt': {'target': 'content',
'operand': None}}]}}
unsafe_block body Rust unsafe block
block body Generic code block
block_start stmt_id
parent_stmt_id
Internal marker for block start
block_end stmt_id
parent_stmt_id
Internal marker for block end
new_array target
attrs
data_type
Array instantiation: target = attrs data_type[]
new_object target
attrs
data_type
args
Class instantiation: target = attrs new data_type(args)
new_record target
attrs
data_type
Dictionary instantiation
new_set target
attrs
data_type
Set instantiation
new_struct target
attrs
data_type
Struct instantiation
phi_stmt target
phi_values
phi_labels
LLVM-style phi node: target = [phi_value, phi_label]
mem_read target
address
Read from memory: target = *address
mem_write address
source
Write to memory: *address = source
array_write array
index
source
Array write: array[index] = source
array_read target
array
index
Array read: a0 = result[0]
array_insert array
source
index
Insert into array at index
array_append array
source
Append to array: <array>.append(<source>)
array_extend array
source
Extend array: <array>.extend(<source>)
record_write receiver_object
key
value
Map write: record[key] = value
record_extend record
source
Map extend: <record>.update(<source>)
field_write receiver_object
field
source
Field write: receiver_object.field = source
field_read target
receiver_object
field
Field read: target = receiver_object.field
slice_wirte array
source
start
end
step
Python slice write: array[start:end:step] = source
- start: The index at which the slice begins
- stop: The index at which the slice stops
- step: The number of skipped elements each time
slice_read target
array
start
end
step
Python slice read: target = array[start:end:step]

Example as a = list[x:y:3]
{'slice_read': {'target': '%v1', 'array': 'list', 'start': 'x', 'end': 'y', 'step': '3'}}
{'assign_stmt': {'target': 'a', 'operand': '%v1'}}
addr_of target
source
Address-of: target = &source
await_stmt target await statement
field_addr target
data_type
name
Field offset calculation (e.g., offsetof(struct address, name))

Example as
struct address {
  char name[50];
  char street[50];
  int phone;
};
offsetof(struct address, name);

Convert to target = data_type: address, name: name
switch_type_stmt condition
body
Type-based switch statement