Added first batch of syntax for the pattern language

WerWolv 2021-01-07 23:15:14 +01:00
parent e7e3d50d82
commit ab5108ce06
1 changed files with 94 additions and 0 deletions

94
Pattern-Language-Guide.md Normal file

@ -0,0 +1,94 @@
# Pattern Language
The Pattern Language is ImHex custom built programming language used to create binary patterns/templates. These patterns are applied to a binary data in order to parse it and display the decoded values neatly in a tree-hierarchy. The syntax follows the the same style as other C-like languages and is therefore easy to read, understand, learn and use.
This document is meant as an overview of all the features the Pattern Language has.
## Table of Contents
## Built-in Types
Built-in types are the fundamental types used in the language. Supported are various unsigned types, signed types, floating point types as well as a few special types.
Unsigned Types: `u8`, `u16`, `u32`, `u64`, `u128`
Signed Types: `s8`, `s16`, `s32`, `s64`, `s128`
Floating point Types: `float`, `double`
Special Types: `char`, `bool`
Unsigned and signed types denote their size in bits in the name of the type. `s8` is 1 byte long, `u32` is 4 bytes long and so on.
Floating point types use the same sizes and encodings as their host system which in most cases is 32 bit for floats and 64 bit for doubles with the IEEE 754 encoding.
The special types `char` and `bool` are both one byte long and for the most part the same as `s8` and `u8`. The only difference is, they produce a more relevant output in the pattern data view.
## Variable Placements
To get started with extracting data from binary data, variables need to be defined and they need to be placed at some offset within the data.
This is done using the following syntax:
```cpp
<type> <variableName> @ <expression>;
// Example
u32 headerMagic @ 0x00;
s8 type @ 0x1234;
```
Doing this will cause 4 bytes at address `0x00` to `0x03` to be parsed as an unsigned 32 bit value and 1 byte at offset `0x1234` to be parsed as an unsigned 8 bit value. These results will then be displayed in the Pattern Data View within ImHex.
![Variable Placement](https://puu.sh/H4LDd/7a505816c8.png)
## Arrays
Arrays are used to parse a list of values that all share the same type and are placed contiguously in memory.
To place an array at a specific offset, again the variable placement syntax may be used in combination with the array syntax.
```cpp
<type> <variableName>[<expression>] @ <expression>;
// Example
u32 ids[0x100] @ 0x50;
```
This will cause a new branch node to appear which contains the decoded values of all entries within the array.
![Arrays](https://puu.sh/H4LJc/2702c5d94b.png)
## Structs
Structs can be used to group multiple types together in order to form a new type. All members of the struct will be placed right after each other in memory with no padding inserted between them. Therefore the size of the complete struct will be the sizes of all members summed up.
```cpp
struct <typeName> {
<variableDeclaration>
...
};
// Example
struct Header {
u8 magic[4];
u32 type;
bool flag;
};
Header header @ 0x00;
```
This code will create a new type named `Header` which again may be placed at any point in memory using the variable placement syntax. Multiple structs can also be nested to create more complex types all of which create a new branch node in the Pattern Data View.
![Struct](https://puu.sh/H4LSb/071a6aacf4.png)
## Unions
Syntactically, unions look and work exactly the same as structs. The difference however is that all members are placed at the same address on top of each other in contrast to the struct where all members are placed after each other (The same as in C/C++). Therefore the size of the union will be the size of the biggest member within in union.
```cpp
union <typeName> {
<variableDeclaration>
...
};
// Example
union Color {
u32 rgba;
u8 components[4];
};
Color color @ 0x100;
```
![Union](https://puu.sh/H4LY2/0d40d2ac34.png)