657 lines
20 KiB
Markdown
657 lines
20 KiB
Markdown
# Development information
|
|
|
|
This file aims to introduce developers to conventions for working on the code
|
|
base of radare2.
|
|
|
|
The GitHub issues page contains a list of all the bugs that have been reported,
|
|
with labels to classify them by difficulty, type, milestone, etc. It is a good
|
|
place to start if you are looking to contribute.
|
|
|
|
For information about the git process, see
|
|
[CONTRIBUTING.md](CONTRIBUTING.md#How_to_contribute).
|
|
|
|
## Documentation
|
|
|
|
Functions should have descriptive names and parameters. It should be clear what
|
|
the function and its arguments do from the declaration. Comments should be used
|
|
to explain purpose or clarify something that may not be immediately apparent or
|
|
relatively complicated.
|
|
|
|
```c
|
|
/* Find the min and max addresses in an RList of maps. Returns (max-min)/width. */
|
|
static int findMinMax(RList *maps, ut64 *min, ut64 *max, int skip, int width);
|
|
```
|
|
|
|
## Error diagnosis
|
|
|
|
There are several utilities that can be used to diagnose errors in r2, whether
|
|
they are related to memory (segfaults, uninitialized read, etc.) or problems
|
|
with features.
|
|
|
|
### Compilation options
|
|
|
|
* `sys/sanitize.sh`: Compile with ASan, the address sanitizer. Provides
|
|
detailed backtraces for memory errors.
|
|
* `R2_DEBUG_ASSERT=1`: Provides a backtrace when a debug assert (typically a
|
|
`r_return_` macro) fails.
|
|
* `R2_DEBUG=1`: Show error messages and crash signal. Used for debugging plugin
|
|
loading issues.
|
|
|
|
### ABI stability and versioning
|
|
|
|
During abi-stable seassons [x.y.0-x.y.8] it is not allowed to break the abi, this
|
|
is checked in the CI using the `abidiff` tool on every commit. Sometimes keeping
|
|
the abi/api stable implies doing ugly hacks. Those must be marked with the corresponding
|
|
to the next MAJOR.MINOR release of r2.
|
|
|
|
|
|
For example, during the development of 5.8.x we add a comment or use `#if R2_590` code
|
|
blocks to specify those lines need to be changed when 5.8.9 in git.
|
|
|
|
Only the even patch version numbers are considered a release. This means that if you have
|
|
an odd patch version of r2 it was built from git instead of the release tarball or binaries.
|
|
|
|
For more details read [doc/abi.md](doc/abi.md)
|
|
|
|
### Useful macros from [r\_types.h](libr/include/r_types.h)
|
|
|
|
* `EPRINT_*`: Allows you to quickly add or remove a debug print without
|
|
worrying about format specifiers.
|
|
|
|
#### Parameter marking
|
|
|
|
r2 provides several empty macros to make function signatures more informative.
|
|
|
|
* `R_OUT`: Parameter is output - written to instead of read.
|
|
* `R_INOUT`: Parameter is read/write.
|
|
* `R_OWN`: Pointer ownership is transferred from the caller.
|
|
* `R_BORROW`: The caller retains ownership of the pointer - the reciever must
|
|
not free it.
|
|
* `R_NONNULL`: Pointer must not be null.
|
|
* `R_NULLABLE`: Pointer may ne null.
|
|
* `R_DEPRECATE`: Do not use in new code and will be removed in the future.
|
|
* `R_IFNULL(x)`: Default value for a pointer when null.
|
|
* `R_UNUSED`: Not used.
|
|
|
|
## Code style
|
|
|
|
### C
|
|
|
|
In order to contribute patches or plugins, we encourage you to use the same
|
|
coding style as the rest of the code base.
|
|
|
|
* Please use `./sys/clang-format-diff.py` before submitting a PR to be sure you
|
|
are following the coding style, as described in
|
|
[CONTRIBUTING.md](CONTRIBUTING.md#Getting Started). If you find a bug in this
|
|
script, please submit a bug report issue. A detailed style guide can be found
|
|
below.
|
|
|
|
* See `sys/indent.sh` for indenting your code automatically.
|
|
|
|
* A pre-commit hook to check coding style is located at
|
|
`sys/pre-commit-indent.sh`. You can install it by copying it to
|
|
`.git/hooks/pre-commit`. To preserve your existing pre-commit hook, use
|
|
`cat sys/pre-commit-indent.sh >> .git/hooks/pre-commit` instead.
|
|
|
|
* For a premade `.vimrc`, see `doc/vim`.
|
|
|
|
* See `.clang-format` for work-in-progress support for automated indentation.
|
|
|
|
#### Guidelines
|
|
|
|
The following guidelines apply to code that we must maintain. Generally, they
|
|
will not apply to copy-paste external code that will not be touched.
|
|
|
|
* Tabs are used for indentation. In a switch statement, the cases are indented
|
|
at the switch level.
|
|
|
|
* Switch-cases where local variables are needed should be refactored into
|
|
separate functions instead of using braces. If braced scope syntax is used,
|
|
put `break;` statements inside the scope.
|
|
|
|
```c
|
|
switch (n) {
|
|
case 1:
|
|
break;
|
|
case 2: {
|
|
break;
|
|
}
|
|
default:
|
|
}
|
|
```
|
|
|
|
* Lines should be at most 140 characters in length. Considering tabs are 8
|
|
characters long. Originally this limit was 78, and it's still considered
|
|
as a good practice, try to keep your functions short and readable, with
|
|
minimum number of function arguments and not much nesting.
|
|
|
|
* Braces open on the same line as the for/while/if/else/function/etc. Closing
|
|
braces are put on a line of their own, except in the else of an if statement
|
|
or in the while of a do-while statement.
|
|
|
|
```c
|
|
if (a == b) {
|
|
...
|
|
}
|
|
|
|
if (a == b) {
|
|
...
|
|
} else if (a > b) {
|
|
...
|
|
}
|
|
|
|
if (a == b) {
|
|
...
|
|
} else {
|
|
do_something_else ();
|
|
}
|
|
|
|
do {
|
|
do_something ();
|
|
} while (cond);
|
|
|
|
if (a == b) {
|
|
b = 3;
|
|
}
|
|
```
|
|
|
|
* Always use braces for if and while.
|
|
|
|
```diff
|
|
-if (a == b)
|
|
- return;
|
|
+if (a == b) {
|
|
+ return;
|
|
+}
|
|
```
|
|
|
|
* In general, avoid `goto`. The `goto` statement only comes in handy when a
|
|
function exits from multiple locations and some common work such as cleanup
|
|
has to be done. If there is no cleanup needed, then just return directly.
|
|
|
|
Choose label names which say what the `goto` does or why it exists. An
|
|
example of a good name could be `out_buffer:` if the `goto` frees `buffer`.
|
|
Avoid using GW-BASIC names like `err1:` and `err2:`.
|
|
|
|
* Use `r_return_*` macros to check for conditions that are caused by
|
|
programming errors or bugs; i.e.: conditions that should **never** happen. Do
|
|
not use them when checking for runtime error conditions, such as a `NULL`
|
|
value being returned from `malloc()`. Use a standard if statement for these
|
|
cases.
|
|
|
|
```c
|
|
int check(RCore *c, int a, int b) {
|
|
/* check for programming errors */
|
|
r_return_val_if_fail (c, false);
|
|
r_return_val_if_fail (a >= 0, b >= 1, false);
|
|
|
|
/* check for runtime errors */
|
|
ut8 *buf = calloc (b, sizeof (a));
|
|
if (!buf) {
|
|
return -1;
|
|
}
|
|
|
|
/* continue... */
|
|
}
|
|
```
|
|
|
|
* Use spaces after keywords and around operators.
|
|
|
|
```c
|
|
a = b + 3;
|
|
a = (b << 3) * 5;
|
|
a = sizeof (b) * 4;
|
|
```
|
|
|
|
* Multiline ternary operator conditionals are indented in JavaScript style:
|
|
|
|
```diff
|
|
-ret = over ?
|
|
- r_debug_step_over (dbg, 1) :
|
|
- r_debug_step (dbg, 1);
|
|
+ret = over
|
|
+ ? r_debug_step_over (dbg, 1)
|
|
+ : r_debug_step (dbg, 1);
|
|
```
|
|
|
|
* When breaking up a long line, use a single additional tab if the current and
|
|
next lines are aligned. Do not align start of line using spaces.
|
|
|
|
```diff
|
|
-x = function_with_long_signature_and_many_args (arg1, arg2, arg3, arg4, arg5,
|
|
- arg6, arg7, arg8);
|
|
-y = z;
|
|
+x = function_with_long_signature_and_many_args (arg1, arg2, arg3, arg4, arg5,
|
|
+ arg6, arg7, arg8);
|
|
+y = z;
|
|
```
|
|
|
|
* Use two additional tabs if the next line is indented to avoid confusion with
|
|
control flow.
|
|
|
|
```diff
|
|
if (function_with_long_signature_and_many_args (arg1, arg2, arg3, arg4, arg5,
|
|
- arg6, arg7, arg8)) {
|
|
+ arg6, arg7, arg8)) {
|
|
do_stuff ();
|
|
}
|
|
```
|
|
|
|
* When following the above guideline, if additional indentation is needed on
|
|
consecutive lines, use a single tab for each nested level. Avoid heavy
|
|
nesting in this manner.
|
|
|
|
```diff
|
|
if (condition_1 && condition_2 && condition_3
|
|
&& (condition_4
|
|
- || condition_5)) {
|
|
+ || condition_5)) {
|
|
do_stuff ();
|
|
}
|
|
```
|
|
|
|
* Split long conditional expressions into small `static inline` functions to
|
|
make them more readable.
|
|
|
|
```diff
|
|
+static inline bool inRange(RBreakpointItem *b, ut64 addr) {
|
|
+ return (addr >= b->addr && addr < (b->addr + b->size));
|
|
+}
|
|
+
|
|
+static inline bool matchProt(RBreakpointItem *b, int rwx) {
|
|
+ return (!rwx || (rwx && b->rwx));
|
|
+}
|
|
+
|
|
R_API RBreakpointItem *r_bp_get_in(RBreakpoint *bp, ut64 addr, int rwx) {
|
|
RBreakpointItem *b;
|
|
RListIter *iter;
|
|
r_list_foreach (bp->bps, iter, b) {
|
|
- if (addr >= b->addr && addr < (b->addr+b->size) && \
|
|
- (!rwx || rwx&b->rwx)) {
|
|
+ if (inRange (b, addr) && matchProt (b, rwx)) {
|
|
return b;
|
|
}
|
|
}
|
|
return NULL;
|
|
}
|
|
```
|
|
|
|
* Use `R_API` to mark exportable (public) methods for module APIs.
|
|
|
|
* Use `R_IPI` to mark functions internal to a library.
|
|
|
|
* Other functions should be `static` to avoid polluting the global namespace.
|
|
|
|
* The structure of C files in r2 should be as follows:
|
|
|
|
```c
|
|
/* Copyright ... */ // copyright
|
|
#include <r_core.h> // includes
|
|
static int globals // const, define, global variables
|
|
static void helper(void) {} // static functions
|
|
R_IPI void internal(void) {} // internal apis (used only inside the library)
|
|
R_API void public(void) {} // public apis starting with constructor/destructor
|
|
```
|
|
|
|
* Why do we return `int` instead of `enum`?
|
|
|
|
The reason why many r2 functions return int instead of an enum type is
|
|
because enums can't be OR'ed; additionally, it breaks the usage within a
|
|
switch statement and swig can't handle it.
|
|
|
|
```
|
|
r_core_wrap.cxx:28612:60: error: assigning to 'RRegisterType' from incompatible type 'long'
|
|
arg2 = static_cast< long >(val2); if (arg1) (arg1)->type = arg2; resultobj = SWIG_Py_Void(); return resultobj; fail:
|
|
^ ~~~~
|
|
r_core_wrap.cxx:32103:61: error: assigning to 'RDebugReasonType' from incompatible type 'int'
|
|
arg2 = static_cast< int >(val2); if (arg1) (arg1)->type = arg2; resultobj = SWIG_Py_Void(); return resultobj; fail:
|
|
^ ~~~~
|
|
```
|
|
|
|
* Do not leave trailing whitespaces at end-of-line.
|
|
|
|
* Do not use `<assert.h>`. Use `"r_util/r_assert.h"` instead.
|
|
|
|
* You can use `export R2_DEBUG_ASSERT=1` to set a breakpoint when hitting an assert.
|
|
|
|
* Declare variables at the beginning of code blocks - use C89 declaration
|
|
instead of C99. In other words, do not mix declarations and code. This helps
|
|
reduce the number of local variables per function and makes it easier to find
|
|
which variables are used where.
|
|
|
|
* Always put a space before an opening parenthesis (function calls, conditionals,
|
|
for loops, etc.) except when defining a function signature. This is useful
|
|
for searching the code base with `grep`.
|
|
|
|
```c
|
|
-if(a == b){
|
|
+if (a == b) {
|
|
```
|
|
|
|
```c
|
|
-static int check(RCore *core, int a);
|
|
+static int check (RCore *core, int a);
|
|
```
|
|
|
|
* Where is `function_name()` defined?
|
|
|
|
```sh
|
|
grep -R 'function_name(' libr
|
|
```
|
|
|
|
* Where is `function_name()` used?
|
|
|
|
```sh
|
|
grep -R 'function_name (' libr
|
|
```
|
|
|
|
* Function names should be explicit enough to not require a comment explaining
|
|
what it does when seen elsewhere in code.
|
|
|
|
* **Do not use global variables**. The only acceptable time to use them is for
|
|
singletons and WIP code. Make a comment explaining why it is needed.
|
|
|
|
* Commenting out code should be avoided because it reduces readability. If you
|
|
*really* need to comment out code, use `#if 0` and `#endif`.
|
|
|
|
* Avoid very long functions; split it into multiple sub-functions or simplify
|
|
your approach.
|
|
|
|
* Use types from `<r_types.h>` instead of the ones in `<stdint.h>`, which are
|
|
known to cause some portability issues. Replace `uint8_t` with `ut8`, etc.
|
|
|
|
* Never use `%lld` or `%llx`, which are not portable. Use the `PFMT64` macros
|
|
from `<r_types.h>`.
|
|
|
|
### Shell scripts
|
|
|
|
* Use `#!/bin/sh`.
|
|
|
|
* Do not use BASH-only features; `[[`, `$'...'`, etc.
|
|
|
|
* Use `sys/shellcheck.sh` to check for problems and BASH-only features.
|
|
|
|
## Managing endianness
|
|
|
|
Endianness is a common stumbling block when processing buffers or streams and
|
|
storing intermediate values as integers larger than one byte.
|
|
|
|
### Problem
|
|
|
|
The following code may seem intuitively correct:
|
|
|
|
```c
|
|
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
|
|
ut32 value = *(ut32*)opcode;
|
|
```
|
|
|
|
However, when `opcode` is cast to `ut32`, the compiler interprets the memory
|
|
layout based on the host CPU's endianness. On little-endian architectures such
|
|
as x86, the least-signficiant byte comes first, so `value` contains
|
|
`0x40302010`. On a big-endian architecture, the most-significant byte comes
|
|
first, so `value` contains `0x10203040`. This implementation-defined behavior
|
|
is inherently unstable and should be avoided.
|
|
|
|
### Solution
|
|
|
|
To avoid dependency on endianness, use bit-shifting and bitwise OR
|
|
instructions. Instead of casting streams of bytes to larger width integers, do
|
|
the following for little endian:
|
|
|
|
```c
|
|
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
|
|
ut32 value = opcode[0] | opcode[1] << 8 | opcode[2] << 16 | opcode[3] << 24;
|
|
```
|
|
|
|
And do the following for big endian:
|
|
|
|
```c
|
|
ut32 value = opcode[3] | opcode[2] << 8 | opcode[1] << 16 | opcode[0] << 24;
|
|
```
|
|
|
|
This behavior is not dependent on architecture, and will act consistently
|
|
between any standard compilers regardless of host endianness.
|
|
|
|
### Endian helper functions
|
|
|
|
The above is not very easy to read. Within radare2, use endianness helper
|
|
functions to interpret byte streams in a given endianness.
|
|
|
|
```c
|
|
val32 = r_read_be32(buffer) // reads 4 bytes from a stream in BE
|
|
val32 = r_read_le32(buffer) // reads 4 bytes from a stream in LE
|
|
val32 = r_read_ble32(buffer, isbig) // reads 4 bytes from a stream:
|
|
// if isbig is true, reads in BE
|
|
// otherwise reads in LE
|
|
```
|
|
|
|
Such helper functions exist for 64, 32, 16, and 8 bit reads and writes.
|
|
|
|
* Note that 8 bit reads are equivalent to casting a single byte of the buffer
|
|
to a `ut8` value, i.e.: endian is irrelevant.
|
|
|
|
## Editor configuration
|
|
|
|
Vim/Neovim:
|
|
|
|
```vim
|
|
setl cindent
|
|
setl tabstop=8
|
|
setl noexpandtab
|
|
setl cino=:0,+0,(2,J0,{1,}0,>8,)1,m1
|
|
```
|
|
|
|
Emacs:
|
|
|
|
```elisp
|
|
(c-add-style "radare2"
|
|
'((c-basic-offset . 8)
|
|
(tab-width . 8)
|
|
(indent-tabs-mode . t)
|
|
;;;; You would need (put 'c-auto-align-backslashes 'safe-local-variable 'booleanp) to enable this
|
|
;; (c-auto-align-backslashes . nil)
|
|
(c-offsets-alist
|
|
(arglist-intro . ++)
|
|
(arglist-cont . ++)
|
|
(arglist-cont-nonempty . ++)
|
|
(statement-cont . ++)
|
|
)))
|
|
```
|
|
|
|
You may use directory-local variables by adding the following to
|
|
`.dir-locals.el`.
|
|
|
|
```elisp
|
|
((c-mode . ((c-file-style . "radare2"))))
|
|
```
|
|
|
|
## Packed structures
|
|
|
|
Due to standards differing between compilers, radare2 provides a portable
|
|
helper macro for packed structures: `R_PACKED()`, which will automatically
|
|
utilize the correct compiler-dependent macro. Do not use `#pragma pack` or
|
|
`__attribute__((packed))`. Place the packed structure inside `R_PACKED()` like
|
|
so:
|
|
|
|
```c
|
|
R_PACKED (struct mystruct {
|
|
int a;
|
|
char b;
|
|
});
|
|
```
|
|
|
|
If you are using `typedef`, do not encapsulate the type name.
|
|
|
|
```c
|
|
R_PACKED (typedef struct mystruct_t {
|
|
int a;
|
|
char b;
|
|
}) mystruct;
|
|
```
|
|
|
|
## Modules
|
|
|
|
radare2 is split into modular libraries in the `libr/` directory. The `binr/`
|
|
directory contains programs which use these libraries.
|
|
|
|
The libraries can be built individually, PIC or non-PIC. You can also create a
|
|
single static library archive (`.a`) which you can link your own programs
|
|
against to use radare2's libraries without depending on an existing system
|
|
installation. See [doc/static.md](doc/static.md) for more info.
|
|
|
|
[This presentation](http://radare.org/get/lacon-radare-2009/) gives a good
|
|
overview of the libraries.
|
|
|
|
## API
|
|
|
|
The external API is maintained in a different repository. The API function
|
|
definitions in C header files are derived from and documented in the
|
|
`radare2-bindings` repository, found
|
|
[here](https://github.com/radareorg/radare2-bindings).
|
|
|
|
Currently, the process of updating the header files from changed API bindings
|
|
requires human intervention, to ensure that proper review occurs. Incorrect
|
|
definitions in the C header files will trigger a build failure in the bindings
|
|
repository.
|
|
|
|
If you are able to write a plugin for various IDE that can associate the
|
|
bindings with the header files, such a contribution would be very welcome.
|
|
|
|
## Dependencies and installation
|
|
|
|
radare2 does not require external dependencies. On \*nix-like systems, it
|
|
requires only a standard C compiler and GNU `make`. For compiling on Windows,
|
|
see [doc/windows.md](doc/windows.md). Browse the [doc/](doc/) folder for other
|
|
architectures. For cross-compilation, see
|
|
[doc/cross-compile.md](doc/cross-compile.md).
|
|
|
|
## Recompiling and Outdated Dependencies
|
|
|
|
When recompiling code, ensure that you recompile all dependent modules (or
|
|
simply recompile the entire project). If a module's dependency is not
|
|
recompiled and relinked, it may cause segmentation faults due to outdated
|
|
structures and libraries. Such errors are not handles automatically, so if you
|
|
are not sure, recompile all modules.
|
|
|
|
To speed up frequent recompilation, you can use `ccache` like so:
|
|
|
|
```sh
|
|
export CC="ccache gcc"
|
|
```
|
|
|
|
This will automatically detect when files do not need to recompiled and avoid
|
|
unnecessary work.
|
|
|
|
## Repeated installation
|
|
|
|
There is an alternative installation method for radare2 to make it easier to
|
|
repeatedly install while making changes. The `symstall` target creates a single
|
|
system-wide installation using symlinks instead of copies, making repeated
|
|
builds faster.
|
|
|
|
```sh
|
|
sudo make symstall
|
|
```
|
|
|
|
## Source repository
|
|
|
|
The source for radare2 can be found in the following GitHub repository:
|
|
|
|
```sh
|
|
git clone https://github.com/radareorg/radare2
|
|
```
|
|
|
|
Other packages radare2 depends on, such as Capstone, are pulled from
|
|
their git repository as required.
|
|
|
|
To get an up-to-date copy of the repository, you should perform the
|
|
following while on the `master` branch:
|
|
|
|
```sh
|
|
git pull
|
|
```
|
|
|
|
If your local git repository is not tracking upstream, you may need to use the
|
|
following:
|
|
|
|
```sh
|
|
git pull https://github.com:radareorg/radare2 master
|
|
```
|
|
|
|
The installation scripts `sys/user.sh`, `sys/install.sh`, `sys/meson.py`, and
|
|
`sys/termux.sh` will automatically identify and update using an existing
|
|
upstream remote, if one exists. If not, it will pull using a direct URL.
|
|
|
|
If you have modified files on the `master` branch, you may encounter conflicts
|
|
that must be resolved manually. To save your changes, work on a different
|
|
branch as described in [CONTRIBUTING.md](CONTRIBUTING.md). If you wish to
|
|
discard your current work, use the following commands:
|
|
|
|
```sh
|
|
git clean -xdf
|
|
git reset --hard
|
|
```
|
|
|
|
## Regression testing
|
|
|
|
Use `r2r` to run the radare2 regression test suite, e.g.:
|
|
|
|
```sh
|
|
sys/install.sh
|
|
r2r
|
|
```
|
|
|
|
r2r's source can be found in the `test/` directory, while binaries used for
|
|
tests are located in the following GitHub repository:
|
|
|
|
```sh
|
|
git clone https://github.com/radareorg/radare2-testbins
|
|
```
|
|
|
|
These can be found in `test/bins/` after being downloaded by r2r.
|
|
|
|
For more information, see [r2r's
|
|
README](https://github.com/radareorg/radare2-testbins/blob/master/README).
|
|
|
|
The test files can be found in `test/db/`. Each test consists of a unique name,
|
|
an input file, a list of input commands, and the expected output. The test must
|
|
be terminated with a line consisting only of `RUN`.
|
|
|
|
Testing can always be improved. If you can contribute additional tests or fix
|
|
existing tests, it is greatly appreciated.
|
|
|
|
## Reporting bugs
|
|
|
|
If you encounter a broken feature, issue, error, problem, or it is unclear how
|
|
to do something that should be covered by radare2's functionality, report an
|
|
issue on the GitHub repository
|
|
[here](https://github.com/radareorg/radare2/issues).
|
|
|
|
If you are looking for feedback, check out the [Community section in the
|
|
README](README.md#Community) for places where you can contact other r2 devs.
|
|
|
|
# HOW TO RELEASE
|
|
|
|
- Set `RELEASE=1` in global.mk and r2-bindings/config.mk.acr.
|
|
- Use `bsdtar` from libarchive package. GNU tar is broken.
|
|
|
|
RADARE2
|
|
---
|
|
- bump revision
|
|
- `./configure`
|
|
- `make dist`
|
|
|
|
R2-BINDINGS
|
|
---
|
|
- `./configure --enable-devel`
|
|
- `make`
|
|
- `make dist`
|
|
|
|
- Update the [paths on the website](https://github.com/radareorg/radareorg/blob/master/source/download_paths.rst)
|
|
|
|
# Additional resources
|
|
|
|
* [CONTRIBUTING.md](CONTRIBUTING.md)
|
|
* [README.md](README.md)
|
|
* [USAGE.md](USAGE.md)
|