19 KiB
Development information
This file aims to introduce developers to conventions for working on the code base of radare2.
The GitHub issues page contains a list of all the bugs that have been reported, with labels to classify them by difficulty, type, milestone, etc. It is a good place to start if you are looking to contribute.
For information about the git process, see CONTRIBUTING.md.
Documentation
Functions should have descriptive names and parameters. It should be clear what the function and its arguments do from the declaration. Comments should be used to explain purpose or clarify something that may not be immediately apparent or relatively complicated.
/* Find the min and max addresses in an RList of maps. Returns (max-min)/width. */
static int findMinMax(RList *maps, ut64 *min, ut64 *max, int skip, int width);
Error diagnosis
There are several utilities that can be used to diagnose errors in r2, whether they are related to memory (segfaults, uninitialized read, etc.) or problems with features.
Compilation options
sys/sanitize.sh
: Compile with ASan, the address sanitizer. Provides detailed backtraces for memory errors.R2_DEBUG_ASSERT=1
: Provides a backtrace when a debug assert (typically ar_return_
macro) fails.R2_DEBUG=1
: Show error messages and crash signal. Used for debugging plugin loading issues.
Useful macros from r_types.h
EPRINT_*
: Allows you to quickly add or remove a debug print without worrying about format specifiers.
Parameter marking
r2 provides several empty macros to make function signatures more informative.
R_OUT
: Parameter is output - written to instead of read.R_INOUT
: Parameter is read/write.R_OWN
: Pointer ownership is transferred from the caller.R_BORROW
: The caller retains ownership of the pointer - the reciever must not free it.R_NONNULL
: Pointer must not be null.R_NULLABLE
: Pointer may ne null.R_DEPRECATE
: Do not use in new code and will be removed in the future.R_IFNULL(x)
: Default value for a pointer when null.R_UNUSED
: Not used.
Code style
C
In order to contribute patches or plugins, we encourage you to use the same coding style as the rest of the code base.
-
Please use
./sys/clang-format-diff.py
before submitting a PR to be sure you are following the coding style, as described in [CONTRIBUTING.md](CONTRIBUTING.md#Getting Started). If you find a bug in this script, please submit a bug report issue. A detailed style guide can be found below. -
See
sys/indent.sh
for indenting your code automatically. -
A pre-commit hook to check coding style is located at
sys/pre-commit-indent.sh
. You can install it by copying it to.git/hooks/pre-commit
. To preserve your existing pre-commit hook, usecat sys/pre-commit-indent.sh >> .git/hooks/pre-commit
instead. -
For a premade
.vimrc
, seedoc/vim
. -
See
.clang-format
for work-in-progress support for automated indentation.
Guidelines
The following guidelines apply to code that we must maintain. Generally, they will not apply to copy-paste external code that will not be touched.
-
Tabs are used for indentation. In a switch statement, the cases are indented at the switch level.
-
Switch-cases where local variables are needed should be refactored into separate functions instead of using braces. If braced scope syntax is used, put
break;
statements inside the scope.
switch (n) {
case 1:
break;
case 2: {
break;
}
default:
}
-
Lines should be at most 78 characters in length. A tab is considered 8 characters.
-
Braces open on the same line as the for/while/if/else/function/etc. Closing braces are put on a line of their own, except in the else of an if statement or in the while of a do-while statement.
if (a == b) {
...
}
if (a == b) {
...
} else if (a > b) {
...
}
if (a == b) {
...
} else {
do_something_else ();
}
do {
do_something ();
} while (cond);
if (a == b) {
b = 3;
}
- Always use braces for if and while.
-if (a == b)
- return;
+if (a == b) {
+ return;
+}
-
In general, avoid
goto
. Thegoto
statement only comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed, then just return directly.Choose label names which say what the
goto
does or why it exists. An example of a good name could beout_buffer:
if thegoto
freesbuffer
. Avoid using GW-BASIC names likeerr1:
anderr2:
. -
Use
r_return_*
macros to check for conditions that are caused by programming errors or bugs; i.e.: conditions that should never happen. Do not use them when checking for runtime error conditions, such as aNULL
value being returned frommalloc()
. Use a standard if statement for these cases.
int check(RCore *c, int a, int b) {
/* check for programming errors */
r_return_val_if_fail (c, false);
r_return_val_if_fail (a >= 0, b >= 1, false);
/* check for runtime errors */
ut8 *buf = calloc (b, sizeof (a));
if (!buf) {
return -1;
}
/* continue... */
}
- Use spaces after keywords and around operators.
a = b + 3;
a = (b << 3) * 5;
a = sizeof (b) * 4;
- Multiline ternary operator conditionals are indented in JavaScript style:
-ret = over ?
- r_debug_step_over (dbg, 1) :
- r_debug_step (dbg, 1);
+ret = over
+ ? r_debug_step_over (dbg, 1)
+ : r_debug_step (dbg, 1);
- When breaking up a long line, use a single additional tab if the current and next lines are aligned. Do not align start of line using spaces.
-x = function_with_long_signature_and_many_args (arg1, arg2, arg3, arg4, arg5,
- arg6, arg7, arg8);
-y = z;
+x = function_with_long_signature_and_many_args (arg1, arg2, arg3, arg4, arg5,
+ arg6, arg7, arg8);
+y = z;
- Use two additional tabs if the next line is indented to avoid confusion with control flow.
if (function_with_long_signature_and_many_args (arg1, arg2, arg3, arg4, arg5,
- arg6, arg7, arg8)) {
+ arg6, arg7, arg8)) {
do_stuff ();
}
- When following the above guideline, if additional indentation is needed on consecutive lines, use a single tab for each nested level. Avoid heavy nesting in this manner.
if (condition_1 && condition_2 && condition_3
&& (condition_4
- || condition_5)) {
+ || condition_5)) {
do_stuff ();
}
- Split long conditional expressions into small
static inline
functions to make them more readable.
+static inline bool inRange(RBreakpointItem *b, ut64 addr) {
+ return (addr >= b->addr && addr < (b->addr + b->size));
+}
+
+static inline bool matchProt(RBreakpointItem *b, int rwx) {
+ return (!rwx || (rwx && b->rwx));
+}
+
R_API RBreakpointItem *r_bp_get_in(RBreakpoint *bp, ut64 addr, int rwx) {
RBreakpointItem *b;
RListIter *iter;
r_list_foreach (bp->bps, iter, b) {
- if (addr >= b->addr && addr < (b->addr+b->size) && \
- (!rwx || rwx&b->rwx)) {
+ if (inRange (b, addr) && matchProt (b, rwx)) {
return b;
}
}
return NULL;
}
-
Use
R_API
to mark exportable (public) methods for module APIs. -
Use
R_IPI
to mark functions internal to a library. -
Other functions should be
static
to avoid polluting the global namespace. -
The structure of C files in r2 should be as follows:
/* Copyright ... */ // copyright
#include <r_core.h> // includes
static int globals // const, define, global variables
static void helper(void) {} // static functions
R_IPI void internal(void) {} // internal apis (used only inside the library)
R_API void public(void) {} // public apis starting with constructor/destructor
-
Why do we return
int
instead ofenum
?The reason why many r2 functions return int instead of an enum type is because enums can't be OR'ed; additionally, it breaks the usage within a switch statement and swig can't handle it.
r_core_wrap.cxx:28612:60: error: assigning to 'RRegisterType' from incompatible type 'long'
arg2 = static_cast< long >(val2); if (arg1) (arg1)->type = arg2; resultobj = SWIG_Py_Void(); return resultobj; fail:
^ ~~~~
r_core_wrap.cxx:32103:61: error: assigning to 'RDebugReasonType' from incompatible type 'int'
arg2 = static_cast< int >(val2); if (arg1) (arg1)->type = arg2; resultobj = SWIG_Py_Void(); return resultobj; fail:
^ ~~~~
-
Do not leave trailing whitespaces at end-of-line.
-
Do not use
<assert.h>
. Use"r_util/r_assert.h"
instead. -
You can use
export R2_DEBUG_ASSERT=1
to set a breakpoint when hitting an assert. -
Declare variables at the beginning of code blocks - use C89 declaration instead of C99. In other words, do not mix declarations and code. This helps reduce the number of local variables per function and makes it easier to find which variables are used where.
-
Always put a space before an opening parenthesis (function calls, conditionals, for loops, etc.) except when defining a function signature. This is useful for searching the code base with
grep
.
-if(a == b){
+if (a == b) {
-static int check(RCore *core, int a);
+static int check (RCore *core, int a);
- Where is
function_name()
defined?
grep -R 'function_name(' libr
- Where is
function_name()
used?
grep -R 'function_name (' libr
-
Function names should be explicit enough to not require a comment explaining what it does when seen elsewhere in code.
-
Do not use global variables. The only acceptable time to use them is for singletons and WIP code. Make a comment explaining why it is needed.
-
Commenting out code should be avoided because it reduces readability. If you really need to comment out code, use
#if 0
and#endif
. -
Avoid very long functions; split it into multiple sub-functions or simplify your approach.
-
Use types from
<r_types.h>
instead of the ones in<stdint.h>
, which are known to cause some portability issues. Replaceuint8_t
withut8
, etc. -
Never use
%lld
or%llx
, which are not portable. Use thePFMT64
macros from<r_types.h>
.
Shell scripts
-
Use
#!/bin/sh
. -
Do not use BASH-only features;
[[
,$'...'
, etc. -
Use
sys/shellcheck.sh
to check for problems and BASH-only features.
Managing endianness
Endianness is a common stumbling block when processing buffers or streams and storing intermediate values as integers larger than one byte.
Problem
The following code may seem intuitively correct:
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
ut32 value = *(ut32*)opcode;
However, when opcode
is cast to ut32
, the compiler interprets the memory
layout based on the host CPU's endianness. On little-endian architectures such
as x86, the least-signficiant byte comes first, so value
contains
0x40302010
. On a big-endian architecture, the most-significant byte comes
first, so value
contains 0x10203040
. This implementation-defined behavior
is inherently unstable and should be avoided.
Solution
To avoid dependency on endianness, use bit-shifting and bitwise OR instructions. Instead of casting streams of bytes to larger width integers, do the following for little endian:
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
ut32 value = opcode[0] | opcode[1] << 8 | opcode[2] << 16 | opcode[3] << 24;
And do the following for big endian:
ut32 value = opcode[3] | opcode[2] << 8 | opcode[1] << 16 | opcode[0] << 24;
This behavior is not dependent on architecture, and will act consistently between any standard compilers regardless of host endianness.
Endian helper functions
The above is not very easy to read. Within radare2, use endianness helper functions to interpret byte streams in a given endianness.
val32 = r_read_be32(buffer) // reads 4 bytes from a stream in BE
val32 = r_read_le32(buffer) // reads 4 bytes from a stream in LE
val32 = r_read_ble32(buffer, isbig) // reads 4 bytes from a stream:
// if isbig is true, reads in BE
// otherwise reads in LE
Such helper functions exist for 64, 32, 16, and 8 bit reads and writes.
- Note that 8 bit reads are equivalent to casting a single byte of the buffer
to a
ut8
value, i.e.: endian is irrelevant.
Editor configuration
Vim/Neovim:
setl cindent
setl tabstop=8
setl noexpandtab
setl cino=:0,+0,(2,J0,{1,}0,>8,)1,m1
Emacs:
(c-add-style "radare2"
'((c-basic-offset . 8)
(tab-width . 8)
(indent-tabs-mode . t)
;;;; You would need (put 'c-auto-align-backslashes 'safe-local-variable 'booleanp) to enable this
;; (c-auto-align-backslashes . nil)
(c-offsets-alist
(arglist-intro . ++)
(arglist-cont . ++)
(arglist-cont-nonempty . ++)
(statement-cont . ++)
)))
You may use directory-local variables by adding the following to
.dir-locals.el
.
((c-mode . ((c-file-style . "radare2"))))
Packed structures
Due to standards differing between compilers, radare2 provides a portable
helper macro for packed structures: R_PACKED()
, which will automatically
utilize the correct compiler-dependent macro. Do not use #pragma pack
or
__attribute__((packed))
. Place the packed structure inside R_PACKED()
like
so:
R_PACKED (struct mystruct {
int a;
char b;
});
If you are using typedef
, do not encapsulate the type name.
R_PACKED (typedef struct mystruct_t {
int a;
char b;
}) mystruct;
Modules
radare2 is split into modular libraries in the libr/
directory. The binr/
directory contains programs which use these libraries.
The libraries can be built individually, PIC or non-PIC. You can also create a
single static library archive (.a
) which you can link your own programs
against to use radare2's libraries without depending on an existing system
installation. See doc/static.md for more info.
This presentation gives a good overview of the libraries.
API
The external API is maintained in a different repository. The API function
definitions in C header files are derived from and documented in the
radare2-bindings
repository, found
here.
Currently, the process of updating the header files from changed API bindings requires human intervention, to ensure that proper review occurs. Incorrect definitions in the C header files will trigger a build failure in the bindings repository.
If you are able to write a plugin for various IDE that can associate the bindings with the header files, such a contribution would be very welcome.
Dependencies and installation
radare2 does not require external dependencies. On *nix-like systems, it
requires only a standard C compiler and GNU make
. For compiling on Windows,
see doc/windows.md. Browse the doc/ folder for other
architectures. For cross-compilation, see
doc/cross-compile.md.
Recompiling and Outdated Dependencies
When recompiling code, ensure that you recompile all dependent modules (or simply recompile the entire project). If a module's dependency is not recompiled and relinked, it may cause segmentation faults due to outdated structures and libraries. Such errors are not handles automatically, so if you are not sure, recompile all modules.
To speed up frequent recompilation, you can use ccache
like so:
export CC="ccache gcc"
This will automatically detect when files do not need to recompiled and avoid unnecessary work.
Repeated installation
There is an alternative installation method for radare2 to make it easier to
repeatedly install while making changes. The symstall
target creates a single
system-wide installation using symlinks instead of copies, making repeated
builds faster.
sudo make symstall
Source repository
The source for radare2 can be found in the following GitHub repository:
git clone https://github.com/radareorg/radare2
Other packages radare2 depends on, such as Capstone, are pulled from their git repository as required.
To get an up-to-date copy of the repository, you should perform the
following while on the master
branch:
git pull
If your local git repository is not tracking upstream, you may need to use the following:
git pull https://github.com:radareorg/radare2 master
The installation scripts sys/user.sh
, sys/install.sh
, sys/meson.py
, and
sys/termux.sh
will automatically identify and update using an existing
upstream remote, if one exists. If not, it will pull using a direct URL.
If you have modified files on the master
branch, you may encounter conflicts
that must be resolved manually. To save your changes, work on a different
branch as described in CONTRIBUTING.md. If you wish to
discard your current work, use the following commands:
git clean -xdf
git reset --hard
Regression testing
Use r2r
to run the radare2 regression test suite, e.g.:
sys/install.sh
r2r
r2r's source can be found in the test/
directory, while binaries used for
tests are located in the following GitHub repository:
git clone https://github.com/radareorg/radare2-testbins
These can be found in test/bins/
after being downloaded by r2r.
For more information, see r2r's README.
The test files can be found in test/db/
. Each test consists of a unique name,
an input file, a list of input commands, and the expected output. The test must
be terminated with a line consisting only of RUN
.
Testing can always be improved. If you can contribute additional tests or fix existing tests, it is greatly appreciated.
Reporting bugs
If you encounter a broken feature, issue, error, problem, or it is unclear how to do something that should be covered by radare2's functionality, report an issue on the GitHub repository here.
If you are looking for feedback, check out the Community section in the README for places where you can contact other r2 devs.
HOW TO RELEASE
- Set
RELEASE=1
in global.mk and r2-bindings/config.mk.acr. - Use
bsdtar
from libarchive package. GNU tar is broken.
RADARE2
- bump revision
./configure
make dist
R2-BINDINGS
-
./configure --enable-devel
-
make
-
make dist
-
Update the paths on the website