Zig

August 25, 2024

Zig is a nice small language that is quite suited to low level programming, I would say that it aims at replacing C.

I like that it is a simple language, it can be learnt relatively quickly. I it is very “in the face”, as it has no hidden control flow (implicit destructor calls, exceptions), and no hidden memory allocations. This is a bit more verbose, but simplifies the analysis of what happens just looking at the local code, which is very useful for low level code, and to optimize.

Furthermore zig can be used to incrementally improve a C/C++/Zig codebase, and can easily crosscompile.

I stated to really use zig a year ago, when I decided to do the advent of code (2022). I later used it to implement a slurm plugin (see the presentation I did), and did partecipate to Software you can love.

I enjoyed using it, it has a different the main things that did strike me.

Error handling

Zig has an effective way to hadle errors: the return type of any function can be an union of an error code and the normal return type (for example i32). The error codes can be either an explicit MyErrorCodes!i32 (where you defined something like error MyErrorCodes{ Error1, Error2, }), or implicit !i32. When calling such a function you can decide to either receive back the union, and store it, but almost always you will want to get the result, and either handle the error explicitly (with catch) or return an error if an error happened (try). This makes the error handling also explicit at the call site (no “hidden” exception throwing). It might seems like a small things but it has several benefits:

error handling is important to make a function reusable, because how to handle the corner cases (ignore, report, fail & raise) depend on the place where you use the function, and if it decides on its behaviour it limits its usefulness
when you use a function that might fail the language should make you aware that you have to choose the behaviour on failure, not implicitly choose for you, any default might be dangerous in some context
error handling should be concise and unobtrusive, in many case you do not care much about errors, you want just to fail and pass them on. In zig the single keyword try gives you the “Fail the function and return an error when this operation fails”
Not perfect, errors can be only simple constants, if you want to give more context you must do it through extra channels (for example a writer), but it covers well more than 90% of the cases and is very unobtrusive and adds very little overhead.
Zig provides error traces, they trace an error from the place where it first happened. If one simply does a stacktrace one finds only the last error, ont the one that triggered all. This is similar to exception stack traces as done in python, but an error return type that avoids all all the issues of exceptions (hidden excution flow, and cost of exceptions).

Explicit handling

Exposed implementation

zig makes it difficult to “hide” the implementation, it has a quite different feeling than other languages, but it was refreshing if you care about efficiency, it makes one care about the detailed implementation, and pushes also users to be aware of it, and improve it.

Centralization of handling

zig pushes toward a centralization of handling, this might at first seem complecting (i.e. leading to more complexity), but very often it is the opposite. There are two main reasons for it:
1. some aspects are not really as independent as we would like in reality, and forcing them apart leads to more complexity
2. having all control in one place makes it easier to change things or experiment with different approaches

Memory handling

Zig cares quite a bit about memory handling, this starts form the usage of explicit memory allocators: any allocation requires an allocator that you have to pass explicitly. It took a bit to become used about passing and storing memory allocators, but it was not difficult.

Memory allocation and memory layouts are crucial for the performance of a program. Zig supports integers with arbitrary bit length and packs them in a good way by default. Zig also hs sturctures to easily switch from Arrays of Structures (AoS) to Struct of arrays (SoA) without affecting the whole code using the MultiArrayList, and there are many other interesting data structures.

compile time

Zig has neither preprocessor, nor macros, but has a very nice comptime execution: any function can be called at compile time emulating the target architecture, and compile time functions can manipulate types. Compile time is lazy (no unused function is compiled), and types do ducktyping: if a structure literal is equivalent it can be used (runtime is still statically typed).

C interoperability

It is easy to import header files

pub usingnamespace @cImport({
    @cInclude("slurm/spank.h");
    @cInclude("sys/types.h");
    @cInclude("string.h");
    @cInclude("toml.h");
});

A zig library can generate c headers for the exported functions of a library.

Slices and Pointers

[]i32: slice (range with .ptr and .len), should be preferred to bare pointers
[128]i32, [128:0]i32: fixed size arrays

Zig normally uses either fixed size arrays or slices which do not contain just the pointer .ptr but also the length .len. The & operator on an array returns a slice, just like the allocators.

const array = [5]i32{ 1, 2, 3, 4, 5 };
const slice: []i32 = &array;

There are also slices with a sentinel: i.e. a special value just past their end. Normally this sentinel is 0 and they are denoted with :0:

const cStr: [:0]u8 = &"aStr";

Pointers in zig are much more well defined than in c, a c pointer [*c] i32: can be null, point to a single element or an array of elements, but normally one uses one of:

[*] i32: a non null pointer to an array of unknown length
[*:0] i32: a non null pointer to a null terminated array
* i32: a non null pointer to a single 32 bit integer

All the previous pointers can become?T: a possibly null Type T (optional). With them we can get things like

export fn slurm_spank_init(spnk: spank.spank_t,
   ac: c_int, argv: ?[*][*:0]const u8) spank.slurm_err_t {
    // ...
}

where argv is a possibly null pointer to an array of null terminated strings.

Future

Zig is not yet at 1.0, this means that changes to the language are still possible. Indeed I had to adapt my programs due to changes to the language: the introduction multi objects for loops, and some changes to the standard library. These changes were always improvements, and catched by the compiler, still having a non stable language is definitely a drawback. For this reason it is worth looking in more detail to this aspect, and zig future in general.

A first important point to realize is that in the world “change is the unique constant”: hardware changes, os change, libraries changes, computing languages change (C++2020, Python 3, Fortran 2008,…), and your program also changes. Thus having a language that is not 1.0 is not such a dealbreaker, the chaanges can be normally integrated along with the other changes, and do not increase the burden of using the language too much.

Still there are two main cases in which language changes can be especially disruptive:

programs that are “done” and will be seldomly updated, apart porting them to newer systems
tricky code that one wants to write once and then mainly use, without touching again Both these things get more difficult when only few tests are available.

Language changes are not equally distributed, if one follows zig he will realize that (as with any language) some things are stable work well and are unlikely to change, and things that are clunkier, or where implementors are not convinced of the solution, knowing thing lets one minimize the chance of future disruptions.

Async is a touchy subject in zig for example, because a pretty good idea for it emerged (connected with colorless async), but the reality of implementing expecially the need of knowing the expected stack size required for a call to cleanly instantiare stack frames, wanting to have completly non allocating implementation (when llvm instinsic might call malloc), never made the async implementation leave the experimental stage, and it is unlikely that it will be part of zig 1.0.

The main sources of uncertainity in the future of zig are will to make zig incrementally compilable, meaning that the object code of a single function should be patchable and updatable safely, even with optimizations. It is unclear if any language changes are needed to acheive this, so I will not discuss its impact other that to say that it requires an own implementation of all layers, con just the frontend, but backend, object updating and linker, something that zig is on the track to achieve (making LLVM an optional dependency) and is consistent with the general willingness to make the compilation faster.

I/O is a place where some disruption is likely to happen though. Zig makes allocation explicit and forces one to pass around explicit allocator objects. This is good as allocation is something that can be crucial for the performance and behaviour of a program. I/O and in general event loop handling is another place that can be crucial for performance, io calls can block and dominate the performace of some programs. Zig did not handle that in any way, via async an optional version of the std io would use async calls, but now that is on hold. This is discussed in the issue 8224, the likely outcome is that there will be an explicit io object that somehow encapsulates io_uring, select, poll, epoll or aio, based I/O, and had to be passed to the calls that then perform io. This will give the same level of explicitness as allocation to io, which is I think quite inline with zig philosophy, but will be a bit disruptive to existing programs.

Zen

I think it is goot to finish with the official Zig Zen:

Communicate intent precisely.
Edge cases matter.
Favor reading code over writing code.
Only one obvious way to do things.
Runtime crashes are better than bugs.
Compile errors are better than runtime crashes.
Incremental improvements.
Avoid local maximums.
Reduce the amount one must remember.
Focus on code rather than style.
Resource allocation may fail; resource deallocation must succeed.
Memory is a resource.
Together we serve the users.