Back

How safe is Zig?

235 points15 hoursscattered-thoughts.net
AndyKelley11 hours ago

I have one trick up my sleeve for memory safety of locals. I'm looking forward to experimenting with it during an upcoming release cycle of Zig. However, this release cycle (0.10.0) is all about polishing the self-hosted compiler and shipping it. I'll be sure to make a blog post about it exploring the tradeoffs - it won't be a silver bullet - and I'm sure it will be a lively discussion. The idea is (1) escape analysis and (2) in safe builds, secretly heap-allocate possibly-escaped locals with a hardened allocator and then free the locals at the end of their declared scope.

nine_k3 hours ago

I would rather prefer the compiler to tell me: "Hey, this stack-allocated variable is escaping the function's scope, I can't do that! Allocate it somewhere outside the stack."

Maybe the compiler could offer me a simple way to fix the declaration somehow. But being explicit and transparent here feels important to me; if I wanted to second-guess the compiler and meditate over disassembly, I could pick C++.

randyrand1 hour ago

Just another idea for use after free:

What If we combined the 'non-repeating' malloc idea with 128-bit uuids?

malloc would just return a 128-bit uuid, and to get to the data ptr you'd need to consult a hash table.

  dataPtrArr[hash(uuid)].dataPtr = dataPtr
We'd check if it's been freed by checking:

  dataPtrArr[hash(uuid)].uuid == uuid
pjmlp40 minutes ago

At which point is much simpler to introduce automatic memory management in some form.

That solution is basically how platforms like Psion or Symbian used handles due to memory constraints.

randyrand24 minutes ago

IMO this is still simpler than automatic memory management, and the runtime costs are mostly fixed and predictable.

You also don't need to worry about ref cycles or GC pauses.

randyrand6 hours ago

I feel like the most user friendly solution for Use After Free is to just use Reference Counting. Basically, just copy Objective-C's "weak_ptr" design.

For every allocation, also store on the heap a "reference object", that keeps track of this new reference.

  struct reference_t {
   // the ptr returned by malloc
   void* ptr;
   // the stack/heap location that ptr was written to.
   // i.e  the reference location.
   void* referenceAddr; 
  }
Reference track: every time ptr is copied or overwritten, create and delete reference objects as needed.

If ptr is freed, visit every reference and call *referenceAddr = NULL, turning all of these references into NULL pointers.

remexre8 hours ago

Could this be integrated into the LLVM SafeStack pass? (I don't know how related Zig still is to LLVM, or if your thing would be implemented there.)

skullt10 hours ago

Does that not contradict the Zig principle of no hidden allocations?

kristoff_it9 hours ago

I don't know the precise details of what Andrew has in mind but the compiler can know how much memory is required for this kind of operation at compile time. This is different from normal heap allocation where you only know how much memory is needed at the last minute.

At least in simple cases, this means that the memory for escaped variables could be allocated all at once at the beginning of the program not too differently to how the program allocates memory for the stack.

messe8 hours ago

Static allocation at the beginning of the program like that can only work for single threaded programs with non-recursive functions though, right?

I’d hazard a guess that the implementation will rely on use-after-free faulting, meaning that the use of any escaped variable will fault rather than corrupting the stack.

wavesquid3 hours ago

No need to limit to single-threaded: as long as we reserve enough space in the TLS area it's possible to work with multiple threads.

Zig has future plans to require recursive functions to declare their maximum stack memory usage up-front, so that will provide the rest.

adwn2 hours ago

No need for recursion or multi-threading: If you call the function in a loop and don't release the escaped local unter after the loop, and if the number of loop iterations isn't statically known, then it's impossible to pre-allocate heap storage for that local variable.

anonymoushn13 hours ago

I would like Zig to do more to protect users from dangling stack pointers somehow. I am almost entirely done writing such bugs, but I catch them in code review frequently, and I recently moved these lines out of main() into some subroutine:

  var fba = std.heap.FixedBufferAllocator.init(slice_for_fba);
  gpa = fba.allocator();
slice_for_fba is a heap-allocated byte slice. gpa is a global. fba was local to main(), which coincidentally made it live as long as gpa, but then it was local to some setup subroutine called by main(). gpa contains an internal pointer to fba, so you run into trouble pretty quickly when you try allocating memory using a pointer to whatever is on that part of the stack later, instead of your FixedBufferAllocator.

Many of the dangling stack pointers I've caught in code review don't really look like the above. Instead, they're dangling pointers that are intended to be internal pointers, so they would be avoided if we had non-movable/non-copyable types. I'm not sure such types are worth the trouble otherwise though. Personally, I've just stopped making structs that use internal pointers. In a typical case, instead of having an internal array and a slice into the array, a struct can have an internal heap-allocated slice and another slice into that slice. like I said, I'd like these thorns to be less thorny somehow.

10000truths13 hours ago

Alternatively, use offset values instead of internal pointers. Now your structs are trivially relocatable, and you can use smaller integer types instead of pointers, which allows you to more easily catch overflow errors.

anonymoushn13 hours ago

This is a good idea, but native support for slices tempts one to stray from the path.

alphazino12 hours ago

> so they would be avoided if we had non-movable/non-copyable types.

There is a proposal for this that was accepted a while ago[0]. However, the devs have been focused on the self-hosted compiler recently, so they're behind on actually implementing accepted proposals.

[0] https://github.com/ziglang/zig/issues/7769

throwawaymaths13 hours ago

This. I believe it is in the works, but postponed to finish up self-hosted.

https://github.com/ziglang/zig/issues/2301

avgcorrection9 hours ago

A meta point to make here but I don’t quite understand the pushback that Rust has gotten. How often does a language come around that flat out eliminates certain errors statically, and at the same time manages to stay in that low-level-capable pocket? And doesn’t require a PhD (or heck, a scholarly stipend) to use? Honestly that might be a once in a lifetime kind of thing.

But not requiring a PhD (hyperbole) is not enough: it should be Simple as well.

But unfortunately Rust is (mamma mia) Complex and only pointy-haired Scala type architects are supposed to gravitate towards it.

But think of what the distinction between no-found-bugs (testing) and no-possible-bugs (a certain class of bugs) buys you; you don’t ever have to even think about those kinds of things as long as you trust the compiler and the Unsafe code that you rely on.

Again, I could understand if someone thought that this safety was not worth it if people had to prove their code safe in some esoteric metalanguage. And if the alternatives were fantastic. But what are people willing to give up this safety for? A whole bunch of new languages which range from improved-C to high-level languages with low-level capabilities. And none of them seem to give some alternative iron-clad guarantees. In fact, one of their selling point is mere optionality: you can have some safety and/or you can turn it off in release. So runtime checks which you might (culturally/technically) be encouraged to turn off when you actually want your code to run out in the wild, where users give all sorts of unexpected input (not just your “asdfg” input) and get your program into weird states that you didn’t have time to even think of. (Of course Rust does the same thing with certain non-memory-safety bug checks like integer overflow.)

nyanpasu647 hours ago

Unsafe Rust is an esoteric language without iron-clad guarantees, and type-level programming and async Rust is an esoteric metalanguage (https://hirrolot.github.io/posts/rust-is-hard-or-the-misery-...). For example, matklad made a recent blog post on "Caches In Rust" (https://matklad.github.io/2022/06/11/caches-in-rust.html). The cache is built around https://docs.rs/elsa, which is built around https://docs.rs/stable_deref_trait/latest/stable_deref_trait..., which is unsound for Box and violates stacked borrows in its current form: https://github.com/Storyyeller/stable_deref_trait/issues/15

There is a recurring trend of sound C programs turning into unsound Rust programs, because shared mutability is often necessary but it's difficult to avoid creating &mut, and Stacked Borrows places strict conditions on constructing &mut T (they invalidate some but not all aliasing *const T).

haberman2 hours ago

I invested a lot of time porting some parsing code I had written to Rust, with the vision that Rust is the memory-safe future. The code I was porting from used arenas, so I tried to use arenas in Rust also.

Using arenas required a bunch of lifetime annotations everywhere, but I was happy to do it if I could get provable memory safety.

I got everything working, but the moment I tried to wrap it in Python, it failed. The lifetime annotation on my struct was a problem. I tried to work around this by using ouroboros and the self-referencing struct pattern. But then I ran into another problem: the Rust arena I was using (Bumpalo) is not Sync, which means references to the arena are not Send. All of my arena-aware containers were storing references to the Arena and therefore were not Send, but wrapping in Python requires it to be Send. I wrote more about these challenges here: https://blog.reverberate.org/2021/12/19/arenas-and-rust.html

You might say "well don't use an arena, use Box, Rc, etc." But now you're telling me to write more complicated and less efficient code, just to make it work with Rust. That is a hard pill to swallow for what is supposed to be a systems language.

socialdemocrat2 hours ago

I think you are too dismissive of the importance of simplicity. Programming is hard. That Rust takes away certain problems doesn’t change that. A lot of coding is just reading and understanding some code. If you have problems understanding some code then I hat is also code you are. Or likely to not catch bugs in.

A compiler cannot be a substitute for your brian. The ability to read code and think clearly about it is a massively important feature because humans at the end of the day are the ones who have to understand code and fix it.

It depends on the person. Programmers are different. Rust works great for some. To me it looks too much like C++ which is something I want to put behind. I know it is a different language but it has a lot of that same focus as C++ that leads to slow compilers and complex looking code.

If I was younger I might have put in the effort, but I am not willing to make the same wrong bet I did with C++. I sunk so much time into perfecting C++ skills and realizing afterwards when using other languages that it was a huge waste.

avgcorrection1 hour ago

Readable code is important? Then keep in mind the context: low-level programming and writing Safe Rust or writing in some not-quite-memory-safe language because one would rather be bitten by undefined behavior now and again rather than have to learn the borrow checker.

Knowing for sure that the code you write is at least memory safe is a certain kind of readability win and I don’t see how anyone can claim that it’s not.

gnuvince3 hours ago

Rust has been my primary language for the past 5 years, but it's moving in a direction that gets it farther away from my own values about what software ought to be like. As more features are added to the language, the ways they interact with each other increases the overall complexity of the language and it becomes hard to keep up.

I really like the safety guarantees that Rust provides and I want to keep enjoying them, but the language -- and more importantly, its ecosystem -- is moving from something that was relatively simple to a weird mish-mash of C++, JavaScript, and Haskell, and I'm keeping an eye out for a possible escape hatch.

Zig, Odin, or Hare are not on the same plane of existence as Rust when it comes to out-of-the-box safety (or, at the moment, out-of-the-box suitability for writing production-grade software), but they are simpler and intend to remain that way. That really jives with my values. Yes, this means that some of the complexity of writing software is pushed back onto me, the programmer, but I feel that I have a better shot at writing good software with a simple language than with a complex language where I only superficially understand the features.

kristoff_it8 hours ago

> Of course Rust does the same thing with certain non-memory-safety bug checks like integer overflow.

The problem with getting lost too much in the ironclad certainties of Rust is that you start forgetting that simplicity (papa pia) protects you from other problems. You can get certain programs in pretty messed up states with an unwanted wrap around.

Programming is hard. Rust is cool, very cool, but it's not a universal silver bullet.

avgcorrection8 hours ago

Nothing Is Perfect is a common refrain and non-argument.

If option A has 20 defects and option B has the superset of 25 defects then option A is better—the fact that option A has defects at all is completely besides the point with regards to relative measurements.

Karrot_Kream8 hours ago

But if Option A has 20 defects and takes a lot of effort to go down to 15 defects, yet Option B has 25 defects and offers a quick path to go down to 10 defects, then which option is superior? You can't take this in isolation. The cognitive load of Rust takes a lot of defects out of the picture completely, but going off the beaten path in Rust takes a lot of design and patience.

People have been fighting this fight forever. Should we use static types which make it slower to iterate or dynamic types that help converge on error-free behavior with less programmer intervention? The tradeoffs have become clearer over the years but the decision remains as nuanced as ever. And as the decision space remains nuanced, I'm excited about languages exploring other areas of the design space like Zig or Nim.

+2
avgcorrection8 hours ago
coldtea7 hours ago

>If option A has 20 defects and option B has the superset of 25 defects then option A is better

Only if "defect count" is what you care for.

What if you don't give a fuck about defect count, but prefer simplicity to explore/experiment quickly, ease of use, time to market, and so on?

notriddle7 hours ago

Then you don't want Zig or Rust. Use a language with a GC. Exploratory programming is a lot more pleasant when you don't have to worry about calling free() at the right time. I've had success with PHP and Elixir for productive, exploratory programming, not just because of their GCs, but also because they both support REPL-driven development and hot code reloading.

qzw6 hours ago

Then just use C? Heck, if you really don't give a fuck about defects, you can just have all your code in main(). You really can't beat that in terms of simplicity to explore/experiment quickly, ease of use, and time to market.

kristoff_it7 hours ago

Zig keeps overflow checks in the main release mode (ReleaseSafe), Rust defines ints as naturally wrapping in release. This means that Rust is not a strict superset of Zig in terms of safety, if you want to go down that route.

I personally am not interested at all in abstract discussions about sets of errors. Reality is much more complicated, each error needs to be evaluated with regards to the probability of causing it and the associated cost. Both things vary wildly depending on the project at hand.

avgcorrection7 hours ago

> This means that Rust is not a strict superset of Zig in terms of safety, if you want to go down that route.

Fair.

> I personally am not interested at all in abstract discussions about sets of errors.

Abstract? Handwaving “no silver bullet” is even more abstract (non-specific).

Avshalom7 hours ago

releasesafe is the main release mode because zig has a small community that is largely ideologically aligned with it.

I have absolutely no faith that, in a future where Zig is popular, it will remain so. "well it passed our unit tests and it didn't fall down when we fuzzed it, so lets squeeze a couple free % extra perf out of it and ship" already in this post we have people talking about how safer mallocs or sanitizers are too much of a hit to expect people to use in the wild.

the__alchemist8 hours ago

This is a concise summary of why I'm betting on Rust as the future of performant and embedded computing. You or I could poke holes in it for quite some time. Yet, I imagine the holes would be smaller and less numerous than in any other language capable in these domains.

I think some of the push back is from domains where Rust isn't uniquely suited. Eg, You see a lot of complexity in Rust for server backends; eg async and traits. So, someone not used to Rust may see these, and assume Rust is overly complex. In these domains, there are alternatives that can stand toe-to-toe with it. In lower-level domains, it's not clear there are.

cogman107 hours ago

> I think some of the push back is from domains where Rust isn't uniquely suited. Eg, You see a lot of complexity in Rust for server backends; eg async and traits. So, someone not used to Rust may see these, and assume Rust is overly complex. In these domains, there are alternatives that can stand toe-to-toe with it. In lower-level domains, it's not clear there are.

The big win for rust in these domains is startup time, memory usage, and distributable size.

It may be that these things outweigh the easier programming of go or java.

Now if you have a big long running server with lots of hardware at your disposal then rust doesn't make a whole lot of sense. However, if want something like an aws lambda or rapid up/down scaling based on load, rust might start to look a lot more tempting.

voidhorse5 hours ago

The premise of eliminating entire classes of errors in the abstract is nice and all, and definitely something we should do, but it isn’t the sole deciding factor in choosing an implementation. language:

- If the language is not well known, that’s bad. It will be harder to hire a proficient team. More time will be spent on learning the language. - If the syntax is needlessly verbose, that’s bad. It increases the chance for typos and time spent fixing typos to get things to compile. Eventually it leads to ide completion based programming which results in the degradation of the skill set of the pool of programmers that know that language. - If the concepts the language uses are difficult to manipulate and remember, it takes more time to engage with any given piece of code - If you often need to drop into unsafe modes that’s also not great because now you effectively are using two languages not just one. The safe language and the unsafe language. they interact but play by totally different rules. yikes.

I think rust is a great language and the borrow checker is an amazing innovation. However I think rust has a lot of warts that will harm its success in the long run. I think the next language that leverages the borrow checker idea but does so with a bit better ergonomics will really take off.

dilap7 hours ago

What Rust does is incredibly cool and impressive.

But as someone that's dabbled a bit in both Zig and Rust, I think there's a lot of incidental complexity in Rust.

For example, despite having used them and read the docs, I'm still not exactly sure how namespaces work in Rust. It takes 30s to understand exactly what is going on in Zig.

pitaj2 hours ago

Can you explain what you mean by your namespaces comment? AFAIK, Rust has modules and crates, not namespaces.

Klonoar7 hours ago

>A meta point to make here but I don’t quite understand the pushback that Rust has gotten.

The non-CS "human" answer to this is that so much of tech and programming is unfortunately tied to identity. There are developers who view their choices as bordering on religion (from editors to languages to operating systems and so on) and across the entire industry you can see where some will take the slightest hint that things could be better as an affront to their identity.

The more that Rust grows and winds up in the industry, the more this will continue to happen.

modeless7 hours ago

It's pretty simple. Rust's safety features (and other language choices) have a productivity cost. For me I found the cost surprisingly high, and I'm not alone (though I'm sure I'll get replies from people who say the borrow checker never bothers them anymore and made them a better programmer, let's just agree there's room to disagree).

Although I'm a big fan of safety, since experiencing Rust my opinion is that low-pause GC is a better direction for the future of safe but high performance programming. And there's also a niche for languages that aren't absolutely safe which I think Zig looks like a great fit for.

crabbygrabby3 hours ago

After having seeing GC after GC fail to live up to expectations... I'm still voting for Rust. So much more control over wether you even want to allocate or not. I know where you are coming from but, I see it differently I guess.

modeless2 hours ago

Whether you allocate or not is a property of the language, not the GC. A lot of GC'd languages encourage or even force allocations all over the place. But maybe we could do better in a new language.

dleslie14 hours ago

And here is the table with Nim added; though potentially many GC'd languages would be similar to Nim:

https://uploads.peterme.net/nimsafe.html

Edit: noteworthy addendum: the ARC/ORC features have been released, so the footnote is now moot.

3a2d2911 hours ago

Seeing Nim danger made me think, shouldn’t rust unsafe be added?

Seems inaccurate to display rust as safe and not include what actually allows memory bugs to be found in public crates.

jewpfko13 hours ago

Thanks! I'd love to see a Dlang BetterC column too

Snarwin12 hours ago

Here's a version with D included:

https://gist.github.com/pbackus/0e9c9d0c83cd7d3a46365c054129...

The only difference in BetterC is that you lose access to the GC, so you have to use RC if you want safe heap allocation.

kzrdude5 hours ago

I guess rust should say "wraps" for integer overflow as well, as that's what it does in default release compile.

IshKebab10 hours ago

I don't know why Rust gets "runtime" and Nim gets "compile time" for type confusion?

shirleyquirk9 hours ago

yes, for tagged unions specifically, (which the linked post refers to for that row) Nim raises an exception at runtime when trying to access the wrong field, (or trying to change the discriminant)

verdagon13 hours ago

A lot of embedded devices and safety critical software sometimes don't even use a heap, and instead use pre-allocated chunks of memory whose size is calculated beforehand. It's memory safe, and has much more deterministic execution time.

This is also a popular approach in games, especially ones with entity-component-system architectures.

I'm excited about Zig for these use cases especially, it can be a much easier approach with much less complexity than using a borrow checker.

jorangreef12 hours ago

This is almost what we do for TigerBeetle, a new distributed database being written in Zig. All memory is statically allocated at startup [1]. Thereafter, there are zero calls to malloc() or free(). We run a single-threaded control plane for a simple concurrency model, and because we use io_uring—multithreaded I/O is less of a necessary evil than it used to be.

I find that the design is more memory efficient because of these constraints, for example, our new storage engine can address 100 TiB of storage using only 1 GiB of RAM. Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

[1] “Let's Remix Distributed Database Design” https://www.youtube.com/channel/UC3TlyQ3h6lC_jSWust2leGg

infamouscow8 hours ago

> Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

This has also been my experience building a database in Zig. It's such a joy.

pcwalton11 hours ago

Even in this environment, you can still have dangling pointers to freed stack frames. There's no way around having a proper lifetime system, or a GC, if you want memory safety.

verdagon11 hours ago

Yep, or generational references [0] which also protect against that kind of thing ;)

The array-centric approach is indeed more applicable at the high levels of the program.

Sometimes I wonder if a language could use an array-centric approach at the high levels, and then an arena-based approach for all temporary memory. Elucent experimented with something like this for Basil once [1] which was fascinating.

[0] https://verdagon.dev/blog/generational-references

[1] https://degaz.io/blog/632020/post.html

com2kid10 hours ago

> Yep, or generational references [0] which also protect against that kind of thing ;)

First off, thank you for posting all your great articles on Vale!

Second off, I just read the generational references blog post for the 3rd time and now it makes complete sense, like stupid obvious why did I have problems understanding this before sense. (PS: The link to the benchmarks is dead :( )

I hope some of the novel ideas in Vale make it out to the programming language world at large!

verdagon10 hours ago

Thank you! I just fixed the link, thanks for letting me know! And if any of my articles are ever confusing, feel welcome to swing by the discord or file an issue =)

I'm pretty excited about all the memory safety advances languages have made in the last few years. Zig is doing some really interesting things (see Andrew's thread above), D's new static analysis for zero-cost memory safety hit the front page yesterday, we're currently prototyping Vale's region borrow checker, and it feels like the space is really exploding. Good time to be alive!

yw34105 hours ago

It feels like it would work really well (you could even swap between arenas per frame). I've been wanting to try something similar but it's early days.

im3w1l10 hours ago

Well if get rid of not just the heap, but the stack too... turn all variables into global ones, then it will be safe.

This means we lose thread safety and functions become non-reentrant (but easy to prove safe - make sure graph of A-calls-B is a acyclical).

infamouscow8 hours ago

> Even in this environment, you can still have dangling pointers to freed stack frames.

How frequently does this happen in real software? I learned not to return pointers to stack allocated variables when I was 12 years old.

> There's no way around having a proper lifetime system, or a GC, if you want memory safety.

If you're building an HTTP caching program where you know the expiration times of objects, a Rust-style borrow-checker or garbage collector is not helping anyone.

seoaeu4 hours ago

> > Even in this environment, you can still have dangling pointers to freed stack frames.

> How frequently does this happen in real software? I learned not to return pointers to stack allocated variables when I was 12 years old.

This happens rarely. However, the reason it isn't an issue is because C programmers are (and have to be) extremely paranoid about this kind of thing.

Rust, however, lets you recklessly pass around pointers to local variables while guaranteeing that you won't accidentally use one as a return value. One example is scoped thread pools which let you spawn a bunch of worker threads and then pass them pointers to stack allocated variables that get concurrently accessed by all the threads. The Rust type system/borrow checker ensures both thread safety and memory safety.

Would you trust a novice C programmer to use something like that?

Arnavion7 hours ago

>I learned not to return pointers to stack allocated variables when I was 12 years old.

So, if you slip while walking today, does that mean you didn't learn to walk when you were one year old?

infamouscow5 hours ago

Your analogy doesn't answer the question. How frequently does this happens in real software?

brundolf13 hours ago

Rust's borrow checker would be much calmer in these scenarios too, wouldn't it? If there are no lifetimes, there are no lifetime errors

thecatster12 hours ago

Rust is definitely different (and calmer imho) on bare metal. That said (as much of a Rust fanboy I am), I also enjoy Zig.

the__alchemist9 hours ago

Yep! We've entered a grey areas, where some Rust embedded libs are expanding the definitions of memory safety, and what the borrow checker should evaluate beyond what you might guess. Eg, structs that represent peripherals, that are now checked for ownership; the intent being to prevent race conditions. And Traits being used to enforce pin configuration.

snicker712 hours ago

How exactly is pre-allocation safer? If you would ever like to re-use chunks of memory, then wouldn’t you still encounter “use-after-free” bugs?

verdagon12 hours ago

The approach can reuse old elements for new instances of the same type, so to speak. Since the types are the same, any use-after-free becomes a plain ol' logic error. We use this approach in Rust a lot, with Vecs.

olig1512 hours ago

But if you have a structure that contains offsets into another buffer somewhere, or an index, whatever - the wrong value here could be just as bad as a use-after-free. I don’t see how this is any safer. If you use memory after free from a malloc, with any chance you’ll hit a page fault, and your app will crash. If you have a index/pointer to another structure, you could still end up reading past the end of that structure into the unknown.

+3
verdagon11 hours ago
kaba011 hours ago

These are 1000 times worse than even a segfault. These are the bugs you won’t notice until they crop up at a wildly different place, and you will have a very hard time tracking it back to their origin (slightly easier in Rust, as you only have to revalidate the unsafe parts, but it will still suck)

nine_k12 hours ago

No; every chunk is for single, pre-determined use.

Imagine all variables in your program declared as static. This includes all buffers (with indexes instead of pointers), all nested structures, etc.

bsder9 hours ago

Normally you do this on embedded so that you know exactly what your memory consumption is. You never have to worry about Out of Memory and you never have to worry about Use After Free since there is no free. That memory is yours for eternity and what you do with it is up to you.

It doesn't, however, prevent you from accidentally scribbling over your own memory (buffer overflow, for example) or from scribbling over someone else's memory.

LAC-Tech7 hours ago

Safe enough. You can use `std.testing.allocator` and it will report leaks etc in your test cases.

What rust does sounds like a good idea in theory. In practice it rejects too many valid programs, over-complicates the language, and makes me feel like a circus animal being trained to jump through hoops. Zigs solution is hands down better for actually getting work done, plus it's so dead simple to use arena allocation and fixed buffers that you're likely allocating a lot less in the first place.

Rust tries to make allocation implicit, leaving you confused when it detects an error. Zig makes memory management explicit but gives you amazing tools to deal with it - I have a much clearer mental model in my head of what goes on.

Full disclaimer, I'm pretty bad at systems programming. Zig is the only one I've used where I didn't feel like memory management was a massive headache.

Klonoar7 hours ago

>Zigs solution is hands down better for actually getting work done

Rust has seen significant usage in large companies; they wouldn't be using it unless it was usable for "real work".

>Full disclaimer, I'm pretty bad at systems programming. Zig is the only one I've used where I didn't feel like memory management was a massive headache.

I'd say this about Rust, though. Rust's mental model is very straightforward if you accept the borrow-checker and stop fighting it. Can you list any examples of what you think is a headache...?

>In practice it rejects too many valid programs, over-complicates the language, and makes me feel like a circus animal being trained to jump through hoops.

I've found that jumping through those hoops leads to things running in production that don't make me get up in the middle of the night. Can you show me a "valid program" that Rust rejects?

voidhorse3 hours ago

I agree with the parent. Rust is hard because it fundamentally inverts the semantics of pretty much every other programming language on earth by making move semantics the default instead of copying.

Yet there’s no syntax to indicate this. Worse, actual copies are hidden behind a trait that you have no way of knowing whether a particular external lay defined type implements it or not outside of reading documentation. A lot of Rust’s important mechanics are underrepresented syntactically, which makes the language harder to get used to imo. I agree with the parent that in general it’s better for things to be obvious as you’re writing them—if rust had syntax that screamed “you’re moving this thing” or “you’re copying this thing because it implements copy” that’d be a lot easier to get used to than what beginners are currently stuck with which is a cycle of “get used to the invisible semantics by having the compiler yell at you at build time until you’ve drilled it into your head past the years of accumulated contrary models” and oh, as soon as you have to use another language this model becomes useless, so expertise in it does not translate to other domains (though that will hopefully change in the future)

LAC-Tech7 hours ago

Rust has seen significant usage in large companies; they wouldn't be using it unless it was usable for "real work".

I didn't say it wasn't usable. I said I found Zig more usable.

I'd say this about Rust, though. Rust's mental model is very straightforward if you accept the borrow-checker and stop fighting it. Can you list any examples of what you think is a headache...?

Mate, I didn't start learning rust in order to wage war against the borrow checker. I had no idea what the hell it wanted a lot of the time. Each time I fixed an error I thought I got it, and each time I was wrong. The grind got boring.

As for specific examples no, I've tried to put rust out of my mind. I certainly can't remember specific issues from 3 months ago.

I've found that jumping through those hoops leads to things running in production that don't make me get up in the middle of the night. Can you show me a "valid program" that Rust rejects?

Yeah that's how rust as sold, the compiler is your friend and stuff will compile and it will never fail.

In reality the compiler was so irritating I hardly got anything done at all. The output wasn't super reliable software, it was no software.

woodruffw14 hours ago

This was a great read, with an important point: there's always a tradeoff to be made, and we can make it (e.g. never freeing memory to obtain temporal memory safety without static lifetime checking).

One thought:

> Never calling free (practical for many embedded programs, some command-line utilities, compilers etc)

This works well for compilers and embedded systems, but please don't do it command-line tools that are meant to be scripted against! It would be very frustrating (and a violation of the pipeline spirit) to have a tool that works well for `N` independent lines of input but not `N + 1` lines.

samatman13 hours ago

There are some old-hand approaches to this which work out fine.

An example would be a generous rolling buffer, with enough room for the data you're working on. Most tools which are working on a stream of data don't require much memory, they're either doing a peephole transformation or building up data with filtration and aggregation, or some combination.

You can't have a use-after-free bug if you never call free, treating the OS as your garbage collector for memory (not other resources please) is fine.

woodruffw13 hours ago

Yeah, those are the approaches that I've used (back when I wrote more user tools in C). I wonder how those techniques translate to a language like Zig, where I'd expect the naive approach to be to allocate a new string for each line/datum (which would then never truly be freed, under this model.)

anonymoushn13 hours ago

I've been writing a toy `wordcount` recently, and it seems like if I wanted to support inputs much larger than the ~5GB file I'm testing against, or inputs that contain a lot more unique strings per input file size, I would need to realloc, but I would not need to free.

woodruffw13 hours ago

Is that `wordcount` in Zig? My understanding (which could be wrong) is that reallocation in Zig would leave the old buffer "alive" (from the allocator's perspective) if it couldn't be expanded, meaning that you'd eventually OOM if a large enough contiguous region couldn't be found.

anonymoushn13 hours ago

It's in zig but I just call mmap twice at startup to get one slab of memory for the whole file plus all the space I'll need. I am not sure whether Zig's GeneralPurposeAllocator or PageAllocator currently use mremap or not, but I do know that when realloc is not implemented by a particular allocator, the Allocator interface provides it as alloc + memcpy + free. So I think I would not OOM. In safe builds when using GeneralPurposeAllocator, it might be possible to exhaust the address space by repeatedly allocating and freeing memory, but I wouldn't expect to run into this on accident.

dundarious10 hours ago

They don't (at least the GPA's defaulting backing allocator is the page_allocator, which doesn't). https://github.com/ziglang/zig/blob/master/lib/std/heap.zig

woodruffw13 hours ago

That's interesting, thanks for the explanation!

avgcorrection8 hours ago

> This was a great read, with an important point: there's always a tradeoff to be made, and we can make it (e.g. never freeing memory to obtain temporal memory safety without static lifetime checking).

I.e. we can choose to risk running out of memory? I don’t understand how this is a viable strategy unless you know you only will process a certain input size.

woodruffw7 hours ago

Yes. There are many domains where you know exactly how much memory you’ll need (even independent of input size), so just “leaking” everything is a perfectly valid technique.

longrod3 hours ago

I think Zig has a lot more footguns due to it's explicit nature. When you don't hide away the details from the human, you are increasing the risk of writing bad code and it becomes increasingly harder to make the compiler detect each potentially bad decision.

Rust did it but they had to rethink the whole problem from the ground up. Rust is safe but that safety had quite the learning cost as compared to, say, Zig or Go. The good thing about this is that you can't use many of the bad practices from other languages that have become habits.

But Zig is still in beta/alpha stage so let's see how they increase the overall safety in the coming months/years. My experience with Zig has left me quite satisfied, especially the comptime features but the explicitness sometimes gets in the way of readability.

galangalalgol3 hours ago

why does go always get brought up when talking about rust or nim or now zig? Its got gc, a large std lib, less latency than java but noticably slower, and its not suited for embedded or drivers. Its a completely different language for a completely different niche than zig or rist, though maybe I could see a comparison to nim... Maybe. I do like go for what it is, batteries included back end language with syntax nicer than java, but I mostly do embedded hpc, so I don't reach for go often.

I can believe that abstractions prevent errors, but can't a language allow footguns but have syntax that makes them easy to detect?

pjmlp26 minutes ago

I am not big Go fan, yet its suited for embedded or drivers is a mindset question, as proven by USB hardware keys being shipped by F-Secure with Go firmware, goVisor, Android GPU debugger, ARM official support for TinyGo...

And since you speak of Java, companies like PTC, Aicas, microEJ, ExelsiorJET (now gone) have been happily doing business on embedded, with bare metal or RTOS based deployments.

And then there is Meadow/Netduino on .NET world, and Astrobe for Oberon as commercial products as well. Astrobe has been in business for 20 years now.

longrod1 hour ago

Go is a compiled language with comparable performance to Zig/Rust/C in a lot of use cases. It's heavily ergonomic for network/server side programming, it's safe, easy to use with a tiny footprint.

Rust is not only or even primarily used in embedded or driver programming. Indeed, I see more user space software built in rust than in embedded nowadays. As for Zig, it's so new and alpha that it didn't even find a niche yet. Rust also hasn't yet found a specific niche. It's all over the place in GUI, networking, embedded etc. which isn't a bad thing, to be clear.

It might be harder to do embedded in Go but that's not the point here, is it? It's about safety. Go has a special place in that it is compiled, easy to learn, quite performant for most tasks, and safe to boot.

> I can believe that abstractions prevent errors, but can't a language allow footguns but have syntax that makes them easy to detect?

What would be the point though? If a language can detect footguns, it's time to prevent them which is what Rust does, essentially. Footguns are rarely useful and there's always an alternative way. For these rare cases, Rust includes the unsafe escape hatch but then all bets are off. Don't expect the compiler to help you if you are intent on going down that road.

pjmlp18 minutes ago

Just point people to F-Secure TamaGo unikernel for embedded firmware as one possible example,

https://www.withsecure.com/en/solutions/innovative-security-...

TinkersW6 hours ago

Falsely representing the state of C & C++ doesn't really lead to a convincing argument. All those safety checks Zig supports are easily enabled in C++, and widely used. Sometimes they are even on by default.

tptacek13 hours ago

"Temporal" and "spatial" is a good way to break this down, but it might be helpful to know the subtext that, among the temporal vulnerabilities, UAF and, to an extent, type confusion are the big scary ones.

Race conditions are a big ugly can of worms whose exploitability could probably be the basis for a long, tedious debate.

When people talk about Zig being unsafe, they're mostly reacting to the fact that UAFs are still viable in it.

jorangreef12 hours ago

I see your UAF and raise you a bleed!

As you know, buffer bleeds like Heartbleed and Cloudbleed can happen even in a memory safe language, they're hard to defend against (padding is everywhere in most formats!), easier to pull off than a UAF, often remotely accessible, difficult to detect, remain latent for a long time, and the impact is devastating. All your RAM are belong to us.

For me, this can of worms is the one that sits on top of the dusty shelf, it gets the least attention, and memory safe languages can be all the more vulnerable as they lull one into a false sense of safety.

tptacek12 hours ago

Has an exploitable buffer bleed (I'm happy with this coinage!) happened in any recent memory safe codebase?

jorangreef12 hours ago

I worked on a static analysis tool to detect bleeds in outgoing email attachments, looking for non-zero padding in the ZIP file format.

It caught different banking/investment systems written in memory safe languages leaking server RAM. You could sometimes see the whole intranet web page, that the teller or broker used to generate and send the statement, leaking through.

Bleeds terrify me, no matter the language. The thing with bleeds is that they're as simple as a buffer underflow, or forgetting to zero padding. Not even the borrow checker can provide safety against that.

+2
raphlinus11 hours ago
+1
pornel7 hours ago
+1
tptacek11 hours ago
kaba011 hours ago

Would that work in the case of Java for example? It nulls every field as per the specification (at least observably at least), so unless someone writes some byte mangling manually I don’t necessarily see it work out.

dkersten12 hours ago

I’m not sure I understand the value of an allocator that doesn’t reuse allocations, as a bug prevention thing. Is it just for performance? (Since its never reused, allocation can simply be incrementing an offset by the size of the allocation)? Because beyond that, you can get the same benefit in C by simply never calling free on the memory you want to “protect” against use-after-free.

kaba011 hours ago

I believe it is only for performance, as malloc will have to find place for the allocation, while it is a pointer bump only for a certain kind of allocator.

anonymoushn8 hours ago

The allocations are freed and the addresses are never reused. So heap use-after-frees are segfaults.

lmh13 hours ago

Question for Zig experts:

Is it possible, in principle, to use comptime to obtain Rust-like safety? If this was a library, could it be extended to provide even stronger guarantees at compile time, as in a dependent type system used for formal verification?

Of course, this does not preclude a similar approach in Rust or C++ or other languages; but comptime's simplicity and generality seem like they might be beneficial here.

pron12 hours ago

Not as it is (it would require mutating the type's "state"), but hypothetically, comptime could be made to support even more programmable types. But could doesn't mean should. Zig values language simplicity and explicitness above many other things.

lmh6 hours ago

Thanks, that's informative. This was meant to clarify the bounds of Zig's design rather than as a research proposal. Otherwise, one might read it as an open invitation to just the sort of demonic meta-thinking that its users abhor.

kristoff_it9 hours ago

Somebody implemented part of it in the past, but it was based on the ability to observe the order of execution of comptime blocks, which is going to be removed from the language (probably already is).

https://github.com/DutchGhost/zorrow

It's not a complete solution, among other things, because it only works if you use it to access variables, as the language has no way of forcing you.

lmh6 hours ago

Thanks, that's interesting.

avgcorrection8 hours ago

Why would the mere existence of some static-eval capability give you that affordance?

Researchers have been working on these three things for decades. Yes, “comptime” isn’t some Zig invention but a somewhat limited (and anachronistic to a degree) version of what researchers have added to research versions of ML and Ocaml. So can it implement all the static language goodies of Rust and give you dependent types? Sure, why not? After all, computer scientists never had the idea that you can evaluate values and types at compile-time. Now all those research papers about static programming language design will wither on their roots now that people can just use the simplicity and generality of `comptime` to prove programs correct.

anonymoushn12 hours ago

It is possible in principle to write a Rust compiler in comptime Zig, but the real answer is "no."

ptato13 hours ago

Not an expert by any means, but my gut says that it would be very cumbersome and not practical for general use.

pjmlp13 hours ago

This is why for me, Zig is mostly a Modula-2 with C syntax in regards to safety.

All the runtime tooling it offers, already exists for C and C++ for at least 30 years, going back to stuff like Purify (1992).

belter13 hours ago

1 year ago, 274 comments.

"How Safe Is Zig?": https://news.ycombinator.com/item?id=26537693

baby5 hours ago

So zig has runtime integer overflow protection by default? That's interesting considering Rust lost that battle

afdbcreid13 hours ago

Do compilers really can never call `free()`?

Simple compiler probably can. Most complex probably cannot (I don't want to imagine a Rust compiler without freeing memory: it has 7 layers of lowering (source code->tokens->ast->HIR->THIR->MIR->monomorphized MIR, excluding the final LLVM IR) and also allocates a lot while type-checking or borrow-checking).

What is most interesting to me is the average compiler. Does somebody have statistics on the average amount compilers allocate and free?

com2kid13 hours ago

> Do compilers really can never call `free()`?

I worked on, one of the many, Microsoft compiler teams, though as a software engineer in test not directly on the compiler itself, and I believe the lead dev told me they don't free any memory, though I could be misremembering since it was my first job out of college.

Remember C compilers are often one file at a time (and a LOLWTF # of includes), and the majority of work goes into making a single output file, and then you are done. Freeing memory would just take time, better to just hand it all back to the OS.

Also compilers are obsessed with correctness, generating incorrect code is to be avoided at all costs. Dealing with memory management is just one more place where things can go wrong. So why bother?

I do remember running out of memory using link time code gen though, back when everything was 32bit.

Related, I miss the insane dedication to quality that team had. Every single bug had a regression test created for it. We had regression tests 10-15 years old that would find a bug that would have otherwise slipped through. It was a great way to start my career off, just sad I haven't seen testing done at that level since then!

Arnavion5 hours ago

>Related, I miss the insane dedication to quality that team had. Every single bug had a regression test created for it.

Compilers in particular are usually easy to have rigorous regression testing for. Investigating any issue usually forces you to produce a minimal repro because of how complicated a compiler is. Also, all the inputs and outputs are known. Then it's just a one tiny extra step of putting that all together into a checked-in test.

kaba011 hours ago

Bootstrapping aside, a compiler written in a GCd language would make perfect sense. It really doesn’t have any reason to go lower level than that (other than of course, if one wants to bootstrap it in the same language that happens to be a low-level one)

com2kid11 hours ago

There is no reason to free memory. Your process is going to hard exit after a set workload.

If you wrote a compiler in a GCd language, you'd want to disable the collector because that just takes time, and compilers are slow enough as it is!

kaba010 hours ago

A good GC will not really increase the execution time at all -- they turn on only after a significant "headroom" of allocations. For short runs they will hardly do any work.

Also, most of the work will be done in parallel, and I really wouldn't put aside that a generational GC's improved cache effect (moving still used objects close) might even improve performance (all other things being equal, but they are never of course). All in all, do not assume that just because a runtime has a GC it will necessarily be slower, that's a myth.

MaxBarraclough10 hours ago

I imagine plenty of compilers do call free, but here's a 2013 article by Walter Bright on modifying the dmd compiler to never free, and to use a simple pointer-bump allocator (rather than a proper malloc) resulting in a tremendous improvement in performance. [0] (I can't speak to how many layers dmd has, or had at the time.)

The never-free pattern isn't just for compilers of course, it's also been used in missile-guidance code.

[0] https://web.archive.org/web/20190126213344/https://www.drdob...

TazeTSchnitzel11 hours ago

> Do compilers really can never call `free()`?

If a compiler has to be run multiple times in the same process, it may use an area allocator to track all memory, so you can free it all in one go once you're done with compilation. Delaying all freeing until the end effectively eliminates temporal memory issues.

woodruffw13 hours ago

LLVM uses a mixed strategy: there's both RAII and lots of globally allocated context that only gets destroyed at program cleanup. I believe GCC is the same.

Rustc is written entirely in Rust, so I would assume that it doesn't do that.

notriddle11 hours ago

The headline feature of rustc memory management is the use of arenas: https://github.com/rust-lang/rust/blob/10f4ce324baf7cfb7ce2b...

    //! The arena, a fast but limited type of allocator.
    //!
    //! Arenas are a type of allocator that destroy the objects within, all at
    //! once, once the arena itself is destroyed. They do not support deallocation
    //! of individual objects while the arena itself is still alive. The benefit
    //! of an arena is very fast allocation; just a pointer bump.
The other thing (not specifically mentioned in this comment, but mentioned elsewhere, and important to understanding why it work the way it does) is that if everything in the arena gets freed at once, it implies that you can soundly treat everything in the arena as having exactly the same lifetime.

You can see an example of how every ty::Ty<'tcx> in rustc winds up with the same lifetime, and an entry point for understanding it more, here in the dev guide: https://rustc-dev-guide.rust-lang.org/memory.html

However, arena allocation doesn't cover all of the dynamic allocation in rustc. Rustc uses a mixed strategy: there's both RAII and lots of arena allocated context that only gets destroyed at the end of a particular phase.

zRedShift11 hours ago

For further reading, I recommend Niko's latest blog, tangentially related to rustc internals (and arena allocation): https://smallcultfollowing.com/babysteps/blog/2022/06/15/wha...

woodruffw11 hours ago

Yep -- arenas compose very nicely with lifetimes, and basically accomplish the same thing as global allocation (in effect, a 'static arena) but with more control.

IshKebab13 hours ago

I presume he means that compilers could be written to never call `free()`. I'm sure that most of them are not written like that, though they do tend to be very leaky and just `exit()` at the end rather than clean everything up neatly (partly because it's faster).

nwellnhof13 hours ago

UBSan has a -fsanitize-minimal-runtime flag which is supposedly suitable for production:

https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#...

So it seems that null-pointer dereferences and integer overflows can be checked at runtime in C. Besides, there should be production-ready C compilers that offer bounds checking.

pjmlp13 hours ago

There should but there aren't, GCC had a couple of extensions on a branch like 20 years ago that never got merged.

The best is to use C++ instead, with bounds checked library types.

uecker12 hours ago

You can already get some bounds checking, although more work is needed:

https://godbolt.org/z/abx7KE44z

pjmlp11 hours ago

Yeah, indeed. Thanks for sharing it.

ajross12 hours ago

> In practice, it doesn't seem that any level of testing is sufficient to prevent vulnerabilities due to memory safety in large programs. So I'm not covering tools like AddressSanitizer that are intended for testing and are not recommended for production use.

I closed the window right there. Digs like this (the "not recommended" bit is a link to a now famous bomb thrown by Szabolcs on the oss-sec list, not to any kind of industry consensus piece) tell me that the author is grinding an axe and not taking the subject seriously.

Security is a spectrum. There are no silver bullets. It's OK to say something like "Rust is better than Zig+ASan because", it's quite another to refuse to even treat the comparison and pretend that hardening tools don't exist.

This is fundamentally a strawman, basically. The author wants to argue against a crippled toolchain that is easier to beat instead of one that gets used in practice.

klyrs12 hours ago

As a Zig fan, I disagree. I think it's really important to examine the toolchain that beginners are going to use.

> I'm also focusing on software as it is typically shipped, ignoring eg bounds checking compilers like tcc or quarantining allocators like hardened_malloc which are rarely used because of the performance overhead.

To advertize that Zig is perfectly safe because things like ASan exist would be misleading, because that's not what users get out of the box. Zig is up-front and honest about the tradeoffs between safety and performance, and this evaluation of Zig doesn't give any surprises if you're familiar with how Zig describes itself.

ajross12 hours ago

> To advertize that Zig is perfectly safe because things like ASan exist would be misleading

Exactly! And for the same reason. You frame your comparison within the bounds of techniques that are used in practice. You don't refuse to compare a tool ahead of time, especially when doing so reinforces your priors.

To be blunt: ASan is great. ASan finds bugs. Everyone should use ASan. Everyone should advocate for ASan. But doing that cuts against the point the author is making (which is basically the same maximalist Rust screed we've all heard again and again), so... he skipped it. That's not good faith comparison, it's spin.

KerrAvon12 hours ago

ASAN doesn’t add memory safety to the base language. It catches problems during testing, assuming those problems occur during the testing run (they don’t always! ASAN is not a panacea!). It’s perfectly fair to rule it out of bounds for this sort of comparison.

lmm7 hours ago

> You frame your comparison within the bounds of techniques that are used in practice.

Well, is ASan used in practice, by the relevant target audience (i.e. mainstream C++ developers)? My guess is that the vast majority of the people both Rust and Zig are aiming for are people who don't use ASan with C++ today and wouldn't use ASan with Rust or Zig if they switched to them.

klyrs6 hours ago

Wait, are you saying that because the author didn't push your personal agenda, that's spin? Hardly.

einpoklum9 hours ago

One-liner summary: Zig has run-time protection against out-of-bounds heap access and integer overflow, and partial run-time protection against null pointer dereferencing and type mixup (via optionals and tagged unions); and nothing else.

ArrayBoundCheck14 hours ago

I like zig but this is taking a page out of rust book and exaggerating C and C++

clang and gcc will both tell you at runtime if you go out of bounds, have an integer overflow, use after free etc. You need to turn on the sanitizer. You can't have them all on at the same time because code will be unnecessarily slow (ex: having thread sanitizer on in a single threaded app is pointless)

hyperpape13 hours ago

Can you explain why, in spite of the fact that (according to you) C & C++ aren't that unsafe, critical projects like Chromium can't get this right? https://twitter.com/pcwalton/status/1539112080590217217

Is the Project Zero team just too lazy to remind Chromium to use sanitizers?

jerf12 hours ago

While I'm generally in favor of the proposition that C++ is an intrinsically dangerous language, pointing at one of the largest possible projects that uses it isn't the best argument. If I pushed a button and magically for free Chrome was suddenly in 100% pure immaculate Rust, I'm sure it would still have many issues and problems that few other projects would have, just due to its sheer scale. I would still consider it an open question/problem as to whether Rust can scale up to that size and still be something that humans can modify. I could make a solid case that the difficulty of working in Rust would very accurately reflect a true and essential difficulty of working at that scale in general, but it could still be a problem.

(Also Rust defenders please note I'm not saying Rust can't work at that scale. I'm just saying, it's a very big scale and I think it's an open problem. My personal opinion and gut say yes, it shouldn't be any worse than it has to be because of the sheer size (that is, the essential complexity is pretty significant no matter what you do), but I don't know that.)

hyperpape11 hours ago

You're right that Chromium* is a very difficult task, but I disagree with the conclusion you draw. I think Chromium is one of the best examples we can consider.

There would absolutely be issues, including security issues. But there is also very good evidence that the issues that are most exploited in browsers and operating systems relate to memory safety. Alex Gaynor's piece that the author linked is good on this point.

While securing Chromium is huge and a difficult task, it and consumer operating systems are crucial for individual security. Until browsers and consumer operating systems are secure, individuals ranging from persecuted political dissidents to Jeff Bezos won't be secure.

* Actually not sure why I said Chromium rather than Chrome. Nothing hangs on the distinction, afaict.

ArrayBoundCheck11 hours ago

Considering how much I got downvoted no I don't want to comment more about this. But I'll let you ponder why while using rust has you could get a use after free sometimes https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-4572...

hyperpape10 hours ago

Here's the commit: https://github.com/jeromefroe/lru-rs/pull/121/commits/416a2d....

I don't think this does much for your initial claim. Take the most generous reading you can--Rust isn't any better at preventing UAF than C/C++. That doesn't make safe C/C++ a thing, it means that Rust isn't an appropriate solution.

+1
ArrayBoundCheck10 hours ago
alfiedotwtf8 hours ago

> Rust isn't any better at preventing UAF than C/C++

Maybe I'm missing something here?

uecker12 hours ago

I think the big question is, whether two teams writing software on a fixed budget using Rust or C using modern tools and best practices would end up with a safer product. I think this is not clear at all.

pcwalton11 hours ago

People have done just that with, for example, Firefox components and found that yes, Rust gives you a safer product.

uecker11 hours ago

Do you have a pointer? I know they rewrote Firefox components, but I am not aware of a real study with a 1:1 comparison.

lmm7 hours ago

I think it's very clear for anything other than a no-true-Scotsman definition of "modern tools and best practices" (which is sadly the only one that seems to exist).

uecker12 hours ago

(Ok, I should read the text before sending.)

woodruffw13 hours ago

Neither Clang nor GCC has perfect bounds or lifetime analysis, since the language semantics forbid it: it's perfectly legal at compile time to address at some offset into a supplied pointer, because the compiler has no way of knowing that the memory there isn't owned and initialized.

Sanitizers are great; I love sanitizers. But you can't run them in production without a significant performance hit, and that's where they're needed most. I don't believe this post blows that problem out of proportion, and is correct in noting that we can solve it without runtime instrumentation and overhead.

AlotOfReading13 hours ago

State of the art sanitizing is pretty consistently in the <50% overhead range (e.g. SANRAZOR), with things like UBSAN coming in under 10%. If you can't afford even that, tools like ASAP have been around for 7-ish years now to make overhead arbitrarily low by trading off increased false-negatives in hot codepaths.

Yes, the "just-enable-the-compiler-flags" approach can be expensive, but the tools exist to allow most people to be sanitizing most of the time. Devs simply don't know what's available to them.

woodruffw13 hours ago

I'd consider even 10% to be a significant performance hit. People scream bloody murder when CPU-level mitigations cause even 1-2% regressions. The marginal cost of mitigations when memory safe code can run without them is infinite.

But let's say, for the sake of argument, that I can tolerate programs that run twice as long in production. This doesn't improve much:

* I'm not going to be deploying SoTA sanitizers (SANRAZOR is currently a research artifact; it's not available in mainline LLVM as far as I can tell.)

* No sanitizer that I know of guarantees that execution corresponds to memory safety. ASan famously won't detect reads of uninitialized memory (MSan will, but you can't use both at the same time), and it similarly won't detect layout-adjacent overreads/writes.

That's a lot of words to say that I think sanitizers are great, but they're not a meaningful alternative to actual memory safety. Not when I can have my cake and eat it too.

ArrayBoundCheck10 hours ago

> I'd consider even 10% to be a significant performance hit. People scream bloody murder when CPU-level mitigations cause even 1-2% regressions. The marginal cost of mitigations when memory safe code can run without them is infinite.

What people? and in my experience rust has always been much higher than 2% regression

+1
AlotOfReading12 hours ago
anonymoushn12 hours ago

> People scream bloody murder when CPU-level mitigations cause even 1-2% regressions

For a particular simulation on a particular Cascade Lake chip, mitigations collectively cause it to run about 30% slower. So I won't scream about 1%, but that's a lot of 1%s.

com2kid13 hours ago

> it's perfectly legal at compile time to address at some offset into a supplied pointer, because the compiler has no way of knowing that the memory there isn't owned and initialized.

Embedded land, everything is a flat memory map, odds are malloc isn't used at all, memory is possibly 0'd on boot.

It is perfectly valid to just start walking all over memory. You have a bunch of #defines with known memory addresses in them and you can just index from there.

Fun fact: Microsoft Band writes crash dumps to a known location in SRAM and because SRAM doesn't instantly lose its contents on reboot, after a crash the runtime checks for crash dump data at that known address and if present would upload the crash dump to servers for analysis and then 0 out that memory.[1]

Embedded rocks!

[1] There is a bit more to it to ensure we aren't just reading random data after along power off, but I wasn't part of the design, I just benefited from a 256KB RAM wearable having crash dumps that we could download debugging symbols for.

masklinn13 hours ago

> clang and gcc will both tell you at runtime if you go out of bounds [...] You can't have them all on at the same time because code will be unnecessarily slow

Yeah, so clang and gcc don't actually tell you at runtime if you go out of bounds. How many program ship production binaries with asan or ubsan enabled, to say nothing of msan or tsan?

Also you can't have them all on at the same time because they're not necessarily compatible with one another[0], you literally can't run with both asan and msan, or asan and tsan.

[0] https://github.com/google/sanitizers/issues/1039

pjmlp10 hours ago

Quite a few subsystems on Android, but that is about it.

https://source.android.com/devices/tech/debug/hwasan

xedrac13 hours ago

> So I'm not covering tools like AddressSanitizer that are intended for testing and are not recommended for production use.

How is it an exaggeration when he explicitly called this out?

ArrayBoundCheck10 hours ago

ASAN isn't just "for testing". A lot of people went straight to the chart (like me) and it reeks of bullshit. double free is the same as use after free, null pointer dereference is essentially the same as type confusion since a nullable pointer is confused with a non null pointer, invalid stack read/write is the same as array out of bounds (or invalid pointers), etc

I also never heard of a data race existing without a race condition existing. That's a pointless metric like many of the above I mentioned

lijogdfljk13 hours ago

What is the cause of all those notorious C bugs then?

CodeSgt13 hours ago

> at runtime

wyldfire13 hours ago

One interesting distinction is that it sounds as if - for Zig, this is a language feature and not a toolchain feature. Although if there's only one toolchain for zig maybe that's a distinction-without-a-difference. At least it's not opt-in, that's really nice. Believe it or not, there are lots of people who write and debug C/C++ code who don't know about sanitizers or they know about it and never decide to use them.

throwawaymaths13 hours ago

I think it would be interesting to see zig move towards annotation-based compile time lifetime checking plugin (ideally in-toolchain, but alternatively as a library). You could choose to turn it on selectively for security-critical pathways, turn it off for "trust me" functions, or, do it on "not every recompilation", as desired.

pjmlp10 hours ago

The irony being that lint exists since 1979, and already using a static analyser would be a bing improvement in some source bases.

kubanczyk13 hours ago

Whoa the username checks out perfectly.

ArrayBoundCheck11 hours ago

Haha yes. I love knowing I'm in bounds but unfortunately saying anything about C++ (that isn't a criticism) is out of bounds and my comment got downvoted enough that I don't feel like saying more