There is some work on making unsafe rust safer, or at least easier to get right. But those are significant undertakings that will take years to pay off.
On the other hand safe rust is so much easier than correct C that I don't particularly mind the tradeoff.
D isn't Rust, but my personal style has gravitated towards the borrow checker rules anyway. It wouldn't be hard at all to do it in C, either.
C's biggest security failing, however, is the lack of array bounds checking. It's always the #1 cause of security bugs in the field, by a wide margin. None of the workarounds in Standard C are attractive. I do not understand why the C committee adds all these bits and pieces of new features, and does not fix that problem. It's not like it's hard:
https://www.digitalmars.com/articles/C-biggest-mistake.html
The solution has been out there for 15 years now.
I kinda cry every time I see a new api with foo(void *src, int sz) signatures.
Correct C is not actually hard to write. It can be somewhat hard to convince oneself that it's right.
That's an important difference. Not all C has to be correct. If it's experimental or exploratory code that'll be thrown away. If that just works on one machine or on particular inputs that you care about, or it only certain specific environmental conditions, that can be fine.
Correctness is more than just memory safety and avoiding overflows. They are still bugs in programs and languages that avoid those problems.
They don't mean that you write unsafe Rust code for experimental or exploratory code that will be thrown away. You could do that, but it's not common.
More importantly: this approach of parsimoniously writing unsafe code in a larger-scale safe system does not work in C/C++. Memory safety issues can (and do) crop up all over your system, not just in the small "unsafe" parts you validate carefully.
https://securitycryptographywhatever.com/2024/10/15/a-little...
Even with careful rules and oversight and secure coding idioms and library exclusions and fuzzing, the rate of memory safety defects in C/C++ code is still pretty high.
You can write code which you think erases some security-sensitive information to zero bits after a calcualtion, but the compiler throws it away. The program is perfectly safe in that it doesn't crash on any bad input or miscalculate on any good input, but doesn't meet a security requirement.
The crux is that what the language standard calls "observable behavior", and behavior that can actually be observed (with security implications) are two different things.
int main(void)
{
global_init();
service_loop();
global_cleanup();
}
Some of these functions, defined in another translation unit could require parameters, but their declarations be incorrect or missing from this translation unit. So the program will link, but the behavior is undefined."Unsafe" means that "no diagnostic is required when a rule is broken". When correctness depends on the programmer, such that the tooling silently accepts incorrectness, that is unsafe.
Almost all C is unsafe, but not all coding situations are equally risky of having a problem.
People working in non-Rust languages, I'm writing just small amounts of low level, unsafe parts in Rust, would be better off using C.
Rust offers tons of tools that are very useful for correctness outside of memory safety. I hope to never program in a language without sum types ever again, if possible.
The summary is that the research that you reference was not done in good faith.
Repeating his claims in text form would be much more useful than linking to a video. I am not anti-video, but for example, I am out in public right now, and cannot watch it, and therefore can't meaningfully respond.
Some bullets:
* Experience levels in C++ and Rust are not equivalent. The experience gap between C++ and Rust developers can skew results, as C++ has been around much longer, affecting developer proficiency and code complexity.
* Comparison of codebases: legacy C++ vs. greenfield Rust. Comparing a mature C++ codebase with a new Rust project is like comparing apples to oranges; age and complexity can impact productivity metrics significantly.
* Measurement metrics can be misleading. The metrics used to determine productivity often lack depth, failing to account for feature complexity or development duration.
* Organizations have asymmetric composition of resources. The management style and communication within teams can heavily influence productivity, indicating that not all teams function under the same conditions.
* Experience gap between C++ and Rust should give edge to C++ as it has larger pool of candidates to pick from.
* Even in greenfield Rust vs greenfield C++ like Rust drivers in Android, Rust significantly reduced defect rate.
* Measurements in this study were done on teams with same size, rewriting existing service. Neither you or Primagen demonstrate how that would be misleading.
* See first point. While true how can you be sure C++ team isn't overall better than Rust? What if C++ is three senior engineers and Rust is three mediors?
Actual possible problems:
* There isn't yet enough data for language comparison. Too early to tell.
* Observer effect might have influenced the study.
Ok. Looks inside. First warning sign Primagen. Ok. What did he talk about:
> There is lies, damn lies and statistics.
> Go is goated, let's go.
Truly dedunked.
I believe that's a good thing (in general, constructive constraints are good), but it is just an opinion.
Cyclic structures in memory are not bad just because the one-memory-management-trick pony language you're currently infatuated with doesn't handle them within its safety paradigm.
Graph structures are entirely legitimate in computer science, and are handled nicely by garbage collection which is the gold standard in memory management.
Fully general lexical closures lead to cycles even when no variables are mutated. And even if no circular definitions are supported in the language.
We can use the Y combinator to create a cycle: a situation when a function's captured environment contains a binding whose value is that function itself.
I was talking to a friend last night, and programming languages came up, including Rust. I remarked to him that I felt that Rust's biggest weakness was its learning curve. But once you're past that, you're golden.
It really isn't. It's quite nice to use. The big hangup is that Rust "wants" you to structure your programs according to a certain philosophy. Unfortunately the compiler can't teach you that approach directly, but can only complain when you write programs that don't abide by it.
In my experience, following this design philosophy benefits even outside of memory safety, and I now write code following these principles in every language.
> Correct C is not actually hard to write.
Quite literally all of the available evidence strongly disagrees with this conclusion.
> Correctness is more than just memory safety and avoiding overflows. They are still bugs in programs and languages that avoid those problems.
This is true, but C lack tools to help with any of them. Fixing memory safety and overflows is in and of itself an enormous win, but Rust also has other tools that assist in program correctness (sum types being a huge one).
Easy to write is not the same thing as easy to prove. It may be hard to show that some C program is correct, but when that is done, that doesn't change that it was easy to write (if that had been the case).
C is actually "unreasonably effective". You can't just look at the problems; the overall picture is that a lot of large and complicated, yet well-working systems have been achieved in C. They may not be correct, but are "correct enough".
Still, C is unreasonably effective. A lot of large and complicated, yet mostly well-working (modulo regular CVEs) systems have been achieved in C. And that’s truly wonderful. I cut my teeth on C nearly 30 years ago and I still hold a soft spot in my heart for it.
But other languages these days are even more unreasonably effective. They need fewer lines of code to achieve the same results and they have fewer security vulnerabilities and logic bugs per line as well.
Rule of thumb: almost never fix some alleged problem without a repro test case.
The CVE database lists only entries related to situations gone wrong. It doesn't list any information about working software that has no issues.
Also nothing will appear in the CVE database in relation to a language that nobody uses.
The database also has some garbage entries.
Overall the database is actually paltry in size in relation to the vast amounts of stuff out there written in C.
Also, correct C programs can have security issues, because some security issues depend on actual behaviors being observable which correspond to behavior that is not observable according to language standard. ISO C is mum about memory being observed, or side channel information being monitored, or timing of operations.
Ugh, not this again. Yes it is. It's hard. Veteran C programmers (I include myself in this category), even with compiler warnings, linters, and address sanitizers, still make mistakes all the time.
> Not all C has to be correct. If it's experimental or exploratory code that'll be thrown away. If that just works on one machine or on particular inputs that you care about, or it only certain specific environmental conditions, that can be fine.
So what? That's not the kind of code we're talking about.
> Correctness is more than just memory safety and avoiding overflows. They are still bugs in programs and languages that avoid those problems.
That's either whataboutism or a straw man. No one is claiming that languages that help avoid common C memory-related bugs like out-of-bounds access, use-after-free, memory leaks, etc. will also somehow magically make all your logic errors go away too.
Even veteran eaters bite their tongue or lip once in while. That doesn't mean eating is hard.
> No one is claiming
Only about 3 out of 4 Rust advocates.
> That's not the kind of code we're talking about.
We should talk about all code. Code that gets thrown away, rewritten, or improved from prototype to production. It is relevant to productivity. If I'm going to throw away some code to get to the code I want, it will be an extra waste of time if it is difficult to write.
Kinda. This would be more like if my mouth were stuffed with anti-tongue-bite technology and I'd trained for years using scientifically proven techniques to avoid biting my tongue, but inevitably I do anyway and then my identity and the identity of all of my friends get stolen.
I love C and actively dislike Rust, but to minimize the effort spent compensating for and consequences of C's shortcomings is very head in the sand.
In my experience, it helps to have stronger type system regardless of any language of comparison if you want to avoid logic errors.
- Type safety
- "Memory safety"
We have to understand it: it is religion. No science.
in 10 years maybe somebody start realising, that even with the 2 sw can still be very bad. And those 2 are not silver bullets... until then... patience.
integer safety on the other hand is pretty hard, and non-trivial to get right unfortunately.
I would advise to you not say such blatantly false things. You paint yourself either as a liar or as unqualified. C has been proven time and time again to be almost impossible to write correctly. Even top tier programmers like djb cannot write safe C.
> C has been proven time and time again to be almost impossible to write correctly.
Silly hyperbole.
I've been sticking with the concept of unsafe language, whereas you're talking about an unsafe program.
The two are related. An unsafe program misbehaves when given bad inputs. It's behavior is undefined.
An unsafe language is an unsafe program with regard to bad inputs, which consist of certain kinds of incorrect programs. (Those which are not diagnosed).
It's easy to write unsafe programs in unsafe languages. If we only focus on the program's functional requirements and none of them speak about what to do with bad inputs or in bad situations, we can end up with a program which is perfectly correct but unsafe. It will behave as specified on the correct inputs.
If you happen to be advocating Rust due to misunderstanding this kind of material, you're in it for the wrong reasons.
You can't assume programming in unsafe Rust is like programming in C. Unsafe doesn't mean "do whatever you want"; it means you get to do things that the compiler won't check for you, but it's your responsibility to ensure you're still maintaining Rust's invariants.
So yes, unsafe Rust is harder than C. I think that's fine, though. Most people will be able to get by without writing any unsafe Rust, and those who still need to use it will end up writing very little of it. And tools like MIRI exist to help increase your confidence that your unsafe Rust is correct.
You have behaviors for all four quadrants: there's stuff that's okay in Rust and not okay in C, stuff that's okay in both, stuff that's not okay in both, and stuff that's not okay in Rust and is okay in C.
Of the things you can try to do, but will be prevented: use-after-free, double-free, using uninitialized variables, null references, modifying constants, data races
Rust is more restrictive by default, but C's restrict keyword lets you opt-in to strict aliasing semantics in exchange for unlocking compiler optimizations that would otherwise be unsound, so it's not like you're free from having to think about aliasing in high performance C code.
Here’s the issue: waiting_for_elements is a Vec<Waker>. The channel cannot know how many tasks are blocked, so we can’t use a fixed-size array. Using a Vec means we allocate memory every time we queue a waker. And that allocation is taken and released every time we have to wake.
Why isn't a structure that does amortized allocation an option here? I appreciate the design goal was "no allocations in steady-state", but that's what you'd expect if you were using C++'s std::vector: After a while the reserved space for the vector gets "big enough".
And my response: https://www.reddit.com/r/rust/comments/1gbqy6c/comment/ltpv0...
One typical approach is double-buffering the allocation but it doesn't work here because you need to pull out the waker list to call `wake()` outside of the mutex. You could try to put the allocation back, but you have to acquire the lock again.
I had an implementation that kept a lock-free waker pool around https://docs.rs/wakerpool/latest/wakerpool/ but now you're paying for atomics too, and it felt like this was all a workaround for a deficiency in the language.
Intrusive lists are the "correct" data structure, so I kept pushing.
It almost feels like an engineering tool/language, by devs for devs, instead of by devs for “customers” where customer == “outside” devs, of which I am one.
Edit: typo
However, I would much rather that more languages like Go and Zig simply take off in popularity instead, and that we just reject the eyesore and cognitive mess that is Rust syntax. It’s a language which has no regard for beauty.
Personally I don’t think it makes sense to write all software in Rust. A GC makes it much easier to write certain patterns that are frustrating to write in Rust.
It's a language that's definitely worth learning; it expands the mind in a way that languages like Go (or Java, or any GC language) will not. There is a great beauty and elegance to its design.
I agree that Rust has a lot of syntactic warts but the common stuff I find beautiful
[1] https://smallcultfollowing.com/babysteps/blog/2024/10/14/ove...
1. This is perhaps just my preference, and so is more subjective, but I don't want to have to pick and choose among various options for doing linting or static analysis or or address sanitizing or model checking after the fact. I want compilation to fail if any of these invariants don't hold. Rust can do that; C never will be able to do that. (Sure, if I write unsafe Rust, I'm going to want to run MIRI on it, but if can stick to safe Rust, the compiler should be sufficient.)
2. I'm probably not as well-versed in the topic as you are, but my understanding is that model checking tools like this cannot prove that every single program that a compiler will accept will also be free of the issues that the model checker is looking for. Again, the Rust compiler can do that. Yes, that does mean that the Rust compiler might reject some programs that would turn out to be safe and sound, but I'm ok with that trade off.
2. You're not interested in proving that every program that the compiler accepts will be free of issues. You're interested in whether YOUR program that you write is free of those issues. The Rust compiler CAN'T do this, because the Rust compiler is only looking at a SUBSET of the possible things that you can build model checks against. This is why Kani -- a model checker for Rust -- exists. You can model check unsafe code in Rust, as well as safe code in Rust against user assertions and function contracts that are otherwise not possible to check in vanilla Rust.
Model checking isn't just for C, but model checking, as a practical form of formal methods, brings the same and even better safety to C. In fact, with Kani, you can get similar safety in Rust as well.
If you like Rust, use it. But, as was the point of my comment, it is possible and practical to get similar safety in C.
Why doesn’t the Linux kernel embrace model checking instead of experimenting with Rust?
The reason why some Rust enthusiasts have been experimenting with Rust in the Linux kernel is because they are passionate about Rust, and kernel maintainers are looking to find younger people. It's neither an endorsement of Rust nor an argument against model checking in C.
The reality is that this tooling isn't yet well known about. As it becomes better known, it will be adopted.
Also, model checking - at least as you’re portraying it - sounds too good to be true. If it was anywhere close to the realm of practicality for large C codebases (without maintaining a model separate from code), we would be hearing its praises being sung by C devs all over.
Not really. Linus Torvalds has been quite open about this topic, and it has been covered extensively on LWN.
> Also, model checking - at least as you’re portraying it - sounds too good to be true.
> If it was anywhere close to the realm of practicality... we would be hearing its praises being sung by C devs all over.
Like any technology, model checking takes effort to use and learn. But, it does work quite well. Again, you're conflating the popularity of something with its effectiveness, which is a poor argument.
Typically, if you’re advocating for a relatively unknown technology that you want others to adopt, the onus is on you to describe how it is better and to be upfront about its limitations. Good luck with your book!
CBMC's abstract machine can detect memory safety violations and integer related UB. This includes things like use-after-free, buffer overruns, heap/stack corruption, thread races, fence post errors, integer conversion and promotion errors, and signed integer overflow. When it detects a violation, it provides a counter-example demonstrating this. It also provides the user to build custom assertions, which allows function contracts to be built and enforced. Any function can be defined as the entry point for the model checker, which allows function-by-function analysis. Shadow methods can be substituted, which provides the abstraction necessary to model check entire code bases. A shadow method uses non-determinism to cover all possible inputs, outputs, and side-effects that the real function could perform. This abstraction also allows modeling of third party libraries, user/kernel code, and hardware. So far, I've model checked code bases of around half a million lines of code. It will easily scale to cover a code base the size of the Linux kernel, as long as you understand how to use it. It's an engineering problem at this point. Tracking that abstraction matches implementation is actually not that hard and can be done by writing good function contracts which are verified by the model checker. If the shadow and the original code follow the same function contract, which includes all possible inputs, outputs, and side-effects, the abstraction can be substituted.
The biggest limitation really comes down to large recursive data structures, which is also a pain point for Rust for that matter. There are ways to deal with this, but that's probably the most significant place where any code base customization is required. It's possible to refactor this code to be just as fast in modern C, but in a way that is easier for the model checker to verify quickly.
It's impossible to convert the trillions of lines of code written in C to Rust or any other language without blowing the entire budget of the tech sector for 30 years. Rewrites are prohibitively expensive. Tooling and automation for this tooling is not nearly that expensive.
Or is it some form of symbolic execution? This I doubt because I believe the performance is not there yet.
I will read up a bit more on CBMC.
There is definitely a performance impact here, which is why it is important to decompose the program and verify it function by function. This decomposition is sound as long as the model checking scenario covers the complete function contract. To improve performance further, I use abstraction in the form of shadow methods. These are sort of like mocks for the model checker. They provide the same function contract -- inputs, outputs, and side effects -- as the original function, but using built-in non-determinism provided by the model checker. This simplifies the overall SMT equation while maintaining an approximation of the overall program. By defining external function contracts, I can use the model checker to verify that both the original function and the shadow function follow the same contract, which keeps the two in sync. The shadow functions are used to replace functions called by the function under model check in order to isolate this function and simplify the overall SMT equation.
The tool provides the mechanism, but it has taken me six years of work and research to develop a practical way to scale it. The book will cover the tool, but it is documenting this "cheat sheet" that is the real purpose for it.
For what it's worth, I'm also considering an edition that covers Rust and Kani.
Oh, also get rid of header files, they are archaic. And I want fearless concurrency... And sum types!
My comment was not to imply that somehow C is superior to X, Y, or Z, but rather to point out that the safety problem with C does have a practical solution.
In conclusion, programming is a land of contrasts.
Well, everything is easier than something that doesn't exist.
Since I have actually started using C I realized how easy it is to be lazy and not handle memory right so it makes Rust and maybe C++ seem more appealing, but trying to figure out random segfaults it seems like address sanitizer and valgrind catches more than I would have assumed is a low hanging fruit.
I guess I should look more into how Rust manages that safety or understand what memory safety is trying to accomplish more formally. I've taken GC for granted for years until I needed to care about memory.
An example of low hanging fruit is -fwrapv. This flag takes a behavior that is undefined, signed overflow, and converts it to defined behavior, two's compliment wrapping. That improves safety, but it does not prevent all errors. There are many flags like this, but they all tackle individual aspects of the problem, and even if you turn them all on, there are situations which aren't caught.
If you want this level of safety, which is possible in C, then you need to use a model checker. Model checking C isn't as trivial as adding a flag to the compiler, but it can be done with about as much overhead as unit testing, if a reasonable idiomatic style is followed, and if the model checker is used well.
It is still a decision problem, and thus has similar limitations, but you can perform steps to ensure that you have some level of soundness with unwinding assertions and other techniques.
I've written both, albeit way more Rust.
For me it's header files, no package manager, doing pointer magic all the time (void pointers are... Brrr!), concurrency, ...
I could go on. I don't enjoy it, although I like simple and easy things.
C is simple, but not easy
I work high performance libraries, when I compare the code C and Rust, the void is used alot of blas libraries, to dispatch different function with different type information. They also include reimplementation of threading libraries that you find as a function from a crate. This coupled with lack of doc.rs, header files, (C not having namespaces/modules (especially for complex projects)) my needed cognitive load is higher for C than for Rust.
Non-negligible amount of these libraries are also hard to setup on non-unix environment (mainly due to not having OS-independent build system). One of my favorite things is looking at rust cli projects, checking them and install them with cargo install $(name_of_project) and have it working.
Also, alot of other stuff is not default (e.g. for testing, you just #[test] in rust, for C you need 3 party tooling).
The more complex project becomes, the more of these needs you realize.
Some of these you might need or not, it all depends on your case. But, for myself (someone also used to other more modern tooling system), these do matter to a non-negligible extent.
Either I write the code myself, or I pull in a dependency.
Compile times I agree with though.
https://internals.rust-lang.org/t/type-inference-breakage-in...
On the dependency issues, I would disagree. Firsly, large amount of dependency (especially if you working on computational project e.g. on a university cluster) is not really an issue, because in the projects I worked with C++/python bindings, we already used alot of python bindings and dependencies from python ecosystem. It is just the nature of experimental projects/numerical. Limiting number of dependencies for numerical projects (e.g. simuation of physics sytem) is an very rare example given the how little academia care about the software develop, they pull whatever helps them (Nothing wrong with that since they have other things to care about than software quality).
Secondly, It just depends on the number of dependencies you pull, you have the option not to include and write it yourself, which is what C projects tend to do. It is trade-off. Given, how east it is to manage other things in rust-up (e.g. tooling versioning), I prefer this one.
I think this is more of philosophical difference: modern tooling (where you use tools like cargo/pip with declarative simple config api) vs make (where you do it more by yourself)
There can be some issues with dependency, e.g. breaking change. But, these are not unsolvable problems. If you choose you dependencies correctly, I would prefer having to manage dependency version than to write it myself (Again you can write it yourself if you want). Also, these issues are really rare.
I dont know about instability of Rust itself and where you are getting this claim from. Rust promises backwards compatibility and uses tools like these to make sure it: https://github.com/rust-lang/crater
Maybe you are talking about MSRV as semver breaking change. There has been alot of discussion about this, you can read it up as to why the choice made.
Regarding compilation time, you have to be more careful about what features you use (not spreading traits throughout your code) using incremental compilation, see: https://matklad.github.io/2021/09/04/fast-rust-builds.html
Another one is lack features: E.g. generics (That is why I'd prefer using C++ than C). But, that is one of the features of C, not a disadvantage.
Safe Rust doesn't have this "feature".
This makes multi-threaded code in C very difficult to write correctly beyond simplest cases. It's even harder to ensure it's reliable when 3rd party code is involved. The C compiler has no idea what thread safety even is, so it can't help you (but it can backstab you with unexpected optimizations when you use regular types where atomics are required). It's up to you to understand thread-safety documentation of all code involved, if such documentation exists at all. It's even more of a pain to debug data races, because they can be impossible to reproduce when a debugger or printf slows down or accidentally synchronises the code as a side effect.
OTOH thread-safety is part of Rust's type system. Use of non-thread-safe data in a multi-threaded context is reliably and precisely caught at compile time, even across boundaries of 3rd party libraries and callbacks.
Is that so? On every push back? I’d expect it’d only do an allocation when the current array segment is almost full… as a vector you might write by hand or like the ones in the C++ standard libraries do.
"Vec does not guarantee any particular growth strategy when reallocating when full, nor when reserve is called. The current strategy is basic and it may prove desirable to use a non-constant growth factor. Whatever strategy is used will of course guarantee O(1) amortized push."
Seems it should be amortized just like in C++?
Re: performance considerations. This is important, but for a performance critical application, any compiler, library etc version change can cause regressions, so it seems better to benchmark often and then tackle this, rather than make assumptions based on implicit (or even explicit) guarantees.