Do you actually audit open source projects you download?
from OhVenus_Baby@lemmy.ml to selfhosted@lemmy.world on 29 May 07:26
https://lemmy.ml/post/30846701
from OhVenus_Baby@lemmy.ml to selfhosted@lemmy.world on 29 May 07:26
https://lemmy.ml/post/30846701
The question is simple. I wanted to get a general consensus on if people actually audit the code that they use from FOSS or open source software or apps.
Do you blindly trust the FOSS community? I am trying to get a rough idea here. Sometimes audit the code? Only on mission critical apps? Not at all?
Let’s hear it!
threaded - newest
For personal use? I never do anything that would qualify as “auditing” the code. I might glance at it, but mostly out of curiosity. If I’m contributing then I’ll get to know the code as much as is needed for the thing I’m contributing, but still far from a proper audit. I think the idea that the open-source community is keeping a close eye on each other’s code is a bit of a myth. No one has the time, unless someone has the money to pay for an audit.
I don’t know whether corporations audit the open-source code they use, but in my experience it would be pretty hard to convince the typical executive that this is something worth investing in, like cybersecurity in general. They’d rather wait until disaster strikes then pay more.
My company only allows downloads from official sources, verified publishers, signed where we can. This is enforced by only allowing the repo server to download stuff and only from places we’ve configured. In general those go through a process to reduce the chances of problems and mitigate them quickly.
We also feed everything through a scanner to flag known vulnerabilities, unacceptable licenses
If it’s fully packaged installable software, we have security guys that take a look at I have no idea what they do and whether it’s an audit
I’m actually going round in circles with this one developer. He needs an open source package and we already cache it on the repo server in several form factors, from reputable sources …… but he wants to run a random GitHub component which downloads an unsigned tar file from an untrusted source
Packaged products ready to use? No.
Libraries which I use in my own projects? I at least have a quick look at the implementation, often a more detailed analysis if issues pop up.
I know lemmy hates AI but auditing open source code seems like something it could be pretty good at. Maybe that’s something that may start happening more.
This is one of the few things that AI could potentially actually be good at. Aside from the few people on Lemmy who are entirely anti-AI, most people just don’t want AI jammed willy-nilly into places where it doesn’t belong to do things poorly that it’s not equipped to do.
Those are silly folks lmao
Exactly, fuck corporate greed!
Eh, I kind of get it. OpenAI’s malfeasance with regard to energy usage, data theft, and the aforementioned rampant shoe-horning (maybe “misapplication” is a better word) of the technology has sort of poisoned the entire AI well for them, and it doesn’t feel (and honestly isn’t) necessary enough that it’s worth considering ways that it might be done ethically.
I don’t agree with them entirely, but I do get where they’re coming from. Personally, I think once the hype dies down enough and the corporate money (and VC money) gets out of it, it can finally settle into a more reasonable solid-state and the money can actually go into truly useful implementations of it.
I mean that’s why I call them silly folks, that’s all still attributable to that corporate greed we all hate, but I’ve also seen them shit on research work and papers just because “AI” Soo yea lol
I don’t hate AI, I hate how it was created, how it’s foisted on us, the promises it can do things it really can’t, and the corporate governance of it.
But I acknowledge these tools exist, and I do use them because they genuinely help and I can’t undo all the stuff I hate about them.
If I had millions of dollars to spend, sure I would try and improve things, but I don’t.
Daniel Stenberg claims that the curl bug reporting system is effectively DDOSed by AI wrongly reporting various issues. Doesn’t seem like a good feature in a code auditor.
I’ve been on the receiving end of these. It’s such a monumental time waster. All the reports look legit until you get into the details and realize it’s complete bullshit.
But if you don’t look into it maybe you ignored a real report…
Lots of things seem like they would work until you try them.
It wouldn’t be good at it, it would at most be a little patch for non audited code.
In the end it would just be an AI-powered antivirus.
‘AI’ as we currently know it, is terrible at this sort of task. It’s not capable of understanding the flow of the code in any meaningful way, and tends to raise entirely spurious issues (see the problems the curl author has with being overwhealmed for example). It also wont spot actually malicious code that’s been included with any sort of care, nor would it find intentional behaviour that would be harmful or counterproductive in the particular scenario you want to use the program.
Having actually worked with AI in this context alongside github/azure devops advanced security, I can tell you that this is wrong. As much as we hate AI, and as much as people like to (validly) point out issues with hallucinations, overall it’s been very on-point.
Could you let me know what sort of models you’re using? Everything I’ve tried has basically been so bad it was quicker and more reliable to to the job myself. Most of the models can barely write boilerplate code accurately and securely, let alone anything even moderately complex.
I’ve tried to get them to analyse code too, and that’s hit and miss at best, even with small programs. I’d have no faith at all that they could handle anything larger; the answers they give would be confident and wrong, which is easy to spot with something small, but much harder to catch with a large, multi process system spread over a network. It’s hard enough for humans, who have actual context, understanding and domain knowledge, to do it well, and I’ve, personally, not seen any evidence that an LLM (which is what I’m assuming you’re referring to) could do anywhere near as well. I don’t doubt that they flag some issues, but without a comprehensive, human, review of the system architecture, implementation and code, you can’t be sure what they’ve missed, and if you’re going to do that anyway, you’ve done the job yourself!
Having said that, I’ve no doubt that things will improve, programming languages have well defined syntaxes and so they should be some of the easiest types of text for an LLM to parse and build a context from. If that can be combined with enough domain knowledge, a description of the deployment environment and a model that’s actually trained for and tuned for code analysis and security auditing, it might be possible to get similar results to humans.
Its just whatever is built into copilot.
You can do a quick and dirty test by opening copilot chat and asking it something like “outline the vulnerabilities found in the following code, with the vulnerabilities listed underneath it. Outline any other issues you notice that are not listed here.” and then paste the code and the discovered vulns.
I’m writing a paper on this, actually. Basically, it’s okay-ish at it, but has definite blind spots. The most promising route is to have AI use a traditional static analysis tool, rather than evaluate the code directly.
That seems to be the direction the industry is headed in. GHAzDO and competitors all seem to be converging on using AI as a force-multiplier on top of the existing solutions, and it works surprisingly well.
I’m actually planning to do an evaluation of a n ai code review tool to see what it can do. I’m actually somewhat optimistic that it could do this better than it can code
I really want to sic it on this one junior programmer who doesn’t understand that you can’t just commit ai generated slop and expect it to work. This last code review after over 60 pieces of feedback I gave up on the rest and left it as he needs to understand when ai generated slop needs help
Ai is usually pretty good at unit tests but it was so bad. Randomly started using a different mocking framework, it actually mocked entire classes and somehow thought that was valid to test them. Wasting tests on non-existent constructors no negative tests, tests without verifying anything. Most of all there were so many compile errors, yet he thought that was fine
I generally look over the project repo and site to see if there’s any flags raised like those I talk about here.
Upon that, I glance over the codebase, check it’s maintained and will look for certain signs like tests and (for apps with a web UI) the main template files used for things like if care has been taken not to include random analytics or external files by default. I’ll get a feel for the quality of the code and maintenance during this. I generally wouldn’t do a full audit or anything though. With modern software it’s hard to fully track and understand a project, especially when it’ll rely on many other dependencies. There’s always an element of trust, and that’s the case regardless of being FOSS or not. It’s just that FOSS provides more opportunities for folks to see the code when needed/desired.
That’s something along the lines I do as well, but your methods are far more in depth than mine. I just glance around documentations, how active the development is and get a rough idea if the thing is just a single person hobby-project or something which has a bit more momentum.
And it of course also depends on if I’m looking for solutions just for myself or is it for others and spesifically if it’s work related. But full audits? No. There’s no way my lifetime would be enough to audit everything I use and even with infinite time I don’t have the skills to do that (which of course wouldn’t be an issue if I had infinite time, but I don’t see that happening).
Having gone through the approval process at a large company to add an open source project to it’s whitelist, it was surprisingly easy. They mostly wanted to know numbers. How long has it been around, when was the last update, number of downloads, what does it do, etc. They mostly just wanted to make sure it was still being maintained.
In their eyes, they also don’t audit closed source software. There might also have been an antivirus scan run against the code, but that seemed more like a checkbox than something that would actually help.
I trust the community, but not blindly. I trust those who have a proven track record, and I proxy that trust through them whenever possible. I trust the standards and quality of the Debian organization and by extension I trust the packages they maintain and curate. If I have to install something from source that is outside a major distribution then my trust might be reduced. I might do some cursory research on the history of the project and the people behind it, I might look closer at the code. Or I might not. A lot of software doesn’t require much trust. A web app running in its own limited user on a well-secured and up-to-date VPS or VM, in the unlikely event it turned out to be a malicious backdoor, it is simply an annoyance and it will be purged. In its own limited user, there’s not that much it can do and it can’t really hide. If I’m off the beaten track in something that requires a bit more trust, something security related, or something that I’m going to run it as root, or it’s going to be running as a core part of my network, I’ll go further. Maybe I “audit” in the sense that I check the bug tracker and for CVEs to understand how seriously they take potential security issues.
Yeah if that malicious software I ran that I didn’t think required a lot of trust, happens to have snuck in a way to use a bunch of 0-day exploits and gets root access and gets into the rest of my network and starts injecting itself into my hardware persistently then I’m going to have a really bad day probably followed by a really bad year. That’s a given. It’s a risk that is always present, I’m a single guy homelabbing a bunch of fun stuff, I’m no match for a sophisticated and likely targeted nation-state level attack, and I’m never going to be. If On the other hand if I get hacked and ransomwared along with 10,000 other people from some compromised project that I trusted a little too much at least I’ll consider myself in good company, give the hackers credit where credit is due, and I’ll try to learn from the experience. But I will say they’d better be really sneaky, do their attack quickly and it had better be very sophisticated, because I’m not stupid either and I do pay pretty close attention to changes to my network and to any new software I’m running in particular.
I run projects inside Docker on a VM away from important data. It allows me to test and restrict access to specific things of my choosing.
It works well for me.
Oof, you are infected
With?
Malware. You downloaded something without checking if it was altered in transit maliciously
I do not. But then again, I don’t audit the code of the closed source software I use either.
Nah not really…most of the time I’m at least doing a light metadata check, like who’s the maintainer & main contributors, any trusted folks have starred the repo, how active is development and release frequency, search issues with “vulnerability”/“cve” see how contributors communicate on those, previous cve track record.
With real code audits… I could only ever be using a handful of programs, let alone the thought of me fully auditing the whole linux kernel before I trust it 😄
Focusing on “mission critical” apps feels pretty useless imho, because it doesn’t really matter which of the thousands of programs on your system executes malicious code, no? Like sure, the app you use for handling super sensitive data might be secure and audited…then you get fucked by some obscure compression library silently loaded by a bunch of your programs.
No, I pretty much only look at the number of contributors (more is better)
Full code audit is very time consuming. It’s impossible to audit all software someone uses. However if I know nothing about project, I do a short look at the code to understand if it follows best practices or not and make some assumptions about the code quality. The problem is that I can’t do this if I’m unfamiliar with the programming language the project is written in, so in most cases I try to avoid such projects.
Well my husband’s work place does audit the code they deploy but they have a big problem with contractors just downloading random shit and putting it on production systems without following proper review and in violation of policy.
The phrase fucking Deloitte is a daily occurrence.
Fucking Deloitte!
Lol. I download a library or program to do a task because I would not be able to code it myself (to that kind of production level, at least). Of course I’m not gonna be able to audit it! You need twice the IQ to debug a software compared to the one needed to even write it in the first place.
I don’t because I don’t have the necessary depth of skill.
But I don’t say I “blindly” trust anyone who says they’re FOSS. I read reviews, I do what I can to understand who is behind the project. I try to use software (FOSS or otherwise) in a way that minimizes impact to my system as a whole if something goes south. While I can’t audit code meaningfully, I can setup unique credentials for everything and use good network management practices and other things to create firebreaks.
It’s not feasible. A project can have 10s or 100s of thousand lines of code and it takes months to really understand what’s going on. Sometimes you need domain specific knowledge.
I read through those installers that do a
curl gitbub… | bash
. Otherwise I do what amounts to a “vibe check”. How many forks and stars does it have? How many contributors? What is the release cycle like?Contributors is my favorite metric. It shows that there are lots of eyes on the code. Makes it less likely of a single bad actor being able to do bad things.
That said, the supply chain and sometimes packaging is very opaque. So it almost renders all of that moot.
I’m unlikely to do a full code audit, unless something about it doesn’t pass the ‘sniff test’. I will often go over the main code flows, the issue tracker, mailing lists and comments, positive or negative, from users on other forums.
I mean, if you’re not doing that, what are you doing, just installing it and using it??!? Where’s the fun in that? (I mean this at least semi seriously, you learn a lot about the software you’re running if you put in some effort to learn about it)
I don’t know enough about programming to do it myself so I like to look at what the community says. This is one thing we’re AI could be very helpful no?
nope.
About as much as I trust other drivers on the road.
As in I give it the benefit of the doubt but if something seems off I take precautions while monitoring and if it seems dangerous I do my best to avoid it.
In reality it means that I rarely check it but if anything seems off I remove it and if I have the time and energy I further check the actual code.
My general approach is minimalism, so I don’t use that many unknown/small projects to begin with.
that is such a good analogy
Let me put it this way: I audit open source software more than I audit closed source software.
I have also looked at the code of one project.
(Edit: Actually, I get paid for closed source software… So I can not say the same)
Of course I do bro, who doesnt have 6 thousand years of spare time every time they run dnf update to go check on 1 million lines of code changed? Amateurs around here…
I do not, but I sleep soundly knowing there are people that do, and that FOSS lets them do it. I will read code on occasion, if I’m curious about technical solutions or whatnot, but that hardly qualifies as auditing.
I vet lesser known projects, but yea I do end up just taking credibility for granted for larger projects. I assume that with those projects, the maintainers team with pull access is doing that vetting before they accept a pull.
I do not audit code line by line, bit by bit. However, I do due diligence in making sure that the code is from reputable sources, see what other users report, I’ll do a search for any unresolved issues et al. I can code on a very basic level, but I do not possess the intelligence to audit a particular app’s code. Beyond my ‘due diligence’ I rely on the generosity of others who are more intelligent than I and who can spot problems. I have a lot of respect and admiration for dev teams. They produce software that is useful, fun, engaging, and it just works.
I don’t audit the code, but I do somewhat audit the project. I look at:
I think that catches the worst issues, but it’s far from an audit, which would require digging through the code and looking for code smells.
Same here, plus
I implicitly trust FOSS more than closed source but because that trust has been earned through millions of FOSS projects.
On occasion, I will dive deep into a codebase especially if I have a bug and I think I can fix it.
You can’t do this with closed source or even source available code because there is no guarantee that the code you have is the code that’s been compiled.
I do sometimes, when I know the tech stack. (I wonder if GitHub Copilot could help in other situations?)
For example, I’ve been learning more about FreshRSS and Wallabag (especially now that Pocket is shutting down).
In any case, with open source, I trust that someone looks at it.
If it’s a project with a couple hundred thousands of downloads a week then no, i trust that it’s been looked at by more savvy people than myself.
If it’s a niche project that barely anyone uses or comes from a source i consider to be less reputable then i will skim it
This
If I can read it in around an afternoon, and it’s not a big enough project that I can safely assume many other people have already done so, then I will !
But I don’t think it qualifies as “auditing”, for now I only have a bachelor’s in CS and I don’t know as much as I’d like about cybersecurity yet.
It depends on the provenance of the code and who (if anyone) is downstream.
A project that’s packaged in multiple distros is more likely to be reliable than a project that only exists on github and provides its own binary builds.
Depends on how the project and how long they have been around.
some yes, I’m currently using hyde for hyprland and I’ve been tinkering with almost every script that holds the project together
Nah. My security is entirely based on vibes and gambling
Hell yeah brother!
Based
YOLO
depends like for known projecte like curl i wont because i know its fine but if its a new project i heard about i do audit the source and if i dont know the lang its in i ask someone that does
If it looks sketchy I’ll look at it and not trust the binaries. I’m not going to catch anything subtle, but if it sets up a reverse shell, I can notice that shit.
I don’t have the know how to do so, so I go off of what others have said about it. It’s at-least got a better chance of being safe than closed source software where people are FULLY guessing at if its safe or not, rather than what we have with at-least 1 person having poured over it that doesn’t have ties to the creator.
No, so I only use well known widely used open source programs. If I’m doing a code review I’m getting paid to do it.
I rely on Debian repo maintainers to do this for me
I usually just look for CVEs. The biggest red flag is if there’s 0 CVEs. The yellow flag is if the CVEs exist, but they don’t have a prominent notice on their site about it.
Best case is they have a lot of CVEs, they have detailed notices on their sites that were published very shortly after the CVE was published, and they have an bug bounty program setup.
What if the software is just so flawlessly written that there are not CVEs?
/s
I maintained an open-source app for many years. It leveraged a crypto library but allowed for different algos, or none at all for testing.
Some guy wrote a CVE about “when I disable all crypto it doesn’t use crypto”. So there’s that. It’s the only CVE we got before or during my time.
But even we got one.
Oh damn, haha.
Depends on what you mean by “audit”.
I look at the GitHub repo.
Do I read the whole code base? Of course not. But this is way more than I can do with closed source software.
Generally, no. On some cases where I’m extending the code or compiling it for some special case that I have, I will read the code. For example, I modified a web project to use LDAP instead of a local user file. In that case, I had to read the code to understand it. In cases where I’m recompiling the code, my pipeline will run some basic vulnerability scans automatically.
I would not consider either of these a comprehensive audit, but it’s something.
Additionally, on any of my server deployments, I have firewall rules which would catch “calls to home”. I’ve seen a few apps calling home, getting blocked but no adverse effects. The only one I can remember is Traefik, which I flipped a config value to not do that.
Yes. It's important to verify the dependencies and perform audits like automated scans on the source code and packages from repositories like PyPi and npm. Which is done on my day job.
Also before mirroring data, I look at the source code level if I see anything suspicious. Like phoning home or for example obfuscated code. Or other red flags.
Even at home, working on 'hobby projects', I might not have the advantage of the advance scanning source code tools, but I'm still suspicious, since I know there is also a lot of sh*t out there.
Even for home projects I limit the amount of packages I use. I tent to only use large (in terms of users), proven (lot of stars and already out for a long time) and well maintained packages (regular security updates, etc.). Then again, without any advance code scanning tool it's impossible to fully scan it all. Since you still have dependencies on dependencies with dependencies that might have a vurnability. Or even things as simple as openssl heartbleed bug or repository take overs by evil maintainers. It's inevitable, but you can take precautions.
Tldr: I try my best with the tools I have. I can't do more then that. Simple and small projects in C is easier to audit then for example a huge framework or packages with tons of new dependencies.
Especially in languages like Python, Go and Javascript/typescript. You have been warned.
Edit: this also means you will need to update your packages often. Not only on your distro. But also when using these packages with npm and PyPi, go or php composer. Just writing your code once and deploy is not sufficient anymore. The chances you are using some packages that are vulnerable is very high and you will need to regularly update your packages. I think updating is just as important as auditing.
no… I do just blindly trust the code.
Nope! Not at all. I don’t think I could find anything even if I tried. I do generally trust OS more than other apps but I feel like I’m taking a risk either way. If it’s some niche thing I’m building from a git repo I’ll be wary enough to not put my credit card info but that’s about it
no. ive skimmed through maybe 2 things overall but thats about it. i use too many apps to be able to audit them all and i dont have the proper skills to audit code anyway, and even if i did i would still have to re-audit after every update or every few years. its just not worth the effort
youre taking a chance whether you use closed or open source software, at least with open source there is the option to look through things yourself, and with a popular project theres going to be a bigger chance of others looking through it
I look whether if someone has audited the code or not & even then I simply find Libre stuff trustworthy anyways
Yes, but with an explanation.
You don’t necessarily need coding skills to “audit”, you can get q sense of the general state of things by simply reading the docs.
The docs are a good starting point to understand if there will be any issues from weird licensing, whether the author cares enough to keep the project going, etc. Also serious, repeated or chronic issues should be noted in the docs if its something the author cares about.
And remember, even if you do have a background in the coding language, the project might not be built in a style you like or agree with.
I’m pretty proficient at bash scripting, and I found the proxmox helper scripts a spaghetti mess of interdependent scripts that were simply a nightmare to follow for any particular install.
I think the overall message is do your best within your abilities.
All I do is look into the open issues, the community, docs etc. I don’t remember auditing the code.