A code backdoor in a package on the Python Package Index demonstrates the importance of verifying code brought in from code repositories.
The pace of modern software development requires code reuse, and effective code reuse requires code repositories. These collections of code fragments, functions, libraries, and modules allow developers to write applications without having to reinvent every small (or large) detail in their code. That makes repositories very valuable to developers – and very rich targets for malicious actors.
Researchers at ReversingLabs have discovered the most recent attack against a repository: a module that carries a backdoor found in popular Python repository Python Package Index (also known as PyPI or Cheese Shop). This isn’t the first time PyPI has been attacked, but this one is notable because it involves malicious code thought to have been previously fixed.
“Essentially, a backdoor that has been reported before but hasn’t been cleaned completely from the repository was still available and live on the Web page,” says Robert Perica, principal engineer at ReversingLabs. And while the package involved is not ubiquitous, it is being used. “What’s troubling about this package is that even though it’s not a popular package, it averages 82 installs per month,” Perica says.
The malware resides in a module named “libpeshnx,” which is similar to an earlier module named “libpeshna” and was contributed by the same author. According to ReversingLabs’ blog post on the discovery, the actual backdoor mechanism is very simple, involving a call to a command-and-control server followed by a wait to be activated.
A Supply Chain Attack
Recent years have seen an increase in the number of attacks launched against companies’ supply chains. Most of these involve physical supply chains, but Perica says security professionals need to understand these code repositories – from PyPI to RubyGems, NuGet, and npm – are critical pieces of their software supply chain. That understanding should lead to strong security procedures around code drawn from the repositories.
“Many of these software repositories don’t have such a thorough review process during user submissions,” Perica says. “Essentially, any user can more or less submit anything.”
He contrasts this with open source projects hosted on GitHub, where there is typically a review and approval process for new code added to the official release. Still, PyPI is trusted within the Python developer community. “PyPI is like the official package repository for the Python Software Foundation,” Perica notes.
He points out that PyPI hosts more than 188,000 projects, with almost 1.4 million releases and roughly 350,000 users. PyPI is almost certain to be the repository used by beginning developers, Perica adds, whether they’re working on individual projects or software for an employer.
Writing secure code is complicated by the fact that modules tend to contain other modules. The “dependencies,” or network of functions and modules brought together for a single library, can be many layers deep. Perica says the best solution for companies looking to minimize the risk from code repositories is to have a security team look at each library to be used and verify the contents.
It takes a lot of effort, he says, but that effort can still be dramatically less than that required to recover from a major software vulnerability that has been exploited.