A team of researchers at the Leiden Institute of Advanced Computer Science (Soufian El Yadmani, Robin The, Olga Gadyatskaya) discovered thousands of repositories on GitHub that offer fake proof-of-concept (PoC) exploits for multiple vulnerabilities.
The experts analyzed PoCs shared on GitHub for known vulnerabilities discovered in 2017-2021, some of these repositories were used by threat actors to spread malware.
The experts pointed out that public code repositories do not provide any guarantees that any given PoC comes from a trustworthy source.
“We discovered that not all PoCs are trustworthy. Some proof-of-concepts are fake (i.e., they do not actually offer PoC functionality), or even malicious: e.g., they attempt to exfiltrate data from the system they are being run on, or they try to install malware on this system.” reads the research paper published by the experts.
The team focused on a set of symptoms observed in the collected dataset, such as calls to malicious IP addresses, encoded malicious code, or included Trojanized binaries. The boffins analyzed 47313 repositories and 4893 of them were malicious repositories (i.e. 10.3% of the studied repositories have symptoms of malicious intent).
“This figure shows a worrying prevalence of dangerous malicious PoCs among the exploit code distributed on GitHub.” continues the paper.
The researchers analyzed a total of 358277 IP addresses, 150734 of them were unique IPs and 2864 were blacklisted. 1,522 IP addressed were labeled as malicious by Virus Total, and 1,069 of them were listed in the AbuseIPDB database.
Of the 150,734 unique IPs extracted, 2,864 matched blacklist entries. 1522 were detected as malicious in AV scans on Virus Total, and 1069 were present in the AbuseIPDB database.
Most of the malicious detections are related to vulnerabilities from 2020.
During their research the experts found multiple examples of malicious PoC developed for CVEs and shared some case studies.
One of the examples is related to a PoC developed for the CVE-2019-0708, also known as BlueKeep.
“This repository was created by a user under the name Elkhazrajy. The source code contains a base64 line that once decoded will be running. It contains another Python script with a link to Pastebin28 that will be saved as a VBScript, then run by the first exec command. After investigating the VBScript we discovered that it contains the Houdini malware.” continues the paper.
Another example detailed by the experts is related to a malicious PoC designed to gather info about the target. In this case the URL to the server used for data exfiltration was base64-encoded.
The boffins explained that their study has several limitations. For example the GitHub API proved unreliable and not all repositories corresponding to the used CVE IDs were collected.
Another limitation is related to the use of heuristics for detecting malicious PoCs. Experts explained that the approach can miss some malicious PoCs in their dataset.
“However, this approach cannot detect every malicious PoC based on source code, since it is always possible to find more creative ways to obfuscate it. We have investigated code similarity as a feature to help identifying new malicious repositories. Our results show that indeed malicious repositories are on average more similar to each other than non-malicious one.” conclude the experts. “This result is the first step to develop more robust detection techniques.”
The researchers have shared their findings with GitHub and some of the malicious repositories have yet to be removed.
Follow me on Twitter: @securityaffairs and Facebook
|[adrotate banner=”9″]||[adrotate banner=”12″]|
(SecurityAffairs – hacking, malicious GitHub)