GitHub - Pinperepette/snakebite : Détecter les packages PyPI malveillants en utilisant une analyse heuristique et un filtrage alimenté par LLM pour découvrir des identifiants.

Q: "Les grands modèles de langage peuvent-ils fonctionner sur une infrastructure privée ?"

"Oui. Les modèles open source comme LLaMA, Mistral, DeepSeek et Qwen peuvent fonctionner on-premise ou sur un cloud européen. Ces modèles atteignent des performances comparables à GPT-4 pour la plupart des tâches métier, avec l'avantage d'une souveraineté complète sur les données."

Q: "Quel LLM est le meilleur pour un usage professionnel ?"

"Le meilleur modèle dépend de votre cas d'usage. Pour l'analyse de documents et le chat, Mistral et LLaMA excellent. Pour l'analyse de données, DeepSeek offre un raisonnement solide. L'approche de HTX est agnostique : ORCA supporte plusieurs modèles."

25 mars 2026·1083 mots·6 mins

GitHub LLM Python Open Source AI

#### Source

Type: GitHub Repository
Original link: https://github.com/Pinperepette/snakebite?trk=feed-detail_main-feed-card_feed-article-content
Publication date: 2026-03-27

Summary
#

Introduction
#

Imagine being a developer working on a critical project. One day, while installing a new Python library from PyPI, you discover that the package contains a malicious payload that steals your credentials. This scenario is not just a nightmare, but a reality that can affect anyone working with Python. Supply chain attacks are on the rise and can compromise entire projects, causing irreparable damage. Snakebite is a project that solves this problem in an innovative way, using heuristic analysis and filtering based on advanced language models (LLM) to detect malicious PyPI packages.

Snakebite was created to protect developers from credential theft attacks, obfuscated code, and persistence mechanisms. Thanks to its ability to analyze Python packages contextually, Snakebite can distinguish between legitimate behaviors and suspicious activities, drastically reducing false positives. This project is essential for anyone working with Python and wanting to ensure the security of their dependencies.

What It Does
#

Snakebite is a tool that scans Python packages to detect malicious patterns. It uses a two-stage approach: first, it applies 14 specific heuristic rules to detect real attack patterns, and then it uses an advanced language model (LLM) to filter out false positives. This means that Snakebite does not just look for keywords or suspicious functions, but analyzes the context in which they are used.

Think of Snakebite as a detective who not only looks for clues but interprets them in the context of the crime scene. For example, if a package uses os.environ to obtain a legitimate environment variable, Snakebite recognizes that it is not an attempt to steal credentials. This dynamic and contextual approach makes Snakebite a powerful tool for supply chain security.

Why It’s Amazing
#

The “wow” factor of Snakebite lies in its ability to combine heuristic analysis and artificial intelligence to offer advanced protection. It is not just a scanner that looks for generic patterns. Here are some of the features that make it extraordinary:

Dynamic and contextual: Snakebite does not just look for keywords or suspicious functions. It analyzes the context in which they are used, distinguishing between legitimate behaviors and suspicious activities. For example, if a package uses subprocess.call([editor]) in an editor package, Snakebite recognizes that it is a legitimate use.

Real-time reasoning: Thanks to the integration with advanced language models, Snakebite can analyze code in real-time, filtering out false positives with precision. This means you can trust the results without having to spend hours manually verifying each alert.

Concrete examples: Imagine receiving a notification like this: “Hello, I am your system. The package litellm version 1.82.7 contains a malicious .pth file that performs obfuscated credential theft at Python startup.” This is exactly the type of notification that Snakebite can generate, providing specific and actionable details.

Case study: A concrete example is the attack on the litellm package that occurred on March 24, 2026. Snakebite detected a credential theft payload in versions 1.82.7 and 1.82.8, preventing potential damage. This demonstrates how Snakebite can be a valuable ally in the fight against supply chain attacks.

How to Try It
#

Trying out Snakebite is simple and straightforward. Here’s how to get started:

Clone the repository: Start by cloning the repository from GitHub. Open your terminal and type:
```
git clone https://github.com/pinperepette/snakebite.git
cd snakebite
```
Prerequisites: Snakebite has no external dependencies and uses only the standard Python library. Make sure you have Python 3.8 or higher installed.
Setup: Once you have cloned the repository, you can start using Snakebite in two main modes:
- local mode: Scan packages installed on your machine. To scan all packages, use:
```
python3 snakebite.py local
```
  To scan specific packages, such as flask, requests, and litellm, use:
```
python3 snakebite.py local flask requests litellm
```
- feed mode: Monitor PyPI in real-time. For a single scan of the latest packages, use:
```
python3 snakebite.py feed
```
  For continuous monitoring every 60 seconds, use:
```
python3 snakebite.py feed --loop 60
```
Documentation: For more details, consult the main documentation available in the repository. There is no one-click demo, but the setup is simple and straightforward.

Final Thoughts
#

Snakebite represents a significant step forward in supply chain security for Python projects. In an era where cyberattacks are becoming increasingly sophisticated, having a tool like Snakebite can make the difference between a secure project and a compromised one. This project not only protects developers but also contributes to creating a safer and more reliable ecosystem.

Imagine a future where every Python package you install is automatically verified for security. Snakebite brings us closer to this future, offering a practical and powerful solution for supply chain protection. If you are a developer or a tech enthusiast, Snakebite is a tool you cannot afford to ignore. Try it today and discover how it can improve the security of your project.

Use Cases
#

Private AI Stack: Integration into proprietary pipelines
Client Solutions: Implementation for client projects
Development Acceleration: Reduction of time-to-market for projects

Resources
#

Original Links
#

GitHub - Pinperepette/snakebite: Detect malicious PyPI packages using heuristic analysis and LLM-powered filtering to uncover credent - Original link

Article recommended and selected by the Human Technology eXcellence team, processed through artificial intelligence (in this case with LLM HTX-EU-Mistral3.1Small) on 2026-03-28 09:26 Original source: https://github.com/Pinperepette/snakebite?trk=feed-detail_main-feed-card_feed-article-content

Le Point de Vue HTX
#

Ce sujet est au cœur de ce que nous construisons chez HTX. La technologie présentée ici — qu’il s’agisse d’agents IA, de modèles de langage ou de traitement de documents — représente exactement le type de capacités dont les entreprises européennes ont besoin, mais déployées selon leurs propres conditions.

Le défi n’est pas de savoir si cette technologie fonctionne. Elle fonctionne. Le défi est de la déployer sans envoyer les données de votre entreprise vers des serveurs américains, sans violer le RGPD et sans créer des dépendances fournisseur dont vous ne pouvez pas sortir.

C’est pourquoi nous avons créé ORCA — un chatbot d’entreprise privé qui apporte ces capacités à votre infrastructure. Même puissance que ChatGPT, mais vos données ne quittent jamais votre périmètre.

Vous voulez savoir si votre entreprise est prête pour l’IA ? Faites notre évaluation gratuite — 5 minutes, rapport personnalisé, feuille de route actionnable.

Découvrez ORCA par HTX

ORCA →

Votre entreprise est-elle prête pour l'IA ?

Faites l'évaluation gratuite →

FAQ

Les grands modèles de langage peuvent-ils fonctionner sur une infrastructure privée ?

Oui. Les modèles open source comme LLaMA, Mistral, DeepSeek et Qwen peuvent fonctionner on-premise ou sur un cloud européen. Ces modèles atteignent des performances comparables à GPT-4 pour la plupart des tâches métier, avec l'avantage d'une souveraineté complète sur les données.

Quel LLM est le meilleur pour un usage professionnel ?

Le meilleur modèle dépend de votre cas d'usage. Pour l'analyse de documents et le chat, Mistral et LLaMA excellent. Pour l'analyse de données, DeepSeek offre un raisonnement solide. L'approche de HTX est agnostique : ORCA supporte plusieurs modèles.

Summary #

Introduction #

What It Does #

Why It’s Amazing #

How to Try It #

Final Thoughts #

Use Cases #

Resources #

Original Links #

Articles Connexes #

Le Point de Vue HTX #

FAQ