A vulnerability in the DocugamiReader class of the run-llama/llama_index repository, up to version 0.12.28, involves the use of MD5 hashing to generate IDs for document chunks. This approach leads to hash collisions when structurally distinct chunks contain identical text, resulting in one chunk overwriting another. This can cause loss of semantically or legally important document content, breakage of parent-child chunk hierarchies, and inaccurate or hallucinated responses in AI outputs. The issue is resolved in version 0.3.1.
A path traversal vulnerability exists in run-llama/llama_index versions 0.12.27 through 0.12.40, specifically within the `encode_image` function in `generic_utils.py`. This vulnerability allows an attacker to manipulate the `image_path` input to read arbitrary files on the server, including sensitive system files. The issue arises due to improper validation or sanitization of the file path, enabling path traversal sequences to access files outside the intended directory. The vulnerability is fixed in version 0.12.41.
A vulnerability in the ObsidianReader class of the run-llama/llama_index repository, specifically in version 0.12.27, allows for hardlink-based path traversal. This flaw permits attackers to bypass path restrictions and access sensitive system files, such as /etc/passwd, by exploiting hardlinks. The vulnerability arises from inadequate handling of hardlinks in the load_data() method, where the security checks fail to differentiate between real files and hardlinks. This issue is resolved in version 0.5.2.
The JSONReader in run-llama/llama_index versions 0.12.28 is vulnerable to a stack overflow due to uncontrolled recursive JSON parsing. This vulnerability allows attackers to trigger a Denial of Service (DoS) by submitting deeply nested JSON structures, leading to a RecursionError and crashing applications. The root cause is the unsafe recursive traversal design and lack of depth validation, which makes the JSONReader susceptible to stack overflow when processing deeply nested JSON. This impacts the availability of services, making them unreliable and disrupting workflows. The issue is resolved in version 0.12.38.
An XML Entity Expansion vulnerability, also known as a 'billion laughs' attack, exists in the sitemap parser of the run-llama/llama_index repository, specifically affecting version v0.12.21. This vulnerability allows an attacker to supply a malicious Sitemap XML, leading to a Denial of Service (DoS) by exhausting system memory and potentially causing a system crash. The issue is resolved in version v0.12.29.
A vulnerability in the ArxivReader class of the run-llama/llama_index repository, versions up to v0.12.22.post1, allows for MD5 hash collisions when generating filenames for downloaded papers. This can lead to data loss as papers with identical titles but different contents may overwrite each other, preventing some papers from being processed for AI model training. The issue is resolved in version 0.12.28.
A vulnerability in the `ObsidianReader` class of the run-llama/llama_index repository, versions 0.12.23 to 0.12.28, allows for arbitrary file read through symbolic links. The `ObsidianReader` fails to resolve symlinks to their real paths and does not validate whether the resolved paths lie within the intended directory. This flaw enables attackers to place symlinks pointing to files outside the vault directory, which are then processed as valid Markdown files, potentially exposing sensitive information.
A critical deserialization vulnerability exists in the run-llama/llama_index library's JsonPickleSerializer component, affecting versions v0.12.27 through v0.12.40. This vulnerability allows remote code execution due to an insecure fallback to Python's pickle module. JsonPickleSerializer prioritizes deserialization using pickle.loads(), which can execute arbitrary code when processing untrusted data. Attackers can exploit this by crafting malicious payloads to achieve full system compromise. The root cause includes an insecure fallback mechanism, lack of validation or safeguards, misleading design, and violation of Python security guidelines.
Multiple vector store integrations in run-llama/llama_index version v0.12.21 have SQL injection vulnerabilities. These vulnerabilities allow an attacker to read and write data using SQL, potentially leading to unauthorized access to data of other users depending on the usage of the llama-index library in a web application.
An SQL injection vulnerability exists in the delete function of DuckDBVectorStore in run-llama/llama_index version v0.12.19. This vulnerability allows an attacker to manipulate the ref_doc_id parameter, enabling them to read and write arbitrary files on the server, potentially leading to remote code execution (RCE).