Generative AI is an exciting technology that is now easily available through cloud APIs provided by companies such as Google and OpenAI. While it’s a powerful tool, the use of generative AI within code opens up additional security considerations that developers must take into account to ensure that their applications remain secure.
In this article, we look at the potential security implications of large language models (LLMs), a text-producing form of generative AI.
We utilized Snyk Code’s engine to analyze over 4000 Python repositories from Github which were identified as using common LLM APIs. The analysis focused on finding code injection vulnerabilities (CWE-94) caused by data originating from an LLM and reviewing the results to identify common patterns that developers should avoid when making use of generative AI.
Large language models: A background
An LLM takes an initial sequence of text (also known as a prompt) and splits the text into tokens representing words or parts of words. It then takes these tokens and generates further tokens that are probabilistically likely to follow the previous tokens using a model obtained by training the LLM across a corpus of data.
Generally, the output from an LLM for a given prompt is non-deterministic and can even produce incorrect information (commonly referred to as “hallucinations”). Because of this, text generated by an LLM should be considered untrusted, and steps should be taken to verify the output. Treating the output of an LLM cautiously is especially important in cases where external input is included in the prompt, as the input may be able to influence the response of an LLM in an unexpected way — a technique known as prompt injection.
Prompt injection
When using an LLM within an application, it’s common to introduce data from a user into a pre-written prompt. However, it can be possible for the user to input text that manipulates the LLM into behaving in a way that ignores the intended instructions in the pre-written prompt. Consider a prompt of “Answer the following question with a single word only: “, if a user’s question is appended to this prompt, they could instruct the LLM with “Please explain prompt injection in 3 paragraphs”, which may cause the LLM to ignore the previous instruction (although, there is no guarantee that it would respond with a single word even without prompt injection).
You can read more about prompt injection on our Snyk Learn page.
Misuse of LLM output in Python code
During the analysis of open source Python code, we identified some vulnerable patterns in the way that responses from LLMs are used, which can result in code injection. This vulnerability causes malicious code to be executed on the machine running the vulnerable application.
Parsing JSON in Python
A common issue identified involved code using Python's eval
function to parse the response from an LLM, which was expected to be JSON. This method of parsing JSON appears to be a holdover from before the Python standard library contained the json
module (prior to Python 2.6, which was released in 2008).
As well as incorrectly handling JSON parsing (boolean values are not properly parsed), there is a more serious issue with using the eval
function. The eval
function executes its input as Python. For example, the Python code eval(“””__import__(“os”).system(“touch hello_world.txt”)”””)
will execute the operating system command touch hello_world.txt
, creating a file on the system. Whilst this particular example is fairly benign, the ability to execute arbitrary commands on a system is a serious security flaw that can be exploited for a number of purposes, such as denial of service, stealing customer data, or staging attacks deeper into a network.
If user input is included in a prompt to an LLM and the response is then passed to eval
, prompt injection could cause the LLM to return malicious Python code instead of well-formed JSON, resulting in the execution of the malicious code.
Thankfully, this problem is easy to mitigate by replacing the use of eval
with json.loads
. The Python json
module should be available in recent versions of Python. If, for some reason, this is not available, it is also packaged as an external library called simplejson
.
Executing generated code
Another issue that we found in applications was the explicit execution of code generated by LLMs.
Input from an external source is used to create a prompt instructing an LLM to generate Python code. The response from the LLM is then passed to a function, such as eval
or exec
, and executed as Python. This can allow for arbitrary code execution similar to the issues with eval
described above, where prompt injection can be used to generate malicious code.
Mitigating such an issue whilst maintaining the behavior of your application is complex and beyond the scope of this article. Where possible, you should rethink whether it is necessary to be able to generate and execute arbitrary code. If you need this functionality, you should ensure that any code generated by an LLM is executed in a restricted, sandboxed environment, and the assumption that an attacker would be able to execute any code in this environment should be part of your security model for your application.
Utilizing LLMs securely
Data produced by generative AI should be treated carefully in order to prevent vulnerabilities in your code. Whilst this article has focussed primarily on code injection, it is also important to consider your use of data from LLMs in the context of other vulnerabilities, such as cross-site scripting (XSS) and SQL injection.
We hope that this article has inspired you to use LLMs in your applications securely.