GraphQL is an API query language and server-side runtime that allows clients to request the exact data they need with a single request to a single endpoint, rather than making multiple requests to different endpoints as is often required with REST APIs. It can be used with any backend framework or programming language, allowing for data sharing between different applications.
We've been playing with it at Arcjet because we've seen common integrations with Apollo + NestJS and Yoga + Next.js. Arcjet brings security closer to your application by analyzing requests within the context of your route handlers or middleware, so it can be used to secure your GraphQL endpoints.
To grasp the security risks associated with GraphQL, it is vital to understand how it works from a high-level conceptually.
Understanding GraphQL
GraphQL utilizes the following main components:
- Schema : The schema defines how your available data is structured and their relationships to each other. You can conceptualize a schema as a menu that a client uses to order data in the correct way based on what data is offered. Data objects are called types and their associated properties are called fields. Each field has its own scalar type of either an integer, float, string, boolean or ID (also a string but not intended to be human-readable) value. GraphQL also allows fields to have multiple or custom scalar types.
- Queries : HTTP requests sent to the GraphQL endpoint are known as queries.
- Resolvers : Every type and field in a schema can have a defined resolver function that handles the queries received and returns the requested data. Resolvers can be asynchronous to interact with other APIs and databases to fetch data.
- Mutations : While queries are used to read data, mutations are utilized to write data to a schema. Mutation requests can be used to create, update, or delete data on the GraphQL server.
GraphiQL
The GraphiQL interface is an integrated development environment (IDE) that can be used to build queries using a graphical user interface. This IDE can be located at endpoints such as /graphql
, /playground
, or /console
.
As this page is designed to assist end users in constructing queries with the correct syntax – it includes autocomplete suggestions and excessive error descriptions. These features, while convenient, can divulge hints to malicious attackers on the syntax required to successfully interact with the data.
For example, a malformed query may return the following response:
{
"errors": [
{
"message": "Fields \"userInfo\" conflict because they have differing arguments. Use different aliases on the fields to fetch both if this was intentional.",
"locations": [
{
"line": 2,
"column": 3
},
{
"line": 6,
"column": 3
}
],
...
}
]
}
As seen above, even if you did not notice the autocomplete suggestion for the type of username
, the error message also tells you exactly what to fix, helping attackers construct valid queries.
GraphQL Introspection
By default, nearly every GraphQL instance is configured to have an enabled introspection system. An introspection query reveals the entire schema. An example of an introspection query is:
query IntrospectionQuery {
__schema {
queryType {
name
}
mutationType {
name
}
subscriptionType {
name
}
types {
...FullType
}
directives {
name
description
args {
...InputValue
}
}
}
}
fragment FullType on __Type {
kind
name
description
fields(includeDeprecated: true) {
name
description
args {
...InputValue
}
type {
...TypeRef
}
isDeprecated
deprecationReason
}
inputFields {
...InputValue
}
interfaces {
...TypeRef
}
enumValues(includeDeprecated: true) {
name
description
isDeprecated
deprecationReason
}
possibleTypes {
...TypeRef
}
}
fragment InputValue on __InputValue {
name
description
type {
...TypeRef
}
defaultValue
}
fragment TypeRef on __Type {
kind
name
ofType {
kind
name
ofType {
kind
name
ofType {
kind
name
}
}
}
}
The response to such a query would return the entire schema which could be parsed by a threat actor to gain an understanding of the attack surface of the API.
Denial of Service Attacks
There are various attack techniques that all aim to overload the backend with queries to achieve a Denial of Service (DoS) outage at the application level.
Note: All examples will be denoted with ellipsis (...) to indicate a continuation of the queries.
Query Batch DoS Attack
Requests sent to the GraphQL endpoint are not limited to a single query. GraphQL supports what is known as query batching, which enables multiple queries to be included in just one HTTP request.
If this capability is not disabled or the maximum number of allowed queries within a batch is not configured to a reasonable value – severe security vulnerabilities could arise. Such a malicious query could resemble the following:
query {
userInfo(id: "1") {
name
email
}
userInfo(id: "2") {
name
email
}
userInfo(id: "3") {
name
email
}
...
}
Additionally, if a malicious attacker has the ability to batch queries, rate limiting implementations can be bypassed as only a single request is sent.
In REST APIs, rate limiting can be an effective proactive security measure against brute-force and enumeration attacks, as the ratio of requests to endpoints is 1:1. However, since a sole GraphQL request can contain multiple queries, rate limiting provides no protection against calls for data.
An attacker can leverage this to brute force valid credentials of an account if GraphQL is used for authentication via a mutation that returns a session token.
Alias DoS Attacks
Even if query batching is disabled, a server can still be overloaded if a malicious attacker uses different aliases for the same data in the request to the GraphQL endpoint.
Under normal circumstances, aliases are used to avoid naming conflicts when fetching the same object field multiple times in cases where different arguments are used. A alias attack query could look like the following:
query {
user1: userInfo(id: "1") {
name
email
}
user2: userInfo(id: "2") {
name
email
}
user3: userInfo(id: "3") {
name
email
}
...
}
Duplication DoS Attacks
Another technique utilized by threat actors to carry out DoS attacks against GraphQL APIs is to include duplicates of fields within queries. Again, this type of attack aims to exceed the processing power of the backend. An example of a duplication attack query is:
query {
userInfo(id: "1") {
name
email
name
email
name,
email
...
}
}
Circular Query DoS Attacks
A circular query attack occurs when a query causes data to be fetched in a loop using nested fields.
Without limitations in place, these deep level queries can lead to excessive resource consumption on the server.
query {
userInfo(id: "1") {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
...
}
}
}
}
In the above example, the query fetches a user’s name, email, and friends field. The query then recursively requests the same fields for each friend. If clients are able to abuse query depth in this manner, the server could easily be overwhelmed.
Injection Attacks
As with any other user input, queries and mutations can be vulnerable to injection attacks such as: Cross-Site Scripting (XSS), Structured Query Language Injection (SQLi), Server-Side Request Forgery (SSRF), and Command Injection.
GraphQL XSS
In the absence of sufficient input sanitization on user input, if a GraphQL server returns user generated content such as comments or posts – threat actors could inject their own code that will be served and executed. There are three major classifications of XSS attacks:
- Reflected XSS : Applications are vulnerable to this variant of XSS if a malicious script is included within the immediate response to a request and executed in the victim’s browser.
- Stored XSS : In this type of XSS, the injected payload is stored by the web application and delivered to anyone who subsequently visits the vulnerable page.
- DOM XSS : These vulnerabilities occur when JavaScript takes data from a user controlled source such as a URL query parameter and that data is used in a sink.
For example, imagine a web application that uses the input value of a biography field in a form used to customize a user’s profile page. The supplied value is then directly displayed in the user’s profile. An XSS payload delivered via a GraphQL mutation could resemble something like:
mutation {
changeUserInfo(
id: "2"
name: "John Doe",
bio: "'<script>alert();</script>//"
) {
id
name
email
bio
}
}
As the value of the biography is rendered to the webpage, the injected JavaScript would execute in the browser, displaying an alert pop up.
GraphQL SQLi
Structured Query Language (SQL) is the language used to interact with an SQL database. In an SQL injection (SQLi) attack, a malicious database query statement is sent to extract, update, add or delete additional information. Vulnerabilities can arise if GraphQL API statements are directly translated into SQL queries without appropriate protection measures in place. An example of an SQLi payload is:
query {
userInfo(id: "admin' OR '1'='1'") {
id
name
email
bio
}
}
As '1'='1'
will evaluate to true
, if the web application does not sanitize input or use preset allowed queries, the sensitive information of the administrator could be compromised.
GraphQL SSRF
If a query or mutation makes the server issue an HTTP request, under certain conditions, an attacker can supply a URL to an arbitrary host. For example, if a server makes a request to an internal host, an attacker could use the server as a proxy to scan the internal network:
mutation {
getUrlData(url: "http://172.16.1.222/private_data") {
id
name
email
bio
}
}
GraphQL Command Injection
If a mutation or query uses unsanitized user-supplied input directly in a terminal command on the server, Command Injection attacks could be carried out.
For example, if the backend fetches an image from an external source using a user-supplied URL and the wget command, a command injection payload could be:
mutation {
changeUserInfo(
id: "2"
name: "John Doe",
wget: "https://example.com/img.png&ls"
) {
id
name
email
bio
}
}
Protecting Your GraphQL API
Although Arcjet provides robust protection for REST APIs through various defensive measures like rate limiting and bot protection, GraphQL APIs require additional security implementations due to their unique structure and potential vulnerabilities.
Using a single protection, such as rate limiting, will only protect against request count and doesn’t help with nested or batch queries. The philosophy to apply here is defense in depth because if one protection fails, we will still hit another security layer further down the stack.
Adhering to the actionable suggestions laid out in the GraphQL Cheat Sheet provided by OWASP can close the attack vectors that threat actors can exploit.
To add additional defensive layers to your GraphQL API, plugins such as GraphQL Armor can be used. View how it is accomplished with Yoga, Next.js & Arcjet.
Disable Schema Enumeration Features
Disabling schema enumeration features like the GraphiQL interface and introspection capabilities is crucial for security. These features can expose your entire schema, allowing attackers to understand the structure of your API and identify potential vulnerabilities. By disabling them, you reduce the surface area for reconnaissance attacks, making it harder for attackers to formulate targeted exploits.
Implement a Timeout
Setting a timeout for query processing is a straightforward but vital security measure. By enforcing a time limit on how long the server will attempt to process a query, you can prevent long-running queries from consuming excessive server resources.
This is particularly important for mitigating denial-of-service attacks, where attackers may try to overwhelm your server with complex or resource-intensive queries. Timeouts help ensure that your server remains responsive and can handle legitimate requests without being bogged down.
Limit Query Complexity
Implementing limits on query complexity is essential for reducing the risk of resource exhaustion attacks. By setting parameters such as character limits, cost limits, maximum query depth, and limits on aliases and directives, you can effectively control how complex a query can be.
This measure not only protects against excessive resource consumption but also ensures that your API remains online under heavy load. Limiting query complexity can deter attackers who may exploit overly complex queries to extract sensitive information or disrupt service.
Node.js + Apollo GraphQL Server + Arcjet + GraphQL Armor + Validation
To demonstrate how to implement GraphQL specific protections, we will create a simple Node.js application with an Apollo API endpoint, and then integrate functionality provided by security libraries.
- Create a directory in which you would like to store the application:
mkdir arcjet-graphql
- Enter the newly created directory:
cd arcjet-graphql
- Initialize a new project with:
npm init --yes && npm pkg set type="module"
- Install the necessary dependencies:
npm install @apollo/server graphql @arcjet/node @escape.tech/graphql-armor @graphql-tools/schema graphql-constraint-directive graphql-tag
- Install the TypeScript and Node packages:
npm install --save-dev typescript @types/node
- Create a src directory:
mkdir src
src/tsconfig.json
Create a tsconfig.json
file with the following content:
{
"compilerOptions": {
"target": "es2020",
"module": "es2022",
"lib": ["es2020"],
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"outDir": "./dist",
"rootDir": "./src",
"moduleResolution": "node"
},
"include": ["src/**/*"],
"exclude": ["node_modules"]
}
src/package.json
Update your package.json
file to the following:
{
"name": "arcjet-graphql",
"version": "1.0.0",
"main": "index.js",
"type": "module",
"scripts": {
"compile": "tsc",
"start": "npm run compile && node --env-file .env.local dist/index.js"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"dependencies": {
"@apollo/server": "^4.11.0",
"@arcjet/node": "^1.0.0-alpha.28",
"@escape.tech/graphql-armor": "^3.1.1",
"graphql": "^16.9.0",
"graphql-constraint-directive": "^5.4.3",
"graphql-tag": "^2.12.6"
},
"devDependencies": {
"@types/node": "^22.8.1",
"typescript": "^5.6.3"
}
}
src/.env.local
- Create a
.env.local
file in the project’s root. Add the following to this file:
ARCJET_ENV=development
ARCJET_KEY=ajkey_YOUR-KEY-VALUE
To obtain a key create an Arcjet account or sign in - the key will be present in your account dashboard page. This Arcjet API key enables your application to use Arcjet, which can then be viewed on the Arcjet dashboard.
src/index.ts
Now, let’s create the src/index.ts
file using an Apollo Server with Arcjet Shield and GraphQL Armor protections implemented:
Import the required dependencies:
import { ApolloServer } from '@apollo/server';
import { startStandaloneServer } from '@apollo/server/standalone';
import arcjet, { tokenBucket, detectBot } from "@arcjet/node";
import { ApolloArmor } from '@escape.tech/graphql-armor';
import { ApolloServerPluginLandingPageDisabled } from '@apollo/server/plugin/disabled';
import { GraphQLArmorConfig } from '@escape.tech/graphql-armor-types';
import { makeExecutableSchema } from '@graphql-tools/schema';
import { createApollo4QueryValidationPlugin, constraintDirectiveTypeDefs } from 'graphql-constraint-directive/apollo4.js';
import gql from 'graphql-tag';
Configure Arcjet with Shield protection and your API key:
const aj = arcjet({
key: process.env.ARCJET_KEY!,
rules: [
tokenBucket({
mode: "LIVE",
refillRate: 1,
interval: "10s",
capacity: 5,
}),
detectBot({
mode: "LIVE",
allow: [],
}),
],
});
Configure GraphQL Armor settings and apply the rules to your server with the ApolloArmor wrapper via armor.protect()
:
const armorConfig: GraphQLArmorConfig = {
maxAliases: {
n: 3, // Maximum number of aliases allowed.
},
maxDepth: {
n: 5, // Maximum query depth.
},
maxTokens: {
n: 1000, // Maximum number of tokens allowed in a query.
},
blockFieldSuggestion: {
enabled: true, // Block field suggestions.
},
costLimit: {
maxCost: 100, // Maximum allowed query cost.
},
};
// Initialize ApolloArmor with custom config.
const armor = new ApolloArmor(armorConfig);
const protection = armor.protect();
Define the GraphQL schema. Included is an object type for user data, a query to retrieve the data, and a mutation to change the name and biography of a user based on their id
.
const typeDefs = gql`
${constraintDirectiveTypeDefs}
type User {
id: ID!
name: String! @constraint(pattern: "^[a-zA-Z0-9.]+$")
email: String! @constraint(format: "email")
bio: String @constraint(maxLength: 100, pattern: "^[a-zA-Z0-9.]+$")
friends: [User]
}
type Query {
userInfo(id: ID!): User
}
type Mutation {
changeUserInfo(id: ID!, name: String! @constraint(pattern: "^[a-zA-Z0-9.]+$"), bio: String @constraint(maxLength: 500, pattern: "^[a-zA-Z0-9.]+$")): User
}
input UserInput {
name: String! @constraint(minLength: 2, pattern: "^[a-zA-Z0-9.]+$")
email: String! @constraint(format: "email")
bio: String @constraint(maxLength: 100, pattern: "^[a-zA-Z0-9.]+$")
}
`;
Create an interface for the User
type object to provide a TypeScript representation for use in your resolver functions.
interface User {
id: string;
name: string;
email: string;
bio?: string;
friends: User[];
}
For demonstration purposes, instead of connecting a database – hardcode user data to be used in testing:
const users: User[] = [
{
id: "1",
name: "John Doe",
email: "john@example.com",
bio: "",
friends: [
{
id: "2",
name: "Jane Doe",
email: "jane@example.com",
bio: "",
friends: []
}
]
},
{
id: "2",
name: "Jane Doe",
email: "jane@example.com",
bio: "",
friends: [
{
id: "1",
name: "John Doe",
email: "john@example.com",
bio: "",
friends: []
}
]
}
];
Define the resolver functions:
-
userInfo
: Returns user data based on the supplied id when a userInfo query is made. -
changeUserInfo
: Allows for thename
and optionally thebio
of a specified user to be changed. -
friends
: A resolver for thefriend
field which takes the current user (parent object) and returns the data on each friend.
const resolvers = {
Query: {
userInfo: (_: any, { id }: { id: string }) => users.find(user => user.id === id)
},
Mutation: {
changeUserInfo: (_: any, { id, name, bio }: { id: string, name: string, bio?: string }) => {
const user = users.find(user => user.id === id);
if (!user) {
throw new Error('User not found.');
}
user.name = name;
if (bio !== undefined) user.bio = bio;
return user;
}
},
User: {
friends: (parent: User) => parent.friends.map(friend => users.find(user => user.id === friend.id))
}
};
Create a new Apollo server. Initially, leave all the protections commented out to use the GraphQL API in its default state. When testing, uncomment them one at a time to view the protection provided.
const schema = makeExecutableSchema({
typeDefs,
resolvers,
});
const server = new ApolloServer({
schema,
plugins: [
...protection.plugins, // GraphQL Armor plugins.
// createApollo4QueryValidationPlugin(),
// ApolloServerPluginLandingPageDisabled(), // Disables landing page.
],
// validationRules: protection.validationRules,
// introspection: false, // Disables introspection.
});
Your server will run on http://localhost:4000
async function startServer() {
const { url } = await startStandaloneServer(server, {
listen: { port: 4000 },
context: async ({ req, res }) => {
// Apply ArcJet rate limiting and bot protection.
const decision = await aj.protect(req, { requested: 1 });
if (decision.isDenied()) {
console.log("Request denied:", decision);
res.writeHead(429, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "BLOCKED" }));
}
return { req, res };
},
});
console.log(`🚀 Server ready at: ${url}`);
}
startServer().catch(console.error);
- Run:
npm start
- Navigate to http://localhost:4000
- To prevent output spam, click the gear icon in the ‘Sandbox’ input field and set ‘Auto Update’ to ‘Off’:
- With all of the protections commented out, you are free to use the GraphQL API and have no restrictions. Try sending queries such as:
query {
userInfo(id: "1") {
id
name
email
bio
friends {
id
name
email
}
}
}
query {
userInfo(id: "1") {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
friends {
id
name
email
bio
}
}
}
}
}
}
}
}
}
}
}
}
mutation {
changeUserInfo(id: "1", name: "<script>alert()</script>", bio: "Updated bio") {
id
name
email
bio
}
}
}
Testing
Test the protections by uncommenting the associated lines and adjusting the thresholds on the GraphQL armor configurations. Send calls to the endpoint using the attack techniques discussed in this article.
The results should resemble the following:
Implement Authentication and Authorization
In addition to all of these protections, proper authentication and authorization are crucial for protecting sensitive data and operations in your GraphQL API. By requiring users to prove they are who they claim to be and have the correct permission levels, you ensure that no unauthorized access to data occurs. For further information, view the Apollo documentation.
Calculating Query Cost
Another defense you could implement is limiting queries based on their complexity by adding the following in the GraphQL Armor configuration block:
costLimit: {
maxCost: 100, // Maximum allowed query cost.
},
View the GraphQL Inspector documentation and see how Shopify implemented this protection here.
Conclusion
As you can see, due to how GraphQL APIs operate, they require a much more intricate defense strategy than REST APIs do. However, by layering protection, you can prevent unauthorized access to data and abuse of your endpoint.