Express.js XML External Entity (XXE) via Expat Parser

High Risk XML External Entity (XXE)
expressxxexmlexpatjavascriptexternal-entities

What it is

The Express.js application uses the Expat XML parser in a way that allows XML External Entity (XXE) attacks. This vulnerability occurs when XML parsing is configured to process external entities, enabling attackers to read local files, perform SSRF attacks, or cause denial of service.

// Vulnerable: Expat with default settings
const expat = require('node-expat');

app.post('/parse-xml', (req, res) => {
  const parser = new expat.Parser('UTF-8');
  
  parser.on('startElement', (name, attrs) => {
    console.log('Element:', name);
  });
  
  parser.write(req.body.xml); // Dangerous: processes external entities
  parser.end();
  
  res.json({ success: true });
});
// Secure: Disable external entity processing
const libxml = require('libxmljs');

app.post('/parse-xml', (req, res) => {
  try {
    // Parse with security options
    const doc = libxml.parseXml(req.body.xml, {
      noent: false, // Disable entity substitution
      nonet: true,  // Disable network access
      nowarning: true,
      noerror: true
    });
    
    res.json({ success: true, data: doc.toString() });
  } catch (error) {
    res.status(400).json({ error: 'Invalid XML format' });
  }
});

💡 Why This Fix Works

The vulnerable code was updated to address the security issue.

Why it happens

Node.js applications use Expat XML parser through libraries like node-expat, saxes, or sax-js with default configurations that allow external entity processing and DTD declarations. The Expat parser defaults to processing external entities unless explicitly configured otherwise. Applications create parser instances without security hardening: new expat.Parser() or saxes.SaxesParser() without options disabling entity expansion. When parsing XML containing entity declarations like <!ENTITY xxe SYSTEM "file:///etc/passwd">, the parser resolves these entities by default, enabling file disclosure, SSRF to internal services, or billion laughs DoS attacks.

Root causes

Expat Parser Default Settings Enable External Entities

Node.js applications use Expat XML parser through libraries like node-expat, saxes, or sax-js with default configurations that allow external entity processing and DTD declarations. The Expat parser defaults to processing external entities unless explicitly configured otherwise. Applications create parser instances without security hardening: new expat.Parser() or saxes.SaxesParser() without options disabling entity expansion. When parsing XML containing entity declarations like <!ENTITY xxe SYSTEM "file:///etc/passwd">, the parser resolves these entities by default, enabling file disclosure, SSRF to internal services, or billion laughs DoS attacks.

Processing Untrusted XML Without Disabling Entity Resolution

Express applications accept XML from user uploads, API requests (Content-Type: application/xml), webhook callbacks, or external integrations without disabling entity resolution in the parser configuration. Applications use body-parser or custom middleware to parse XML request bodies and pass them to Expat-based parsers without security options. Developers treat XML as a data format similar to JSON without understanding the security implications of DTD processing and external entity resolution. User-supplied XML from SOAP requests, RSS feeds, SVG uploads, or configuration files gets parsed with full entity processing enabled.

Missing Pre-Parse XML Validation and Sanitization

Applications fail to validate or sanitize XML content before passing it to the Expat parser. No checks for suspicious patterns like <!DOCTYPE, <!ENTITY, SYSTEM, PUBLIC keywords in XML payloads. Applications don't implement file size limits to prevent DoS through entity expansion, don't validate Content-Type headers properly, and don't inspect XML structure before parsing. XML passes directly from req.body through body-parser to application logic without intermediate security controls. Lack of input validation allows attackers to submit carefully crafted XXE payloads that exploit the parser before application code can inspect or filter the content.

Explicitly Enabling DOCTYPE Processing

Applications or XML parsing libraries explicitly enable DOCTYPE processing thinking it's required for valid XML parsing or schema validation. Code configures parsers with options enabling DTD features: {resolvePrefix: true} in sax or similar flags in other Expat wrappers. Developers enable DOCTYPE processing to support legacy XML formats, XML schema validation via DTDs, or XML entity substitution for templating purposes. Even applications that understand XXE risks may enable DOCTYPE processing under the assumption that other controls (input validation, network restrictions) provide sufficient protection, creating defense-in-depth gaps.

Insufficient Security Configuration Across XML Stack

Applications use multiple XML processing libraries (sax, saxes, node-expat, fast-xml-parser) with inconsistent security configurations. Some parsers have XXE protections while others don't, creating vulnerabilities where different code paths process XML differently. Developers configure one parser securely but miss configuration in secondary parsers used for different XML processing tasks. XML processing libraries get updated with new security options, but applications don't update configurations to enable new protections. No centralized XML parser factory ensures consistent security configuration across the application, leading to ad-hoc parser instantiation with varying security postures.

Fixes

1

Disable External Entity Processing in Expat Configuration

Configure Expat-based parsers to explicitly disable external entity resolution and DTD processing. For sax parser, use strict mode and disable entity resolution: const parser = require('sax').parser(true, {strictEntities: true, xmlns: false}). For saxes, create parser with {defaultXMLVersion: '1.0'} and handle entities through onopentag callbacks without resolving. For node-expat, avoid using Parser directly and wrap in security layer that rejects DOCTYPE declarations. Set parser options that prevent entity expansion before any XML processing begins. Test configuration by attempting to parse XML with external entities to verify they're rejected or not resolved.

2

Use Modern XML Parsers with Secure Defaults

Replace legacy Expat-based parsers with modern alternatives designed with security as default: fast-xml-parser with {ignoreAttributes: false, parseAttributeValue: false, allowBooleanAttributes: true, processEntities: false} to disable entity processing. Use xmldom with DOMParser that doesn't resolve external entities. For new projects, prefer XML libraries explicitly marketed as XXE-resistant. Create secure parser wrapper functions that encapsulate security configuration and use consistently: function secureXMLParse(xmlString) { return new FastXMLParser({processEntities: false}).parse(xmlString); }. Maintain library versions and monitor security advisories for XML parsing dependencies.

3

Implement Pre-Parse XML Validation and Sanitization

Validate XML input before parsing to detect and reject malicious payloads. Check for XXE attack patterns: if (xmlString.match(/<!ENTITY|<!DOCTYPE|SYSTEM|PUBLIC/i)) throw new Error('Invalid XML'). Implement file size limits rejecting XML payloads larger than business requirements (e.g., max 1MB for API requests). Validate Content-Type headers match expected application/xml or text/xml. Strip or reject DOCTYPE declarations before parsing: xmlString = xmlString.replace(/<!DOCTYPE[^>]*>/gi, ''). Use allowlists for expected XML root elements and reject unexpected structures. Log rejected XML payloads to security monitoring systems for attack detection.

4

Enforce XML Schema Validation with XSD

Define and enforce strict XML Schema (XSD) validation for all accepted XML formats using libraries like libxmljs2 with schema validation enabled. Create XSD schemas that define exact allowlists of permitted elements, attributes, and data types. Validate XML against schema before any processing: const schema = libxmljs.parseXmlString(xsdString); if (!schema.validate(xmlDoc)) reject. Schemas prevent unexpected XML structures and provide defense-in-depth against XXE by rejecting malformed documents before entity processing. Maintain schemas in version control alongside application code. Generate schemas from sample data for third-party XML integrations.

5

Migrate to JSON for Data Exchange

Evaluate whether XML is truly required or if JSON can replace it for data exchange. Modern APIs overwhelmingly prefer JSON due to simpler parsing, better performance, and absence of XXE vulnerabilities. Migrate internal APIs from XML to JSON: accept application/json instead of application/xml, convert XML responses to JSON responses. For external integrations requiring XML (SOAP, legacy systems), isolate XML processing to specific API gateway layers with enhanced security controls and monitoring. Document business justification for any XML usage and security mitigations applied. JSON eliminates XXE attack surface entirely while providing better developer experience and ecosystem support.

6

Implement Centralized Secure XML Parser Factory

Create centralized XML parser factory that enforces security configuration across entire application: module.exports.createSecureXMLParser = function() { return new FastXMLParser({processEntities: false, allowBooleanAttributes: true, ignoreAttributes: false}); }. Replace all ad-hoc parser instantiation with factory calls ensuring consistent security posture. Use dependency injection to provide parsers to application components. Implement parser wrapper that logs usage, monitors for XXE attack attempts, and enforces security policies. Add automated tests verifying parser factory rejects XXE payloads. Maintain parser security configuration in single location making updates easier and reducing risk of misconfiguration.

Detect This Vulnerability in Your Code

Sourcery automatically identifies express.js xml external entity (xxe) via expat parser and many other security issues in your codebase.