Remote Code Execution via Unsafe YAML Deserialization in PyYAML

Critical Risk Deserialization & Object Security
pythonyamlpyyamldeserializationrceunsafe-loader

What it is

Remote code execution (RCE) vulnerability where yaml.unsafe_load or yaml.load with unsafe loaders processes untrusted YAML constructors, allowing arbitrary code execution during deserialization. YAML's ability to represent complex Python objects makes it vulnerable to deserialization attacks where attackers can craft YAML documents containing malicious Python object constructors that execute when parsed.

import yaml
from flask import Flask, request

app = Flask(__name__)

# VULNERABLE: unsafe_load allows code execution
@app.route('/config', methods=['POST'])
def upload_config():
    yaml_content = request.data.decode('utf-8')
    
    # DANGEROUS: can execute arbitrary code
    config = yaml.unsafe_load(yaml_content)
    # Also dangerous:
    # config = yaml.load(yaml_content, Loader=yaml.Loader)
    
    return {'status': 'uploaded'}

# Example malicious YAML that executes code:
# !!python/object/apply:os.system
# - 'rm -rf /'
import yaml
from flask import Flask, request

app = Flask(__name__)

# SECURE: safe_load prevents code execution
@app.route('/config', methods=['POST'])
def upload_config():
    yaml_content = request.data.decode('utf-8')
    
    # SAFE: only allows standard YAML tags
    config = yaml.safe_load(yaml_content)
    
    # Validate structure
    if not isinstance(config, dict):
        return {'error': 'Invalid format'}, 400
    
    return {'status': 'uploaded'}

💡 Why This Fix Works

The vulnerable code uses yaml.unsafe_load() which allows YAML to instantiate arbitrary Python objects, enabling remote code execution. The secure version uses yaml.safe_load() which only constructs standard YAML objects (strings, numbers, lists, dicts) and prevents code execution.

Why it happens

Directly using yaml.unsafe_load() which explicitly allows arbitrary Python object construction.

Root causes

Using yaml.unsafe_load() with Untrusted Data

Directly using yaml.unsafe_load() which explicitly allows arbitrary Python object construction.

Using yaml.load() Without SafeLoader

Using yaml.load() without specifying Loader=yaml.SafeLoader, which defaults to unsafe loading.

Processing External YAML Without Validation

Loading YAML from user uploads, APIs, or external sources without safe parsing.

Fixes

1

Replace with yaml.safe_load()

Use yaml.safe_load() which only allows standard YAML tags and prevents code execution.

2

Use SafeLoader Explicitly

If using yaml.load(), always specify Loader=yaml.SafeLoader.

3

Validate Parsed Data

Validate the structure and content of parsed YAML data before use.

Detect This Vulnerability in Your Code

Sourcery automatically identifies remote code execution via unsafe yaml deserialization in pyyaml and many other security issues in your codebase.