Python urllib Insecure HTTP urlopen Vulnerability

Medium Risk Insecure Transport
PythonurllibHTTPInsecure TransporturlopenNetwork Security

What it is

Application uses urllib.request.urlopen() or urllib2.urlopen() with HTTP URLs, exposing data transmission to eavesdropping and man-in-the-middle attacks.

import urllib.request import urllib2 # Python 2.x from flask import request @app.route('/fetch_url') def fetch_url(): # Vulnerable: HTTP URL with urlopen url = 'http://api.example.com/data' response = urllib.request.urlopen(url) return response.read().decode() @app.route('/proxy') def proxy_request(): # Vulnerable: User-controlled URL without validation target_url = request.args.get('url') try: # Dangerous: No scheme validation response = urllib.request.urlopen(target_url) return response.read() except: return 'Error fetching URL', 500 # Vulnerable: Python 2.x style def legacy_fetch(url): # Insecure: urllib2 with HTTP response = urllib2.urlopen(url) # No HTTPS enforcement return response.read()
import urllib.request import urllib.parse import ssl from flask import request def validate_https_url(url): """Validate that URL uses HTTPS scheme.""" parsed = urllib.parse.urlparse(url) if parsed.scheme != 'https': raise ValueError(f'Only HTTPS URLs allowed, got: {parsed.scheme}') return parsed def create_secure_context(): """Create secure SSL context.""" context = ssl.create_default_context() context.check_hostname = True context.verify_mode = ssl.CERT_REQUIRED return context @app.route('/fetch_url') def fetch_url(): """Securely fetch URL with HTTPS.""" try: # Secure: HTTPS URL url = 'https://api.example.com/data' validate_https_url(url) # Secure context with certificate verification context = create_secure_context() response = urllib.request.urlopen( url, timeout=10, context=context ) return response.read().decode('utf-8') except (ValueError, urllib.error.URLError) as e: return f'Error: {str(e)}', 500 @app.route('/proxy') def proxy_request(): """Secure proxy with URL validation.""" target_url = request.args.get('url', '') if not target_url: return 'URL parameter required', 400 try: # Validate HTTPS scheme parsed_url = validate_https_url(target_url) # Allowlist of permitted domains allowed_domains = [ 'api.trusted.com', 'secure.partner.com', 'data.example.com' ] if parsed_url.netloc not in allowed_domains: return 'Domain not allowed', 403 # Secure request with SSL verification context = create_secure_context() response = urllib.request.urlopen( target_url, timeout=15, context=context ) # Limit response size content = response.read(1024 * 1024) # 1MB limit return content.decode('utf-8', errors='ignore') except ValueError as e: return f'Invalid URL: {str(e)}', 400 except urllib.error.URLError as e: return f'Request failed: {str(e)}', 500 # Secure helper function for HTTP requests def secure_urlopen(url, timeout=10, max_size=1024*1024): """Secure wrapper for urllib.request.urlopen.""" # Validate URL validate_https_url(url) # Create secure SSL context context = create_secure_context() try: response = urllib.request.urlopen( url, timeout=timeout, context=context ) # Read with size limit content = response.read(max_size) return content except urllib.error.URLError as e: raise RuntimeError(f'Secure request failed: {str(e)}') @app.route('/secure_fetch') def secure_fetch(): """Example using secure urlopen wrapper.""" try: content = secure_urlopen('https://api.example.com/secure-data') return content.decode('utf-8') except (ValueError, RuntimeError) as e: return f'Error: {str(e)}', 500 # Additional security: Custom URL opener def create_secure_opener(): """Create URL opener with security configurations.""" # Create HTTPS handler with SSL context context = create_secure_context() https_handler = urllib.request.HTTPSHandler(context=context) # Block HTTP handler entirely opener = urllib.request.build_opener(https_handler) # Set secure headers opener.addheaders = [ ('User-Agent', 'SecureApp/1.0'), ('Accept', 'application/json, text/plain') ] return opener @app.route('/opener_example') def opener_example(): """Example using secure opener.""" try: opener = create_secure_opener() response = opener.open('https://api.example.com/data', timeout=10) return response.read().decode('utf-8') except urllib.error.URLError as e: return f'Request failed: {str(e)}', 500

💡 Why This Fix Works

See fix suggestions for detailed explanation.

Why it happens

Code opens HTTP URLs: urllib.request.urlopen('http://api.example.com'). HTTP transmits data unencrypted. API responses, credentials, sensitive information exposed to network eavesdropping. urlopen() supports HTTP without warning. Legacy urllib usage often defaults to insecure HTTP connections.

Root causes

Using urllib.request.urlopen() with HTTP URLs

Code opens HTTP URLs: urllib.request.urlopen('http://api.example.com'). HTTP transmits data unencrypted. API responses, credentials, sensitive information exposed to network eavesdropping. urlopen() supports HTTP without warning. Legacy urllib usage often defaults to insecure HTTP connections.

Not Validating URL Protocols Before urlopen() Calls

Opening URLs without scheme checks: url = config['endpoint']; urlopen(url). URLs from configuration, environment, or user input may use HTTP. No validation ensuring HTTPS. Applications trust external URL sources. Missing protocol validation enables insecure HTTP connections through urlopen().

Using urlopen() for API Calls Without TLS Verification

Making API requests with urllib: response = urlopen(api_url). urllib provides limited TLS configuration. Less control than requests library over certificate validation, cipher suites, or protocol versions. Default verification may accept weak TLS configurations. urllib inappropriate for secure API communications.

Legacy Code Using urllib Instead of Modern requests Library

Older codebases with urllib: import urllib.request. urllib.request lower-level, less secure defaults than requests. Missing features like automatic retries, connection pooling, better error handling. Security features require manual configuration. Modern code should use requests library instead.

HTTP URLs in Documentation or Example Code

Examples showing HTTP usage: # Example: urlopen('http://example.com'). Developers copy-paste insecure examples. Documentation using HTTP for simplicity. Production code based on insecure examples. Documentation should always demonstrate HTTPS even for examples.

Fixes

1

Always Use HTTPS URLs with urllib.request.urlopen()

Use HTTPS exclusively: urllib.request.urlopen('https://api.example.com'). Replace all http:// with https://. For external services, HTTPS mandatory. urlopen() supports HTTPS with automatic certificate validation. Ensure all URLs use secure protocol before opening.

2

Validate URL Schemes Before urlopen(), Reject HTTP

Check protocol before opening: from urllib.parse import urlparse; if urlparse(url).scheme != 'https': raise ValueError('HTTPS required'); urlopen(url). Validate all URLs from external sources. Allowlist HTTPS scheme only. Fail fast on insecure protocols.

3

Migrate from urllib to requests Library for Better Security

Replace urllib with requests: import requests; response = requests.get('https://api.example.com', verify=True). requests library provides better security defaults, easier API, better error handling. Automatic certificate validation, connection pooling, modern TLS support. Migration improves security and code quality.

4

Configure TLS Settings Explicitly if Using urllib

If urllib required, configure TLS: import ssl; context = ssl.create_default_context(); context.minimum_version = ssl.TLSVersion.TLSv1_2; urlopen(url, context=context). Set minimum TLS version. Verify certificates. Disable weak ciphers. Explicit TLS configuration ensures secure connections.

5

Use Certificate Pinning for Critical Services

Pin certificates for sensitive APIs: import ssl; context = ssl.create_default_context(cafile='/path/to/cert.pem'); urlopen(url, context=context). Provide specific CA certificate. Prevents man-in-the-middle with compromised CAs. Critical for authentication or payment APIs.

6

Update Documentation and Examples to Use HTTPS

Change all examples to HTTPS: # Example: urlopen('https://api.example.com'). Documentation should never show HTTP for network requests. Code review rejecting HTTP in examples. README files with HTTPS only. Secure examples prevent copy-paste vulnerabilities.

Detect This Vulnerability in Your Code

Sourcery automatically identifies python urllib insecure http urlopen vulnerability and many other security issues in your codebase.