Security Write-Up · Project #01
PhishGuard
Email & SMS Threat Analyzer

A browser-based cybersecurity tool that analyzes phishing emails and SMS smishing messages in real time — detecting threats, scoring risk, and generating downloadable investigation reports.

Open Live App View Source
Live on GitHub Pages
Email + SMS Analysis
Mobile Responsive
No Server Required
Scroll to explore
01 · Overview

Problem
Statement

Phishing and smishing are among the most damaging forms of cyberattack. Most users cannot read email headers, recognize domain spoofing, or spot psychological manipulation in text messages. PhishGuard automates the entire investigation process and presents findings in plain language anyone can understand — accessible to anyone with a browser, including on mobile.

7
Detection Stages
5
Keyword Categories
0
Server Dependencies
2
Analyzers — Email & SMS
02 · Detection Engine

How It
Works

PhishGuard runs a seven-stage detection pipeline on each email. Structural indicators are weighted more heavily than keyword matches — they require deliberate construction and are far harder to accidentally trigger in legitimate email.

01
Parsing
Email Header Extraction

Raw email source is parsed in the browser. Key fields are extracted: From, To, Subject, Reply-To, Date, Received headers, and message body. The Received chain contains the ground truth about an email's origin — often very different from what the From field displays.

02
Structural
Display Name Alarm Detection

The sender's display name is checked against psychological alarm vocabulary: hacked, suspended, urgent, warning, bitcoin, payment required. Legitimate organizations never use panic language in display names — this is a high-confidence structural signal weighted at +15 per finding.

03
Structural
Brand Impersonation Check

The display name is cross-referenced against major brands — PayPal, Amazon, Google, Apple, Microsoft. If a brand name appears in the display name but the sending domain does not match the official domain, the discrepancy is flagged as spoofing.

04
Structural
Reply-To Mismatch Detection

A common phishing technique sets a legitimate-looking From address while routing replies to an attacker-controlled address. Domain mismatch between From and Reply-To is a strong structural indicator (+15 points). Example: From: support@paypal.com / Reply-To: attacker@gmail.com.

05
Forensic
IP Address Extraction

Regex extracts IPv4 addresses from Received headers with two validation layers — rejecting octets with leading zeros (preventing false positives from date strings like 04.04.01.02) and deduplicating across all routing hops. Private IPs in public routing chains indicate possible header forgery.

06
Content
Five-Category Keyword Scanner

The subject and body are scanned across five psychological manipulation categories: Urgency, Financial Lure, Threat, Credential Harvesting, and Extortion/Sextortion. Bitcoin wallet addresses are detected via regex. The extortion category auto-escalates any email with 2+ indicators to a minimum score of 75.

07
Content
URL Analysis

URLs extracted from the email body are checked for: suspicious TLDs (.xyz, .ru, .tk, .ml), URL shorteners hiding the true destination, IP addresses used as domains, excessive subdomains, redirect parameters, and brand name spoofing within non-official domains.

Scoring Breakdown
Alarm words in display name+15 each
Reply-To domain mismatch+15
Brand impersonation+15
Private IP in routing+8 each
Extortion keywords (2+)Auto ≥75
Credential harvesting+10/cat
Suspicious URL flags+8/flag
Risk Verdict
70 – 100HIGH RISK
40 – 69MEDIUM RISK
15 – 39LOW RISK
0 – 14CLEAN
03 · Real-World Test

Tested on a
Real Phishing Email

The tool was tested against a real sextortion/extortion scam received in a personal inbox — a sophisticated attack using psychological panic tactics and a Bitcoin payment demand.

📧 Information about your online security.eml
85
/ 100
HIGH RISK
From
"You've been HACKED" <kfixc@kawachi.zaq.ne.jp>
Subject
Information about your online security
IP Found
222.227.81.164 — Public
Reply-To
Not found
Indicators Detected
Alarming display name Extortion keywords Bitcoin wallet address recorded you your webcam hacking group 48 hours payment immediately
Outcome

The extortion category auto-escalated the score when two or more extortion-specific indicators were found, correctly classifying this sextortion scam as HIGH RISK. Without the dedicated extortion category, this email previously scored only 8% — a critical detection gap now resolved.

04 · Feature Set

Full
Capabilities

Email Header Forensics

Extracts and analyzes From, To, Subject, Reply-To, Date, and the full Received routing chain from .eml files or pasted raw source.

SMS Smishing Analyzer

Dedicated analyzer for text message phishing with 5 detection categories, URL scanning, phone UI preview, and actionable advice.

Extortion Detection

Dedicated sextortion/extortion category with auto-escalation scoring. Bitcoin wallet address detection via regex pattern matching.

IP Forensics

Extracts IPs from routing headers with false positive filtering. Detects private IPs in public routing chains — a strong indicator of header forgery.

URL Scanner

Checks for suspicious TLDs, URL shorteners, IP-based domains, excessive subdomains, redirect parameters, and brand name spoofing.

Investigation Report

Generates a downloadable HTML investigation report with full findings, scoring breakdown, and risk verdict. Print-to-PDF supported.

Full Mobile Support

Bottom navigation bar, slide-up drawers, FAB button, and Email/SMS mode toggle — designed for full mobile usability on any device.

Keyword Detection

Five-category phishing language scanner: Urgency, Financial Lure, Threat, Credential Harvesting, and Extortion across both email and SMS.

No Backend Required

Runs entirely client-side in the browser. No server, no API, no data uploaded anywhere. Hosted free on GitHub Pages.

05 · Stack

Technology

Built entirely with vanilla web technologies — no frameworks, no build tools, no dependencies. The entire application ships as a single HTML file.

HTML5 / CSS3 Structure & Design
Vanilla JavaScript Detection Engine
SVG Icons No icon libraries
GitHub Pages Free Hosting
CSS Flexbox / Grid Responsive Layout
Regex IP & Wallet Detection
Bebas Neue + Syne + JetBrains Mono Typography
06 · Engineering

Challenges
Solved

C1
Sextortion Detection Gap

Standard keyword lists failed to detect sextortion emails, which scored only 8% despite being clear threats. Solved by adding a dedicated extortion category with targeted vocabulary and an auto-escalation rule that guarantees ≥75 score when two or more extortion-specific indicators are found.

C2
False Positive IP Extraction

Email Date headers (e.g. Sat, 4 Apr 2026 01:02:27) contain sequences matching IPv4 patterns. Initial regex extracted 04.04.01.02 as a valid IP. Solved by rejecting octets with leading zeros and deduplicating across the full received header chain.

C3
Mobile Accessibility

The original tool required Google Colab — unusable on mobile. Rebuilt as a pure HTML/CSS/JS application with a dedicated mobile layout: bottom navigation bar, slide-up drawers for input, and a FAB button. Email/SMS mode toggle accessible from both sidebar and mobile drawer.

C4
Scoring Calibration

Simple keyword matching produces too many false positives — a legitimate urgent business email would score high. Solved by weighting structural indicators (display name, Reply-To, private IPs) more heavily than keyword matches, as these require deliberate construction.

C5
Base64 Browser Parser Crash

An attempt to embed the logo as a 41,000-character base64 JPEG caused Safari's HTML parser to choke, rendering the entire script block as visible page text. Solved by replacing all base64 assets with inline SVGs — smaller, faster, and parser-safe across all browsers.

07 · Learnings

What I
Learned

How email headers expose the true origin of a message, independent of what the From field displays

How to write robust regex patterns handling real-world edge cases like date-string false positives

How weighted scoring systems balance sensitivity and specificity in automated threat detection

How sextortion scams differ structurally from standard phishing, requiring dedicated detection categories

How to design mobile-first web applications with slide-up drawers, FAB buttons, and bottom navigation

How browser HTML parsers fail under malformed inline data, and how to build defensively with SVGs

How to take a Python/Colab prototype and rebuild it as a zero-dependency production web application

How smishing (SMS phishing) differs from email phishing in structure, vocabulary, and detection approach

Try It Live

See It In Action

Upload a .eml file or paste a suspicious SMS — PhishGuard analyzes it instantly in your browser.

Open PhishGuard GitHub Source