Building a Mini SIEM in Python – Detecting Brute Force & Web Scanning Attacks

In modern cybersecurity operations, log analysis is one of the most important defensive skills. Security Information and Event Management (SIEM) systems collect logs from multiple sources and detect suspicious patterns.

To better understand how SIEM tools work, I built a Mini SIEM simulation in Python that analyzes SSH and Apache logs to detect potential attacks.

🎯 Objective

The goal of this project was to:

Detect brute-force login attempts from SSH logs
Identify suspicious successful logins after repeated failures
Detect web scanning activity from Apache logs
Generate structured alerts based on attack patterns

This project focuses on blue team detection logic, not just log parsing.

🔎 Understanding the Threats

1️⃣ SSH Brute Force Attacks

Attackers often attempt multiple login attempts within a short time window to guess credentials.

Indicators:

Multiple failed login attempts from same IP
Attempts occurring rapidly
A successful login after many failures (possible compromise)

2️⃣ Web Scanning Attacks

Attackers frequently probe web servers for:

/wp-login.php
/.env
/phpmyadmin
/wp-admin

These paths are common targets in automated scanning tools.

Indicators:

Multiple suspicious path requests
High 4xx (client error) responses

🛠 Implementation Overview

The project was built using:

Python 3
Regular Expressions (regex)
datetime for time-window detection
collections.deque for rolling window tracking
CLI-based alert output

🔐 Brute Force Detection Logic

The key idea was implementing a rolling 60-second time window.

For each failed login attempt:

Store the timestamp per IP
Remove timestamps older than 60 seconds
If ≥ 5 failures occur within that window → trigger alert

Example logic:

q = recent[ip]
q.append(timestamp)

cutoff = timestamp - timedelta(seconds=60)
while q and q[0] < cutoff:
    q.popleft()

if len(q) >= 5:
    alert("Possible brute force detected")

This simulates how real SIEM systems correlate repeated events within a time frame.

🌐 Web Scanning Detection Logic

For Apache logs:

Extract IP address
Extract requested path
Check against suspicious endpoints
Count 4xx responses per IP

If suspicious paths exceed threshold → alert triggered.

📸 Sample Output

Example SSH detection:

Example Apache detection

🧠 Key Learnings

Through this project, I strengthened my understanding of:

Log parsing techniques
Event correlation logic
Time-based detection mechanisms
Blue team analytical thinking
Practical SOC alert simulation

It highlighted how powerful simple detection logic can be when structured correctly.

🔮 Future Improvements

Planned upgrades include:

Export alerts to JSON
Generate CSV summary reports
Configurable detection thresholds
Integration with IP reputation APIs
Web-based dashboard using Flask

🔗 GitHub Repository

Source Code:
https://github.com/Sahilcyber-code/log-analyzer-blue-team

🛡️ Final Thoughts

Building a mini SIEM from scratch gave me hands-on experience with how detection engines work behind the scenes. While enterprise SIEM platforms are far more advanced, the core principle remains the same:

Correlate events → Detect patterns → Generate alerts.

Understanding this foundation is essential for any aspiring blue team professional.

Command Palette