CS2 Professional Match Analytics Pipeline

Posted on 2025-09-2025

Tags: Python, PostgreSQL, Web Scraping, Streamlit, Data Engineering, ETL

Problem:

Counter-Strike 2 players and esports bettors lack structured, up-to-date data on player performance and match history. HLTV provides rich insights, but data is fragmented and hard to access or analyze at scale. This project aimed to centralize CS2 performance data and make it accessible for analysis, betting strategies, and personal improvement.

Approach:

Engineered a modular scraping and parsing pipeline using SeleniumBase to extract match, player, and map-level data from HLTV
Structured the data into a PostgreSQL database with:
- Matches, players, maps, and historical team data
- Snapshot tracking for team rosters, player transfers, and aliases
- Composite keys, foreign constraints, and indexing for performance
Built queue-based scraping architecture to support scalable data ingestion
Developed a Streamlit frontend for displaying match stats, trends, and insights
Designed system for future demo file parsing and advanced analytics (e.g. positioning, utility usage)

Outcome:

Centralized historical CS2 match and player data into a queryable, auditable PostgreSQL schema
Delivered an early-stage Streamlit dashboard for:
- Team and player performance trends
- Map win/loss rates and event metadata
- Scouting insights for players, bettors, or analysts
Built a scalable architecture to support future ML modeling (e.g. win prediction, rating inflation detection)

Repo: GitHub - CS2 Analytics
Demo: (Add Streamlit app link or screenshots if available)

Categories: Machine Learning, Real Estate, Python

Kyle Darden

CS2 Professional Match Analytics Pipeline

Problem:

Approach:

Outcome: