Overview
Built an AI-powered healthcare insurance fraud detection platform designed for small and mid-sized health insurers, Medicaid administrators, and third-party claims processors. The system combines fraud probability scoring, operational decision-routing, business impact modeling, and responsible AI controls in a polished Streamlit experience.
Business Problem
Healthcare claims teams need more than a fraud score. They need a workflow that can prioritize risk, route borderline cases intelligently, support investigators with explainable outputs, and avoid unsafe automation in critical financial decisions.
Datasets
- Primary dataset: Kaggle Healthcare Provider Fraud Detection Analysis
- Secondary dataset: synthetic Health Insurance Claims Data for Fraud Detection
These datasets were used to model provider and claim risk patterns, generate reusable scoring logic, and stress-test operational routing decisions.
Modeling Approach
- Built provider-level fraud features from claims and beneficiary data
- Compared logistic regression, random forest, and boosting models
- Tuned thresholds for imbalanced fraud detection
- Created reusable scoring logic for provider and claim fraud probability outputs
- Focused on explainability and business-facing interpretation rather than raw model complexity alone
Decision Automation Engine
The platform includes a decision automation layer that translates fraud probability into operational routing actions:
- Auto Approve
- Secondary Rules Check
- Human Review Queue
- Priority Investigation
- Payment Hold + Mandatory Investigator Review
This makes the system useful for day-to-day fraud operations, not just model experimentation.
Dashboard Experience
The Streamlit dashboard includes dedicated pages for:
- Overview
- Batch Scoring
- Decision Engine
- Case Review
- Model Insights
- Business Impact
- Responsible AI
- About
The product framing emphasizes investigator support, operational clarity, and executive readability.
Business Impact
The system includes business impact modeling to estimate:
- fraud prevented
- review cost
- net savings
This makes the platform easier to evaluate in terms of operational value, not only model metrics.
Responsible AI Controls
- No automatic denial language
- Human review required for critical cases
- Secondary rule checks for borderline cases
- Claims above $10,000 require human review
These controls keep the workflow aligned with responsible AI governance and reduce the risk of unsafe automated decisioning.
Tools and Technologies
- Python
- pandas
- scikit-learn
- Streamlit
- Plotly
- joblib
- provider-level feature engineering
- reusable scoring and decision-routing logic
Why It Stands Out
This project stands out because it connects applied machine learning with fraud operations workflow design, business decision support, dashboarding, explainability, and responsible AI governance.
In one system, it combines:
- applied machine learning
- fraud operations workflow design
- business decision support
- executive dashboard design
- explainability
- responsible AI governance
Future Hybrid Roadmap
Future versions may combine the streaming event pipeline with the machine learning fraud scoring dashboard to create a unified real-time healthcare risk platform.