Census Income Prediction Models

A comparative machine learning project exploring classification, clustering, and neural network approaches on the U.S. Census income dataset.

Overview

This project compares different machine learning approaches for predicting whether an adult earns more than $50,000 annually using the U.S. Census dataset. It focuses on both model performance and the reasoning behind model selection.

Problem

A strong ML portfolio should show comparison, tuning, and interpretation rather than stopping at a single score. I used this project to work through feature handling, optimization, and how different model families behave on the same dataset.

What I built

exploratory analysis across demographic and economic features
preprocessing and feature engineering workflows
multiple models including logistic regression, random forest, clustering methods, and a neural network
evaluation and optimization steps for comparing approaches

Why this project matters

The project demonstrates breadth: data cleaning, visualization, model training, hyperparameter tuning, and stakeholder-style comparison of results. It is useful as a case study in analytical thinking, not just code output.

Outcome

The result is a practical benchmark project that shows how I reason about trade-offs between different ML techniques and communicate performance in a structured way.