Applied ML Analysis

Census Income Prediction Models

A comparative machine learning project exploring classification, clustering, and neural network approaches on the U.S. Census income dataset.

2024Completed studyPublic documentation
  • Python
  • Scikit-learn
  • TensorFlow
  • Pandas
  • Seaborn
  • Matplotlib

Overview

This project compares different machine learning approaches for predicting whether an adult earns more than $50,000 annually using the U.S. Census dataset. It focuses on both model performance and the reasoning behind model selection.

Problem

A strong ML portfolio should show comparison, tuning, and interpretation rather than stopping at a single score. I used this project to work through feature handling, optimization, and how different model families behave on the same dataset.

What I built

  • exploratory analysis across demographic and economic features
  • preprocessing and feature engineering workflows
  • multiple models including logistic regression, random forest, clustering methods, and a neural network
  • evaluation and optimization steps for comparing approaches

Why this project matters

The project demonstrates breadth: data cleaning, visualization, model training, hyperparameter tuning, and stakeholder-style comparison of results. It is useful as a case study in analytical thinking, not just code output.

Outcome

The result is a practical benchmark project that shows how I reason about trade-offs between different ML techniques and communicate performance in a structured way.