Graduate Capstone Project

Population & Economic Dynamics in Puerto Rico

A graduate research project examining demographic shifts, out-migration, and sustainable development strategies through linked data analysis and policy synthesis.

Synthesis & Strategic Solutions: Puerto Rico’s demographic and economic challenges reinforce one another.

 

Proposed Predictive & Analytical Framework

To address the convergence of population decline and economic fragility, this project implements a municipal-level predictive data science framework. The approach moves beyond descriptive dashboards to forecast trajectories and quantify the relative importance of fertility, age structure, and hazard exposure.

📊

1. Population Forecasting

Supervised Learning

We employ regularized regression (Ridge/Lasso) and tree-based models (Random Forest, Gradient Boosting) to forecast percent population change. Unlike standard cohort models, this approach integrates demographic momentum features (median age, birth rates) with economic covariates and hazard data to separate structural decline from shock-related migration.

🔍

2. Driver Analysis & Explainability

Feature Importance

To open the "black box" of prediction, we apply SHAP values and permutation importance. This allows us to rank drivers of change, explicitly comparing the influence of structural factors (low fertility) against policy-amenable factors (employment density, housing vacancy).

🧩

3. Municipal Typologies

Unsupervised Learning

Using K-means clustering, we identify distinct municipal profiles that share trajectories. This helps classify municipalities into actionable groups, such as "Low-Fertility/High-Migration" or "Hazard-Exposed/Aging," facilitating targeted rather than blanket policy interventions.

🛡️

4. Risk Scoring & Scenario Analysis

Policy Simulation

The model outputs feed into a Municipal Demographic Vulnerability Score. We simulate simplified scenarios (e.g., improved employment vs. stagnation) to assess potential resilience even under the constraints of low fertility and demographic momentum.

Framework: Predictive Population Modeling Pipeline

Framework: Predictive Population Modeling Pipeline

Inputs (Xm,t)

  • Structural: Birth rates, Age structure (% <15, % >65)
  • Resilience: Unemployment, Payroll, Establishment counts
  • Vulnerability: CDC SVI (Poverty, Transportation, Housing)
  • Shocks: Hurricane strikes, Earthquake intensity

Methods

Phase I (Supervised): Lasso Regression (feature selection) and Random Forest (capturing “shock cliffs”)

Phase II (Unsupervised): K-Means/Hierarchical Clustering for typology profiling

Outputs (Ym,t)

Primary: Population change / Net migration forecast

Secondary: Composite Municipal Demographic Vulnerability Score (0–1)

Goal

Distinguish between “natural” demographic momentum and shock-driven emptying to inform resource allocation for the PR Planning Board

Data Structure (The Input Vector)

For a specific Municipality m at a specific Year t, the data point Xm,t is a structured feature vector containing demographic, economic, social, and hazard variables.

Xm,t = Dbirth, Dage65+, Eunemp, Epayroll, Spov, Stransportation, Hhazard

Inputs (Xm,t)

  • Structural: Birth rates, Age structure (% <15, % >65)
  • Resilience: Unemployment, Payroll, Establishments
  • Vulnerability: Poverty, Transportation, Housing
  • Shocks: Hurricane strikes, Earthquake intensity

Target (Ym,t)

The population change (or net migration) the model aims to predict — capturing both structural demographic momentum and shock-driven displacement.

Expected Outcomes & Impact

This project aligns with the Puerto Rico Planning Board’s need for evidence-based tools. By explicitly incorporating demographic momentum into a predictive pipeline, we aim to produce actionable intelligence for workforce development, housing, and resilience planning.

Reproducible Pipeline

A fully reproducible, end-to-end data science framework that harmonizes public datasets (Census, NOAA, BLS) into a standardized municipal panel, ready for extension to future years or other regions.

Structural vs. Shock

Quantitative evidence that helps planners distinguish structural decline (driven by aging and sub-replacement fertility) from migration-driven loss caused by economic shocks or disasters.

Actionable Intelligence

Actionable risk scores and a spatial typology that allow agencies to prioritize place-based strategies—identifying where policy intervention (e.g., job creation) can effectively counter demographic drag.

Resources & Further Reading

Click a source to open it in a new tab.

  1. 1. Migration is the driving force of rapid aging in Puerto Rico
  2. 2. BLS: Population loss amid economic decline
  3. 3. GAO: Puerto Rico & Section 936 Tax Credit
  4. 4. Senate Finance: The Effect of §936
  5. 5. BEA: Prototype economic statistics for PR, 2012–2017
  6. 6. FRED: Unemployment Rate in Puerto Rico
  7. 7. BLS: LAUS – July 2025
  8. 8. BLS: QCEW – PR & U.S., Q4 2024
  9. 9. Population decline & school closures in PR
  10. 10. PNAS: Migration dynamics after Hurricane Maria
  11. 11. Centro: Post-Maria Exodus estimates
  12. 12. DataUSA: Puerto Rico profile
  13. 13. arXiv: Bayesian APC model for fertility in PR
  14. 14. Census: PR profile
  15. 15. Puerto Rico exodus & long-term economic headwinds
  16. 16. Census: PR profile (alt link)
  17. 17. Center for Puerto Rican Studies: Pervasive Poverty in PR (2023)
  18. 18. NHC/Colorado: Steep Risks—Landslide vulnerability in rural PR
Metrics Overview