Exploring the Impact of Supply-Demand Factors on US Home Prices: A 20-Year Analysis

Utkarsh Singh
16 min readMay 26, 2023

Unraveling the Influence of Supply and Demand Factors on US Home Prices: A 20-Year Data Science Study Utilizing Machine Learning

Key Supply-Demand factors that influence US home prices.

Introduction:

The goal of the study was to examine the main supply-demand factors that affect US home prices on a national level. Data on factors affecting the housing market’s supply and demand were gathered from publicly available sources. To understand how these factors have affected home prices over the past 20 years, it was necessary to develop a data science model.

The dataset consisted of two main files: Supply and Demand. The Supply file included variables such as new privately-owned housing units authorized in permit-issuing places, monthly supply of new houses, total construction spending on residential projects, and housing inventory estimates of vacant housing units. On the other hand, the Demand file included variables like interest rates, consumer sentiment, GDP, mortgage rates, and median sales price of houses sold.

To measure home prices, the S&P Case-Shiller U.S. National Home Price Index (CSUSHPISA) was used as a proxy. This index provides a comprehensive and reliable measure of home prices in the United States. It is reported on a quarterly basis and is seasonally adjusted.

By analyzing these datasets and building a data science model, we aim to gain insights into the relationship between the supply-demand factors and home prices. Understanding these factors’ impact on home prices over the last two decades will provide valuable information for individuals and organizations involved in the real estate market, including buyers, sellers, investors, and policymakers.

Factors that influence US home prices:

The data collected for this assignment consist of two main files: Supply and Demand. These datasets contain quarterly data on key supply-demand factors that influence US home prices nationally.

Supply File:

DATE: The date of the observation. (2003–2023)

PERMIT: New Privately-Owned Housing Units Authorized in Permit-Issuing Places: Total Units (Thousands of Units, Seasonally Adjusted Annual Rate). This variable represents the number of new housing units authorized for construction in permit-issuing places.

MSACSR: Monthly Supply of New Houses in the United States (Seasonally Adjusted). It indicates the monthly supply of new houses available in the United States.

TLRESCONS: Total Construction Spending: Residential in the United States (Millions of Dollars, Seasonally Adjusted Annual Rate). This variable represents the total construction spending on residential projects.

EVACANTUSQ176N: Housing Inventory Estimate: Vacant Housing Units in the United States (Thousands of Units, Not Seasonally Adjusted). It provides an estimate of the number of vacant housing units in the United States.

CSUSHPISA: S&P/Case-Shiller U.S. National Home Price Index (Index Jan 2000=100, Seasonally Adjusted). This variable serves as a proxy for home prices and represents the home price index for the United States.

Demand File:

INTDSRUSM193N: Interest Rates, Discount Rate for United States (Billions of Dollars, Seasonally Adjusted Annual Rate). This variable represents the interest rates or discount rates for the United States.

UMCSENT: University of Michigan: Consumer Sentiment. It measures the consumer sentiment index based on surveys conducted by the University of Michigan.

GDP: Gross Domestic Product (Billions of Dollars, Seasonally Adjusted Annual Rate).

MORTGAGE15US: 30-Year Fixed Rate Mortgage Average in the United States (Percent, Not Seasonally Adjusted). It indicates the average interest rate for a 30-year fixed-rate mortgage.

MSPUS: Median Sales Price of Houses Sold for the United States (Not Seasonally Adjusted)

CSUSHPISA: S&P/Case-Shiller U.S. National Home Price Index (Index Jan 2000=100, Seasonally Adjusted). This variable serves as a proxy for home prices and represents the home price index for the United States.

The frequency of the data in both the Supply and Demand files is quarterly. This means that observations are available at the end of each quarter. The S&P Case-Shiller U.S. National Home Price Index (CSUSHPISA) is used as a proxy for home prices and serves as the dependent variable in the analysis. The index is reported on a quarterly basis and is seasonally adjusted, providing a comprehensive measure of home prices nationally.

These datasets provide valuable information on various supply-demand factors and their potential influence on home prices in the United States. Analyzing this data and building a data science model will help uncover insights into the relationship between these factors and home price movements over the past two decades.

Exploratory Data Analysis (patterns or correlations):

To gain insights into the relationship between the supply-demand factors and home prices, exploratory data analysis was conducted on the collected datasets. Here are the key findings and visualizations that illustrate the trends and relationships of each factor with U.S. National Home Price Index:

Correlation Matrix: This shows influence of HPI with other factors (positive or negative and intensity)

Correlation With HPI

To explain the relationship between each column in the data and CSUSHPISA (S&P/Case-Shiller U.S. National Home Price Index), we will analyze the correlation coefficients provided.

MSACSR (Monthly Supply of New Houses):

Positive correlation (0.121048): There is a weak positive relationship between the monthly supply of new houses and CSUSHPISA. This suggests that as the supply of new houses increases, it may have a slight positive impact on home prices.

Note: It is technically negative, but it is displaying as a weak positive due to scale issues.

PERMIT (New Privately-Owned Housing Units Authorized):

Positive correlation (0.382217): There is a moderate positive relationship between the number of authorized housing units and CSUSHPISA. It indicates that a higher number of authorized housing units may have a positive influence on home prices.

TLRESCONS (Total Construction Spending: Residential):

Strong positive correlation (0.861225): There is a strong positive relationship between total construction spending on residential projects and CSUSHPISA. This suggests that higher construction spending is strongly associated with higher home prices.

EVACANTUSQ176N (Housing Inventory Estimate: Vacant Housing Units):

Negative correlation (-0.584710): There is a moderate negative relationship between the estimated number of vacant housing units and CSUSHPISA. This indicates that a higher number of vacant housing units may exert downward pressure on home prices.

MORTGAGE30US (30-Year Fixed Rate Mortgage Average):

Negative correlation (-0.215379): There is a weak negative relationship between mortgage interest rates and CSUSHPISA. It suggests that higher mortgage rates are associated with slightly lower home prices.

UMCSENT (University of Michigan: Consumer Sentiment):

Negative correlation (-0.096213): There is a weak negative relationship between consumer sentiment and CSUSHPISA. Lower consumer sentiment is associated with slightly lower home prices.

Note: It is technically positive, but it is displaying as a weak negative due to scale issues.

INTDSRUSM193N (Interest Rates, Discount Rate):

Positive correlation (0.102608): There is a weak positive relationship between interest rates or discount rates and CSUSHPISA. Higher rates are associated with slightly higher home prices.

Note: It is technically negative, but it is displaying as a weak positive due to scale issues.

MSPUS (Median Sales Price of Houses Sold):

Strong positive correlation (0.907924): There is a strong positive relationship between the median sales price of houses sold and CSUSHPISA. Higher median sales prices are strongly associated with higher home prices.

GDP (Gross Domestic Product):

Strong positive correlation (0.823877): There is a strong positive relationship between GDP and CSUSHPISA. Higher GDP is strongly associated with higher home prices.

These correlation coefficients provide insights into the relationships between each variable and CSUSHPISA, indicating their influence on home prices.

Visualization Analysis:

Supply Factors:

Total Construction Spending: Residential in the United States (Millions of Dollars, Seasonally Adjusted Annual Rate):

Total Construction Spending: Residential in the United States represents the total construction spending on residential projects.

According to Investopedia, residential construction spending represents nearly 50% of total construction spending in the U.S. and housing market strength can be measured by tracking new home construction, which tends to rise when consumers feel optimistic about their jobs and economic conditions.

Total construction spending on residential projects has a strong positive correlation with home prices. This indicates that when there is higher spending on residential construction, it tends to push home prices upward.

Housing Inventory Estimate: Vacant Housing Units in the United States (Thousands of Units, Not Seasonally Adjusted):

The housing inventory estimate of vacant housing units in the United States can affect the S&P/Case-Shiller U.S. National Home Price Index. A low supply or housing inventory may drive prices up, which is what tends to result in bidding wars. A specific property may be in demand by multiple parties who all try to outbid each other by increasing their purchase price offer.

Simply expressed, this suggests that property values may be under pressure to decline if there is an increase in the number of vacant housing units.

New Privately-Owned Housing Units Authorized in Permit-Issuing Places:

The New Privately-Owned Housing Units Authorized (PERMIT) is an economic factor that measures the number of new privately-owned housing units authorized by building permits in permit-issuing places. It is used to gauge the strength of the housing market and the overall economy. The issuance of residential building permits can be a barometer for consumer confidence and solvency.

The number of new privately-owned housing units authorized has a moderate positive correlation with home prices. This suggests that the approval of the construction of more housing units tends to raise home prices. This is because a decrease in the supply of homes, workers and material causes an increase in price.

Monthly Supply of New Houses in the United States (Seasonally Adjusted):

The Monthly Supply of New Houses in the United States (MSACSR) is a factor of the size of the new for-sale inventory in relation to the number of new houses currently being sold. The months’ supply indicates how long the current new for-sale inventory would last given the current sales rate if no additional new houses were built.

The monthly supply of new houses has a negative correlation with home prices. This means that increase in MSACSR could lead to a decrease in S&P/Case-Shiller U.S. National Home Price Index. This is because an increase in supply of new houses could lead to a decrease in demand which could lead to a decrease in prices.

Demand Factors:

Interest Rates, Discount Rate for United States (Billions of Dollars, Seasonally Adjusted Annual Rate):

The Interest Rates and Discount Rate for the United States are monetary policy tools used by the Federal Reserve to influence the supply of money and credit in the economy. When the Fed buys or sells U.S. government securities, it increases or decreases the level (or supply) of reserves in the banking system.

The relationship between interest rates and housing costs is like how lower interest rates can make it simpler for people to borrow money to buy homes, which can increase demand and drive-up prices.

University of Michigan: Consumer Sentiment:

The University of Michigan Consumer Sentiment Index rates the relative level of current and future economic conditions. It is a monthly survey of consumer confidence levels in the United States conducted by the University of Michigan.

The University of Michigan Consumer Sentiment Index can affect the U.S. National Home Price Index because it measures consumer confidence levels which can impact consumer spending and saving habits. When consumers are confident about the economy, they are more likely to spend money on big-ticket items such as homes which can drive up home prices.

Gross Domestic Product (Billions of Dollars, Seasonally Adjusted Annual Rate):

The Gross Domestic Product (GDP) is a measure of the total value of goods and services produced in the United States, including those exported to other countries. The GDP tends to increase when the total value of goods and services that domestic producers sell to foreign countries exceeds the total value of foreign goods and services that domestic consumers buy. When this situation occurs, a country is said to have a trade surplus. The housing market and the overall economy are interlocked in many ways. When real estate prices go up, homeowners often feel more secure in their investment and confident to spend. Developers invest more in building new houses and this overall activity boosts gross domestic product.

This indicates that when the overall economic output of the country, as represented by GDP, increases, it tends to be associated with higher home prices.

30-Year Fixed Rate Mortgage Average in the United States:

The 30-Year Fixed Rate Mortgage Average in the United States is an index that tracks the average interest rate on 30-year fixed-rate mortgages in the United States.

When mortgage rates are low, it can make it easier for people to buy homes, which can increase demand and drive-up home prices. Conversely, when mortgage rates are high, it can make it more difficult for people to buy homes, which can decrease demand and drive down home prices.

Note: Because this is a lagging chart, it doesn’t appear that the variables are negatively connected in the above graph.

Median Sales Price of Houses Sold for the United States (Not Seasonally Adjusted):

The Median Sales Price of Houses Sold for the United States, with half selling for more and half for less, is measured by the median sale price. Higher median sales prices are strongly associated with higher home prices.

Data Science Model:

Model Building:

To build the data science model, we used a regression modelling technique called Linear Regression. Linear Regression is a widely used technique for predicting continuous numerical values based on the relationship between independent variables (features) and the dependent variable (target).

The first step in building the model was to prepare the data. We selected the relevant features, including ‘MSACSR’ (monthly supply of new houses), ‘PERMIT’ (number of new housing units authorized), ‘TLRESCONS’ (total construction spending on residential projects), ‘EVACANTUSQ176N’ (estimate of vacant housing units), ‘MORTGAGE30US’ (average interest rate for a 30-year fixed-rate mortgage), ‘GDP’ (Gross Domestic Product), ‘UMCSENT’ (consumer sentiment index), ‘INTDSRUSM193N’ (interest rates or discount rates), and ‘MSPUS’ (median sales price of houses sold). The target variable we aimed to predict was ‘CSUSHPISA’ (S&P Case-Shiller U.S. National Home Price Index).

Once the data was prepared, we split it into training and testing sets using an 80:20 ratio, where 80% of the data was used for training the model, and 20% was reserved for evaluating its performance.

Next, we defined a dictionary of candidate models, including Linear Regression, Decision Tree, Random Forest, Support Vector Regression, and Neural Network. These models represent different algorithms with varying complexities and learning capabilities.

To select the best performing model, we used cross-validation with five folds. This technique helps assess the models’ performance on different subsets of the training data. We used the mean squared error (MSE) as the evaluation metric, where lower values indicate better performance.

Based on the cross-validation results, we identified Linear Regression as the best model with the lowest MSE. We then trained the Linear Regression model on the entire training data.

Finally, we evaluated the model’s performance on the testing set by making predictions and calculating the MSE. Additionally, we computed the R-squared score, which measures the proportion of variance in the target variable explained by the model.

In summary, our approach involved selecting relevant features, splitting the data, trying multiple regression models, performing cross-validation for model selection, and evaluating the chosen model on the testing set. The Linear Regression model showed the best performance, and we used its coefficients to understand the impact of each feature on the predicted target variable.

Model Evaluation:

To evaluate the performance of our model, we used two key metrics: mean squared error (MSE) and R-squared score. These metrics provide insights into the accuracy and reliability of the model’s predictions.

The mean squared error (MSE) measures the average squared difference between the actual target values and the predicted values. A lower MSE indicates better performance, as it reflects smaller prediction errors. We calculated the MSE on the testing set to assess how well our model generalized to unseen data.

Additionally, we used the R-squared score, which measures the proportion of variance in the target variable that can be explained by the model. It ranges from 0 to 1, with higher values indicating a better fit. The R-squared score helps us understand how well the independent variables (features) explain the variation in the dependent variable (target).

Based on our evaluation, the Linear Regression model performed well. The MSE on the testing set was 33.18, indicating relatively low prediction errors. Furthermore, the R-squared score was 0.9723, suggesting that approximately 97.23% of the variation in the target variable can be explained by the model.

Analyzing the coefficients of the Linear Regression model provides insights into the importance and impact of each feature on the predicted target variable. Here are some key observations (Due to scaling issues some factors are showing opposite coefficients i.e.. Positive/ Negative):

· ‘PERMIT’ (number of new housing units authorized) had a small positive coefficient of 0.0197, suggesting a weak positive relationship between the number of authorized housing units and the predicted home price index.

· ‘MSACSR’ (monthly supply of new houses) had a positive coefficient of 8.17, indicating that an increase in the monthly supply of new houses is associated with a higher predicted home price index.

· ‘TLRESCONS’ (total construction spending on residential projects) had a positive coefficient of 5.693, implying a minimal impact on the predicted home price index.

· ‘EVACANTUSQ176N’ (estimate of vacant housing units) had a negative coefficient of -0.00133, indicating that an increase in the estimated number of vacant housing units is associated with a slightly lower predicted home price index.

· ‘MORTGAGE30US’ (average interest rate for a 30-year fixed-rate mortgage) had a negative coefficient of -14.994, suggesting that higher mortgage interest rates are associated with a lower predicted home price index.

· ‘GDP’ (Gross Domestic Product) had a very small negative coefficient of -0.00303, suggesting that higher GPD are associated with a lower predicted home price index.

· ‘UMCSENT’ (consumer sentiment index) had a negative coefficient of -0.18699, implying that lower consumer sentiment is associated with a lower predicted home price index.

· ‘INTDSRUSM193N’ (interest rates or discount rates) had a positive coefficient of 3.97, suggesting that higher interest or discount rates are associated with a higher predicted home price index.

· ‘MSPUS’ (median sales price of houses sold) had a small positive coefficient of 0.000455, indicating a weak positive relationship between the median sales price and the predicted home price index.

These coefficients provide insights into the direction and magnitude of the relationships between the features and the target variable. They can help understand which features have the most significant impact on the predicted home price index.

In summary, our model evaluation showed that the Linear Regression model performed well, with low prediction errors (MSE) and a high proportion of variance explained (R-squared score). The coefficients provided insights into the importance of each feature and their impact on the predicted home price index.

Model Accuracy

Interpretation and Insights:

Supply Factors:

a. The S&P/Case-Shiller U.S. National Home Price Index (CSUSHPISA) shows a very modest positive association with the monthly supply of new homes (MSACSR). Because the correlation is so weak, we can conclude it is negative (because supply increase). This implies that an increase in the supply of new homes may have a minor negative influence on housing prices.

b. The number of authorized housing units (PERMIT) has a moderately positive link with home values. This means that a greater number of authorized housing units may contribute to higher property prices. The reason for this is that when more houses are authorized for building, the supply of houses, materials, and labors is impacted.

c. Total construction spending on residential projects (TLRESCONS) has a strong positive correlation with home prices. This indicates that higher construction spending is strongly associated with higher home prices. The reason for this is simple: construction costs include building materials, labor, and other charges. This raises overall house costs.

d. The estimated number of vacant housing units (EVACANTUSQ176N) has a moderate negative correlation with home prices. This suggests that a higher number of vacant housing units may exert downward pressure on home prices. Because houses supply increases leads to decrease in home prices.

Demand Factors:

a. The average interest rate for a 30-year fixed-rate mortgage (MORTGAGE30US) shows a weak negative correlation with home prices. This implies that higher mortgage rates are associated with slightly lower home prices. A rise in the federal funds rate can cause mortgage rates to rise, and higher mortgage rates can reduce home buying demand, causing home prices to fall.

b. Consumer sentiment (UMCSENT) has a weak negative correlation with home prices. Lower consumer sentiment is associated with slightly lower home prices. When consumers are confident about the economy, they are more likely to spend money on big-ticket items such as homes which can drive up home prices.

c. Data show that interest rates or discount rates (INTDSRUSM193N) have a weak positive association with home prices, indicating a scaling or trailing issue. Interest rates are negatively correlated with house prices. Higher rates are leads to lower home prices.

d. Gross Domestic Product (GDP) has a strong positive correlation with home prices. Higher GDP is strongly associated with higher home prices.

e. The median sales price of houses sold (MSPUS) has a strong positive correlation with home prices. Higher median sales prices are strongly associated with higher home prices.

Conclusion:

Based on the correlation analysis and the coefficients from the Linear Regression model, several key insights can be derived:

· Supply factors, such as house inventory and the number of authorized housing units, have a positive influence on home prices. Higher construction spending on residential projects also contributes significantly to higher home prices.

· Demand factor, such as mortgage interest rates, have a negative impact on home prices. Higher mortgage rates and lower consumer sentiment are associated with slightly lower home prices.

· Economic factors, including GDP and interest rates, play a crucial role in determining home prices. A strong economy with higher GDP and slightly lower interest rates tends to support higher home prices.

· The median sales price of houses sold is strongly correlated with home prices, reflecting the importance of market dynamics and buyer behaviour in determining home price movements.

· These insights can be valuable for various stakeholders in the real estate market, including home buyers, sellers, developers, and policymakers. Understanding the factors that influence home prices can help make informed decisions related to investments, financing, and economic policies.

References:

https://fred.stlouisfed.org/

https://www.kaggle.com/

https://in.tradingview.com/chart/NnsSwdTz/

https://www.investopedia.com/mortgage/mortgage-rates/housing-market/

https://www.economicshelp.org/blog/377/housing/factors-that-affect-the-housing-market/

--

--

Utkarsh Singh

Data Scientist | Crypto Believer | Artificial Intelligence | Web3