This project is no longer accepting applications. Subscribe to our newsletter to be notified of new projects!
Design an end-to-end housing price prediction machine learning system and deploy final model artifact using AWS Sagemaker.
Predictive modeling is a critical skill for data scientists, widely used across various industries. In real estate, predictive analytics and valuation modeling are gaining momentum, with both startups and established companies increasingly adopting data-driven approaches to forecast housing prices.
In this Build Project, you will retrieve, clean, and preprocess real-world housing data, transforming it into a format suitable for machine learning. You will explore and visualize the data to gain insights, then apply supervised learning algorithms to predict the sales price for each house. Throughout the project, you'll focus on optimizing model performance through feature selection and hyperparameter tuning. Additionally, you'll deploy your final model using AWS SageMaker, making it accessible via cloud infrastructure. This hands-on project will not only enhance your predictive modeling skills but also give you practical experience in cloud-based machine learning deployment.
Kick off the project with an overview of the housing price prediction system, setting up the Python IDE, configuring AWS Cloud and Git, and retrieving the dataset. You will also be introduced to the project’s workflow and expectations.
Dive into data exploration and visualization techniques. Learn how to uncover key insights and patterns in the dataset using Pandas, Matplotlib, and Seaborn, setting the foundation for effective modeling
Clean and preprocess the dataset by handling missing values, encoding categorical variables, and scaling numeric features. Engineer new features that can improve model performance.
Learn how to develop and train supervised learning models such as linear regression and decision trees, using Scikit-learn. Understand how to split data into training and testing sets to evaluate your models.
Optimize your machine learning models through hyperparameter tuning using techniques like grid search and random search. Explore how tuning can enhance model performance.
Evaluate your machine learning models using metrics such as RMSE, R-squared, and cross-validation. Compare models and select the best-performing one for deployment.
Learn how to deploy your trained model to the cloud using AWS SageMaker
Wrap up the project by documenting your findings, methodologies, and results. Prepare a presentation to showcase your work, including the deployed model and key insights from the project. Documentation would be in the form of detailed Jupyter notebook along with github project description in the read me section. If time permits, can create a substack or medium post.
Get access to all of our Build projects, including this one, by creating your Build account!
Get started by submitting your application.
We'll notify you when projects reopen. In the meantime, you can explore our resources and learn more about our Fellows.
Nischal Subedi is a Data Science Fellow at Open Avenues, where he mentors students and leads projects in data science, machine learning, and AI. Additionally, as a Data Scientist at Home Partners of America, a company within the Blackstone Inc. portfolio, he develops pricing strategies to enhance leasing operations across the company.
He brings over five years of experience to the data science field and holds a master's degree in Statistics from the University of Delaware.
Fun Fact: Nepal, known for Mount Everest, is not just Nischal's birthplace but also the source of his affinity for hiking.