This project is no longer accepting applications. Subscribe to our newsletter to be notified of new projects!
Use Python and Linear Regression to predict stock prices in a quantitative way
What about let’s start to solve some real-world finance problems using computer science and math? Quantitative finance might sound unfamiliar to you, but in this Build Project, you’ll have an end-to-end experience of the quantitative investment process – using Python to do data mining and data cleaning, applying keyword detection statistical algorithms on earnings call data, implementing a linear regression model to generate stock price prediction, and building your own portfolio with risk control. By the end of the 8-weeks project, you will develop a published online notebook with code and text that includes a full pipeline of building up a portfolio in a quantitative way.
Get to know the Build Fellow and other students, learn about definition of quantitative finance, setup a python IDE and able to execute read csv function.
Understand why we need to clean data and what types of dirty data are. Use python and built-in package Feature Engine to clean a dataset provided by project leader.
Understand what is earning call and why companies need to disclose their earnings. Use python to parse earning call data and extract key words that will affect the company’s stock price.
Understand what a finance Term is. Understand why Term is important in quantitative finance. Use Python to transform data collected from workshop 2 and 3 to create Terms.
Understand what linear regression is and why linear regression is powerful in quantiative finance industry. Learn the training process of a linear regression model.
Use python to do linear regression model inference. Looking at your price prediction, why is it accurate/inaccurate? How to measure accuracy? How can you improve the model?
In quantitative finance industry, presentation skills are important. Learn how to use matplotlib in python to generate plots to present your prediction results. Prepare the PowerPoint for final presentation.
Present your model, your prediction and what you’ve learnt to the whole class and the project leader. You’ll get feedback from the project leader and some advice on how to catch stake holders' interest.
Proficiency in python programming including using Numpy for data processing, using data structures like dictionary/dataframe to store processed regression features, using Scipy to build up a linear regression framework, and generating risk plots / reports via MatPlotLib.
Deep understanding of financial concepts such as volume and earnings call. This needs to be detail oriented such as understanding the difference between minutely volume and daily volume, and how the distribution of volume of stocks with different capital look like intraday, and how volume is correlated with price. For earnings calls, you need to know the key words for an earnings call (e.g beat expectation, below expectation) and understand the financial meaning of those key words and why they will affect stock price.
Full understanding of the logic and implementation of feature engineering and linear regression.
Ability to use the Python MatPlotLib package to build up visualizations for your portfolio. This will involve how to do multiple line plots in python, how to implement labeling and legends and how to structure the graphs so your research results could be easily digested by senior management from financial firms who might not have context of the research process.
Get access to all of our Build projects, including this one, by creating your Build account!
Get started by submitting your application.
We'll notify you when projects reopen. In the meantime, you can explore our resources and learn more about our Fellows.
Mike is a Quantitative Finance Build Fellow at Open Avenues, where he works with students leading projects in quantitative finance.
Mike is a quantitative research analyst at Citadel, where he focuses on portfolio construction, portfolio optimization and data visualization.
Mike has over 4 years of experience in the quantitative finance field. He interned at Bank of America in 2018 as a software engineer and interned at Citadel in 2019 as a quantitative research engineer.
He holds a bachelor degree of science in computer science.
A fun fact about Mike is that he knows to play 4 different kinds of musical instruments.