This project is no longer accepting applications. Subscribe to our newsletter to be notified of new projects!
Utilizing the current state of the art techniques in natural language processing, you’ll develop a text generation language model featuring the attention mechanism (using GPT and transformer architecture).
With transformers dominating the machine learning world by setting new benchmarks in various NLP (natural language processing) tasks and Vision tasks, this Build Project will help you learn about transformers, specifically GPT models (which are powering the AI chat assistance like ChatGPT).
We will delve into the intricacies of transformer architecture using PyTorch. You will learn to build, train, and deploy GPT models from scratch, understanding their superior capabilities compared to prior NLP baselines. This knowledge is valuable in today’s job market where the demand for AI expertise is rapidly growing.
Who are we? What are our career interests? Get an overview of the AI/ML ecosystem, tools required for this project and understand the deliverables
Get to know how to collect and preprocess text data for use with a language model. Explore textual dataset used for final deliverable.
Overview of traditional NLP methods: n-gram model, basic neural network models
Overview of transformer architecture and key components: self-attention, multi-head attention, and positional encoding.
Train the implemented transformer model on the data collected during previous sessions
Explore techniques for evaluating the quality of the generated text and practical steps for deploying the trained model.
Understanding and adjusting hyperparameters to optimize the model. Techniques for fine-tuning the transformer model for better performance on specific tasks.
Present the findings to the general audience in the form of a presentation and live demo, then gather feedback.
Get access to all of our Build projects, including this one, by creating your Build account!
Get started by submitting your application.
We'll notify you when projects reopen. In the meantime, you can explore our resources and learn more about our Fellows.
Kacper is a Data Science Build Fellow at Open Avenues where he works with students leading projects in Data Science.
Kacper is a Research Engineer at comma.ai where he focuses on driving evaluation infrastructure and driving-related metrics.
Kacper has over 5 years of experience in the Data Science and Software Engineering fields. Started in mobile app space as iOS engineer, then pivoted to machine learning to make driving chill.
He holds an M.S. in Computer Science.
A fun fact about Kacper, in his free time, he enjoys piloting aircrafts and surfing.