Your application

Please complete the following fields to be considered for this project.

Please fill in this required field.
Please fill in this required field.
Please fill in this required field.
Please fill in this required field.
Please fill in this required field.
How much commitment will you have to this project?
Please fill in this required field.
Are you available to dedicate 1-2 hours per week to the Build Project?
Please fill in this required field.
Your application has been 
successfully submitted!
Explore more projects
Close
There was an error submitting your form. Please try again later or contact us.
Oops! Something went wrong while submitting the form.

This project is no longer accepting applications. Subscribe to our newsletter to be notified of new projects!

Get updates
Uncover Product Reviews Using Fine-Tuned AI-Powered Classifier
Laura Tociu
Laura Tociu
Get updates
Register today
Apply now

Uncover Product Reviews Using Fine-Tuned AI-Powered Classifier

Using a suitable training set, train a Large Language Model classifier to find product reviews among user comments and determine their sentiment (positive, negative or neutral).

Register today
Apply now
Fridays
 at
3:00
P.M.
 ET /
12:00
P.M.
PT
8 weeks, 2-3 hours per week
Intermediate
No experience required
No experience required
Some experience required
Degree and experience required

Description

User comments are the data fellow's company website's most unique and valuable asset;they provide more genuine and honest opinions on products than most anywhere else on the internet. However, not all of the comments are useful, as there is also banter, trolling and off-topic discussions. While ChatGPT can identify and classify product reviews extremely well, it is costly and time-consuming to use it for every new comment posted on the website. Instead, a Large Language Model (LLM) fine-tuned for this specific task can perform this task quickly, offline and at a fraction of ChatGPT.

In this Build Project, imagine you are part of Company X’s team and it’s your responsibility to solve this challenge. You will train a Large Language Model (LLM) classifier to surface product reviews from among the large body of comments on Company X’s deals and perform sentiment analysis on them.

Session timeline

  • Applications open
    May 27, 2024
  • Application deadline
    June 23, 2024
  • Project start date
    Week of July 8, 2024
    Week of
    July 8, 2024
  • Project end date
    Week of

What you will learn

  • Use SQL to get data from a relational database
  • Use Python to process, clean and transform text data
  • Use Databricks and/or Google Cloud to run data science projects in the cloud
  • Use open-source platform Huggingface to source ML models and fine-tune them for your own use-case
  • Apply statistical methods to assess performance of ML techniques
  • Build API endpoint to easily call ML models
Build Projects are 8-week experiences that operate on a rolling basis. Selected participants engage in weekly live workshops with a Build Fellow and 2-15 other students.

Project workshops

1
Introductions
2
Hmmm... What Movie to Watch Tonight?
3
What Artist Should I Listen to Next?
4
Data Cleaning and Exploration
5
Large Language Models
6
Fine-tuning a Huggingface Model
7
Accuracy Metrics and API Endpoint
8
LLM’s in the Spotlight!

Prerequisites

  • Basic knowledge of Python (for loops, installing and importing libraries, using Pandas to handle data frames)
  • Basic knowledge of SQL (select statements)
  • Basic familiarity with statistics, algebra and calculus (probability distributions, sampling, evaluation metrics - most importantly, confusion matrices -, plotting functions and graphs, gradient descent).
  • Analytical and problem-solving skills for approaching data challenges.
  • Inquisitive mindset to explore and understand data patterns and trends.
  • Ability to work effectively in a team, as data science projects often involve collaboration.

Sign up today

Get access to all of our Build projects, including this one, by creating your Build account!

Register today
Log in

Apply to

Laura

's project today!

Get started by submitting your application.

Apply now

Stay updated!

Subscribe to our newsletter to be notified when projects reopen!

Please fill in this required field.
By clicking “Subscribe” you agree to our Terms of Services and Privacy Policy.

Thanks for subscribing!

We'll notify you when projects reopen. In the meantime, you can explore our resources and learn more about our Fellows.

Discover our articles
There was an error submitting your form. Please try again later or contact us.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
About the expert
Laura Tociu
Visit
Laura
's Linkedin

I'm a full stack data scientist with a background in computational sciences, having earned my Ph.D. in the field. My journey into the realm of data science was born from a deep-seated curiosity about the intricacies of the world and a passion for unraveling its mysteries. Initially driven by a thirst for scientific understanding, I transitioned into leveraging my skills to tackle real-world challenges. Along the way, I taught myself coding and immersed myself in various data science courses, honing my expertise to navigate complex data landscapes.

Currently, I thrive in the dynamic environment of an ecommerce/deals website, where I apply my knowledge to classify and perform sentiment analysis on user comments. My role extends to ensuring data integrity through rigorous validation processes. Delving deep into data problems is not just a profession for me; it's a relentless pursuit that fuels my drive to make meaningful impacts through insightful analysis and innovative solutions.