Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Develop an advanced predictive model to forecast a film's box office revenue with precision and confidence. Utilizing a myriad of parameters, including budget, cast, genre, and past performance, our task is to leverage the power of machine learning to unravel the intricacies of box office dynamics and provide actionable insights for studios and filmmakers.
With the extensive data from the TMDB_5000 dataset from Kaggle, numerous recommendation systems are built. However, the true potential of the dataset remains largely untapped. Our initiative aims to harness this wealth of information to predict a film's expected revenue by leveraging a multitude of parameters and innovative feature engineering techniques, ultimately empowering stakeholders to make more informed decisions in the ever-evolving landscape of the entertainment industry.
This section contains detailed information about the approach, experimentation results, and inferences derived from the project. I have created a blog explaining the approach and execution. Please visit my blog:
Frontend | Backend | ML Library | MLOps Tools | Deployment | Version Control |
---|---|---|---|---|---|
To predict expected revenue, we introduced a novel approach by considering footfall (number of tickets sold) as a target metric. While revenue is subject to various external factors such as ticket prices and distribution deals, footfall provides a more consistent and direct measure of a movie's popularity and audience engagement.
expected revenue = predicted footfall * current avg_ticket_price
Model | Best Model |
---|---|
RandomForestRegressor | |
DecisionTreeRegressor | |
GradientBoostingRegressor | |
LinearRegression | |
XGBRegressor | XGBRegressor |
CatBoostRegressor | |
AdaBoostRegressor |
Metric | Value |
---|---|
RMSE | 0.012 |
neg_mean_squared_error | -0.00024 |
Parameter | Value |
---|---|
colsample_bytree | 0.30000000000000004 |
learning_rate | 0.11 |
max_depth | 4 |
n_estimators | 444 |
All the experiment results and models are logged in MLflow for a clearer understanding and detailed inference: View here
Home Page | Form Page | Result |
git clone https://github.com/uvaishnav/BoxOfficePrediction.git
conda create -n boxoffice python=3.9 -y
conda activate boxoffice
pip install -r requirements.txt
python app.py
open up you local host and port
git clone https://github.com/uvaishnav/BoxOfficePrediction.git
conda create -n boxoffice python=3.9 -y
conda activate boxoffice
pip install -r requirements.txt
For model evaluation pipeline,
export MLFLOW_TRACKING_URI= your mlflow uri
export MLFLOW_TRACKING_USERNAME= your username
export MLFLOW_TRACKING_PASSWORD= your password
dvc init
dvc repro
Update the Dockerfile
as needed and build the Docker image. You need to install Docker Desktop first.
docker build -t boxoffice .
Settings
-> Secrets and Variables
-> Actions
.
Add the secret keys according to your main.yaml file in workflowHEROKU_API_KEY
HEROKU_APP_NAME
HEROKU_EMAIL
Our current model predicts expected revenue based on factors like budget, cast, release month, and genres.
We can enhance its utility by optimizing cast selection and release timing. By analyzing historical data, we can identify optimal combinations of actors and crew members that synergize well, thereby maximizing revenue potential. Additionally, refining our model to recommend the best release windows can help avoid high competition periods and leverage seasonal trends, further boosting a film’s success.
This project is licensed under the GPL-3.0 License - see the LICENSE file for details.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?