Duvvuri Durga Prasad

About Me

Hello! I'm Duvvuri Durga Prasad, an aspiring Data Scientist with a strong foundation in Python, Machine Learning, and MLOps. I recently completed my PG Diploma in Data Science & Artificial Intelligence from the University of Liverpool and hold a Bachelor's degree in Mechanical Engineering.

I am passionate about transforming raw data into actionable insights that drive impactful business solutions. I have independently built and deployed end-to-end ML applications using Flask, Streamlit, Docker, and AWS, with projects spanning finance, healthcare, education, and food tech.

My journey from Mechanical Engineering to Data Science reflects my adaptability and drive to solve real-world problems through data-driven approaches. I value clean code, reproducibility, and data ethics, and thrive in collaborative, fast-paced environments where I can learn and contribute as a team player.

As a fresher, I bring a fresh perspective, a strong willingness to learn, and proven hands-on skills in modern ML and MLOps tools like DVC, MLflow, and GitHub Actions. I am eager to contribute, grow, and make a meaningful impact in a forward-thinking organization.

Education & Certifications

Post Graduate Diploma in Data Science and Artificial Intelligence

University of Liverpool, UK | Dec 2024

Focus: Machine Learning, Deep Learning, AI, NLP, Data Visualization

Dissertation: Developed a Python-based 2D strategy game using graph theory.

Bachelor of Technology in Mechanical Engineering

Avanthi Institute, Hyderabad | May 2017

Project: Solar-powered car prototype focusing on sustainable energy design.

Courses & Certifications

Machine Learning with Python: Zero to GBMs (Aug 2023 – Feb 2024)
Data Analysis with Python: Zero to Pandas (May 2022 – Aug 2022)

Personal Projects

Web Development Projects

Portfolio Website

Deployed HTML CSS JavaScript

Developed a fully responsive personal portfolio website using HTML, CSS, and JavaScript. The website includes dark/light theme toggle, smooth scrolling, and a dynamic showcase of projects, skills, and contact form integration. The design is clean, professional, and mobile-friendly to ensure an optimal experience across devices.

Technologies: HTML, CSS, JavaScript

View Code Try Now

I began by designing the layout with semantic HTML5 and styled it using modern CSS3, focusing on a clean and responsive layout. JavaScript was used to handle the dark/light theme switch, smooth navigation, and dynamic content rendering for projects.

The website was version-controlled with Git and deployed using GitHub Pages for free and fast hosting. All project sections link directly to live demos and source code for transparency and easy access.

Key highlights:

Dark/light theme toggle
Fully responsive design
Integrated project showcase with live links
Deployed using GitHub Pages

NLP Projects

Movie Recommender System

Deployed Streamlit App NLP

This project is a content-based movie recommender system designed to suggest movies similar to a selected title. It uses metadata features like genres, cast, keywords, and overview to calculate similarity between movies using cosine similarity. To enhance the user experience, the app fetches live movie posters dynamically from The Movie Database (TMDB) API and displays them alongside recommendations.

Technologies: Python, NLP, Cosine Similarity, TMDB API, Streamlit

View Code Try Now

I started by collecting and cleaning movie metadata to create meaningful features for similarity comparison. Using TF-IDF vectorization, I converted textual features into numerical vectors. Cosine similarity was then used to compute similarity scores between movies.

The TMDB API was integrated to dynamically fetch high-quality movie posters to display with recommendations, improving visual appeal and user engagement.

The entire system was wrapped into an interactive web app using Streamlit, providing a simple and responsive UI that allows users to select movies and instantly view recommendations with posters.

Key highlights:

Advanced feature engineering combining multiple metadata features
Real-time API integration for poster retrieval
End-to-end deployment with Streamlit for an interactive user experience

MLOps & Deployment Projects

Credit Score Classification

Deployed MLOps CI/CD

This project implements a robust machine learning pipeline to classify credit scores, aiming to predict creditworthiness of loan applicants. It incorporates modern MLOps practices using DVC for dataset and pipeline versioning, MLflow for experiment tracking and model registry, and Flask for serving the trained model through a REST API. The application is fully containerized with Docker and deployed on AWS EC2, with automated CI/CD workflows via GitHub Actions.

Technologies: Python DVC MLflow Flask Docker AWS EC2 GitHub Actions

View GitHub

I structured the project as an end-to-end ML pipeline, starting with data ingestion and preprocessing, followed by feature engineering, model training, and evaluation.

Using DVC, I version-controlled the datasets and pipeline stages, enabling reproducibility and easy collaboration. MLflow tracked experiments including model parameters and evaluation metrics, facilitating model comparison and selection.

The best model was wrapped in a Flask API for serving predictions, while the entire setup was containerized using Docker to ensure environment consistency. Deployment was automated on AWS EC2, and GitHub Actions CI/CD pipelines ensured code quality and smooth updates.

Key highlights:

Modular pipeline components enabling flexible experimentation.
Seamless experiment tracking and model registry with MLflow.
Robust deployment strategy with Docker and AWS.
CI/CD automation via GitHub Actions for testing and deployment.

Data Analysis & Insights Projects

NYC Restaurants Recommendation System

EDA Predictive Modeling Visualization

This project explores customer order data from NYC restaurants to uncover insights about cuisine popularity, customer satisfaction, and operational efficiency. By analyzing order counts, delivery times, and customer ratings, the project provides actionable recommendations to restaurant owners and food delivery platforms.

Technologies: Python Pandas Seaborn Plotly matplotlib Jupyter Notebook

View GitHub

I began by cleaning and preprocessing a large dataset of restaurant orders. Using exploratory data analysis and visualization techniques (Seaborn, Plotly), I identified key trends such as the popularity of American and Japanese cuisines, and how delivery efficiency correlates with customer satisfaction.

Built predictive models for delivery time, order cost, and customer ratings using relevant features like cuisine type, preparation time, and day of the week.

The insights help stakeholders improve operational efficiency, predict delivery delays, and enhance customer experience through data-driven decisions.

Key highlights:

Comprehensive exploratory data analysis revealing customer behavior patterns.
Predictive modeling for critical operational metrics.
Interactive visualizations supporting decision-making.

Predictive Modeling Projects

Student Performance Prediction

Deployed Streamlit App ML Model

This project predicts students' exam performance based on factors such as study time, past grades, and other academic-related features. The model helps educators identify students who may need additional support.

Technologies: Python Scikit-learn Streamlit

View Code

Data preprocessing and feature selection were performed to identify the most relevant predictors of exam outcomes. I trained multiple machine learning algorithms, tuning hyperparameters to maximize predictive accuracy.

An interactive web application was created using Streamlit to allow users to input student data and instantly receive performance predictions.

Key highlights:

Feature engineering for improved model accuracy.
Model selection and tuning to identify optimal predictors.
Easy-to-use Streamlit app for real-time predictions.

Loan Prediction Web Application

Deployed Flask App SQLite database

This Flask web application predicts loan approval status based on applicant information. It leverages a trained Random Forest classifier to provide real-time decisions, helping users understand their loan eligibility.

Technologies: Flask Random Forest SQLite Bootstrap HTML/CSS

View GitHub Try Now

I trained a Random Forest classifier using cleaned loan application data and stored the model using pickle. The Flask app collects user inputs via Bootstrap-powered responsive forms and returns approval predictions.

Predictions and user contact messages are saved to an SQLite database for persistence. Front-end design was enhanced with custom CSS and JavaScript for usability.

Project Structure Highlights:

app.py: Flask backend application
templates/: HTML pages including forms and result views
static/: CSS, images, and JavaScript files
random_forest_model.pkl: Serialized model file
schema.sql: SQLite database schema

Heart Disease Prediction

ML Models Data Exploration Visualization

Using the Heart Disease UCI dataset, this project predicts heart disease presence based on clinical features like age, cholesterol, and blood pressure. Multiple models including Logistic Regression, Random Forest, and XGBoost were evaluated for accuracy and robustness.

Technologies: Python Scikit-learn XGBoost Data Visualization

View GitHub

I performed extensive exploratory data analysis using boxplots, KDE plots, and scatter matrices to understand feature distributions and relationships. Feature engineering improved model inputs, while missing data was carefully handled.

I trained and tuned multiple classifiers, ultimately selecting the Random Forest model based on performance metrics. The final model was saved for future use and deployment.

Key highlights:

Thorough data exploration and visualization.
Comparative evaluation of different machine learning models.
Robust feature engineering and model selection.

Breast Cancer Prediction

Deployed Flask App Random Forest

This Flask web application predicts whether a tumor is cancerous or benign based on tumor characteristics using a Random Forest classifier. It provides users with instant, real-time predictions through an easy-to-use web interface.

Technologies: Flask Random Forest Scikit-learn NumPy Pickle

View GitHub

I developed the machine learning pipeline by training a Random Forest classifier on a labeled breast cancer dataset, tuning it for high accuracy and reliability.

The model was serialized using pickle and integrated into a Flask web app that allows users to input tumor features via a clean and responsive HTML form.

Static assets like CSS and JavaScript were used to enhance the UI/UX. The app immediately provides a prediction on submission, allowing for quick cancer risk assessments.

Project Structure Highlights:

app.py: Main Flask application handling routes and prediction logic
model.py: Contains the model training and loading functions
classifier_rf.pkl: Serialized Random Forest model
templates/index.html: User input form and result display
static/: CSS and JavaScript files for front-end styling and interaction

Prerequisites: Python 3.x, Flask, scikit-learn, NumPy, Pickle.

Prudential Life Insurance Assessment

Logistic Regression Decision Trees Feature Engineering

A predictive model for evaluating life insurance applications using decision trees and logistic regression on customer demographic and financial data.

Technologies: Logistic Regression Decision Trees Feature Engineering

View GitHub

Explored structured customer datasets and engineered features such as income-to-age ratio and employment status. Trained classification models with hyperparameter tuning and evaluated using ROC AUC and confusion matrix analysis.

Implementation Process:

Processed structured demographic and financial data
Engineered key features like income-to-age ratio
Trained Decision Tree and Logistic Regression models
Evaluated with ROC AUC and confusion matrix

Featured Projects

Movie Recommender System Entertainment AI Streamlit App MLOps NLP

Intelligent content-based recommendation engine leveraging advanced NLP techniques and cosine similarity algorithms. Features real-time movie poster fetching via TMDB API integration, interactive Streamlit interface, and comprehensive metadata processing for enhanced user experience.

Real-time Recommendations

Dynamic Poster Fetching

View GitHub

Credit Score Classification FinTech MLOps Docker AWS EC2

Enterprise-grade end-to-end ML pipeline featuring comprehensive MLOps practices with DVC for data versioning, MLflow for experiment tracking, and automated CI/CD deployment. Containerized with Docker and deployed on AWS EC2 with scalable infrastructure management.

82% Accuracy

Automated Pipeline

View GitHub

Loan Approval App FinTech Flask App ML Model Bootstrap UI

Comprehensive Flask web application designed for intelligent loan approval prediction using machine learning algorithms. Features a responsive Bootstrap UI with form validation, real-time prediction capabilities, and SQLite database integration for persistent data storage and prediction history tracking.

82% Accuracy

Real-time Predictions

View GitHub

Heart Disease Prediction HealthTech ML Models

Machine learning web app to predict the likelihood of heart disease using patient health metrics. Built with XGBoost model, includes visual insights and real-time Flask interface.

86.8% Accuracy

Real-time Results

View GitHub

Breast Cancer Detection Health AI Flask App

Flask-based prediction app for early breast cancer detection using health metrics. Processes data from the Wisconsin dataset and delivers fast, reliable results with high accuracy.

95.6% Accuracy

Fast Inference

View GitHub

Resume

Download my latest resume below.

Download Resume View Resume (PDF)

Technical Skills

Languages & Scripting: Python (Advanced), SQL, HTML, CSS, Excel
Machine Learning Frameworks: Scikit-learn, XGBoost, Pipelines (Sklearn, Joblib)
Machine Learning Techniques: Regression, Classification, Clustering, Model Evaluation (MAE, RMSE, F1-score, ROC-AUC), Feature Selection, Hyperparameter Tuning
Deep Learning Tools: TensorFlow, Keras
Deep Learning Techniques: Neural Networks, CNNs, RNNs, LSTMs, Transfer Learning, Attention Mechanisms
Data Handling & Analysis: Pandas, NumPy, SciPy
Web & Deployment Frameworks: Flask, Streamlit, FastAPI (Basic), Render
MLOps & DevOps Tooling: MLflow, DVC, Docker, Git, GitHub Actions, Linux CLI, CI/CD Automation
Cloud Platforms: AWS EC2 (Deployed Projects), Azure (Learning via Labs), GCP (Learning via Tutorials)
Visualization & BI Tools: Matplotlib, Seaborn, Plotly, Power BI
Databases: MySQL, SQLite
Version Control: Git, GitHub
Specialized Domains: NLP, LLMs, EDA, Feature Engineering, Data Preprocessing
Mathematical & Theoretical Foundations: Statistics, Linear Algebra, Algorithms, Applied Mathematics (Model Optimization & Feature Selection)

Soft Skills

Effective Communicator – Skilled at conveying complex technical concepts clearly to both technical and non-technical audiences.
Self-Motivated – Proactively built and deployed end-to-end machine learning applications independently.
Analytical Problem Solver – Applies critical thinking and data-driven approaches to overcome real-world challenges.
Team Player – Collaborated successfully on academic projects and cross-functional team initiatives.
Continuous Learner – Actively expanding expertise in MLOps, Cloud Computing, Natural Language Processing, and GenAi.

About Me

Education & Certifications

Post Graduate Diploma in Data Science and Artificial Intelligence

Bachelor of Technology in Mechanical Engineering

Courses & Certifications

Personal Projects

Web Development Projects

Portfolio Website

NLP Projects

Movie Recommender System

MLOps & Deployment Projects

Credit Score Classification

Data Analysis & Insights Projects

NYC Restaurants Recommendation System

Predictive Modeling Projects

Student Performance Prediction

Loan Prediction Web Application

Heart Disease Prediction

Breast Cancer Prediction

Prudential Life Insurance Assessment

Implementation Process:

Featured Projects

Movie Recommender System Entertainment AI Streamlit App MLOps NLP

Credit Score Classification FinTech MLOps Docker AWS EC2

Loan Approval App FinTech Flask App ML Model Bootstrap UI

Heart Disease Prediction HealthTech ML Models

Breast Cancer Detection Health AI Flask App

Resume

Technical Skills

Soft Skills

Contact Me

Get in Touch