About Me
Hello! I'm Duvvuri Durga Prasad, an aspiring Data Scientist with a strong foundation in Python, Machine Learning, and MLOps. I recently completed my PG Diploma in Data Science & Artificial Intelligence from the University of Liverpool and hold a Bachelor's degree in Mechanical Engineering.
I am passionate about transforming raw data into actionable insights that drive impactful business solutions. I have independently built and deployed end-to-end ML applications using Flask, Streamlit, Docker, and AWS, with projects spanning finance, healthcare, education, and food tech.
My journey from Mechanical Engineering to Data Science reflects my adaptability and drive to solve real-world problems through data-driven approaches. I value clean code, reproducibility, and data ethics, and thrive in collaborative, fast-paced environments where I can learn and contribute as a team player.
As a fresher, I bring a fresh perspective, a strong willingness to learn, and proven hands-on skills in modern ML and MLOps tools like DVC, MLflow, and GitHub Actions. I am eager to contribute, grow, and make a meaningful impact in a forward-thinking organization.
Education & Certifications
Post Graduate Diploma in Data Science and Artificial Intelligence
University of Liverpool, UK | Dec 2024
Focus: Machine Learning, Deep Learning, AI, NLP, Data Visualization
Dissertation: Developed a Python-based 2D strategy game using graph theory.
Bachelor of Technology in Mechanical Engineering
Avanthi Institute, Hyderabad | May 2017
Project: Solar-powered car prototype focusing on sustainable energy design.
Courses & Certifications
- Machine Learning with Python: Zero to GBMs (Aug 2023 – Feb 2024)
- Data Analysis with Python: Zero to Pandas (May 2022 – Aug 2022)
Personal Projects
Web Development Projects
Portfolio Website

Developed a fully responsive personal portfolio website using HTML, CSS, and JavaScript. The website includes dark/light theme toggle, smooth scrolling, and a dynamic showcase of projects, skills, and contact form integration. The design is clean, professional, and mobile-friendly to ensure an optimal experience across devices.
I began by designing the layout with semantic HTML5 and styled it using modern CSS3, focusing on a clean and responsive layout. JavaScript was used to handle the dark/light theme switch, smooth navigation, and dynamic content rendering for projects.
The website was version-controlled with Git and deployed using GitHub Pages for free and fast hosting. All project sections link directly to live demos and source code for transparency and easy access.
Key highlights:
- Dark/light theme toggle
- Fully responsive design
- Integrated project showcase with live links
- Deployed using GitHub Pages
NLP Projects
Movie Recommender System

This project is a content-based movie recommender system designed to suggest movies similar to a selected title. It uses metadata features like genres, cast, keywords, and overview to calculate similarity between movies using cosine similarity. To enhance the user experience, the app fetches live movie posters dynamically from The Movie Database (TMDB) API and displays them alongside recommendations.
I started by collecting and cleaning movie metadata to create meaningful features for similarity comparison. Using TF-IDF vectorization, I converted textual features into numerical vectors. Cosine similarity was then used to compute similarity scores between movies.
The TMDB API was integrated to dynamically fetch high-quality movie posters to display with recommendations, improving visual appeal and user engagement.
The entire system was wrapped into an interactive web app using Streamlit, providing a simple and responsive UI that allows users to select movies and instantly view recommendations with posters.
Key highlights:
- Advanced feature engineering combining multiple metadata features
- Real-time API integration for poster retrieval
- End-to-end deployment with Streamlit for an interactive user experience
MLOps & Deployment Projects
Credit Score Classification

This project implements a robust machine learning pipeline to classify credit scores, aiming to predict creditworthiness of loan applicants. It incorporates modern MLOps practices using DVC for dataset and pipeline versioning, MLflow for experiment tracking and model registry, and Flask for serving the trained model through a REST API. The application is fully containerized with Docker and deployed on AWS EC2, with automated CI/CD workflows via GitHub Actions.
I structured the project as an end-to-end ML pipeline, starting with data ingestion and preprocessing, followed by feature engineering, model training, and evaluation.
Using DVC, I version-controlled the datasets and pipeline stages, enabling reproducibility and easy collaboration. MLflow tracked experiments including model parameters and evaluation metrics, facilitating model comparison and selection.
The best model was wrapped in a Flask API for serving predictions, while the entire setup was containerized using Docker to ensure environment consistency. Deployment was automated on AWS EC2, and GitHub Actions CI/CD pipelines ensured code quality and smooth updates.
Key highlights:
- Modular pipeline components enabling flexible experimentation.
- Seamless experiment tracking and model registry with MLflow.
- Robust deployment strategy with Docker and AWS.
- CI/CD automation via GitHub Actions for testing and deployment.
Data Analysis & Insights Projects
NYC Restaurants Recommendation System

This project explores customer order data from NYC restaurants to uncover insights about cuisine popularity, customer satisfaction, and operational efficiency. By analyzing order counts, delivery times, and customer ratings, the project provides actionable recommendations to restaurant owners and food delivery platforms.
I began by cleaning and preprocessing a large dataset of restaurant orders. Using exploratory data analysis and visualization techniques (Seaborn, Plotly), I identified key trends such as the popularity of American and Japanese cuisines, and how delivery efficiency correlates with customer satisfaction.
Built predictive models for delivery time, order cost, and customer ratings using relevant features like cuisine type, preparation time, and day of the week.
The insights help stakeholders improve operational efficiency, predict delivery delays, and enhance customer experience through data-driven decisions.
Key highlights:
- Comprehensive exploratory data analysis revealing customer behavior patterns.
- Predictive modeling for critical operational metrics.
- Interactive visualizations supporting decision-making.
Predictive Modeling Projects
Student Performance Prediction

This project predicts students' exam performance based on factors such as study time, past grades, and other academic-related features. The model helps educators identify students who may need additional support.
Data preprocessing and feature selection were performed to identify the most relevant predictors of exam outcomes. I trained multiple machine learning algorithms, tuning hyperparameters to maximize predictive accuracy.
An interactive web application was created using Streamlit to allow users to input student data and instantly receive performance predictions.
Key highlights:
- Feature engineering for improved model accuracy.
- Model selection and tuning to identify optimal predictors.
- Easy-to-use Streamlit app for real-time predictions.
Loan Prediction Web Application

This Flask web application predicts loan approval status based on applicant information. It leverages a trained Random Forest classifier to provide real-time decisions, helping users understand their loan eligibility.
I trained a Random Forest classifier using cleaned loan application data and stored the model using pickle. The Flask app collects user inputs via Bootstrap-powered responsive forms and returns approval predictions.
Predictions and user contact messages are saved to an SQLite database for persistence. Front-end design was enhanced with custom CSS and JavaScript for usability.
Project Structure Highlights:
app.py
: Flask backend applicationtemplates/
: HTML pages including forms and result viewsstatic/
: CSS, images, and JavaScript filesrandom_forest_model.pkl
: Serialized model fileschema.sql
: SQLite database schema
Heart Disease Prediction

Using the Heart Disease UCI dataset, this project predicts heart disease presence based on clinical features like age, cholesterol, and blood pressure. Multiple models including Logistic Regression, Random Forest, and XGBoost were evaluated for accuracy and robustness.
I performed extensive exploratory data analysis using boxplots, KDE plots, and scatter matrices to understand feature distributions and relationships. Feature engineering improved model inputs, while missing data was carefully handled.
I trained and tuned multiple classifiers, ultimately selecting the Random Forest model based on performance metrics. The final model was saved for future use and deployment.
Key highlights:
- Thorough data exploration and visualization.
- Comparative evaluation of different machine learning models.
- Robust feature engineering and model selection.
Breast Cancer Prediction

This Flask web application predicts whether a tumor is cancerous or benign based on tumor characteristics using a Random Forest classifier. It provides users with instant, real-time predictions through an easy-to-use web interface.
I developed the machine learning pipeline by training a Random Forest classifier on a labeled breast cancer dataset, tuning it for high accuracy and reliability.
The model was serialized using pickle and integrated into a Flask web app that allows users to input tumor features via a clean and responsive HTML form.
Static assets like CSS and JavaScript were used to enhance the UI/UX. The app immediately provides a prediction on submission, allowing for quick cancer risk assessments.
Project Structure Highlights:
app.py
: Main Flask application handling routes and prediction logicmodel.py
: Contains the model training and loading functionsclassifier_rf.pkl
: Serialized Random Forest modeltemplates/index.html
: User input form and result displaystatic/
: CSS and JavaScript files for front-end styling and interaction
Prerequisites: Python 3.x, Flask, scikit-learn, NumPy, Pickle.
Prudential Life Insurance Assessment

A predictive model for evaluating life insurance applications using decision trees and logistic regression on customer demographic and financial data.
Implementation Process:
- Processed structured demographic and financial data
- Engineered key features like income-to-age ratio
- Trained Decision Tree and Logistic Regression models
- Evaluated with ROC AUC and confusion matrix
Featured Projects
Movie Recommender System Entertainment AI Streamlit App MLOps NLP
Intelligent content-based recommendation engine leveraging advanced NLP techniques and cosine similarity algorithms. Features real-time movie poster fetching via TMDB API integration, interactive Streamlit interface, and comprehensive metadata processing for enhanced user experience.
Credit Score Classification FinTech MLOps Docker AWS EC2
Enterprise-grade end-to-end ML pipeline featuring comprehensive MLOps practices with DVC for data versioning, MLflow for experiment tracking, and automated CI/CD deployment. Containerized with Docker and deployed on AWS EC2 with scalable infrastructure management.
Loan Approval App FinTech Flask App ML Model Bootstrap UI
Comprehensive Flask web application designed for intelligent loan approval prediction using machine learning algorithms. Features a responsive Bootstrap UI with form validation, real-time prediction capabilities, and SQLite database integration for persistent data storage and prediction history tracking.
Heart Disease Prediction HealthTech ML Models
Machine learning web app to predict the likelihood of heart disease using patient health metrics. Built with XGBoost model, includes visual insights and real-time Flask interface.
Breast Cancer Detection Health AI Flask App
Flask-based prediction app for early breast cancer detection using health metrics. Processes data from the Wisconsin dataset and delivers fast, reliable results with high accuracy.
Technical Skills
- Languages & Scripting: Python (Advanced), SQL, HTML, CSS, Excel
- Machine Learning Frameworks: Scikit-learn, XGBoost, Pipelines (Sklearn, Joblib)
- Machine Learning Techniques: Regression, Classification, Clustering, Model Evaluation (MAE, RMSE, F1-score, ROC-AUC), Feature Selection, Hyperparameter Tuning
- Deep Learning Tools: TensorFlow, Keras
- Deep Learning Techniques: Neural Networks, CNNs, RNNs, LSTMs, Transfer Learning, Attention Mechanisms
- Data Handling & Analysis: Pandas, NumPy, SciPy
- Web & Deployment Frameworks: Flask, Streamlit, FastAPI (Basic), Render
- MLOps & DevOps Tooling: MLflow, DVC, Docker, Git, GitHub Actions, Linux CLI, CI/CD Automation
- Cloud Platforms: AWS EC2 (Deployed Projects), Azure (Learning via Labs), GCP (Learning via Tutorials)
- Visualization & BI Tools: Matplotlib, Seaborn, Plotly, Power BI
- Databases: MySQL, SQLite
- Version Control: Git, GitHub
- Specialized Domains: NLP, LLMs, EDA, Feature Engineering, Data Preprocessing
- Mathematical & Theoretical Foundations: Statistics, Linear Algebra, Algorithms, Applied Mathematics (Model Optimization & Feature Selection)
Soft Skills
- Effective Communicator – Skilled at conveying complex technical concepts clearly to both technical and non-technical audiences.
- Self-Motivated – Proactively built and deployed end-to-end machine learning applications independently.
- Analytical Problem Solver – Applies critical thinking and data-driven approaches to overcome real-world challenges.
- Team Player – Collaborated successfully on academic projects and cross-functional team initiatives.
- Continuous Learner – Actively expanding expertise in MLOps, Cloud Computing, Natural Language Processing, and GenAi.