I’m an Applied AI and Data Systems professional focused on turning complex data into robust, production ready intelligence across machine learning, MLOps, and analytics engineering. I’ve led projects in banking, airlines, healthcare industries designing forecasting models, fraud and risk solutions, and BI experiences that directly influence decisions at scale. I enjoy owning problems end to end from SQL pipelines, cloud based ML, and model monitoring to dashboards and storytelling that make insights clear, trustworthy, and actionable. My work sits at the intersection of AI engineering, data platforms, and user-centric design, and I’m always looking to build systems that are not just accurate, but reliable, interpretable, and aligned with real world impact.
Built an XGBoost engagement model (ROC-AUC 0.76) with SHAP insights and KPIs to guide targeted outreach and quantify business impact for Medicare Advantage members.
Developed a multiclass XGBoost model to predict heavy-duty truck warranty cost segments from configuration and claims data, improving macro ROC–AUC to 0.80 and surfacing high‑risk option bundles for design and pricing decisions.
Developed ML-based features and prototypes to analyze customer travel behavior and surface actionable insights for destination recommendation and operational planning.
Built a model in R to streamline the teaching assistant selection process, evaluating over 1,200 applications and assigning probability scores to identify top candidates.
Built an end-to-end classification pipeline on SpaceX launch data, including API/web scraping, SQL and Python-based EDA, feature engineering, and model evaluation to predict Falcon 9 landing success and support reusable-rocket cost optimization.
Led an end-to-end project management capstone to design and plan the AHI Marketing Data App, replacing fragmented, manual market-tracking processes with a single real-time decision support application.
Designed an interactive Tableau dashboard on Pantawid Indigenous Peoples MCCT data to profile beneficiaries by region, age, and gender, highlight underserved provinces, and surface insights for program coverage, education, and healthcare planning.
University of North Texas, United States
Python (Pandas, NumPy, Scikit-learn, Statsmodels, Matplotlib, Seaborn), R (dplyr, ggplot2), SAS (SAS EG, SAS DI, SAS Viya), Excel (Pivot Tables, Power Query, Functions), SQL (CTE, window functions, joins, performance tuning)
Supervised Learning, Unsupervised Learning, Ensemble Models (XGBoost, LightGBM, CatBoost), Random Forests, Gradient Boosting, Time Series Forecasting (SARIMA, ARIMA), Clustering (K-Means, Hierarchical), Recommendation Systems, Anomaly Detection, Root Cause Analysis, Feature Engineering, Model Evaluation (ROC-AUC, Precision, Recall, F1-score, Accuracy), Bias & Fairness Testing
Hypothesis Testing, A/B Testing, Experimental Design, Regression Modeling, Churn Analysis, Demand Forecasting, Segmentation, Confidence Intervals, p-value Interpretation, t-tests, Chi-square Tests, Correlation Analysis, Causality Analysis, Significance Testing
Apache Spark, PySpark, Airflow, Delta Lake, dbt (models, tests, docs), Data Quality Validation, ETL/ELT Pipeline Development
Microsoft Azure (Data & AI stack), AWS (SageMaker, Redshift, EC2, S3), GCP (BigQuery, Vertex AI – exposure), Cloud-based ML & Analytics Workloads
PostgreSQL, Snowflake, Amazon Redshift, SQL Server, BigQuery, Delta Lake, Databricks, Data Modeling for Analytics & Reporting
Power BI, Tableau, Tableau Prep, Looker Studio (Google Data Studio), Streamlit, Plotly, SAS Viya, Executive Dashboards, Interactive Dashboards, Drill-down Reporting
Credit Portfolio Analytics, Fraud Detection, Compliance Reporting, Regulatory Reporting (CCPA 2020), Customer Retention, Customer Lifetime Value, Operational Performance Monitoring, Dynamic Pricing, Logistics Cost Optimization
Agile & Scrum Methodologies, Work Breakdown Structure (WBS), Risk Identification & Mitigation Planning, Scope & Timeline Management, 30–70% Rules, Stakeholder Alignment & Status Reporting
Git, GitHub, Bitbucket, Confluence, JIRA, Agile, Scrum, Stakeholder Communication, Data-Driven Storytelling
Excellent Written and Verbal Communication Skills, Strong Problem-Solving Skills, Ability To Collaborate, Attention To Detail, Team Player, Self-Motivated
Senior Associate – Data Governance
“I had the pleasure of working with Rakesh and consistently found him to be dependable, detail-oriented, and proactive in his approach. He communicates clearly, collaborates well, and takes ownership of his responsibilities. Rakesh would be a strong asset to any team he joins.”
Recommendation, January 30, 2026
Data Scientist Principal – Lockheed Martin
“I had the privilege of collaborating with Rakesh Sharma during his position as my teaching assistant for two graduate-level courses in the Department of Advanced Data Analytics at the University of North Texas. Rakesh was selected from a highly competitive pool of candidates, which underscores both his extensive knowledge in data analytics and his exceptional interpersonal skills.
His contributions were invaluable; he played a crucial role in researching course content, organizing materials, developing assessments, and providing insightful feedback. Rakesh approached his responsibilities with enthusiasm and consistently surpassed my expectations. His positive attitude and effective communication skills—both verbal and written—greatly enhanced the learning environment.
Rakesh has shown kindness and encouragement to students who seek his assistance, and he is always willing to share his experiences with fellow teaching assistants. Given his outstanding performance as my TA, his proven track record in previous roles, and his dedication to continuous learning, I am confident that he will be a significant asset to your organization.”
Recommendation, November 8, 2024