Rakeshsarma Karra

AI/ML Engineer | Data Scientist

I build production-ready AI and data systems that are accurate, interpretable, and built to support decisions at scale.

Experience across banking, airlines, and healthcare, designing forecasting models, fraud and risk solutions, and BI experiences.

Rakesh Sarma Karra
What I Do

Machine Learning & Intelligent Analytics

  • Design and deploy supervised, unsupervised, and time series models to solve real business problems across banking, airlines, healthcare, and nonprofit domains.
  • Focus on robust feature engineering, model evaluation, and interpretability techniques (e.g., SHAP, KPI analysis) to make model behavior transparent and actionable.
  • Translate model outputs into clear recommendations that help stakeholders make confident, data-driven decisions.

Data Engineering & Scalable Pipelines

  • Build ETL and ELT pipelines using SQL, Apache Spark, PySpark, Airflow, and dbt to move data reliably from source systems into analytics‑ready models.
  • Engineer data workflows that emphasize data quality, validation, and reproducibility across development and production environments.
  • Optimize performance and scalability so models, reports, and dashboards continue to work as data volume and complexity grow.

Business Intelligence & Decision Dashboards

  • Develop interactive dashboards in Power BI, Tableau, and Looker Studio that turn complex data into clear stories for operations, risk, and leadership teams.
  • Design drill‑down views, filters, and KPIs that support day‑to‑day monitoring and deeper “why” analysis across credit risk, marketing, and operations.
  • Align visualizations with how stakeholders think about the business, making insights intuitive, trustworthy, and easy to act on.

MLOps, Cloud & Responsible AI

  • Run ML and analytics workloads on Azure, AWS, and GCP, from experimentation through deployment, using Git‑based workflows and CI/CD practices.
  • Integrate monitoring, logging, and version control to make models easier to maintain, iterate on, and roll back when needed.
  • Incorporate data governance, regulatory reporting, and bias/fairness considerations so systems are accurate, compliant, and auditable.
Experience

Data Scientist – Community Dreams

April 2025 – Present United States
  • Developed and evaluated machine learning models in Python to analyze community program data (e.g., participation, outcomes, engagement), generating predictive insights that supported data driven decision making for nonprofit initiatives.
  • Built end to end analytical workflows using Python, SQL, and Excel to clean raw community datasets, engineer features, train baseline models, and visualize key findings for stakeholders, documenting the work in GitHub for version control and collaboration.

Machine Learning Engineer(Volunteer) – Murphy Charitable Foundations

Nov 2025 – Present United States
  • Contributing as an ML Engineer volunteer to design and prototype machine learning models in Python for Murphy Charitable Foundation’s new application supporting vulnerable communities (e.g., child sponsorship and donor engagement use cases).
  • Building and iterating on data pipelines using Python and SQL (data cleaning, feature engineering, basic model training), with experiments and notebooks \ version controlled through Git and regularly pushed to GitHub.

Research Data Scientist – University of North Texas

January 2024 – December 2024 United States
  • Designed and delivered Python-based lab sessions covering advanced machine learning and big data analytics.
  • Conducted AI bias research, identifying and analyzing system, developer, and statistical biases in ML models.
  • Designed an AI reliability survey, revealing 68% of participants believed AI enhanced performance, providing insights into user trust and adoption.
  • Performed a comparative analysis of bias, hate speech detection, and sentiment classification across AI models from ChatGPT, Gemini, Meta AI, and Claude AI.

Data Scientist Intern – Humana Inc.

September 2024 – October 2024 United States
  • Built XGBoost Medicare Advantage engagement risk models on member-level claims and demographics to classify engaged vs unengaged beneficiaries, improving baseline accuracy by 70% and achieving ROC-AUC 0.76 to support impact-focused outreach and cost of care reduction.
  • Developed and deployed end-to-end ML workflows in Vertex AI (XGBoost) using BigQuery-backed training data, operationalizing scored engagement-risk outputs back into GCP BigQuery views for self-service analytics and monitoring.
  • Used SHAP interpretability & KPI analysis to build a measurement framework quantifying key drivers of low engagement, informing prescriptive outreach scenarios that simulated up to 40% uplift in engagement for high risk members.

Data Scientist Intern – American Airlines

September 2023 – December 2023 United States
  • Improved flight demand forecasting accuracy 78% using Random Forest with advanced feature engineering and time-series trend analysis.
  • Automated Python/SQL data pipelines, reducing processing time 60% and uncovering operational behavior patterns.
  • Built a congestion analysis framework using statistical metrics to support infrastructure capacity planning.

Data Scientist – Citi Bank

November 2021 – January 2023 India
  • Developed an automated feature engineering and fraud detection pipeline using Featuretools and XGBoost, improving model accuracy for high risk transaction detection and data imputation by 15%.
  • Designed and orchestrated ETL workflows in Apache Airflow to migrate data from CMR to MDM, increasing data stewardship reliability to 76% and stabilizing downstream analytics.
  • Streamlined ingestion of large unstructured zip files into SAS by combining UNIX utilities (unzip, grep, awk) with SAS scripts, reducing data loading and preprocessing time by 40% compared to legacy workflows.
  • Enhanced data quality audits and compliance analytics using advanced SQL (CTE, joins, subqueries, case when) under CCPA 2020, accelerating validation and regulatory reporting.
  • Collaborated with engineering and product development teams to automate profiling and compliance reports using Python (Pandas) and Autosys/Bitbucket, reducing manual validation by ~30% and strengthening data integrity monitoring.

Data Scientist – ICICI Bank

March 2019 – October 2021 India
  • Built and validated predictive risk models in SAS Enterprise Miner and Python (Scikit-learn) for loan default prediction across products such as personal loans, gold loans, and fixed deposits, improving underwriting accuracy by 12%.
  • Implemented K-Means clustering in SAS and Python on 4M+ customer records to segment portfolios by behavior and holdings, enabling targeted cross-sell campaigns that increased conversion by 22%.
  • Automated portfolio performance reporting using PROC REPORT, dynamic SAS macros, and PROC SQL, cutting manual reporting time by 60% and providing near real-time visibility into loans, transactions, and customer behavior.
  • Optimized data pipelines with PROC SORT and macro-driven workflows, reducing query latency by 35% for fraud and compliance reports across 20+ banking products.
  • Prototyped VB portfolio dashboards and delivered executive-ready views for BI, Risk, and senior leadership stakeholders.
Tools & Technologies

Programming & Scripting

Python Python
R R
SQL SQL
SAS SAS
Jupyter Jupyter

Machine Learning & AI

Scikit-learn Scikit-learn
PyTorch PyTorch
TensorFlow TensorFlow
OpenAI OpenAI
Hugging Face Hugging Face
MLflow MLflow

Statistical & Experimental

Hypothesis Testing Hypothesis Tests
A/B Testing A/B Testing
Segmentation Segmentation
Experimental Design Experimental Design
Causality Analysis Causality

Data Engineering & MLOps

Apache Spark Spark
PySpark PySpark
Airflow Airflow
Databricks Databricks
dbt dbt
DagsHub DagsHub
Docker Docker
Kubernetes Kubernetes
Heroku Heroku

Cloud & Data Platforms

AWS AWS
Azure Azure
GCP GCP
Snowflake Snowflake
PostgreSQL PostgreSQL

Data Visualization & BI

Power BI Power BI
Tableau Tableau
Plotly Plotly
Streamlit Streamlit
Looker Studio Looker Studio
Executive Dashboards Dashboards

Tools & Collaboration

Git Git
GitHub GitHub
Bitbucket Bitbucket
Jira Jira
Confluence Confluence
Projects
Predicting Unengaged Medicare Advantage Members

Humana - Texas A&M Healthcare Analytics Case Competition

Built an XGBoost engagement model (ROC-AUC 0.76) with SHAP insights and KPIs to guide targeted outreach and quantify business impact for Medicare Advantage members.

Python Google Cloud Platform SQL Machine Learning
View on GitHub →
Attribute Based Truck Warranty Modeling

Peterbilt – UNT Business Analytics Hackathon

Developed a multiclass XGBoost model to predict heavy-duty truck warranty cost segments from configuration and claims data, improving macro ROC–AUC to 0.80 and surfacing high‑risk option bundles for design and pricing decisions.

Python Jupyter Notebook SQL Machine Learning
View on GitHub →
Destination Recommendation Engine – Customer Travel Behavior Modeling

American Airlines Hackathon

Developed ML-based features and prototypes to analyze customer travel behavior and surface actionable insights for destination recommendation and operational planning.

Python Snowflake SQL Machine Learning
View on GitHub →
Teaching Assistant Eligibility Predictor

Teaching Assistant Eligibility Predictor using Deep Learning Models

Built a model in R to streamline the teaching assistant selection process, evaluating over 1,200 applications and assigning probability scores to identify top candidates.

R RShiny RStudio Neural Networks
View on GitHub →
SpaceX Falcon 9 Landing Success Prediction for Reusable Rocket Cost Optimization

SpaceX Falcon 9 Landing Success Prediction

Built an end-to-end classification pipeline on SpaceX launch data, including API/web scraping, SQL and Python-based EDA, feature engineering, and model evaluation to predict Falcon 9 landing success and support reusable-rocket cost optimization.

Python Web Scraping Predictive Analytics Data Visualization
View on GitHub →
AHI App Development - Project Management Capstone

AHI App Development - Project Management Capstone

Led an end-to-end project management capstone to design and plan the AHI Marketing Data App, replacing fragmented, manual market-tracking processes with a single real-time decision support application.

Project Management Stakeholder Management Work Breakdown Structure Agile Planning
View on GitHub →
Philippines MCCT Program Dashboard

Philippines MCCT Program Dashboard

Designed an interactive Tableau dashboard on Pantawid Indigenous Peoples MCCT data to profile beneficiaries by region, age, and gender, highlight underserved provinces, and surface insights for program coverage, education, and healthcare planning.

Tableau Tableau Prep
View on GitHub →
Artificial Intelligence Ethics, Bias & Education Research

Education

University of North Texas campus or department
University of North Texas, TX, USA

M.S. in Data Science & Advanced Analytics

Built a strong foundation in statistical learning, machine learning, data systems, and applied analytics through industry-oriented coursework and project-based problem solving.

Relevant Coursework

Machine Learning Statistical Methods Predictive & Prescriptive Analytics Big Data Analytics Data Visualization Analytics Communication

Certifications

AWS Certified AI Practitioner

AWS Certified AI Practitioner

Amazon Web Services

Validates foundational knowledge in AI, machine learning concepts, and responsible AI on AWS.

View Credential
Databricks Certification

Databricks Certification

Databricks

Demonstrates applied capability in modern data engineering, analytics workflows, and scalable data platforms.

AI Facilitated Learning Network

AI Facilitated Learning Network - THECB

Professional Development Program

Focused on applied AI learning, modern AI workflows, and practical integration of intelligent systems.

View Credential
UNT AI Fundamentals

UNT AI Fundamentals

University of North Texas

Strengthened core understanding of AI principles, foundational concepts, and responsible use of AI technologies.

IBM Data Science Professional Certificate

IBM Data Science Professional Certificate

IBM · Coursera

Built hands-on foundations in Python, SQL, data analysis, visualization, machine learning, and real-world data science workflows.

Cloud Computing Professional Certificate

Cloud Computing Professional

Coursera

Built practical skills in cloud concepts, virtual machines, containers, storage, networking, and application deployment.

Honors & Recognitions
AI In Action 2025 - Exploring User Bias and Hallucinations in Generative AI Systems

AI In Action 2025 – Exploring User Bias and Hallucinations in Generative AI Systems

Poster Presentation – 2025
2024 Humana-Mays Healthcare Analytics Case Competition

2024 Humana-Mays Healthcare Analytics Case Competition

National Healthcare Analytics Case Competition – 2024
American Airlines Machine Learning Competition

American Airlines Machine Learning Competition

Machine Learning Hackathon – 2024
UNT – Tuition Benefit Program Jan 2024 – May 2024

UNT – Tuition Benefit Program (Spring 2024)

Jan 2024 – May 2024
UNT – Tuition Benefit Program Aug 2024 – Dec 2024

UNT – Tuition Benefit Program (Fall 2024)

Aug 2024 – Dec 2024
Citi

Silver Award – Citi Bank

California Consumer Privacy Act 2020 (CCPA) – CMR project
Citi

Bronze Award – Citi Bank

CMR to MDM migration project
ICICI

Work Excellence Award – ICICI Bank

Recognition for outstanding delivery and performance
Event & Pictures
AI In Action 2025
AI In Action 2025
Humana Presentation
American Airlines Skyview #7, Fort Worth
American Airlines Skyview #7, Fort Worth
Organizations & Workshops

UNT AI in Action – Research Presenter & Graduate Participant

Sep 2024 – Apr 2025 United States
  • Completed the Texas Higher Education Coordinating Board’s AI Professional Development Program, gaining hands-on experience in prompt engineering, custom GPTs, AI-enhanced content creation, ethical AI, and trustworthy generative AI frameworks.
  • Co-authored and presented two research posters at the UNT AI in Action Workshop as: Karra, R.S., Kota, M., Mahadasu, M.P., Rajidi, S., “Ethical Considerations in AI Adoption for Education and Research” (GitHub) and “Exploring User Bias and Response Variability in Generative AI Systems” (GitHub), under the mentorship of Dr. Zeynep Orhan.
  • Collaborated with a cross-functional team of four graduate students to investigate bias, hallucination patterns, and fairness concerns in LLM-based systems using systematic prompt–response experiments and statistical analysis.
  • Contributed to responsible AI adoption frameworks by synthesizing empirical findings on ethical risks, user trust, and governance into posters and technical documentation for an interdisciplinary Human–AI Collaboration workshop.

UNT Business Analytics Club – Member

Aug 2024 – Dec 2024 United States
  • Participated in speaker sessions and case-based workshops on business analytics, data visualization, and predictive modeling, gaining exposure to real-world use cases and tools.
  • Collaborated with peers in analytics challenges and networking events, strengthening problem-solving, presentation skills, and industry connections.

UNT Data Science Talk Series – Graduate Student Attendee

Jan 2024 – Dec 2024 United States
  • Attended weekly talks by academic and industry experts on machine learning, AI, data visualization, big data analytics, and ethical data practices, broadening exposure to emerging trends and best practices.
  • Engaged with speakers on real-world applications of advanced analytics and AI-driven decision-making in sectors such as retail, supply chain, healthcare, and technology.
  • Participated in collaborative learning sessions on deep learning frameworks, cloud-based ML tools, NLP, and computational data science techniques to strengthen technical depth.
  • Networked with data science professionals, researchers, and graduate peers during Thursday sessions, exploring career pathways in analytics, ML engineering, and AI research.

UNT Society for Student AI Innovation – Member

Jan 2024 – Dec 2024 United States
  • Participated in workshops on Hugging Face models, including practical tokenization sessions emphasizing LLMs for real-world AI applications.
  • Promotes collaborative projects, research, and learning in AI/ML open to all majors, building skills in model deployment and innovation.
  • Engages members in events like online workshops that explore AI usability, such as fine-tuning LLMs for tasks like natural language processing and ethical AI use.
  • Awarded a Participation Certificate for the online workshop “How to Use Hugging Face AI Models?” in Feb 2026, recognizing active engagement in LLM and AI usability sessions (Certificate Link)
Professional Development
Tableau Public dashboard preview

Interactive dashboards and visual analytics projects showcasing storytelling with data.

View Profile
Credly badges preview

Verified badges and credential achievements across cloud, analytics, and AI learning.

View Profile
Coursera profile preview

Completed courses and specializations in machine learning, data science, and cloud topics.

View Profile
DataCamp portfolio preview

Hands-on learning, projects, and portfolio-based practice in Python, SQL, and analytics.

View Profile

Recommendations

Akhil Jha

Akhil Jha

Assistant Vice President Product Management - RBL Bank

View on LinkedIn

“I highly recommend Rakesh Karra for his strong SQL skills and superb understanding of data analysis. He is very proficient in handling large datasets, writing efficient queries, and converting raw data into meaningful insights that support business decisions.” We have worked togerher and collaborated on various projects.”

Recommendation, March 14, 2026

Vinay MS

Senior Business Analyst - Citi Bank

View on LinkedIn

“I had the privilege of working with Rakesh on AML and financial crime analytics initiatives, where he consistently demonstrated strong expertise in transaction monitoring, KYC controls, and regulatory expectations aligned with global AML regulatory standards. He has a strong understanding of risk-based AML frameworks, effectively connecting KYC quality, transaction monitoring, and investigative outcomes to strengthen overall financial crime controls.

Rakesh combines analytical depth with strong technical capability, working confidently with large datasets to refine monitoring scenarios and enhance the detection of suspicious activity. His proactive mindset and focus on improving monitoring effectiveness make him a valuable asset to any financial crime or compliance team.”

Recommendation, March 07, 2026

Rahul K

Rahul K

Senior Associate Data Governance - JPMorgan Chase

View on LinkedIn

“I had the pleasure of working with Rakesh and consistently found him to be dependable, detail-oriented, and proactive in his approach. He communicates clearly, collaborates well, and takes ownership of his responsibilities. Rakesh would be a strong asset to any team he joins.”

Recommendation, January 30, 2026

John Schroeder

John Schroeder

Data Scientist Principal – Lockheed Martin

View on LinkedIn

“I had the privilege of collaborating with Rakesh Sharma during his position as my teaching assistant for two graduate-level courses in the Department of Advanced Data Analytics at the University of North Texas. Rakesh was selected from a highly competitive pool of candidates, which underscores both his extensive knowledge in data analytics and his exceptional interpersonal skills.

His contributions were invaluable; he played a crucial role in researching course content, organizing materials, developing assessments, and providing insightful feedback. Rakesh approached his responsibilities with enthusiasm and consistently surpassed my expectations. His positive attitude and effective communication skills—both verbal and written—greatly enhanced the learning environment.

Rakesh has shown kindness and encouragement to students who seek his assistance, and he is always willing to share his experiences with fellow teaching assistants. Given his outstanding performance as my TA, his proven track record in previous roles, and his dedication to continuous learning, I am confident that he will be a significant asset to your organization.”

Recommendation, November 8, 2024

Let's Connect.

I am open to opportunities in Data Science, Machine Learning, and Analytics Engineering.