Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Mehdi RESUME

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Mehdi

AI/ML Associate Architect


Cell: (304) 449-4658 ▪ E-mail: mehdidatas1@gmail.com

SUMMARY
 Data Science Engineer with 10+ years of experience in designing, developing, and
implementing data-driven solutions. Proficient in various data analysis and machine learning
tools and frameworks such as Python, R, TensorFlow, and scikit-learn.
 Developed and deployed data pipelines using Apache Spark and Apache Kafka to process
large-scale datasets, ensuring high throughput and low latency.
 Created and maintained ETL (Extract, Transform, Load) processes to collect and preprocess
raw data from various sources, including databases, REST APIs, and streaming platforms.
 Leveraged Hadoop Distributed File System (HDFS) to store and manage petabytes of
structured and unstructured data, enhancing data accessibility and scalability.
 Implemented machine learning models using Python and libraries such as Scikit-Learn,
TensorFlow, and PyTorch for predictive analytics and anomaly detection.
 Utilized Jupyter Notebook for interactive data exploration, model development, and
visualization, facilitating rapid experimentation and model fine-tuning.
 Employed version control systems, such as Git, to track changes in code, collaborate with
team members, and manage codebase history effectively.
 Conducted data preprocessing tasks, including data cleaning, feature engineering, and data
imputation, to enhance data quality and enable more accurate model training.
 Collaborated with domain experts and stakeholders to define project objectives, KPIs, and
success criteria, ensuring alignment with business goals.
 Designed and developed RESTful APIs using Flask and FastAPI to serve machine learning
models and enable seamless integration with external systems.
 Built and maintained cloud-based data storage solutions, including Amazon S3 and Google
Cloud Storage, for efficient data archiving and retrieval.
 Conducted A/B testing and hypothesis testing using statistical tools like SciPy and performed
rigorous statistical analysis to evaluate model performance and validate hypotheses.
 Implemented real-time monitoring and alerting systems using Grafana and Prometheus to
track data pipeline health, model accuracy, and system performance.
 Leveraged containerization technologies, such as Docker and Kubernetes, to ensure
consistent deployment and scalability of data science applications.
 Designed and maintained data warehouses using technologies like Amazon Redshift and
Google BigQuery for business intelligence and reporting.
 Automated recurring tasks and job scheduling using Apache Airflow to ensure data pipelines
run efficiently and reliably.
 Employed time series analysis techniques, such as ARIMA and Prophet, to forecast trends
and make data-driven decisions, particularly for demand forecasting and capacity planning.
 Collaborated with data engineers to optimize SQL queries and database performance,
ensuring efficient data retrieval and processing.
 Conducted feature selection and dimensionality reduction using techniques like PCA
(Principal Component Analysis) and LDA (Linear Discriminant Analysis) to improve model
efficiency and interpretability.
 Deployed models to production environments using cloud-based services like AWS
SageMaker and Google AI Platform for online inference and real-time decision support.
 Designed data visualizations and dashboards using tools like Tableau and Power BI to
communicate insights and findings to non-technical stakeholders.
 Collaborated with DevOps teams to ensure seamless integration of data science solutions
into CI/CD pipelines, ensuring robust, automated testing and deployment.
 Implemented natural language processing (NLP) techniques, including sentiment analysis,
entity recognition, and topic modeling, for text data analysis and content recommendation
systems.
 Conducted data privacy and security assessments to ensure compliance with GDPR, HIPAA,
and other regulatory requirements, implementing encryption and access control measures.
 Stayed updated with the latest advancements in data science and machine learning,
regularly attending conferences and participating in online courses.
 Conducted code reviews and provided mentorship to junior data scientists, fostering a
culture of knowledge sharing and code quality.
 Collaborated with data architects to design data models and schema for efficient data
storage and retrieval.
 Conducted performance tuning of machine learning models, optimizing hyperparameters
and model architecture for better accuracy and efficiency.
 Supported data-driven decision-making by creating interactive and user-friendly data
exploration tools and dashboards for non-technical stakeholders.
 Performed root cause analysis and debugging of data-related issues, taking a systematic and
data-driven approach to problem-solving.
 Collaborated with data engineers to implement and optimize data indexing and search
capabilities, enabling faster and more accurate data retrieval.

TECHNICAL SKILLS
Technical Skills Versions / Tools / Software

Python (v3.7 - v3.9), Pandas, NumPy, SciPy


Matplotlib, Seaborn Jupyter Notebook,
Data Analysis RStudio

Scikit-Learn, TensorFlow Keras, PyTorch Scikit-


Machine Learning Algorithms and Models Learn, XGBoost, LightGBM

Data Visualization Tableau, Power BI Plotly, D3.js ggplot2

Data Analysis and Visualization Python, R

Time Series Analysis and Forecasting ARIMA, Prophet

Data Engineering Apache Kafka, Airflow ETL Processes

Data Warehousing Snowflake, Redshift, Teradata

Database Management Systems MySQL, PostgreSQL

Cloud Computing AWS, Azure, GCP

Data Wrangling Pandas, dplyr, data.table

Version Control Git (v2.x), SVN

Scripting Languages Shell Scripting, Perl, Ruby


EMPLOYMENT EXPERIENCE
Abbvie
Remote (Dec’2021 - Present)
AI/ML Associate Architect

 Actively involved in the design and architectural planning of AI/ML solutions. This includes
creating detailed technical specifications for AI models, defining data pipelines, and ensuring
compatibility with TensorFlow, emphasizing the use of neural networks, recurrent neural
networks (RNNs), and convolutional neural networks (CNNs).
 Development and training of AI/ML models using Python, leveraging libraries like NumPy,
Pandas, and Scikit-Learn. This involved optimizing hyperparameters, choosing appropriate
loss functions, and implementing early stopping strategies to improve model accuracy and
convergence.
 Utilizing Keras, implementation of deep learning techniques for various applications,
including natural language processing (NLP), image recognition, and time series forecasting.
Employing deep learning models such as LSTM, GRU, and CNN, achieving state-of-the-art
results on complex tasks.
 Assessment of the suitability of PyTorch for specific AI projects, comparing its performance
and ease of use against other frameworks. This involved conducting benchmark tests and
identifying scenarios where PyTorch proved advantageous.
 Design and implementation of containerized solutions for deploying AI models to various
environments. These containers ensure consistency in model deployment, enabling
seamless integration into production systems.
 Expertise in deploying AI/ML models in various production environments, including cloud
platforms (AWS, Azure, GCP) and on-premises setups, ensuring optimal performance and
scalability.
 Proficient in leveraging AWS services for scalable and efficient data processing, including but
not limited to Amazon S3 for storage and AWS Glue for ETL workflows.
 Integration of AI/ML projects into CI/CD pipelines for automated testing, building, and
deployment. This ensures that AI models are continuously validated and deployed
efficiently.
 Work on scalability and performance optimization for AI applications, utilizing Apache Spark
for distributed data processing. This enables handling large datasets and improving model
training speed.
 Proficient in designing and managing Oracle databases for efficient data storage and
retrieval.
 Utilization of NLTK for text preprocessing, tokenization, and sentiment analysis.
Implementation of custom NLP models for tasks like named entity recognition and
document classification.
 In computer vision applications, employment of OpenCV for image processing, object
detection, and feature extraction. This includes implementing techniques like Haar cascades
and SIFT for various image-related tasks.
 Setup of monitoring systems to track the performance and health of AI/ML models in
production. This proactive approach allows for quick identification and resolution of issues.
 Maintenance of detailed technical documentation using Confluence, providing
comprehensive information on model architectures, data pipelines, and best practices. This
documentation facilitates knowledge sharing within the team.
 Use of JIRA for project management, task tracking, and collaboration with cross-functional
teams. This ensures that AI/ML projects align with organizational goals and timelines.
 Playing a role in ensuring AI/ML solutions comply with ethical guidelines and regulations,
including GDPR and fairness principles. This involved developing data anonymization
techniques and model bias reduction strategies.
 Employing TensorBoard for model visualization and performance tuning. Analyzing training
metrics and visualizing model architectures helps in fine-tuning AI/ML models for better
results.
 Contributed to the development of data access layers, employing SQL and MongoDB, to
ensure the smooth integration of data storage and retrieval functionalities within web
applications and data science pipelines.
 Utilized Snowflake's Data Sharing feature to securely share live data across business units
without needing to move or copy it, enhancing inter-departmental data sharing.
 Identification and resolution of issues using Bugzilla for efficient bug tracking and
management throughout the AI/ML development lifecycle.

Environment: TensorFlow, Python, Keras, PyTorch, Docker, Jenkins, Snowflake, Apache Spark, NLTK,
OpenCV, Prometheus, Confluence, JIRA, and Bugzilla.

Amdocs
New York City, NY (Apr’2019 –
Nov’2021)
Sr. Data Scientist

 Developed predictive models using Python and scikit-learn to improve demand forecasting
accuracy.
 Employed deep learning techniques with TensorFlow to create a recommendation engine,
boosting user engagement e-commerce platform.
 Collaborated with the DevOps team to deploy machine learning models using Kubernetes and
Docker, ensuring scalability and reliability.
 Conducted exploratory data analysis (EDA) using Pandas and NumPy to identify data patterns
and anomalies, enhancing data quality and insights.
 Developed and maintained data pipelines using Apache Airflow to automate data ingestion,
transformation, and loading processes.
 Effectively utilized Azure Databricks for big data analytics and large-scale machine learning
workloads, significantly improving data processing speed.
 Implemented natural language processing (NLP) algorithms with NLTK and spaCy for
sentiment analysis, resulting in a sentiment score accuracy.
 Collaborated with cross-functional teams to define key performance indicators (KPIs) and
designed interactive Tableau dashboards for real-time monitoring and decision-making.
 Optimized model hyperparameters using grid search and random search techniques,
improving model performance by 10% and reducing training time.
 Conducted A/B tests using Apache Spark for evaluating new features, leading to data-driven
product improvements and increased user engagement.
 Utilized Apache Hadoop and Hive for big data processing and analysis, handling large datasets
efficiently.
 Building/Maintaining Docker container clusters managed by Kubernetes, Linux, Bash, Python,
GIT, Docker, on GCP. Utilized Kubernetes and Docker for the runtime environment of the
CI/CD system to build test and deploy.
 Demonstrated expertise in AWS Big Data services, such as Amazon EMR for distributed data
processing and Amazon Redshift for high-performance data warehousing.
 Developed data models and schemas suitable for MongoDB's document-oriented
architecture.
 Integrated machine learning models into production systems through RESTful APIs using Flask
and Django, ensuring seamless deployment.
 Collaborated with data engineers to maintain data lakes and data warehouses, utilizing HDFS
and Amazon Redshift for data storage and retrieval.
 Employed version control with Git and GitFlow for managing code repositories, promoting
collaboration and code consistency within the team.
 Conducted feature engineering and selection techniques, including Principal Component
Analysis (PCA) and Recursive Feature Elimination (RFE), to improve model accuracy.
 Led a team of data scientists in cross-validation and time-series analysis, incorporating
models like XGBoost and LightGBM for better predictive accuracy.
 Collaborated on the development of anomaly detection algorithms using Isolation Forest and
One-Class SVM, detecting anomalies in real-time sensor data with 95% accuracy.

Environment: Tools and technologies used during this period included Python, scikit-learn,
TensorFlow, Kubernetes, Docker, Pandas, Tableau and Apache.

Breezline
Bellevue, WA (Feb’2016 –
Mar’2019)
Data Scientist

 Utilized Python to conduct data analysis and modeling, employing libraries such as Pandas
and NumPy for data manipulation and computation. Employed Jupyter Notebook for
interactive data exploration and algorithm development.
 Employed Scikit-learn for machine learning tasks, such as classification and regression.
Applied statistical techniques and feature engineering to build predictive models and fine-
tuned hyperparameters to enhance model performance.
 Managed and transformed large datasets using Apache Hadoop and HDFS for distributed
storage, as well as Apache Spark for distributed data processing. Wrote PySpark code to
efficiently extract insights from big data.
 Designed and developed deep learning models using TensorFlow and Keras for tasks like
image recognition and natural language processing. Tuned neural network architectures and
conducted training on GPUs to accelerate processing.
 Utilized SQL (Structured Query Language) for data extraction, transformation, and loading
(ETL) processes. Developed complex queries in PostgreSQL to retrieve and manipulate data
from relational databases.
 Developed and executed big data analytics workflows using tools like Apache Spark within the
Cloudera distribution.
 Constructed interactive and dynamic data visualizations using D3.js and Matplotlib for web-
based dashboards and reports. Employed Plotly for creating interactive graphs within Jupyter
notebooks.
 Employed Git for version control, collaborating with cross-functional teams to manage code
repositories and track changes to data science projects.
 Developed automated data pipelines using Apache Airflow to schedule and orchestrate data
extraction, transformation, and loading processes. Created custom Python operators for
specific tasks.
 Performed natural language processing (NLP) tasks using NLTK and spaCy libraries. Leveraged
pre-trained word embeddings like Word2Vec and GloVe for sentiment analysis and text
classification.
 Implemented dimensionality reduction techniques such as Principal Component Analysis
(PCA) and t-SNE (t-Distributed Stochastic Neighbor Embedding) for feature engineering and
data visualization.
 Conducted A/B testing (split testing) using Python and tools like SciPy to assess the impact of
data-driven decisions on product or process improvements.
 Collaborated with the DevOps team to deploy machine learning models using Docker
containers. Worked on model versioning and model serving through RESTful APIs.
 Employed Linux as the primary operating system for development and production
environments, leveraging shell scripting for automation and server maintenance.
 Employed Elasticsearch for real-time data indexing and search functionality, enhancing data
retrieval capabilities within applications.
 Managed and secured data with Apache Kafka for real-time event streaming, processing, and
analytics.
 Proficient in designing and implementing robust data solutions, with expertise in MongoDB
for handling unstructured and semi-structured data, and SQL for managing structured
relational databases.
 Conducted data cleaning and preprocessing using PyTorch and TensorFlow Transform for
building robust machine learning pipelines.
 Experience in deploying and managing data storage solutions such as Apache HBase within
the Cloudera ecosystem.
 Maintained documentation and knowledge sharing within the team using Confluence and
integrated automated reporting with Slack and email notifications for project updates.
 Leveraged Tableau for creating interactive data visualizations and dashboards for business
stakeholders.

Environment: Python, Pandas, NumPy, Scikit-learn, Apache Hadoop, Apache Spark, TensorFlow,
Matplotlib, Apache Kafka, TensorFlow Transform, Confluence, Slack, and Tableau.

Mitre Corporation
McLean, VA (Dec’2014 –
Jan’2016) AI/ML Engineer

 Developed machine learning models using TensorFlow for tasks such as image classification,
natural language processing, and recommendation systems. Implemented deep neural
networks, including convolutional and recurrent networks, to improve model accuracy.
 Cleaned and prepared large datasets for model training, ensuring data quality and
consistency. Utilized Python and Pandas for data wrangling, feature engineering, and data
augmentation.
 Implemented machine learning algorithms for supervised and unsupervised learning tasks.
Fine-tuned hyperparameters and optimized model performance through cross-validation.
 Conducted feature selection to improve model efficiency and interpretability. Employed
techniques like Recursive Feature Elimination (RFE) and feature importance analysis with
decision trees.
 Assessed model performance using Python, creating visualizations of metrics such as
accuracy, precision, recall, and F1-score. Conducted A/B testing to evaluate model
improvements.
 Integrated machine learning models into production systems using Python and Flask, creating
RESTful APIs for real-time predictions. Ensured seamless model deployment and maintained
version control.
 Leveraged NLTK to preprocess and analyze text data, including tokenization, part-of-speech
tagging, and sentiment analysis. Developed chatbots and text classification models.
 Utilized OpenCV for computer vision tasks, including image processing, object detection, and
optical character recognition (OCR). Designed solutions for image-based automation.
 Worked with Hadoop and Spark to handle and process large-scale datasets. Developed
distributed data pipelines for parallel processing and feature extraction.
 Implemented monitoring solutions in Python to track model performance in real-time. Set up
alerts and triggers for model retraining.
 Collaborated with cross-functional teams using JIRA for project management and Confluence
for documentation. Documented model architectures, data flows, and best practices.
 Automated model testing and deployment pipelines. Ensured consistent and reliable model
updates and releases.
 Managed codebase and model versions using Git. Collaborated with team members to
resolve code conflicts and maintain a clean code repository.
 Optimized deep learning algorithms for efficiency, leveraging NumPy for array manipulation.
 Orchestrated and containerized machine learning workloads using Docker and managed
container clusters for scalability and reliability.
 Employed SHAP for model interpretability, explaining model predictions and ensuring
compliance with regulatory requirements.
 Addressed privacy concerns by implementing differential privacy techniques.
 Regularly communicated project progress and results to stakeholders, translating technical
findings into actionable business insights.

Environment: Scikit-Learn, Matplotlib, Flask, NLTK, OpenCV, JIRA, Confluence, Jenkins, Git, NumPy,
Docker and SHAP

SoftSol
Hyd,India (Apr’2013 –
Aug’2014) Machine Learning Associate

 Developed and maintained Java applications, ensuring compatibility and performance


optimization.
 Collaborated with the development team to design and implement software solutions,
following Agile methodologies and version control with Git.
 Worked with JavaEE for building robust and scalable web applications, implementing RESTful
web services using JAX-RS.
 Utilized the Spring Framework for building the application's core components, configuring
Spring beans, and handling data access using Spring Data JPA.
 Designed and implemented complex database schemas and queries using Oracle Database,
optimizing performance through SQL tuning.
 Integrated Hibernate for Object-Relational Mapping (ORM) to simplify data access and ensure
data integrity.
 Developed and maintained front-end components using JavaServer Pages (JSP) and
HTML/CSS, enhancing the user interface functionality and user experience.
 Implemented security measures using Spring Security to control access and protect sensitive
data, including authentication and authorization.
 Utilized Apache Maven for project build automation, managing dependencies, and creating
executable JAR files.
 Performed code reviews and collaborated with the Quality Assurance (QA) team to identify
and fix bugs, ensuring high-quality software.
 Created and maintained documentation for the codebase, APIs, and system architecture,
enhancing team collaboration and knowledge sharing.
 Utilized JUnit and Mockito for unit testing, ensuring the reliability and correctness of the
codebase.
 Investigated and resolved production issues, providing timely support and troubleshooting
using logs and monitoring tools.
 Optimized application performance using profiling tools like VisualVM and YourKit, identifying
bottlenecks and memory leaks.
 Deployed applications on Apache Tomcat, configuring the server for optimal performance
and scalability.
 Collaborated with system administrators to manage the operating system servers, ensuring
seamless application deployment and maintenance.
 Implemented Continuous Integration (CI) to automate the build, test, and deployment
process, increasing development efficiency.
 Managed application configurations, enabling dynamic configuration updates.
 Integrated and optimized third-party libraries and APIs, such as Apache POI for Excel file
manipulation and Log4j for logging.
 Worked with Java Message Service (JMS) for asynchronous messaging and ensured reliability
using Apache ActiveMQ.

Environment: Hibernate, Oracle Database, Java Server Pages, HTML/CSS, Spring Security, Apache
Maven, JUnit, Mockito and Apache Tomcat.

EDUCATION AND PROFESSIONAL DEVELOPMENT

 B.Sc in computer Science, JNTUH University, India. (2013)

You might also like