SDET
SDET
SDET
Professional Summary:
Data Analyst with Over 6 years of experience in Data Extraction, Data Screening, Data Cleaning, Data
Exploration, Data Visualization and Statistical Modelling of varied datasets, structured and unstructured,
as well as implementing large-scale Machine Learning and Deep Learning Algorithms to deliver resourceful
insights and inferences significantly impacting business revenues and user experience.
Experienced in facilitating the entire lifecycle of a data science project: Data Extraction, Data Pre-
Processing, Feature Engineering, Algorithm Implementation & Selection, Back Testing and Validation.
Skilled in using Python libraries NumPy, Pandas for performing Exploratory Data Analysis.
Proficient in Data transformations using log, square-root, reciprocal, cube root, square and complete
box-cox transformation depending upon the dataset.
Adept at handling Missing Data by exploring the causes like MAR, MCAR, MNAR and analyzing
Correlations and similarities, introducing dummy variables and various Imputation methods.
Experienced in Machine Learning techniques such as Regression and Classification models like Linear
Regression, Logistic Regression, Decision Trees, Support Vector Machine using scikit-learn on Python.
In-depth Knowledge of Dimensionality Reduction (PCA, LDA), Hyper-parameter tuning, Model
Regularization (Ridge, Lasso, Elastic net) and Grid Search techniques to optimize model performance.
Skilled at Python, SQL, R and Object-Oriented Programming (OOP) concepts such as Inheritance,
Polymorphism, Abstraction, Encapsulation.
Working knowledge of Database Creation and maintenance of Physical data models with Oracle, DB2
and SQL server databases as well as normalizing databases up to third form using SQL functions.
Experience in Web Data Mining with Python’s ScraPy and Beautiful Soup packages along with working
knowledge of Natural Language Processing (NLP) to analyze text patterns.
Proficient in Natural Language Processing (NLP) concepts like Tokenization, Stemming, Lemmatization,
Stop Words, Phrase Matching and libraries like SpaCy and NLTK.
Experienced in developing Supervised Deep Learning algorithms which include Artificial Neural
Networks, Convolution Neural Networks, Recurrent Neural Networks, LSTM, GRU and Unsupervised
Deep Learning Techniques like Self-Organizing Maps (SOM’s) in Keras and TensorFlow.
Skilled at Data Visualization with Tableau, PowerBI, Seaborn, Matplotlib, ggplot2, Bokeh and
interactive graphs using Plotly & Cufflinks.
Experience including analysis, modeling, design, and development of Tableau reports and dashboards
for analytics and reporting applications
Knowledge of Cloud services like Amazon Web Services (AWS) and Microsoft Azure ML for building,
training and deploying scalable models.
Highly proficient in using T-SQL for developing complex Stored Procedures, Triggers, Tables, Views,
User Functions, User profiles, Relational Database Models and data Integrity, SQL joins and query
Writing.
Proficient in using PostgreSQL, Microsoft SQL server and MySQL to extract data using multiple types of
SQL Queries including Create, Join, Select, Conditionals, Drop, Case etc.
Education:
Texas A&M University, Kingsville GPA – 4.0 Masters in Mechanical Engineering (2016)
JNT University, Hyderabad GPA – 4.0 Bachelor’s in mechanical engineering (2013)
Technical Skills:
Professional Experience:
Responsibilities:
Utilized Python's data visualization libraries like Matplotlib and Seaborn to communicate findings to the
data science, marketing and engineering teams.
Conducted Data blending, Data Preparation using Python for Tableau consumption and published data
sources to Tableau server.
Performed univariate, bivariate and multivariate analysis on the BMI, age and employment to check
how the features were related in conjunction to each other and the risk factor.
Trained several machine learning models like Logistic Regression, Random Forest and Support vector
machines (SVM) on selected features to predict Customer churn.
Worked on Statistical methods like data driven Hypothesis Testing and A/B Testing to draw inferences,
determined significance level and derived P-value, and to evaluate the impact of various risk factors.
Worked on Data Cleaning and ensured Data Quality, consistency and integrity using Pandas and
Numpy.
Created Database designs through data-mapping using ER diagrams and normalization up to the 3 rd
normal form and extracted relevant data whenever required using joins in PostgreSQL and Microsoft
SQL Server.
Created multiple custom SQL queries in MySQL Workbench to prepare datasets for Tableau
dashboards and retrieved data from multiple tables using join conditions to efficiently extract data for
Tableau workbooks.
Implemented and tested the model on AWS EC2 and collaborated with development team to get the
best algorithms and parameters.
Prepared data-visualization designed dashboards with Tableau, and generated complex reports
including summaries and graphs to interpret the findings to the team.
Environment: Python (NumPy, Pandas, Matplotlib, Sk-learn), AWS, Jupyter Notebook, Tableau, SQL
Citibank, Irving, TX Feb ’17 – Mar ‘18
Senior Data Analyst
Responsibilities:
Involved in Data Profiling to learn about user behavior and merge data from multiple data sources.
Designing and developing various machine learning frameworks using Python and R.
Tackled highly imbalanced Fraud Dataset using under-sampling, over-sampling with SMOTE and cost
sensitive algorithms with Python.
Collaborate with data engineers to implement ETL process, write and optimized SQL queries to
perform data extraction from Cloud and merging from Oracle 12c.
Collect unstructured data from PostgreSQL, MongoDB and completed data aggregation.
Conducted analysis of assessing customer consuming behaviors and discover the value of customers
with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means
Clustering and Hierarchical Clustering.
Used pandas, NumPy, Seaborn, Scipy, Matplotlib, SKLearn and NLTK (Natural Language Toolkit),
in Python for developing various machine learning algorithms
Utilized machine learning algorithms such as Decision Tree, linear regression, multivariate regression,
Naive Bayes, Random Forests, K-means, & KNN.
Perform data integrity checks, data cleaning, exploratory analysis and feature engineer using R 3.4.
Worked on different data formats such as JSON, XML and performed machine learning algorithms in R
Perform data visualizations with Tableau and generated dashboards to present the findings.
Work on Text Analytics, Naïve Bayes, Sentiment analysis, creating word clouds, and retrieving data
from Twitter and other social networking platforms
Used Git2.6 to apply version control. Tracked changes in files and coordinated work on the files among
multiple team members.
Environment: Python, Tableau, R, MySQL, MS SQL Server, AWS, S3, EC2, , RNN, ANN
Responsibilities:
Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive
innovative business solutions.
Perform preliminary data analysis and handle anomalies such as missing, duplicates, outliers, and
imputed irrelevant data.
Remove outliers using Proximity Distance and Density based techniques.
Involved in Analysis, Design and Implementation/translation of Business User requirements.
Experienced in using supervised, unsupervised and regression techniques in building models.
Performed Market Basket Analysis to identify the groups of assets moving together and
recommended the client their risks
Implemented techniques like forward selection, backward elimination and step wise approach for
selection of most significant independent variables.
Performed Feature selection and Feature extraction dimensionality reduction methods to figure out
significant variables.
Performed Exploratory Data Analysis using R. Also involved in generating various graphs and charts for
analyzing the data using Python Libraries.
Involved in the execution of multiple business plans and projects Ensures business needs are being
met Interpret data to identify trends to go across future data sets.
Developed interactive dashboards, created various Ad Hoc reports for users in Tableau by connecting
various data sources.
Environment: Python, SQL server, Sqoop, Mahout, MLLib, MongoDB, Tableau, ETL.
Cloudpolitian Technologies India Pvt Limited Hyderabad, India Aug ’13 – Jun ‘15
Data Analyst
Responsibilities:
Collaborated with data engineers to implement ETL process, wrote and optimized SQL queries to
perform data extraction and merging from Oracle
Performed data integrity checks, data cleaning, exploratory analysis and feature engineer using R and
Python
Developed personalized product recommendation with Machine Learning algorithms, including
Gradient Boosting Tree and Collaborative filtering to better meet the needs of existing customers and
acquire new customers
Used Python to implement different machine learning algorithms, including Generalized Linear Model,
Random Forest, SVM, Boosting and Neural Network
Worked on data cleaning, data preparation and feature engineering with Python, including Numpy,
Scipy, Matplotlib, Seaborn, Pandas, and Scikit-learn
Recommended and evaluated marketing approaches based on quality analytics on customer
consuming behavior
Performed data visualization and Designed dashboards with Tableau and provided complex reports,
including charts, summaries, and graphs to interpret the findings to the team and stakeholders.
Identified process improvements that significantly reduce workloads or improve quality
Environment: R Studio, Python, Tableau, SQL Server and Oracle.