Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
19 views

Automated-Diamond-Price-Prediction-Using-Machine-Learning

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Automated-Diamond-Price-Prediction-Using-Machine-Learning

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Automated Diamond Price Prediction Using Machine Learning

Abirami R and Agniswar P


SRM University AP
ABSTRACT METHODOLOGY RESULTS CONCLUSION
The price of diamonds has been extremely volatile over the last The tools used here are Jupyter lab, Matplotlib, The regression scatters plot in Fig. 2 explains the The prediction of diamond price is extremely important to
century. Investing in diamonds has been extremely fruitful only Numpy, Seaborn and Sklearn. The interactive development relationship between color and price with respect to carat know while investing. It reduces dependency on people and
for some, and for the others, it seems like a gamble. In the platform we used Jupyter lab to write codeand visualize and x. Color has been put into seven categories where 1 is hearsay. In this paper, a detailed study and research has
current paper, we present a machine learning-based method data. Jupyter lab is a flexible environmentto arrange and the best and 7 is the worst color. Price and carat are been made on how to make a better prediction model to
to predict the price of diamonds to prevent human error. With configure data suitable for machinelearning and data linearly proportional, since if the number of carats increases determine diamond price, based on factors other than well
science projects. The main advantageis to write plugins
an accuracy of98% using Random Forest Regression, there are then the price also increases and vice versa. known 4Cs.
and new components and integratethem with Crontab.
much lesser chances of losing the investment. The proposed
Matplotlib is a python library used to visualize
machine learning-based prediction model uses and animation to analyze the given data. Numpy is a
Linear regression, Lasso Regression, Support Vector python library used to work with arrays to work faster than RECOMMENDATIONS
Regression, and Random Forest. The proposed method gives lists. Pandas is a powerful tool that is used for data
the most accurately predicted value. We have also added a analysis. It is also a flexible manipulation tool built on [1]anwala, S. (2020, May 14). Regression-based
feature to automate the process using Crontab, this would Python for data cleaning, merging, reshaping, selecting,
machinelearning approaches for diamond price prediction!
retrain the model before the diamond market opens to the and wrangling. Seaborn is a library built on matplotlib and is
Medium.https://medium.com/@sp7091/regression-
most accurate value. used tomake statistical graphs and high-level visualization.
approaches-to-predict-diamond-price-258478a485c9
Itbuilds on top of matplotlib and integrates closely
INTRODUCTION withpandas data structures. It has inbuilt features that help
visualize data clearly with minimum code. Sklearn is a
[2] 4Cs of Diamond Quality by GIA — Learn about
DiamondBuying — What are the Diamond 4Cs. (2019,
machine learning library built using Python. It has various August 22).GIA 4Cs. https://4cs.gia.edu/en-us/4cs-
Diamonds are one of the most valued gems in theworld. It
features like classification, regression, and clustering diamond-quality/
is also one of the most expensive gems andhence has an
algorithms. [3] G. Sharma, V. Tripathi, M. Mahajan and A. Kumar
extremely volatile price. The value of diamonds depends
Srivastava,”Comparative Analysis of Supervised Models for
upon their structure, cut, inclusions (impurity), carats, and Crontab is used to run a regular schedule to manage a list DiamondPrice Prediction,” 2021 11th International
many other features. The usesof diamonds are many, such and this is done with the help of a list of commands.
Conference onCloud Computing, Data Science
as in industries, as theyare effective in cutting, polishing, Crontab, short for cron table is a job scheduler to execute
Engineering (Confluence),Noida, India, 2021, pp. 1019-
and drilling. Since diamonds are extremely valuable, they tasks. In this case, we use it to retrain the model for more
1022, doi: 10.1109/Conflu-ence51648.2021.9377183.
havebeen traded across different countries for centuries accurate predictions every day. @daily cron keyword is
[4] 4Cs of Diamond Quality by GIA — Learn about
nowand this trade only increases with time. They are used, this will create a log file everyday and purge using
cleanup-logs shell script at 08:00 every day, before the DiamondBuying — What are the Diamond 4Cs. (2019,
graded and certified based on the “four Cs”, which are color, The four algorithms used here are Linear Regression,
diamond exchange opens. August 22).GIA 4Cs. https://4cs.gia.edu/en-us/4cs-
cut, clarity, and carat. Color, Clarity, Cut, and Carat weight. Lasso Regression, Support Vector Machine, and Random diamond-quality/
These are the only metrics that are beingused to the quality of Forest. The results are tabulated below with their [5] S. Ray, ”A Quick Review of Machine Learning Algo-
diamonds and sets the price of thediamond. This metric allows respective accuracy scores. From the table, we can see rithms,” 2019 International Conference on Machine Learn-
uniform understanding forpeople across the world to buy that Random Forest has the most accurate results since it ing, Big Data, Cloud and Parallel Computing
diamonds, which allowsease of trade and value for what is makes use of multiple trees, with 98% accuracy. Secondly, (COMITCon),Faridabad, India, 2019, pp. 35-39, doi:
purchased. Diamond prices are usually set for the day are lasso regression has an accurate result almost equal to 10.1109/COMIT-Con.2019.8862451
traded in US Dollars. To better predict the price of diamonds, linear regression. Support Vector Regression(SVR) has the [6] I. Kumar, K. Dogra, C. Utreja and P. Yadav, ”A
the Kaggle diamond dataset is used and a scatterplot of worst result of 68% accuracy. The shows the r2 score is ComparativeStudy of Supervised Machine Learning
metrics such as carats, price, and the coloris used to 90% which means the score is good and the prediction is Algorithms for StockMarket Trend Prediction,” 2018
understand the nature of their relationships. The more obvious very accurate. The mean Square error is 0.151 which Second International Confer-ence on Inventive
thought is that there is a strong relationship between carat means that we have a very small margin of error since 0 is Communication and Computational Tech-nologies
and price, but it is observed that this trend does not seem to no error and1 is a high margin of error. Mean Absolute (ICICCT), 2018, pp. 1003-1007, doi: 10.1109/ICI-
hold true anymore, and thus leading to higher volatility in the Error is around $805 which is a very small error since the CCT.2018.8473214.
price of diamonds. This machine learning model analyses more values of diamonds are in hundreds and thousands of [7] LinearRegression (n.d.).LinearRegression.http://www.st
than 4 features and can thus produce a more accurate dollars. Explained Variance Score is 0.905 which is almost at.yale.edu/Courses/1997-98/101/linreg.htm[11] Lasso
result. When the price chart is plotted in a graph, it leads to 1, which is the best score. Regression: Simple Definition. (2020, September16).
various formations such as pennants, wedge, flags, double
Statistics How To. https://www.statisticshowto.com/lasso-
bottoms and tops. These formations often used in the currency
regression/
markets,as well as many other trading markets, like the
diamond market. The software used here are Jupyter Notebook ACKNOWLEDGEMENTS
(anaconda navigator), NumPy which is an array
processingpackage, Pandas which is a data manipulation tool, The authors would like to thank SRM University APfor this
Scikit learn which is a Machine Learning library, Matplotlib and unique opportunity, to help conduct research onthis topic
Seaborn which are Data Visualization tools and Crontab and present a fruitful solution.
which is a scheduling tool

You might also like