The data which is beyond the storage space of the server and beyond to the processing power is ca... more The data which is beyond the storage space of the server and beyond to the processing power is called Big Data. It is not manageable by traditional RDBMS or conventional statistical tools. Big data increases the storage capacities as well as the processing power. Horizontal scaling or sharding is needed to divide the data set and distributes the data over multiple servers. Redundancy and fault tolerance are achieved by horizontal scaling. Optimization of horizontal scaling is an important aspect of Big Data technology. Instead of using vertical scaling that means upgrading to fancier computers when the current system becomes inadequate, we have to add more node (computers) to a cluster. It increases the parallelism, rather than the performance of any one node. This paper presents the fundamentals of big data analytics but directing toward an analysis of various optimization techniques used in the big data environment.
2018 3rd International Conference for Convergence in Technology (I2CT), 2018
The data which are beyond the storage space of the server and beyond to the processing power is c... more The data which are beyond the storage space of the server and beyond to the processing power is called Big Data. It is not manageable by traditional RDBMS or conventional statistical tools. Big data Increases the storage capacities as well as the processing power. Horizontal scaling or sharding is needed to divide the data set and distributes the data over multiple servers. Redundancy and fault tolerance is achieved by horizontal scaling. Optimization of horizontal scaling is an important aspect of Big Data technology. Instead of using vertical scaling that means upgrading to fancier computers when the current system becomes inadequate, we have to add more node (computers) to a cluster. It increases the parallelism, rather than the performance of any one node. This paper presents the fundamentals of big data analytics but directing towards an analysis of various optimization techniques used in the big data environment.
Proceedings of International Conference on Computational Intelligence and Data Engineering, 2019
Data mining is the practice of mining valuable information from huge data sets. Data mining allow... more Data mining is the practice of mining valuable information from huge data sets. Data mining allows the users to have perceptions of the data and make convenient decisions out of the information extracted from databases. The purpose of the engineering colleges is to offer greater chances to its students. Education data mining (EDM) is a process for analyzing the student’s performance based on numerous constraints to predict and evaluate whether a student will be placed or not in the campus placement. The idea of predicting the performance of higher education students can help various institutions in improving the quality of education, identifying the pupil’s risk, upgrading the overall accomplishments, and thereby refining the education resource management for better placement opportunities for students. This research proposes a placement prediction model which predicts the chance of an undergrad student getting a job in the placement drive. This self-analysis will assist in identify...
Recent years have revealed an increasing attention and interest in various countries about the pr... more Recent years have revealed an increasing attention and interest in various countries about the problem of dropout of the students in the school and to find out its chief contributing factors. In our model, we attempt to demonstrate how a specific factor can affect students’ academic life, which subsequently produces dropout among school students. In this paper, we propose a methodology and a specific clustering algorithm to identify the factors that results in dropout among the students at different educational levels, such as primary, secondary, and higher secondary and also their percentage of impact among the students. This research will guide the teachers and school administration to improve this dropout scenario of their school. A solution to this problem can be resolved with the use of educational data mining (EDM).
Big Data, as we all know, is becoming a new technological trend in the industries, in science and... more Big Data, as we all know, is becoming a new technological trend in the industries, in science and even businesses. Indefinite data scalability allows organizations to process huge amounts of data in parallel, assisting dramatically decrease the amount of time it takes to manage several amount of work, optimize hardware resource usage and permit the extreme quantity of data per node to be handled. Optimization is to done to attain the finest strategy relative to a set of selected constraints which include maximizing factors such as efficiency, productivity, reliability, strength, and utilization. When the current system becomes insufficient, instead of upgrading it by adding more components to the existing structure you just add more computers to a cluster. This research discusses a hierarchical architecture of Hadoop Nodes namely Name nodes and Data nodes and mainly focuses on the optimization of Data Node by distributing some of its work load to Name Node.
The data which is beyond the storage space of the server and beyond to the processing power is ca... more The data which is beyond the storage space of the server and beyond to the processing power is called Big Data. It is not manageable by traditional RDBMS or conventional statistical tools. Big data increases the storage capacities as well as the processing power. Horizontal scaling or sharding is needed to divide the data set and distributes the data over multiple servers. Redundancy and fault tolerance are achieved by horizontal scaling. Optimization of horizontal scaling is an important aspect of Big Data technology. Instead of using vertical scaling that means upgrading to fancier computers when the current system becomes inadequate, we have to add more node (computers) to a cluster. It increases the parallelism, rather than the performance of any one node. This paper presents the fundamentals of big data analytics but directing toward an analysis of various optimization techniques used in the big data environment.
2018 3rd International Conference for Convergence in Technology (I2CT), 2018
The data which are beyond the storage space of the server and beyond to the processing power is c... more The data which are beyond the storage space of the server and beyond to the processing power is called Big Data. It is not manageable by traditional RDBMS or conventional statistical tools. Big data Increases the storage capacities as well as the processing power. Horizontal scaling or sharding is needed to divide the data set and distributes the data over multiple servers. Redundancy and fault tolerance is achieved by horizontal scaling. Optimization of horizontal scaling is an important aspect of Big Data technology. Instead of using vertical scaling that means upgrading to fancier computers when the current system becomes inadequate, we have to add more node (computers) to a cluster. It increases the parallelism, rather than the performance of any one node. This paper presents the fundamentals of big data analytics but directing towards an analysis of various optimization techniques used in the big data environment.
Proceedings of International Conference on Computational Intelligence and Data Engineering, 2019
Data mining is the practice of mining valuable information from huge data sets. Data mining allow... more Data mining is the practice of mining valuable information from huge data sets. Data mining allows the users to have perceptions of the data and make convenient decisions out of the information extracted from databases. The purpose of the engineering colleges is to offer greater chances to its students. Education data mining (EDM) is a process for analyzing the student’s performance based on numerous constraints to predict and evaluate whether a student will be placed or not in the campus placement. The idea of predicting the performance of higher education students can help various institutions in improving the quality of education, identifying the pupil’s risk, upgrading the overall accomplishments, and thereby refining the education resource management for better placement opportunities for students. This research proposes a placement prediction model which predicts the chance of an undergrad student getting a job in the placement drive. This self-analysis will assist in identify...
Recent years have revealed an increasing attention and interest in various countries about the pr... more Recent years have revealed an increasing attention and interest in various countries about the problem of dropout of the students in the school and to find out its chief contributing factors. In our model, we attempt to demonstrate how a specific factor can affect students’ academic life, which subsequently produces dropout among school students. In this paper, we propose a methodology and a specific clustering algorithm to identify the factors that results in dropout among the students at different educational levels, such as primary, secondary, and higher secondary and also their percentage of impact among the students. This research will guide the teachers and school administration to improve this dropout scenario of their school. A solution to this problem can be resolved with the use of educational data mining (EDM).
Big Data, as we all know, is becoming a new technological trend in the industries, in science and... more Big Data, as we all know, is becoming a new technological trend in the industries, in science and even businesses. Indefinite data scalability allows organizations to process huge amounts of data in parallel, assisting dramatically decrease the amount of time it takes to manage several amount of work, optimize hardware resource usage and permit the extreme quantity of data per node to be handled. Optimization is to done to attain the finest strategy relative to a set of selected constraints which include maximizing factors such as efficiency, productivity, reliability, strength, and utilization. When the current system becomes insufficient, instead of upgrading it by adding more components to the existing structure you just add more computers to a cluster. This research discusses a hierarchical architecture of Hadoop Nodes namely Name nodes and Data nodes and mainly focuses on the optimization of Data Node by distributing some of its work load to Name Node.
Uploads
Papers by Chandrima Roy