M Tech Dissertations
Permanent URI for this collectionhttp://drsr.daiict.ac.in/handle/123456789/3
Browse
7 results
Search Results
Item Open Access Explanations by Counterfactual Argument in Recommendation Systems(Dhirubhai Ambani Institute of Information and Communication Technology, 2023) Pathak, Yash; Rana, ArpitIn recent advances in the domains of Artificial Intelligence (AI) and MachineLearning (ML), complex models are used. Due to their complexity and approaches,they have black box type of nature and raise the question of a trustworthy for decisionprocess especially in the high cost decisions scenario. To overcome thisproblem, users of these systems can ask for an explanation about the decisionwhich can be provided by system in various ways. One way of generating theseexplanations is by the help of Counterfactual (CF) arguments. Although there is adebate on how AI can generate these explanations, either by Correlation or CausalInference, in Recommendation Systems (RecSys) the aim is to generate these explanationswith minimum Oracle calls and have near optimal length (eg., in termsof interactions) of provided explanations. In this study we analyze the nature ofCFs and different methods (eg., Model Agnostic approach, Genetic Algorithms(GA)) to generate them along with the quality measures. Extensive experimentsshow that the generation of CFs can be done through multiple approaches andselecting optimal CFs will improve the explanations.Item Open Access Anomalies Detection in Radon Time Series for Earthquake Prediction Using Machine Learning Techniques(Dhirubhai Ambani Institute of Information and Communication Technology, 2023) Gorasiya, Raghav; Chaudhury, BhaskarRadioactive soil and water radon gas emission is a significant precursor to earthquakes.The meteorological parameters such as temperature, pressure, humidity,rainfall, and windspeed influence the radon gas emission from the medium suchas soil and water. In this study, radioactive soil radon gas has been investigatedfor earthquake prediction. Before the seismic events, radon gas emission is also affectedby seismic energies. These seismic energies are responsible for the changesinside the earth�s crust, which causes earthquakes on earth. Our focus in this workis first to predict the radon gas concentration using Machine Learning algorithmsand then identify anomalies before and after the seismic events using standardconfidence interval methods. We experimented with different machine learningmodels for the detailed comparative study of radon concentration predictions. Adataset is divided into different settings of training and testing data. Testing dataincludes the seismic samples only. The models are trained on non-seismic daysamples and some of the seismic day samples and tested on seismic day samples.After acceptable predictions, anomaly detection can be done on test data.A simple mean plus two standard deviations away test has been used to identifythe original measured radon values, which are out of this prediction confidenceinterval. These values are then considered as an anomalyItem Open Access Performance and power prediction on disparate computer systems(2020) Amrutiya, AdityaPerformance and Power prediction is an active area of research due to its applications in the advancements of hardware-software co-development. Several empirical machine-learning models such as linear models, tree-based models, neural network etc are used for evaluating the performance of machine learning models. Furthermore, the prediction model’s accuracy may differ depending on performance data collected for different software types (compute-bound, memorybound) and different hardware (simulation-based or physical systems).Our results for performance prediction show that the tree-based machine-learning models outperform all other models with median absolute percentage error (MedAPE) of less than 5% consisting of bagging and boosting models that help to improve weak learners. We have also observed that in physical systems, the prediction accuracy of memory-bound applications is higher as compared to compute-bound algorithms due to manufacturer variability in processors. Moreover, the prediction accuracy is higher on simulation-based hardware due to its deterministic nature as compared to physical systems. We have used transfer learning for solving two problems cross-platform prediction and cross-systems prediction. Our result shows the prediction error of 15% in case of cross-systems prediction whereas in case of the cross-platform prediction error of 17% for simulationbased X86 to ARM system using best performing tree-based machine-learning model. For the prediction of the power consumption along with that of performance we have employed several machines learning univariate or multivariate models in our experiments. Our result shows that runtime and power prediction accuracy of more than 80% and 90% respectively is achieved for multivariate deep neural network model in cross-platform prediction. Similarly, for cross-system prediction runtime accuracy of 90% and power accuracy of 75% is achieved for the multivariate deep neural network.Item Open Access Machine learning in financial data EPS estimates(2020) Sharma, Rohan; Joshi, M.V.The project “EPS Estimates” is as the name suggests a work on Earnings Per Share figures released by companies annually and quarterly. The whole project is intended to come up with a better consensus methodology for EPS Estimates given by different brokers and give the clients a better idea of what the EPS figures will be like. There are various statistical methods and machine learning models used for the purpose and a comparison is done between them in this report. The details about the intuition behind the models, their shortcomings and some insights behind them are included in this report.Item Open Access ML-based clients prioritization and ranking algorithm(2020) Sharma, Rajat; Sasidhar, P S KalyanKristal.AI is an AI-powered DigitalWealth Management Platform. It is one of the leading firms in the Fin-tech industry, which provide its customers a platform for wealth investments, It has a very well experienced committee for handling customers queries and also has an AI-driven advisory algorithm that recommends portfolios to the customers according to their profile. As the company has stepped into the AI-driven world, it wants to implement one AI-driven algorithm for it’s clients prioritization and Ranking, so that Relation Management team of the company can focus more on more potential users of the company’s platform rather than just hovering around users who may not be worth of time, as there are also users who just do the sign up for the sake of curiosity but do not want to enroll themselves as the authenticated clients of the company. To tackle this problem there is a need of one AI-based automated algorithm which filters the more potential users from the data and ranks them according to their likelihood of becoming the company’s authenticated Registered KYC approved client. I with the Data Science team of the company has tackled this problem by creating one Machine Learning based client prioritization and ranking algorithm that takes raw company’s data as input on a daily basis and generates a list of clients with their corresponding ranks in which they are to be followed, and for this, weeks of Exploratory Data Analysis had been done to select the crucial features and One Regression Model(Gaussian Process Regression) was created and optimized to give the desired output. This model gave an accuracy of about 82% and a precision of about 84% over the test set.Item Open Access Performance and power modeling on disparate computer systems using machine learning(2020) Kumar, Rajat; Mankodi, AmitPerformance and Power prediction is an active area of research due to its applications in the advancements of hardware-software co-development. We have performed experiments to evaluate the performance of several machine learning models. Our results for performance prediction show that the tree-based machine-learning models outperform all other models with median absolute percentage error (MedAPE) of less than 5% followed by bagging and boosting models that help to improve weak learners. We have collected performance data both from simulation-based hardware as well as from physical systems and observed that prediction accuracy is higher on simulation-based hardware due to its deterministic nature as compared to physical systems. Moreover, in physical systems, prediction accuracy of memory-bound applications is higher as compared to compute-bound algorithms due to manufacturer variability in processors. Furthermore, our result shows the prediction error of 15% in case of crosssystems prediction whereas in case of the cross-platform prediction error of 17% for simulation-based X86 to ARM prediction and 23% for physical Intel Core to Intel-Xeon system using best performing tree-based machine-learning model. We have employed several machine learning univariate or multivariate models for our experiments. Our result shows that runtime and power prediction accuracy of more than 80% and 90% respectively is achieved for multivariate deep neural network model in cross-platform prediction. Similarly, for cross-system prediction runtime accuracy of 90% and power accuracy of 75% is achieved for the multivariate deep neural network.Item Open Access VIU content access layer intelligent & flexible content selection(2020) Marakana, Meet; Banerjee, AsimFor OTT media streaming products like VIU, it is really important to increase the consumption of media content as much as possible. To get the highest benefit, the user must stay on the platform and consume numerous content. To survive in markets where too many competitors are there as the Indian market, this problem is essential to resolve. The problem is to increase the engagement time between the customer and platform, which can be solved by augmenting the content selection. To solve the problem, the company should customize its homepage in favour of user appealing content. Also, the system must behave dynamically as all users have a different preference. By executing this approach, we can improve the engagement time of the users, and hence solved our problem. CAL is the solution to our problem, and it manages all the issues that we had in the past. Now, the users will get the preferred content from the combination of various content selectors, which can select content based on user preference. Trending APIs, recommendation APIs, and BecauseYouHaveWatched APIs are known as content selectors which used for generating intelligent content selection for the user. We are trying to build a system that will give intelligent and flexible content selection. It aims for flexible consumption patterns. It supports plug and plays models for additional content selection algorithms which means no need for updating the system when new content selector service will join the system in the future. To provide the plug and play feature, the use of a discovery service is necessary. I have developed the content selector registry, which is a discovery service API. It manages the availability of the content selector that resides inside the Kubernetes cluster. Also, written a Google Cloud Function that will store the data to BigQuery by initiating the DataFlow. Later the Data of BigQuery will use to generate Insights and KPI metrics.