A Comprehensive Analysis of NFHS-5 data for TB in India

Thakkar, Abhishek Mukeshbhai

Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/1168

Title:	A Comprehensive Analysis of NFHS-5 data for TB in India
Authors:	Rana, Arpit Bandyopadhyay, Tathagata Thakkar, Abhishek Mukeshbhai
Keywords:	Tuberculosis disease Human Development Index Public awareness TB prevention
Issue Date:	2023
Publisher:	Dhirubhai Ambani Institute of Information and Communication Technology
Citation:	Thakkar, Abhishek Mukeshbhai (2023). A Comprehensive Analysis of NFHS-5 data for TB in India. Dhirubhai Ambani Institute of Information and Communication Technology. Viii, 61 p. (Acc. # T01109).
Abstract:	This study presents a comprehensive analysis of tuberculosis (TB) in India usingdata from the NFHS-5 (National Family Health Survey) program. The researchbegins by providing a thorough understanding of the DHS (Demographic andHealth Survey) and NFHS programs, followed by an extensive literature reviewof TB-related studies and NFHS-related papers.The findings from the literature review indicate that directing tuberculosiscontrol initiatives toward the poorest 20% of the population may yield more successfuloutcomes compared to targeting the general population or the wealthiest20%. Additionally, an examination of trends in TB incidence and mortality in Indiafrom 1990 to 2019, based on data from the Global Burden of Disease Study2019, reveals significant insights into the country�s TB burden.One notable observation from the literature review is that a substantial proportionof TB patients over 60% have at least one comorbidity, with diabetes emergingas a prominent comorbidity. Furthermore, the study highlights a concerning lackof awareness regarding TB among Indian adults, with only 49.7% of participantsreporting prior knowledge of the disease.The research extensively utilizes complex and large-scale NFHS-5 data. A considerableamount of work is devoted to analyzing household-level data, whichis categorized into three groups based on the Human Development Index, witheach category representing five states. Python�s CSV file processing capabilitiesare employed to handle and process a vast amount of data.To identify the factors that most significantly affect TB, the study compares TBvariables with 402 other variables. However, due to the limited number of TBcases within each category, the researchers calculate the number of TB and non-TB patients per 100,000 people for all variables. This approach provides a betterunderstanding of the relationships between variables and TB incidence.The study goes beyond analysis and prediction, incorporating the developmentof a model to predict an individual�s likelihood of contracting TB. Notably,the data exhibit significant bias, as 99.7% of the cases are non-TB patients. To addressthis imbalance, the Synthetic Minority Over-sampling Technique (SMOTE)is applied to generate synthetic data for the minority class. The researchers thenfocus on the most influential features associated with TB, resulting in a predictionmodel that achieves an accuracy of over 70% in accurately identifying individualsat risk of TB.In summary, this comprehensive analysis of NFHS-5 data sheds light on thetuberculosis landscape in India. The findings emphasize the importance of targetingthe most marginalized populations, highlighting the prevalence of comorbiditiessuch as diabetes among TB patients, underscoring the need for increasedpublic awareness and showcasing the potential of data-driven prediction modelsin improving TB control and prevention efforts.
URI:	http://drsr.daiict.ac.in//handle/123456789/1168
Appears in Collections:	M Tech Dissertations

Files in This Item:

File	Size	Format
202111023.pdf	1.02 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets