M Tech Dissertations
Permanent URI for this collectionhttp://drsr.daiict.ac.in/handle/123456789/3
Browse
20 results
Search Results
Item Open Access Learn Graph Laplacian for Sparse Frequency Domain Representation(Dhirubhai Ambani Institute of Information and Communication Technology, 2017) Batavia, Darshan; Tatu, Aditya"Learning of graph topology plays an important role in processing structure and unstructured data which can be represented as graph signals. Graph topology for a given graph signal is not always readily available from the given data and is also not unique. It is desirable to learn the graph topology for the graph signals such that the data admits the property or application. In this thesis we address the problem of estimating graph Laplacian matrix (graph topology) for signals with prior assumption that signals have sparse representation in frequency domain . This is done by first finding an optimal graph signal basis (eigenvectors of graph Laplacian matrix) and later knowing the eigenvectors, we try to estimate a sparse representation of the graph signals and its respective graph Laplacian matrix. Then we discuss the results of learning the graph for synthetic data. For application purpose, we propose a modification in an existing algorithm for image denoising and demonstrate the results for all the three class of images - natural, piece-wise smooth (depth map) and texture images. The performance is compared to that of other image denoising methods for various images and quality measures."Item Open Access Vocal tract length normalization for automatic speech recognition(Dhirubhai Ambani Institute of Information and Communication Technology, 2014) Sharma, Shubham; Patil, Hemant A.Various factors affect the performance of Automatic Speech Recognition (ASR) systems. In this thesis, speaker differences due to variations in vocal tract length (VTL) are taken into account. Vocal Tract Length Normalization (VTLN) has become an integral part of ASR systems these days. Different methods have been studied to compensate for these differences in the spectral-domain. In this thesis, various state-of-the-art methods have been implemented and discussed in detail. For example, method of Lee and Rose uses a maximum likelihood-based approach. It implements a grid search over a range of values of warping factors to obtain optimal warping factors for different speakers. On the other hand, method by Umesh et al. uses scale transform to obtain VTL normalized features. Frequency warping is the basis of such normalizing techniques. Mel scale warping is the most acceptable for compensating the speaker differences as it is inspired from the hearing process of human ear. Use of Bark scale–based warping is proposed in this thesis. Bark scale is based on the perception of loudness by human ear in contrast with the Mel scale which is based on pitch perception. Bark scale-based warping provides improved recognition accuracy in case of mismatched conditions (i.e., training on male (or female) speakers and testing on female (or male) speakers). Performances of different methods have been tested for different ASR tasks in English, Gujarati and Marathi languages. TIMIT database is used for English language and details of database collection for Gujarati and Marathi languages have been discussed. The performance provided by using VTLN has shown improvement over state-of-the-art MFCC features alone for almost all applications considered in this thesis. One of the major tasks done in this thesis is the development of Phonetic Engines (PE) using VTLN in three different modes of speech, viz., read, spontaneous and lecture mode in Gujarati and Marathi languages. Lee-Rose method is used for the design of PEs. Improved accuracy is achieved using VTLN-based method as compared to MFCCs. In addition, template matching experiment is performed using various VTL-normalized features under study and MFCCs for application of spoken keyword spotting. Better precision and lower equal error rates (EER) are obtained using VTL-normalized Scale Transform Cepstral Coefficients (STCC). This suggests that VTLN-based features can be useful for bigger applications such as audio search and spoken term detection (STD).Item Open Access Phonetic segmentation: unsupervised approach(Dhirubhai Ambani Institute of Information and Communication Technology, 2013) Vachhani, Bhavikkumar Bhagvanbhai; Patil, Hemant A.Phonetic segmentation can find its potential application for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) Synthesis systems. In this thesis, we propose use of different spectral features viz., Mel Frequency Cepstral Coefficients (MFCC), Cochlear Filter Cepstral Coefficients (CFCC) and Perceptual Linear Prediction Cepstral Coefficients (PLPCC)-based features to detect spectral transition measure (STM) for automatic phonetic boundaries. We propose a new unsupervised algorithm by combining evidences from state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) and proposed CFCC to improve the accuracy in automatic phonetic boundaries detection process. Using proposed fusion-based approach, we achieve 90 % (i.e., 8 % better than MFCC-based STM alone for 20 ms tolerance interval) accuracy for automatic boundary detection of entire TIMIT database. Using proposed PLPCC-base STM approach, we achieve 85 % (i.e., 3 % better than state-of the art Mel- frequency Cepstral Coefficients (MFCC)-based STM for 20 ms tolerance interval) accuracy and 15 % over-segmentation rate (i.e., 8 % less than MFCC-based STM) for automatic boundary detection of 2, 34, 925 phone boundaries corresponding 630 speakers of entire TIMIT database. The second part of the thesis is focusing on development of various applications using automatically segmented and labeled boundaries.Item Open Access FIR filter for high speed 60 GHz wireless communications.(Dhirubhai Ambani Institute of Information and Communication Technology, 2013) Dholariya, Tarun; Sen, SubhajitIn this thesis report the design and implementation of the FIR filter at 5 GS/s presented. IEEE 802.15.3c standard declared unlicensed band from 57 to 66 GHz. This 60 GHz wireless communication transmitter uses interpolator filter part of this works at 5 GS/s. In this report 6 order 5 GS/s FIR filter implementation is described. In 180nm technology speed requirement cannot be achieved. Using the 45 nm technology this FIR filter is implemented and tested. The FIR filter structure is designed in bit pipelined structure. The key part are CPL adder and TSPC Flip-Flop. These adder and FF enabled very high frequency operation at 7.35 GHz. This led to implementation of 5 GS/s FIR filter using power of 2 coefficients.Item Open Access Design & layout of a low voltage folding & interpolation ADC for high speed applications(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Tiwari, Sandeep Kumar; Sen, SubhajitAnalog to Digital Converters (ADC) and Digital to Analog Converters (DAC) plays a vital role in mixed analog signalling, communication and digital signal processing world. Now a day, the demand for designing of high speed, low power and low voltage ADCs are increasing tremendously in high speed data processing applications. In the folding and interpolation ADCs folding amplifiers have the serious bandwidth limitation problem because of larger parasitic capacitance and resistance at the output node. In this thesis work a low voltage and high speed folding and interpolation ADC is implemented using current steering CMOS folding amplifier followed by transresistance amplifier (TRA) in UMC 180nm CMOS technology. The current steering folding amplifier significantly reduces power as well as number of tail current sources compared to the conventional folding amplifier. Transresistance amplifier, which is connected at the output of folding amplifier, avoids the analog bandwidth limitation problem. MSB and LSB bits are generated simultaneously at the output therefore sample and hold circuit is not required in this architecture. This proposed circuit works at 1.8V power supply and 85 MSamples/S and consumes 70mW power. Simulation and Layout of Folding and Interpolation ADC were done using UMC CMOS 180nm technology in the Cadence Analog Design EnvironmentItem Open Access FPGA implementation of multiband and multimode modem for software defined radio (SDR)(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Timbadiya, Jaykant; Dubey, RahulNow a days Software Defined Radio(SDR) is becoming popular for wireless communication because of it’s flexibility to change as per requirement through software. The work presented here describes the different methods of designing a Multiband and Multimode MODEM, implementation on programmable device like FPGA and verification for different functionality and specification. The design presented here has ability to switch between different modulation scheme and different data rate. Multiband and Multimode modem includes BPSK and QPSK modulator and demodulator with Forward Error Correction and other base band processingItem Open Access Application of compressive sensing to tow-way relay channel estimation(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Nair, Rahit R.; Chakka, VijaykumarAn Amplify and Forward Two-Way Relay Network is one where two nodes transmit data to each other via an intermediate relay. The relay amplifies the superimposed data from both the nodes before sending it to both the nodes. A method for the estimation of channel is proposed for Amplify and Forward Two-Way Relay Network (AF-TWRN). The proposed method utilizes the fact that the channel in the case of AF-TWRN shows sparse characteristic. The sparse multipath channel is estimated in the case of AF-TWRN using compressive sensing (CS) reconstruction algorithm, namely Iterative Hard Thresholding (IHT). MSE based performance of these methods in estimating the composite AF-TWRN channel was calculated and compared to that using Compressive Sampling Matching Pursuit (CoSaMP) and Orthogonal Matching Pursuit (OMP). IHT and CoSaMP are seen to perform slightly better than OMP with lesser computational complexity than OMP. It was also shown that all three CS based estimation methods perform better than the traditional Least Squares (LS) method in the estimation of Sparse AF-TWRN channel. A low complexity detection strategy was proposedItem Open Access Low-power multi-ported register file for digital signal processors(Dhirubhai Ambani Institute of Information and Communication Technology, 2011) Aguduri, Nagamanoj Kumar; Nagchoudhuri, DipankarDigital Signal Processors (DSPs) also come under the category of processors in which Multi-ported register files can find their applications widely. Most of the DSP applications do not benefit from further speeding-up after achieving certain speed. This thesis involves in building a multi-ported register file that takes advantage of loose speed-up requirements of DSPs to reduce the power consumption. However, this can also be used in any processor which requires multiple data accesses without requiring high-end performance. A 10 read, 6 write ported 64-entry 64-bit register file is designed using combinations of techniques proposed in various earlier research works. We propose some improvements to this design in order to still lessen the power consumption. The designed Register file operates at a frequency of 250 MHz and at a power supply of 1 V. The circuits are simulated using 90nm technology. The simulation results show that this design consumes 0.00226 mW/MHz-port.Item Open Access Hardware implementation of multiband and multimode modem for software defined radio/ cognitive radio(Dhirubhai Ambani Institute of Information and Communication Technology, 2011) Buddhbhatti, Dixit K.; Dubey, RahulPresent day programmable hardware and SDR (Software Defined Radio) have enabled radio processing to switch from analog to digital. The work presented here describes a method of designing a multiband & multimode modulator for SDR/ CR (Cognitive Radio) and its implementation on programmable hardware, such as FPGA (Field Programmable Gate Array). SDR has become a most important topic of research in the field of satellite communication. The prototype presented here shows ability to dynamically alter modulation and demodulation scheme in a Satellite Communication (Satcom) terminal to suit given conditional requirements. The Cognitive module will decide which modulation scheme and frequency band has to be used for signal transmission. Cognitive module provides control bit to select the modulation scheme and operating frequency. The work presented here demonstrates a practical design and implementation procedure for modem used in SDR/ CR platform and gives detailed description of the baseband signal processing logic design in the FPGA.Item Open Access Oblique projection operator(Dhirubhai Ambani Institute of Information and Communication Technology, 2010) Shivakrishana, D; Chakka, VijaykumarThis thesis presents the understanding usefulness of Oblique Projector in signal processing like signal recovery, signal representation and reconstruction. This thesis also presents the different recursive oblique projector methods available in the literature along with this usefulness in the application of time varying channel conditions and additive correlated noise environment.