Research Interests

Database management systems (with a focus on scientific data management and energy-aware databases), application of control theory, high-performance computing, and peer-to-peer networks.

 

On-Going Projects

1. Efficient Data Processing in Molecular Dynamics Simulation

    Atomistic level Molecular Dynamics can provide insights into the structure, dynamics, and thermodynamic characteristics of biological structures. Studies on molecular dynamics thus attracted a great deal of attention from the research communities of biology, physics, and chemistry. Computer simulation is an important method for molecular dynamics research. The outcome of an MD simulation is a "pseudo trajectory" of positions and velocities of all of the atoms in the simulation system. Due to the large number of atoms/molecules and time instances to simulate, such simulations generate tremendous amounts of data. How to effectively and efficiently reason about such data is a critical problem that directly affects the success of any molecular simulation study. Current strategies to store, access, and query such data are all based on computer files. This renders some serious efficiency problems in querying and sharing the simulation data. In this project, we propose the idea of storing simulation data in databases and develop novel indexing strategies to help optimize the processing of a wide range of queries.

    This project is supported by a National Institutes of Health (NIH) R01 grant (R01-GM086707). Publications: [ICDE09][VLDBJ10][Code download]

2. Power-Aware Database Systems [link]

    Maintaining a sustainable society via technological innovations has been a major challenge to scientists and engineers in the 21st century. With the total energy consumption of computing equipment increasing at a steep rate, much attention has been paid to the design of energy-effecient computing systems. Power-aware computing research at the applications level has been found to be synergistic to that at the system level (e.g., hardware and OS) because it can provide more opportunities of energy reduction for the underlying systems. In this project, we focus on database management systems (DBMS), which often consume a majority of the computing resources (and power) in modern data centers. We aim the design and implementation of a power-aware DBMS that can significantly reduce energy use with graceful performance degradation. Our preliminary results have demonstrated great potential of this line of research.

    This project is supported by an NSF grant via its CISE core program (IIS-001117699) and a Florida Energy Systems Consortium (FESC) seed grant. Publication: [ICDE10][SSDBM11]

 

Accomplished Projects

1. Load Shedding in Data Stream Management Systems.

    The continuous feature of both data and queries in data stream management systems (DSMSs) place great demand on system resources. However, queries can be processed with different levels of quality such as timeliness, reliability, and uncertainty. In this project, we study the problem of how to maintain tuple processing delays in query processing in DSMSs. A widely-used approach to maintain QoS (especially tuple delays) in DSMS query processing is load shedding, i.e., dropping data. Current load shedding solutions utilize simple, intuitive ideas in determining the time and amount of load to be discarded and do not work well in the presence of system/environmental disturbances. We propose a solution based on feedback control theory with significantly improved long-term performance. Currently, we are exploring the strategy of semantic load shedding: selectively shedding data according to their importance to the query results. Publications: [ICDE07][VLDB06][DEXA05a][Code download]

2. QuaSAQ: Enabling End-to-End QoS by A Database-Centric Approach.

    Support of user-input QoS requirements in multimedia databases cannot be accomplished by simply deploying the multimedia DBMS on top of QoS-provisioning OS or middleware. We propose an integrated QoS-control framework (QuaSAQ) as part of the DBMS as well as cost models to evaluate the QoS-aware queries. QuaSAQ is implemented on the basis of our video DBMS - VDBMS.  Publications: [EDBT04][DEXA05b][TKDE07]

3. Hybrid Peer-to-Peer Media Streaming System. [link]

    Peer-to-peer networks provide a method to solve the scalability problem of traditional server-based media streaming services. We studied the capacity (bandwidth) growth of such systems by performing quantitative analysis of a generic P2P streaming model.  Publications: [TOMCCAP05][MMCN04]

4. VDBMS - A video DBMS. [link]

    A multimedia DBMS with full-featured video processing and content-based retrieval capabilities. I investigated the use of tertiary storage as medium for active data. One of the most serious problems in using tertiary storage is long response time to requests. I also studied relevant caching and pre-fetching algorithms for the purpose of minimizing latency. Publications: [MMSJ04][DMS03][ICDE02]

5. HPKB. [link]

    An interactive knowledge-based system for efficient and cost-effective maintenance of naval ships. A description of HPKB infrastructure and methodology: