Ruiz Rohena, Kristalys
Loading...
1 results
Publication Search Results
Now showing 1 - 1 of 1
Publication ArcaDB: A container-based disaggregated query engine for heterogenous computational environments(2023-05-11) Ruiz Rohena, Kristalys; Rodríguez Martínez, Manuel; College of Engineering; Rivera Gallego, Wilson; Arzuaga Cruz, Emmanuel; Rodríguez Rodríguez, Domingo; Department of Electrical and Computer Engineering; Cruzado Vélez, IvetteModern enterprises rely on data management systems to collect, store, and analyze vast amounts of data related to their operations. Nowadays, clusters and hardware accelerators (e.g., GPUs, TPUs) have become a necessity to scale with the data processing demands in many applications related to social media, bioinformatics, surveillance systems, remote sensing, and medical informatics. Given this new scenario, the architecture of data analytics engines must evolve to take advantage of these new technological trends. In this thesis, I present ArcaDB: a disaggregated query engine that leverages container technology to place operators at compute nodes that fit their performance profile. In ArcaDB, a query plan is dispatched to worker nodes that have different computing characteristics. Each operator is annotated with the preferred type of compute node for execution, and ArcaDB ensures that the operator gets picked up by the appropriate workers. I have implemented a prototype version of ArcaDB using Java, Python, Docker containers, and other supporting open-source tools. I have also completed a preliminary performance study of this prototype, using images and business data. This study shows that ArcaDB can speed up query performance by a factor of 5x in comparison with a shared-nothing, symmetric arrangement. ArcaDB can help users better meet the performance requirements of their applications.