Sharma, Arun

Loading...
Profile Picture

Publication Search Results

Now showing 1 - 1 of 1
  • Publication
    Graphical development interface and stream analyzer for Apache Spark
    (2018) Sharma, Arun; Rodríguez Martínez, Manuel; College of Engineering; Arzuaga, Emmanuel; Rivera, Pedro; Department of Electrical and Computer Engineering; Zapata, Rocío
    Technology advances and doubles itself every two years. According to Moore’s law [1], number of transistors on an integrated circuit design, doubles every two years. With this increase, increases the processing power of the devices. But the hurdle comes in when we human beings are not able to find apt solutions to fully utilize this number-crunching power of the devices. To solve this big problem, the solution needs to be big as well. Throughout the history of computing devices, researchers and programmers worldwide have been trying to solve this issue by making computing devices work together in a networked fashion, and this has been termed as Distributed Processing. Apache TM a non-profit corporation, that develops and distributes free and open source frameworks, tools and SDKs, has been constantly trying to come up with a better solution to help achieve maximum processing speeds over a cluster of computing devices interconnected on a network. There are many widely used tools such as Apache Spark, Apache Storm, Apache Flink, Twitter’s Heron, Alluxion Open Foundation’s “Alluxion” framework. All these tools help in solving big data processing problems. Based on these tools, an analyst with prior knowledge of programming, can analyze huge database sets consisting of millions and billions of data rows. But it needs hands on knowledge of programming and most of the times very deep understanding of big data processing algorithms. In this work, we developed a graphical development interface (GDI) that is aimed to minimize the effort required to use these available tools when analyzing big data. We name the system as “Stream Analyzer”. But more than just minimizing that effort for the typical user, this system will also allow these analysis tools to be used by individuals with little, or even none, experience in software development. Stream Analyzer is a collection of tools that allows a novice user to quickly setup a stream processing environment. It is a system based on plugins/components which can be dragged dropped on to the topology design area and can be interconnected. These components serve as basis for doing the main processing work behind the scenes. No prior knowledge of programming is required as the plugins/components can be installed from the plugin management area. It is a multiuser system with ability to provide topology and project management. Furthermore, at the end of the processing end point components, user can attach a report generation plugin, provide configuration parameters and finally run the topology to actually see the live report generation in the report view area of the GDI. The core idea behind this GDI is to let any user quickly learn the tools and use them to analyze the big data present in the huge world of internet without any need to code.