ReProVide
Query Optimisation and Near-Data Processing on Reconfigurable SoCs for Big Data Analysis
This project is funded by the German Research Foundation (DFG) within the Priority Program SPP 2037 "Scalable Data Management for Future Hardware".
The goal of this project is to provide novel hardware and optimisation techniques for scalable, high-performance processing of Big Data. We particularly target huge data sets with flexible schemata (row-oriented, column-oriented, document-oriented, irregular, and/or non-indexed) as well as data streams as found in click-stream analysis, enterprise sources like emails, software logs and discussion-forum archives, as well as produced by sensors in IoT and Industrie 4.0. In this realm, the project investigates the potential of hardware-reconfigurable, FPGA-based SoCs for near-data processing where computations are pushed towards such heterogeneous data sources. Based on FPGA technology and in particular on their dynamic reconfiguration, we propose a generic architecture called ReProVide for low-cost processing of database queries.
The concepts are intended to enable the integration of FPGA-based acceleration support into available SQL, NoSQL, and in-memory database management systems (DBMSs) as well as stream-processing frameworks. Our intention is to attach volatile and non-volatile storage directly to ReProVide nodes, which will not only contain cleansed and integrated data sets, but can also be used for temporarily or persistently storing uncleaned data from new data sources and data streams.
Our FPGA-based SoC is psuhed forward by the Chair for Computer Science 12. It
- makes use of hardware reconfiguration to adapt datapaths and accelerators for being able to process different OLAP and data-mining operators on data from such heterogeneous data sources,
- provides management techniques to generate local meta-data, indexes, and statistics of such data sources for optimised data processing, as well as
- offers schema-on-read capabilities for the DBMS accessing the SoC.
While the support of irregular data (e.g., graph processing) is not in the main focus of our research, we provide a design methodology that is generic and extensible by user-defined functions and data schemata.
Integration of such architectures that come with their own local optimiser into DBMS requires novel global query optimisation techniques based on concepts known from multi-databases research. This is the task of the Chair for Computer Science 6. While the local optimiser builds statistics of its local data, the global optimiser has to access such data and information of this near-data processor. Global query optimisation decides based on these data which operations are worthwhile to assign to ReProVide SoCs, and which are not. It is vital that the optimiser has enough knowledge to engage ReProVide in query processing whenever there is a benefit. This requires functional knowledge (which data and which operators can be offered) as well as non-functional knowledge (e.g., cost estimates). In this project, we provide an extensible interface over which not only the global optimiser can hand over the QEP to be processed on the ReProVide system and the ReProVide can transmit the query result. But it will also enable to bidirectionally exchange hints to improve their respective optimisation.
Music from: https://www.musicfox.com
Recent Publications
2024
- Hahn T., Schüll D., Wildermann S., Teich J.:
ABACUS: ASIP-based Avro Schema-customizable Parser Acceleration on FPGAs
International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS) (Kielce, 3. April 2024 - 5. April 2024)
In: IEEE Proceedings of the 27th International Symposium on Design and Diagnostics of Electronic Circuits & Systems 2024
DOI: 10.1109/DDECS60919.2024.10508904
BibTeX: Download - Hahn T., Wildermann S., Teich J.:
JSON-CooP: A JSON Decompression/Parsing Co-Design for FPGAs
International Conference on Field-Programmable Logic and Applications (FPL) (Turin, 2. September 2024 - 6. September 2024)
In: IEEE Proceedings of the 34th International Conference on Field-Programmable Logic and Applications 2024
DOI: 10.1109/FPL64840.2024.00012
BibTeX: Download
2023
- Hahn T., Schüll D., Wildermann S., Teich J.:
An FPGA Avro Parser Generator for Accelerated Data Stream Processing
2nd Workshop on Novel Data Management Ideas on Heterogeneous (Co-)Processors (NoDMC) (Dresden, 6. March 2023 - 10. March 2023)
In: Proceedings of the 2nd Workshop on Novel Data Management Ideas on Heterogeneous (Co-)Processors (NoDMC) 2023
DOI: 10.18420/BTW2023-46
BibTeX: Download - Hahn T., Wildermann S., Teich J.:
SPEAR-JSON: Selective parsing of JSON to enable accelerated stream processing on FPGAs
International Conference on Field-Programmable Logic and Applications (FPL) (Göteborg, 4. September 2023 - 8. September 2023)
In: IEEE Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications 2023
DOI: 10.1109/FPL60245.2023.00034
BibTeX: Download
2022
- Becher A.:
Near-Data Query Processing on Heterogeneous FPGA-based Systems (Dissertation, 2022)
URL: https://nbn-resolving.org/urn:nbn:de:bvb:29-opus4-189289
BibTeX: Download - Hahn T., Becher A., Wildermann S., Teich J.:
Raw Filtering of JSON data on FPGAs
Design, Automation and Test in Europe Conference (DATE) (Antwerpen, 14. March 2022 - 23. March 2022)
In: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe 2022
DOI: 10.23919/DATE54114.2022.9774696
BibTeX: Download - Hahn T., Wildermann S., Teich J.:
Auto-Tuning of Raw Filters for FPGAs
International Conference on Field-Programmable Logic and Applications (FPL) (Belfast, United Kingdom, 29. August 2022 - 2. September 2022)
In: IEEE Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications 2022
DOI: 10.1109/FPL57034.2022.00036
BibTeX: Download
2020
- Beena Gopalakrishnan Nair L., Becher A., Meyer-Wegener K.:
The ReProVide Query-Sequence Optimization in a Hardware-Accelerated DBMS
16th International Workshop on Data Management on New Hardware Held with ACM SIGMOD/PODS 2020 (Portland, Oregon USA, 15. June 2020 - 15. June 2020)
In: DaMoN '20: Proceedings of the 16th International Workshop on Data Management on New Hardware 2020
DOI: 10.1145/3399666.3399926
BibTeX: Download - Beena Gopalakrishnan Nair L., Becher A., Meyer-Wegener K., Wildermann S., Teich J.:
SQL Query Processing Using an Integrated FPGA-based Near-Data Accelerator in ReProVide
23rd International Conference on Extending Database Technology (Copenhagen, 30. March 2020 - 2. April 2020)
In: Proceedings of EDBT 2020
BibTeX: Download
2019
- Becher A., Herrmann A., Wildermann S., Teich J.:
ReProVide: Towards Utilizing Heterogeneous Partially Reconfigurable Architectures for Near-Memory Data Processing
1st Workshop on Novel Data Management Ideas on Heterogeneous (Co-)Processors (NoDMC) at 18. Fachtagung für "Datenbanksysteme für Business, Technologie und Web" (Universität Rostock, 4. March 2019 - 8. March 2019)
In: Gesellschaft für Informatik, Bonn (ed.): Proceedings of the 1st Workshop on Novel Data Management Ideas on Heterogeneous (Co-)Processors (NoDMC), Bonn: 2019
DOI: 10.18420/btw2019-ws-04
URL: https://dl.gi.de/handle/20.500.12116/21825
BibTeX: Download - Becher A., Teich J.:
In situ Statistics Generation within partially reconfigurable Hardware Accelerators for Query Processing
15th International Workshop on Data Management on New Hardware (DaMoN) Held with ACM SIGMOD/PODS 2019 (Amsterdam, 1. July 2019 - 1. July 2019)
DOI: 10.1145/3329785.3329936
BibTeX: Download - Plagwitz P., Streit FJ., Becher A., Wildermann S., Teich J.:
Compiler-Based High-Level Synthesis of Application-Specific Processors on FPGAs
International Conference on ReConFigurable Computing and FPGAs (ReConFig) (Cancún, Mexico, 9. December 2019 - 11. December 2019)
In: IEEE Proceedings of the 14th International Conference on ReConFigurable Computing and FPGAs 2019
DOI: 10.1109/ReConFig48160.2019.8994778
BibTeX: Download
2018
- Becher A., Beena Gopalakrishnan Nair L., Broneske D., Drewes T., Gurumurthy B., Meyer-Wegener K., Pionteck T., Saake G., Teich J., Wildermann S.:
Integration of FPGAs in Database Management Systems: Challenges and Opportunities
In: Datenbank-Spektrum (2018)
ISSN: 1618-2162
DOI: 10.1007/s13222-018-0294-9
BibTeX: Download - Becher A., Wildermann S., Teich J.:
Optimistic Regular Expression Matching on FPGAs for Near-Data Processing
Data Management on New Hardware (DaMoN) (Houston, Texas, 11. June 2018 - 11. June 2018)
DOI: 10.1145/3211922.3211926
BibTeX: Download - Echavarria Gutiérrez JA., Schütz K., Becher A., Wildermann S., Teich J.:
Can Approximate Computing Reduce Power Consumption on FPGAs?
25th IEEE International Conference on Electronics Circuits and Systems (Bordeaux, 9. December 2018 - 12. December 2018)
In: Proceedings of IEEE International Conference on Electronics Circuits and Systems 2018
DOI: 10.1109/icecs.2018.8618062
BibTeX: Download
2016
- Becher A., Wildermann S., Mühlenthaler M., Teich J.:
ReOrder: Runtime Datapath Generation for High-Throughput Multi-Stream Processing
International Conference on Reconfigurable Computing and FPGAs (ReConFig) (Cancún)
In: Proceedings of the International Conference on Reconfigurable Computing and FPGAs 2016
DOI: 10.1109/ReConFig.2016.7857185
BibTeX: Download