FPGA-Driven Machine Learning Models: Trends, Challenges, and Implementations

Sattenapalli Kalyani, Vydeki D

Citation :

Sattenapalli Kalyani, Vydeki D, "FPGA-Driven Machine Learning Models: Trends, Challenges, and Implementations," International Journal of Electronics and Communication Engineering, vol. 12, no. 10, pp. 196-211, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I10P116

Abstract

FPGAs represent a robust platform for accelerating ML algorithms because they enable parallel computation and short latency while minimizing power usage. Every aspect follows computerization, and most items achieve smart functionality at present. The IoT technology of the present allows network connection through the use of IoT platforms for objects. IoT defines an innovative information system of linked devices that perform automated exchanges between equipment independent of human input. IoT systems require flexible platforms. The connection capability of IoT devices to external environments depends on Field Programmable Gate Array (FPGA) technology, which provides easy user access using low-power systems with minimal delays and exceptional precision. The scalability feature of FPGAs allows SoC implementation since designers can place various hardware clocks onto one single chip. The FPGA functions as a particular type of programmable mainframe since it receives indicators through its input pins before transforming them into outputs at its output pins. This evaluation explores recent FPGA implementation methods of ML algorithms with a specific focus on Support Vector Machines (SVMs) and their classification precision. The research evaluates various hardware system designs while evaluating their performance tradeoffs and identifies noteworthy research areas for improvement. The final part addresses directions for enhancing FPGA-based ML implementations.

Keywords

SVM, Embedded Systems, FPGA, Hardware Architecture, System on Chip.

References

[1] Janmenjoy Nayak, Bighnaraj Naik, and H.S. Behera, “A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges,” International Journal of Database Theory and Application, vol. 8, no. 1, pp. 169-186, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[2] P. Sabouri et al., “A Cascade Classifier for Diagnosis of Melanoma in Clinical Images,” 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, IL, USA, pp. 6748-6751, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[3] G.M. Foody, and A. Mathur, “A Relative Evaluation of Multiclass Image Classification by Support Vector Machines,” IEEE Transactions on Geoscience and Remote Sensing, vol. 42, no. 6, pp. 1335-1343, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Reza Entezari-Maleki, Arash Rezaei, and Behrouz Minaei-Bidgoli, “Comparison of Classification Methods Based on the Type of Attributes and Sample Size,” Journal of Convergence Information Technology, vol. 4, no. 3, pp. 94-102, 2009.
[Google Scholar]
[5] Jinho Kim, Byungsoo Kim, and Silvio Savarese, “Comparing Image Classification Methods: K-Nearest-Neighbor and Support-Vector-Machines,” Proceedings of the 6th WSEAS International Conference on Computer Engineering and Applications, and Proceedings of the 2012 American Conference on Applied Mathematics, pp. 133-138, 2012.
[Google Scholar] [Publisher Link]
[6] Mário P. Véstias, High-Performance Reconfigurable Computing Granularity, 3rd ed., Encyclopedia of Information Science and Technology, IGI Global Scientific Publishing, pp. 3558-3567, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Hanaa M. Hussain, Khaled Benkrid, and Huseyin Seker, “The Role of FPGAs as High Performance Computing Solution to Bioinformatics and Computational Biology Data,” AIHLS2013, pp. 1-4, 2013.
[Google Scholar]
[8] Shuichi Asano, Tsutomu Maruyama, and Yoshiki Yamaguchi, “Performance Comparison of FPGA, GPU and CPU in Image Processing,” 2009 International Conference on Field Programmable Logic and Applications, Prague, Czech Republic, pp. 126-131, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Jeremy Fowers et al., “A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications,” Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey California USA, pp. 47-56, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Ben Cope et al., “Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study,” IEEE Transactions on Computers, vol. 59, no. 4, pp. 433-448, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Marcin Pietron et al., “Comparison of GPU and FPGA Implementation of SVM Algorithm for Fast Image Segmentation,” Architecture of Computing Systems – ARCS 2013, pp. 292-302, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Egil Fykse, “Performance Comparison of GPU, DSP and FPGA Implementations of Image Processing and Computer Vision Algorithms in Embedded Systems,” Master Thesis, Norwegian University of Science and Technology, pp. 1-76, 2013.
[Google Scholar] [Publisher Link]
[13] Shereen Moataz Afifi, Hamid GholamHosseini, and Roopak Sinha, “Hardware Implementations of SVM on FPGA: AState-of-the-Art Review of Current Practice,” IJISET - International Journal of Innovative Science, Engineering & Technology, vol. 2, no. 11, pp. 733-752, 2015.
[Google Scholar] [Publisher Link]
[14] Ajay Rupani, Dikshant Pandey, and Gajendra Sujediya, “Review and Study of FPGA Implementation of Internet of Things,” IJSTE - International Journal of Science Technology & Engineering, vol. 3, no. 2, pp. 104-107, 2016.
[Google Scholar] [Publisher Link]
[15] Sang Don Kim, and Seung Eun Lee, “Little Core Based System on Chip Platform for Internet of Thing,” International Journal of Electrical and Computer Engineering (IJECE), vol. 5, no. 4, pp. 695-700, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[16] A. Ruta, R. Brzoza-Woch, and K. Zielinski, “On Fast Development of FPGA-Based SOA Services—Machine Vision Case Study,” Design Automation for Embedded Systems, vol. 16, pp. 45-69, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Ajay Rupani, and Gajendra Sujediya, “A Review of FPGA Implementation of Internet of Things,” International Journal of Innovative Research in Computer and Communication Engineering, vol. 4, no. 9, pp. 16203-16207, 2016.
[Google Scholar] [Publisher Link]
[18] Gasim Alandjani et al., “Energy Efficient VLSI Design on FPGA Using Capacitance Scaling Technique,” Indian Journal of Science and Technology, vol. 9, no. 36, pp. 1-5, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[19] B. Schoelkopf, K. Tsuda, and J.P. Vert, Support Vector Machine Applications in Computational Biology, MIT Press, pp. 71-92, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Yuki Ago, Koji Nakano, and Yasuaki Ito, “A Classification Processor for a Support Vector Machine with Embedded DSP Slices and Block RAMs in the FPGA,” 2013 IEEE 7th International Symposium on Embedded Multicore Socs, Tokyo, Japan, pp. 91-96, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Markus Berberich, and Konrad Doll, “Highly Flexible FPGA-Architecture of a Support Vector Machine,” MPC Workshop, no. 45, pp. 25-32, 2014.
[Google Scholar] [Publisher Link]
[22] Ray Andraka, “A Survey of CORDIC Algorithms for FPGA Based Computers,” Proceedings of the 1998 ACM/SIGDA Sixth International Symposium on Field Programmable Gate Arrays, Monterey California USA, pp. 191-200, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Vuk S. Vranjković, Rastislav J.R. Struharik, and Ladislav A. Novak, “Reconfigurable Hardware for Machine Learning Applications,” Journal of Circuits, Systems and Computers, vol. 24, no. 5, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Sumeet Saurav et al., “Hardware Accelerator for Facial Expression Classification Using Linear SVM,” Advances in Signal Processing and Intelligent Recognition Systems, pp. 39-50, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Sujin Kim, Seonyoung Lee, and Kyeongsoon Cho, “Design of High-Performance Unified Circuit for Linear and Non-Linear SVM Classifications,” The Institute of Electronics and Information Engineers, vol. 12, no. 2, pp. 162-167, 2012.
[Google Scholar] [Publisher Link]
[26] Tetsushi Koide et al., “FPGA Implementation of Type Identifier for Colorectal Endoscopie Images with NBI Magnification,” 2014 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Ishigaki, Japan, pp. 651-654, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Satoshi Shigemi et al., “Customizable Hardware Architecture of Support Vector Machine in CAD System for Colorectal Endoscopic Images with NBI Magnification,” Proceedings of the 18th Workshop on Systhesis and System Integration of Mixed Information Technologies (SASIMI2013), pp. 298-303, 2013.
[Google Scholar] [Publisher Link]
[28] S. Shigemi, “An FPGA Implementation of Support Vector Machine Identifier for Colorectal Endoscopic Images with NBI Magnification,” Proceedings of the 28th International Conference on Circuits/Systems, Computers and Communications, pp. 571-572, 2013.
[Google Scholar]
[29] Nie Zhiliang, Zhang Xingming, and Yang Zhenxi, “An FPGA Implementation of Multi-class Support Vector Machine Classifier Based on Posterior Probability,” Proceedings of International Conference on Civil and Environmental Engineering, Chengdu, China, 2012.
[Google Scholar]
[30] Maciej Wielgosz et al., FPGA Implementation of the Selected Parts of the Fast Image Segmentation, Intelligent Tools for Building a Scientific Information Platform, Springer, Berlin, Heidelberg, pp. 203-216, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Christos Kyrkou, and Theocharis Theocharides, “SCoPE: Towards a Systolic Array for SVM Object Detection,” IEEE Embedded Systems Letters, vol. 1, no. 2, pp. 46-49, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Christos Kyrkou, and Theocharis Theocharides, “A Parallel Hardware Architecture for Real-Time Object Detection with Support Vector Machines,” IEEE Transactions on Computers, vol. 61, no. 6, pp. 831-842, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[33] D. Anguita et al., “Feed-Forward Support Vector Machine without Multipliers,” IEEE Transactions on Neural Networks, vol. 17, no. 5, pp. 1328-1331, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Marta Ruiz-Llata, Guillermo Guarnizo, and Mar Yébenes-Calvino, “FPGA Implementation of a Support Vector Machine for Classification and Regression,” The 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain, pp. 1-5, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Vuk Vranjković, and Rastislav Struharik, “New Architecture for SVM Classifier and its Application to Telecommunication Problems,” 2011 19thTelecommunications Forum (TELFOR) Proceedings of Papers, Belgrade, Serbia, pp. 1543-1545, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Jesús Gimeno Sarciada, Horacio Lamel Rivera, and Matías Jiménez, “CORDIC Algorithms for SVM FPGA Implementation,” Independent Component Analyses, Wavelets, Neural Networks, Biosystems, and Nanoengineering VIII, Orlando, Florida, United States, vol. 7703, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Horacio Lamela et al., “Performance Evaluation of a FPGA Implementation of a Digital Rotation Support Vector Machine,” Independent Component Analyses, Wavelets, Unsupervised Nano-Biomimetic Sensors, and Neural Networks VI, Orlando, Florida, United States, vol. 6979, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Abdul-Halim M. Jallad, and Lubna B. Mohammed, “Hardware Support Vector Machine (SVM) for Satellite on-Board Applications,” 2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), Leicester, UK, pp. 256-261, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Xipeng Pan et al., “FPGA Implementation of SVM Decision Function Based on Hardware-Friendly Kernel,” 2013 International Conference on Computational and Information Sciences, Shiyang, China, pp. 133-136, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[40] B.H.A.S.W.A.T.I. Mandal et al., “Implementation of Systolic Array Based SVM Classifier Using Multiplierless Kernel,” 2014 International Conference on Signal Processing and Integrated Networks (SPIN), pp. 288-294, 2014.
[Google Scholar]
[41] Luca Pezzarossa et al., “Using Dynamic Partial Reconfiguration of FPGAs in Real-Time Systems,” Microprocessors and Microsystems, vol. 61, pp. 198-206, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Trailokya Nath Sasamal, and Rajendra Prasad, “Module Based and Difference Based Implementation of Partial Reconfiguration on FPGA: A Review,” International Journal of Engineering Research and Applications (IJERA), vol. 1, no. 4, pp. 1898-1903, 2011.
[Google Scholar] [Publisher Link]
[43] Hanaa M. Hussain, Khaled Benkrid, and Huseyin Seker, “Reconfiguration-Based Implementation of SVM Classifier on FPGA for Classifying Microarray Data,” 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, pp. 3058-3061, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Rajesh A. Patil et al., “Power Aware Hardware Prototyping of Multiclass SVM Classifier through Reconfiguration,” 2012 25th International Conference on VLSI Design, Hyderabad, India, pp. 62-67, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Hanaa Hussain, Khaled Benkrid, and Hüseyin Şeker, “Novel Dynamic Partial Reconfiguration Implementations of the Support Vector Machine Classifier on FPGA,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 24, no. 5, pp. 3371-3387, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Christos Kyrkou, Theocharis Theocharides, and Christos-Savvas Bouganis, “An Embedded Hardware-Efficient Architecture for Real-Time Cascade Support Vector Machine Classification,” 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), Agios Konstantinos, Greece, pp. 129-136, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Markos Papadonikolakis, and Christos-Savvas Bouganis, “A Novel FPGA-Based SVM Classifier,” 2010 International Conference on Field-Programmable Technology, Beijing, China, pp. 283-286, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Markos Papadonikolakis, and Christos-Savvas Bouganis, “Novel Cascade FPGA Accelerator for Support Vector Machines Classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 7, pp. 1040-1052, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[49] Christos Kyrkou et al., “Embedded Hardware-Efficient Real-Time Classification with Cascade Support Vector Machines,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 1, pp. 99-112, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[50] Christos Kyrkou et al., “Boosting the Hardware-Efficiency of Cascade Support Vector Machines for Embedded Classification Applications,” International Journal of Parallel Programming, vol. 46, pp. 1220-1246, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[51] Shereen Afifi, Hamid GholamHosseini, and Roopak Sinha, “SVM Classifier on Chip for Melanoma Detection,” 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Korea (South), pp. 270-274, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[52] Shereen Afifi, Hamid GholamHosseini, and Roopak Sinha, “FPGA Implementations of SVM Classifiers: A Review,” SN Computer Science, vol. 1, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[53] Konstantina Koliogeorgi, “Optimizing SVM Classifier Through Approximate and High Level Synthesis Techniques,” 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), Thessaloniki, Greece, pp. 1-4, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[54] Mostafa Rahimi Azghadi et al., “Hardware Implementation of Deep Network Accelerators towards Healthcare and Biomedical Applications,” IEEE Transactions on Biomedical Circuits and Systems, vol. 14, no. 6, pp. 1138-1159, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[55] Ahmed K. Jameil, and Hamed Al-Raweshidy, “Efficient CNN Architecture on FPGA Using High Level Module for Healthcare Devices,” IEEE Access, vol. 10, pp. 60486-60495, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[56] Yi Sun et al., “Adaptive Multi-Lane Detection Based on Robust Instance Segmentation for Intelligent Vehicles,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 1, pp. 888-899, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[57] Kenichi Harada, Kenji Kanazawa, and Moritoshi Yasunaga, “FPGA-Based Object Detection for Autonomous Driving System,” 2019 International Conference on Field-Programmable Technology (ICFPT), Tianjin, China, pp. 465-468, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[58] Gerlando Sciangula et al., “Hardware Acceleration of Deep Neural Networks for Autonomous Driving on FPGA-Based SoC,” 2022 25th Euromicro Conference on Digital System Design (DSD), Maspalomas, Spain, pp. 406-414, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[59] Gina Smith, FPGAs 101: Everything You Need to Know to Get Started, Newnes, pp. 1-245, 2010.
[Google Scholar] [Publisher Link]
[60] Archit Gajjar et al., “RD-FAXID: Ransomware Detection with FPGA-Accelerated XGBoost,” ACM Transactions on Reconfigurable Technology and Systems, vol. 17, no. 4, pp. 1-33, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[61] “McAfee Mobile Threat Report 2021,” pp. 1-12, 2021.
[Publisher Link]
[62] John Demme et al., “On the Feasibility of Online Malware Detection with Performance Counters,” ACM SIGARCH Computer Architecture News, vol. 41, no. 3, pp. 559-570, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[63] Hongxiang Fan et al., “A Real-Time Object Detection Accelerator with Compressed SSDLite on FPGA,” 2018 International Conference on Field-Programmable Technology (FPT), Naha, Japan, pp. 14-21, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[64] Frank Ridder, Kuan-Hsun Chen, and Nikolaos Alachiotis, “Accelerated Real-Time Classification of Evolving Data Streams using Adaptive Random Forests,” 2023 International Conference on Field Programmable Technology (ICFPT), Yokohama, Japan, pp. 232-237, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[65] Shuanglong Liu, and Wayne Luk, “Towards an Efficient Accelerator for DNN-Based Remote Sensing Image Segmentation on FPGAs,” 2019 29th International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain, pp. 187-193, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[66] Caiwen Ding et al., “REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs,” Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside CA USA, pp. 33-42, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[67] Hiroki Nakahara et al., “A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an FPGA,” Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey California USA, pp. 31-40, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[68] Kizheppatt Vipin, “ZyNet: Automating Deep Neural Network Implementation on Low-Cost Reconfigurable Edge Computing Platforms,” 2019 International Conference on Field-Programmable Technology (ICFPT), Tianjin, China, pp. 323-326, 2019. [CrossRef] [Google Scholar] [Publisher Link]
[69] Panagiotis Mousouliotis, Ioannis Papaefstathiou, and Loukas Petrou, “SqueezeJet-3: An Accelerator Utilizing FPGA MPSoCs for Edge CNN Applications,” 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Fayetteville, AR, USA, pp. 236-236, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[70] Yuhao Liu et al., “NetPU: Prototyping a Generic Reconfigurable Neural Network Accelerator Architecture,” 2022 International Conference on Field-Programmable Technology (ICFPT), Hong Kong, pp. 1-1, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[71] Yu Gong et al., “N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores,” Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event USA, pp. 112-122, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[72] Nikhil P Ghanathe et al., “MAFIA: Machine Learning Acceleration on FPGAs for IoT Applications,” 2021 31st International Conference on Field-Programmable Logic and Applications (FPL), Dresden, Germany, pp. 347-354, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[73] Cecilia Latotzke, Tim Ciesielski, and Tobias Gemmeke, “Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA,” 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL), Belfast, United Kingdom, pp. 358-365, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[74] Philip Colangelo et al., “Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs,” 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Boulder, CO, USA, pp. 73-80, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[75] Di Wu et al., “A High-Performance CNN Processor Based on FPGA for MobileNets,” 2019 29th International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain, pp. 136-143, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[76] Panagiotis Mousouliotis, Ioannis Papaefstathiou, and Loukas Petrou, “SqueezeJet-3: An Accelerator Utilizing FPGA MPSoCs for Edge CNN Applications,” 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines, Fayetteville, AR, USA, pp. 236-236, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[77] Yang Yang et al., “FPNet: Customized Convolutional Neural Network for FPGA Platforms,” 2019 International Conference on Field-Programmable Technology (ICFPT), Tianjin, China, pp. 399-402, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[78] Glenn G. Ko et al., “Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling,” 2019 29th International Conference on Field Programmable Logic and Applications, Barcelona, Spain, pp. 159-165, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[79] Mengshu Sun et al., “Hardware-Friendly Acceleration for Deep Neural Networks with Micro-Structured Compression,” 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines, New York City, NY, USA, pp. 1-1, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[80] Mathew Hall, and Vaughn Betz, “From TensorFlow Graphs to LUTs and Wires: Automated Sparse and Physically Aware CNN Hardware Generation,” 2020 International Conference on Field-Programmable Technology (ICFPT), Maui, HI, USA, pp. 56-65, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[81] Dionysios Diamantopoulos, and Christoph Hagleitner, “A System-Level Transprecision FPGA Accelerator for BLSTM Using On-Chip Memory Reshaping,” 2018 International Conference on Field-Programmable Technology (FPT), Naha, Japan, pp. 338-341, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[82] Yingxue Gao et al., “SDMA: An Efficient and Flexible Sparse-Dense Matrix-Multiplication Architecture for GNNs,” 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL), Belfast, United Kingdom, pp. 307-312, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[83] Lucian Petrica et al., “Memory-Efficient Dataflow Inference for Deep CNNs on FPGA,” 2020 International Conference on Field-Programmable Technology (ICFPT), Maui, HI, USA, pp. 48-55, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[84] Mohamed Ibrahim et al., “Extending Data Flow Architectures for Convolutional Neural Networks to Multiple FPGAs,” 2023 International Conference on Field Programmable Technology (ICFPT), Yokohama, Japan, pp. 132-141, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[85] Shuo Wang et al., “C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs,” Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey California USA, pp. 11-20, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[86] Yi-Chien Lin, and Viktor Prasanna, “A Framework for Graph Machine Learning on Heterogeneous Architecture,” 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines, Marina Del Rey, CA, USA, pp. 245-246, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[87] Shreyas Kolala Venkataramanaiah et al., “Automatic Compiler Based FPGA Accelerator for CNN Training,” 2019 29th International Conference on Field Programmable Logic and Applications, Barcelona, Spain, pp. 166-172, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[88] Tamon Sadasue, and Tsuyoshi Isshiki, “Scalable Full Hardware Logic Architecture for Gradient Boosted Tree Training,” 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines, Fayetteville, AR, USA, pp. 234-234, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[89] Frank Ridder, Kuan-Hsun Chen, and Nikolaos Alachiotis, “Accelerated Real-Time Classification of Evolving Data Streams using Adaptive Random Forests,” 2023 International Conference on Field Programmable Technology (ICFPT), Yokohama, Japan, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[90] Changjun Song et al., “MSDF-SGD: Most-Significant Digit-First Stochastic Gradient Descent for Arbitrary-Precision Training,” 2023 33rd International Conference on Field-Programmable Logic and Applications, Gothenburg, Sweden, pp. 159-165, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[91] Duncan J.M Moss et al., “A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study,” Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey California, USA, pp. 107-116, 2018.
[CrossRef] [Google Scholar] [Publisher Link]