Energy-Efficient Deep Learning Architectures for Edge AI Applications

Felix Hart

Authors

Felix Hart Department of Computer Science, University of Central Florida, Orlando, FL, USA.

Keywords:

edge AI, energy efficiency, deep learning, model compression, neuromorphic computing, federated learning, system architecture, sustainability, socio-technical systems

Abstract

The proliferation of edge computing, driven by the exponential growth of Internet of Things devices and the demand for low-latency inference, has necessitated a fundamental rethinking of deep learning architectures. Traditional models, optimized for cloud-based infrastructure with abundant computational resources, are ill-suited for the constrained environments of edge devices, which are characterized by limited energy budgets, memory capacity, and processing power. This paper presents a comprehensive systems-level analysis of energy-efficient deep learning architectures designed specifically for edge artificial intelligence applications. Moving beyond a narrow focus on algorithmic optimization, the discussion situates architectural design within a broader socio-technical framework, examining the structural trade-offs between model accuracy, energy consumption, inference latency, and operational robustness. The paper systematically evaluates key architectural strategies, including model compression techniques such as pruning and quantization, the design of lightweight neural networks like MobileNets and EfficientNet, and the deployment of neuromorphic computing paradigms. Furthermore, it explores the governance and infrastructural implications of deploying these architectures across heterogeneous edge environments, addressing issues of fairness, sustainability, and policy compliance. A critical analysis of federated learning as a privacy-preserving framework for distributed edge training is provided, with particular attention to its energy overhead and convergence challenges. The paper concludes by outlining forward-looking research directions, emphasizing the need for holistic co-design approaches that integrate hardware, software, and regulatory considerations to achieve truly sustainable and equitable edge AI systems.

References

1. Sze, V., Chen, Y.-H., Yang, T.-J., & Emer, J. S. (2017). Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12), 2295-2329.

2. Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.-M., Rothchild, D., So, D., Texier, M., & Dean, J. (2021). Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350.

3. Horowitz, M. (2014). 1.1 computing's energy problem (and what we can do about it). 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 10-14.

4. Lane, N. D., Bhattacharya, S., Georgiev, P., Forlivesi, C., Jiao, L., Qendro, L., & Kawsar, F. (2016). DeepX: A software accelerator for low-power deep learning inference on mobile devices. Proceedings of the 15th International Conference on Information Processing in Sensor Networks, 1-12.

5. Chen, Y.-H., Krishna, T., Emer, J. S., & Sze, V. (2017). Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits, 52(1), 127-138.

6. Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. International Conference on Learning Representations.

7. Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., & Kalenichenko, D. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2704-2713.

8. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

9. Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77-91.

10. Hasan, M. M. (2025). Federated Learning Models for Privacy-Preserving AI In Enterprise Decision Systems. International Journal of Business and Economics Insights, 5(3), 238-269.

11. Neftci, E. O., Mostafa, H., & Zenke, F. (2019). Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6), 51-63.

12. Davies, M., Srinivasa, N., Lin, T.-H., Chinya, G., Cao, Y., Choday, S. H., Dimou, G., Joshi, P., Imam, N., Jain, S., Liao, Y., Lin, C.-K., Lines, A., Liu, R., Mathaikutty, D., McCoy, S., Paul, A., Tse, J., Venkataramanan, G., ... Wang, H. (2018). Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro, 38(1), 82-99.

13. McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 1273-1282.

14. Konečný, J., McMahan, H. B., Yu, F. X., Richtárik, P., Suresh, A. T., & Bacon, D. (2016). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492.

15. Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2, 429-450.

16. Qiu, X., Parcollet, T., Beutel, D. J., Topal, T., Kourtellis, N., & Lane, N. D. (2022). A first look into the carbon footprint of federated learning. arXiv preprint arXiv:2210.01297.

17. Gupta, U., Kim, Y. G., Lee, S., Tse, J., Lee, H.-H. S., Wei, G.-Y., Brooks, D., & Wu, C.-J. (2022). Chasing carbon: The elusive environmental footprint of computing. IEEE Micro, 42(4), 37-47.

18. Teerapittayanon, S., McDanel, B., & Kung, H. T. (2016). BranchyNet: A network of deep neural networks for distributed inference. arXiv preprint arXiv:1609.02579.

19. Schwartz, R., Dodge, J., Smith, N. A., & Etzioni, O. (2020). Green AI. Communications of the ACM, 63(12), 54-63.

20. Hooker, S., Erhan, D., Kindermans, P.-J., & Kim, B. (2019). A benchmark for interpretability methods in deep neural networks. Advances in Neural Information Processing Systems, 32.

21. Whittaker, M., Crawford, K., Dobbe, R., Fried, G., Kaziunas, E., Mathur, V., West, S. M., Richardson, R., Schultz, J., & Schwartz, O. (2018). AI now report 2018. AI Now Institute at New York University.

22. Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., & Le, Q. V. (2019). MnasNet: Platform-aware neural architecture search for mobile. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2820-2828.

23. Lucia, B., Balaji, V., Colin, A., Maeng, K., & Ruppel, E. (2017). Intermittent computing: Challenges and opportunities. 2nd Summit on Advances in Programming Languages, 1-12.

24. Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650.

Energy-Efficient Deep Learning Architectures for Edge AI Applications

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Journal Information

Indexing & Infrastructure

Current Issue

Information

Make a Submission