skip to main content
10.1145/3445814.3446724acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

PacketMill: toward per-Core 100-Gbps networking

Published: 17 April 2021 Publication History

Abstract

We present PacketMill, a system for optimizing software packet processing, which (i) introduces a new model to efficiently manage packet metadata and (ii) employs code-optimization techniques to better utilize commodity hardware. PacketMill grinds the whole packet processing stack, from the high-level network function configuration file to the low-level userspace network (specifically DPDK) drivers, to mitigate inefficiencies and produce a customized binary for a given network function. Our evaluation results show that PacketMill increases throughput (up to 36.4 Gbps -- 70%) & reduces latency (up to 101 us -- 28%) and enables nontrivial packet processing (e.g., router) at ~100 Gbps, when new packets arrive >10× faster than main memory access times, while using only one processing core.

References

[1]
Bilal Anwer, Theophilus Benson, Nick Feamster, and Dave Levin. 2015. Programming Slick Network Functions. In Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking Research (Santa Clara, California) (SOSR ?15). Association for Computing Machinery, New York, NY, USA, Article 14, 13 pages. isbn:9781450334518 https://doi.org/10.1145/2774993.2774998
[2]
D. Barach, L. Linguaglossa, D. Marion, P. Pfister, S. Pontarelli, and D. Rossi. 2018. High-Speed Software Data Plane via Vectorized Packet Processing. IEEE Communications Magazine 56, 12 (2018), 97?103. https://doi.org/10.1109/MCOM.2018.1800069
[3]
Tom Barbette. 2018. Architecture for programmable network infrastructure. Ph.D. Dissertation. University of Liege. http://www.diva-portal.org/smash/record.jsf?pid=diva2\ accessed 2020-12-23.
[4]
Tom Barbette, Marco Chiesa, Gerald Q. Maguire Jr., and Dejan Kosti\'c. 2020. Stateless CPU-Aware Datacenter Load-Balancing. Association for Computing Machinery, New York, NY, USA, 548?549. isbn:9781450379489 https://doi.org/10.1145/3386367.3431672
[5]
Tom Barbette, Georgios P. Katsikas, Gerald Q. Maguire Jr., and Dejan Kosti\'c. 2019. RSS++: Load and State-Aware Receive Side Scaling. In Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies (Orlando, Florida) (CoNEXT ?19). Association for Computing Machinery, New York, NY, USA, 318?333. isbn:9781450369985 https://doi.org/10.1145/3359989.3365412
[6]
Tom Barbette, Cyril Soldani, and Laurent Mathy. 2015. Fast Userspace Packet Processing. In Proceedings of the Eleventh ACM/IEEE Symposium on Architectures for Networking and Communications Systems (Oakland, California, USA) (ANCS '15). IEEE Computer Society, Washington, DC, USA, 5?16. isbn:978-1-4673-6632-8 https://doi.org/10.1109/ANCS.2015.7110116
[7]
Tom Barbette, Chen Tang, Haoran Yao, Dejan Kosti\'c, Gerald Q. Maguire Jr., Panagiotis Papadimitratos, and Marco Chiesa. 2020. A High-Speed Load-Balancer Design with Guaranteed Per-Connection-Consistency . In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 667?683. isbn:978-1-939133-13-7 https://www.usenix.org/conference/nsdi20/presentation/barbette
[8]
BESS. 2017. sn_buff Layout. https://github.com/NetSys/bess/blob/master/core/snbuf_layout.h.
[9]
BESS. 2019. Packet. https://github.com/NetSys/bess/blob/master/core/packet.h.
[10]
Andrea Di Biagio and Matt Davis. 2020. llvm-mca - LLVM Machine Code Analyzer. https://llvm.org/docs/CommandGuide/llvm-mca.html, accessed 2020-06-15.
[11]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-Independent Packet Processors. SIGCOMM Comput. Commun. Rev. 44, 3 (July 2014), 87?95. issn:0146-4833 https://doi.org/10.1145/2656877.2656890
[12]
Anat Bremler-Barr, Yotam Harchol, and David Hay. 2016. OpenBox: A Software-Defined Framework for Developing, Deploying, and Managing Network Functions. In Proceedings of the 2016 ACM SIGCOMM Conference (Florianopolis, Brazil) (SIGCOMM ?16). Association for Computing Machinery, New York, NY, USA, 511?524. isbn:9781450341936 https://doi.org/10.1145/2934872.2934875
[13]
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (San Diego, California) (OSDI?08). USENIX Association, USA, 209?224.
[14]
D. Cerovi?, V. Del Piccolo, A. Amamou, K. Haddadou, and G. Pujolle. 2018. Fast Packet Processing: A Survey. IEEE Communications Surveys Tutorials 20, 4 (2018), 3645?3676. https://doi.org/10.1109/COMST.2018.2851072
[15]
Dehao Chen, David Xinliang Li, and Tipp Moseley. 2016. AutoFDO: Automatic Feedback-Directed Optimization for Warehouse-Scale Applications. In CGO 2016 Proceedings of the 2016 International Symposium on Code Generation and Optimization. New York, NY, USA, 12?23.
[16]
Charlie Curtsinger and Emery D. Berger. 2013. STABILIZER: Statistically Sound Performance Evaluation. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (Houston, Texas, USA) (ASPLOS ?13). Association for Computing Machinery, New York, NY, USA, 219?228. isbn:9781450318709 https://doi.org/10.1145/2451116.2451141
[17]
Bangwen Deng, Wenfei Wu, and Linhai Song. 2020. Redundant Logic Elimination in Network Functions. In Proceedings of the Symposium on SDN Research (San Jose, CA, USA) (SOSR ?20). Association for Computing Machinery, New York, NY, USA, 34?40. isbn:9781450371018 https://doi.org/10.1145/3373360.3380832
[18]
DPDK. 2020. Data Plane Development Kit (DPDK). https://dpdk.org.
[19]
DPDK. 2020. Mbuf Library. https://doc.dpdk.org/guides/prog_guide/mbuf_lib.html.
[20]
Haggai Eran, Lior Zeno, Maroun Tork, Gabi Malka, and Mark Silberstein. 2019. NICA: An Infrastructure for Inline Acceleration of Network Applications. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 345?362. isbn:978-1-939133-03-8 https://www.usenix.org/conference/atc19/presentation/eran
[21]
Ericsson. 2017. Supercharging the Evolved Packet Gateway. Technical Report. Ericsson. https://www.ericsson.com/assets/local/digital-services/doc/Supercharging-the-Evolved-Packet-Gateway.pdf https://www.ericsson.com/assets/local/digital-services/doc/Supercharging-the-Evolved-Packet-Gateway.pdf, accessed 2020-07-24.
[22]
H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, and D. Burger. 2011. Dark silicon and the end of multicore scaling. In 2011 38th Annual International Symposium on Computer Architecture (ISCA). 365?376. issn:1063-6897
[23]
Alireza Farshin and Tom Barbette. 2021. PacketMill: Toward per-core 100-Gbps Networking - Artifact for ASPLOS'21. https://doi.org/10.5281/zenodo.4435970 Note that this is just an archive for ASPLOS'21 artifact evaluation; you can access the latest version at https://github.com/aliireza/packetmill.
[24]
Alireza Farshin, Amir Roozbeh, Gerald Q. Maguire Jr., and Dejan Kosti\'c. 2019. Make the Most out of Last Level Cache in Intel Processors. In Proceedings of the Fourteenth EuroSys Conference 2019 (Dresden, Germany) (EuroSys '19). ACM, New York, NY, USA, Article 8, 17 pages. isbn:978-1-4503-6281-8 https://doi.org/10.1145/3302424.3303977
[25]
Alireza Farshin, Amir Roozbeh, Gerald Q. Maguire Jr., and Dejan Kosti\'c. 2020. Reexamining Direct Cache Access to Optimize I/O Intensive Applications for Multi-hundred-gigabit Networks. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 673?689. isbn:978-1-939133-14-4 https://www.usenix.org/conference/atc20/presentation/farshin
[26]
FastClick. 2019. Packet Class. https://github.com/tbarbette/fastclick/blob/master/include/click/packet.hh.
[27]
FD.io. 2017. Vector Packet Processing - One Terabit Software Router on Intel Xeon Scalable Processor Family Server. Technical Report. Cisco, Intel Corporation, FD.io. https://fd.io/docs/whitepapers/FDioVPPwhitepaperJuly2017.pdf https://fd.io/docs/whitepapers/FDioVPPwhitepaperJuly2017.pdf, accessed 2020-07-24.
[28]
Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert Greenberg. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 51?66. isbn:978-1-939133-01-4 https://www.usenix.org/conference/nsdi18/presentation/firestone
[29]
Massimo Gallo and Rafael Laufer. 2018. ClickNF: a Modular Stack for Custom Network Functions. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 745?757. isbn:978-1-939133-01-4 https://www.usenix.org/conference/atc18/presentation/gallo
[30]
GCC. 2009. Link Time Optimization. https://gcc.gnu.org/wiki/LinkTimeOptimization, accessed 2020-06-15.
[31]
Taras Glek and Jan Hubi\vCka. 2010. Optimizing real world applications with GCC Link Time Optimization. arxiv:1010.2196 [cs.PL] http://sciencewise.info/media/pdf/1010.2196v2.pdf, accessed 2020-06-15.
[32]
Matt Godbolt. 2020. Optimizations in C++ Compilers. Commun. ACM 63, 2 (Jan. 2020), 41?49. issn:0001-0782 https://doi.org/10.1145/3369754
[33]
Google. 2020. GitHub - Propeller: Profile Guided Optimizing Large Scale LLVM-based Relinker. https://github.com/google/llvm-propeller, accessed 2020-06-15.
[34]
Google. 2020. GitHub - Souper: A superoptimizer for LLVM IR. https://github.com/google/souper, accessed 2020-06-15.
[35]
Corey Gough, Ian Steiner, and Winston A. Saunders. 2015. Energy Efficient Servers: Blueprints for Data Center Optimization (1st ed.). Apress, USA. isbn:1430266376
[36]
Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. 2015. Berkeley Extensible Software Switch (BESS). http://span.cs.berkeley.edu/bess.html, accessed 2020-07-22.
[37]
Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. 2015. SoftNIC: A Software NIC to Augment Hardware. Technical Report UCB/EECS-2015-155. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-155.html
[38]
Sangjin Han, Keon Jang, KyoungSoo Park, and Sue Moon. 2010. PacketShader: A GPU-Accelerated Software Router. SIGCOMM Comput. Commun. Rev. 40, 4 (Aug. 2010), 195?206. issn:0146-4833 https://doi.org/10.1145/1851275.1851207
[39]
Toke H\oiland-J\orgensen, Jesper Dangaard Brouer, Daniel Borkmann, John Fastabend, Tom Herbert, David Ahern, and David Miller. 2018. The EXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel. In Proceedings of the 14th International Conference on Emerging Networking EXperiments and Technologies (Heraklion, Greece) (CoNEXT ?18). Association for Computing Machinery, New York, NY, USA, 54?66. isbn:9781450360807 https://doi.org/10.1145/3281411.3281443
[40]
Y. Jiang, Y. Cui, W. Wu, Z. Xu, J. Gu, K. K. Ramakrishnan, Y. He, and X. Qian. 2019. SpeedyBox: Low-Latency NFV Service Chains with Cross-NF Runtime Consolidation. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). 68?79. https://doi.org/10.1109/ICDCS.2019.00016
[41]
Kostis Kaffes, Timothy Chong, Jack Tigar Humphries, Adam Belay, David Mazi\`eres, and Christos Kozyrakis. 2019. Shinjuku: Preemptive Scheduling for \si\musecond-scale Tail Latency. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 345?360. isbn:978-1-931971-49-2 https://www.usenix.org/conference/nsdi19/presentation/kaffes
[42]
Georgios P. Katsikas, Tom Barbette, Marco Chiesa, Dejan Kosti\'c, and Gerald Q. Maguire Jr. 2021. What you need to know about (Smart) Network Interface Cards. In Proceedings of the Passive and Active Measurement (PAM) Conference. Springer International Publishing.
[43]
Georgios P. Katsikas, Tom Barbette, Dejan Kosti\'c, Rebecca Steinert, and Gerald Q. Maguire Jr. 2018. Metron: NFV Service Chains at the True Speed of the Underlying Hardware. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 171?186. isbn:978-1-931971-43-0 https://www.usenix.org/conference/nsdi18/presentation/katsikas
[44]
Georgios P. Katsikas, Marcel Enguehard, Maciej Ku\'zniar, Gerald Q. Maguire Jr., and Dejan Kosti\'c. 2016. SNF: Synthesizing high performance NFV service chains. PeerJ Computer Science 2, e98. issn:2376-5992 https://doi.org/10.7717/peerj-cs.98
[45]
Antoine Kaufmann, SImon Peter, Naveen Kr. Sharma, Thomas Anderson, and Arvind Krishnamurthy. 2016. High Performance Packet Processing with FlexNIC. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (Atlanta, Georgia, USA) (ASPLOS ?16). Association for Computing Machinery, New York, NY, USA, 67?81. isbn:9781450340915 https://doi.org/10.1145/2872362.2872367
[46]
Rainer Keller and Shiqing Fan. 2013. PINstruct ? Efficient Memory Access to Data Structures. Springer Berlin Heidelberg, Berlin, Heidelberg, 127?128. isbn:978-3-642-35893-7 https://doi.org/10.1007/978-3-642-35893-7_14
[47]
Donald E. Knuth. 1974. Structured Programming with Go to Statements. ACM Comput. Surv. 6, 4 (Dec. 1974), 261?301. issn:0360-0300 https://doi.org/10.1145/356635.356640
[48]
Eddie Kohler, Robert Morris, and Benjie Chen. 2002. Programming Language Optimizations for Modular Router Configurations. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California) (ASPLOS X). Association for Computing Machinery, New York, NY, USA, 251?263. isbn:1581135742 https://doi.org/10.1145/605397.605424
[49]
Maciek Konstantynowicz, Patrick Lu, and Shrikant M. Shah. 2017. Benchmarking and Analysis of Software Data Planes. Technical Report. Cisco, Intel Corporation, FD.io. https://fd.io/wp-content/uploads/sites/34/2018/01/performance_analysis_sw_data_planes_dec21_2017.pdf https://fd.io/wp-content/uploads/sites/34/2018/01/performance_analysis_sw_data_planes_dec21_2017.pdf, accessed 2019-07-24.
[50]
S. G. Kulkarni, W. Zhang, J. Hwang, S. Rajagopalan, K. K. Ramakrishnan, T. Wood, M. Arumaithurai, and X. Fu. 2020. NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains. IEEE/ACM Transactions on Networking 28, 2 (2020), 639?652. https://doi.org/10.1109/TNET.2020.2969971
[51]
Rahman Lavaee, John Criswell, and Chen Ding. 2019. Codestitcher: Inter-Procedural Basic Block Layout Optimization. In Proceedings of the 28th International Conference on Compiler Construction (Washington, DC, USA) (CC 2019). Association for Computing Machinery, New York, NY, USA, 65?75. isbn:9781450362771 https://doi.org/10.1145/3302516.3307358
[52]
Bojie Li, Kun Tan, Layong (Larry) Luo, Yanqing Peng, Renqian Luo, Ningyi Xu, Yongqiang Xiong, Peng Cheng, and Enhong Chen. 2016. ClickNP: Highly Flexible and High Performance Network Processing with Reconfigurable Hardware. In Proceedings of the 2016 ACM SIGCOMM Conference (Florianopolis, Brazil) (SIGCOMM ?16). Association for Computing Machinery, New York, NY, USA, 1?14. isbn:9781450341936 https://doi.org/10.1145/2934872.2934897
[53]
X. Li, X. Wang, F. Liu, and H. Xu. 2018. DHL: Enabling Flexible Software Network Functions with FPGA Acceleration. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). 1?11. https://doi.org/10.1109/ICDCS.2018.00011
[54]
L. Linguaglossa, S. Lange, S. Pontarelli, G. Rétvári, D. Rossi, T. Zinner, R. Bifulco, M. Jarschel, and G. Bianchi. 2019. Survey of Performance Acceleration Techniques for Network Function Virtualization. Proc. IEEE 107, 4 (2019), 746?764. https://doi.org/10.1109/JPROC.2019.2896848
[55]
Guyue Liu, Yuxin Ren, Mykola Yurchenko, K. K. Ramakrishnan, and Timothy Wood. 2018. Microboxes: High Performance NFV with Customizable, Asynchronous TCP Stacks and Dynamic Subscriptions. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM ?18). Association for Computing Machinery, New York, NY, USA, 504?517. isbn:9781450355674 https://doi.org/10.1145/3230543.3230563
[56]
LLVM. 2018. Four bitcode generated with plugin-opt=save-temps. http://lists.llvm.org/pipermail/llvm-dev/2018-May/123341.html, accessed 2020-06-15.
[57]
LLVM. 2020. LLVM Link Time Optimization: Design and Implementation. https://llvm.org/docs/LinkTimeOptimization.html, accessed 2020-06-15.
[58]
LLVM. 2020. ThinLTO. https://clang.llvm.org/docs/ThinLTO.html, accessed 2020-06-15.
[59]
Roberto Casta\ neda Lozano, Mats Carlsson, Gabriel Hjort Blindell, and Christian Schulte. 2019. Combinatorial Register Allocation and Instruction Scheduling. ACM Trans. Program. Lang. Syst. 41, 3, Article 17 (July 2019), 53 pages. issn:0164-0925 https://doi.org/10.1145/3332373
[60]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA) (PLDI '05). Association for Computing Machinery, New York, NY, USA, 190?200. isbn:1595930566 https://doi.org/10.1145/1065010.1065034
[61]
Joao Martins, Mohamed Ahmed, Costin Raiciu, Vladimir Olteanu, Michio Honda, Roberto Bifulco, and Felipe Huici. 2014. ClickOS and the Art of Network Function Virtualization. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). USENIX Association, Seattle, WA, 459?473. isbn:978-1-931971-09-6 https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/martins
[62]
Henry Massalin. 1987. Superoptimizer: A Look at the Smallest Program. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems (Palo Alto, California, USA) (ASPLOS II). IEEE Computer Society Press, Washington, DC, USA, 122?126. isbn:0818608056 https://doi.org/10.1145/36206.36194
[63]
Niall McDonnell and Gage Eads. 2020. Queue Management and Load Balancing on Intel Architecture. https://tinyurl.com/yxv9cgpj, accessed 2020-08-08.
[64]
L\'aszl\'o Moln\'ar, Gergely Pongr\'acz, G\'abor Enyedi, Zolt\'an Lajos Kis, Levente Csikor, Ferenc Juh\'asz, Attila K\Horösi, and G\'abor R\'etv\'ari. 2016. Dataplane Specialization for High-Performance OpenFlow Software Switching. In Proceedings of the 2016 ACM SIGCOMM Conference (Florianopolis, Brazil) (SIGCOMM ?16). Association for Computing Machinery, New York, NY, USA, 539?552. isbn:9781450341936 https://doi.org/10.1145/2934872.2934887
[65]
Robert Morris, Eddie Kohler, John Jannotti, and M. Frans Kaashoek. 1999. The Click Modular Router. In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (Charleston, South Carolina, USA) (SOSP ?99). Association for Computing Machinery, New York, NY, USA, 217?231. isbn:1581131402 https://doi.org/10.1145/319151.319166
[66]
Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. 2009. Producing Wrong Data without Doing Anything Obviously Wrong!. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (Washington, DC, USA) (ASPLOS XIV). Association for Computing Machinery, New York, NY, USA, 265?276. isbn:9781605584065 https://doi.org/10.1145/1508244.1508275
[67]
Rolf Neugebauer, Gianni Antichi, Jos\'e Fernando Zazo, Yury Audzevich, Sergio L\'opez-Buedo, and Andrew W. Moore. 2018. Understanding PCIe Performance for End Host Networking. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM '18). ACM, New York, NY, USA, 327?341. isbn:978-1-4503-5567-4 https://doi.org/10.1145/3230543.3230560
[68]
Andy Newell and Sergey Pupyrev. 2020. Improved Basic Block Reordering. IEEE Trans. Comput. (2020), 1?1. issn:2326-3814 https://doi.org/10.1109/tc.2020.2982888
[69]
G. S. Niemiec, L. M. S. Batista, A. E. Schaeffer-Filho, and G. L. Nazar. 2020. A Survey on FPGA Support for the Feasible Execution of Virtualized Network Functions. IEEE Communications Surveys Tutorials 22, 1 (2020), 504?525. https://doi.org/10.1109/COMST.2019.2943690
[70]
ntop. 2020. PF_RING ZC (Zero Copy). https://www.ntop.org/products/packet-capture/pf_ring/pf_ring-zc-zero-copy/, accessed 2020-08-02.
[71]
G. Ottoni and B. Maher. 2017. Optimizing function placement for large-scale data-center applications. In 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 233?244. https://doi.org/10.1109/CGO.2017.7863743
[72]
Stack Overflow. 2008. Why doesn't GCC optimize structs? https://stackoverflow.com/questions/118068/why-doesnt-gcc-optimize-structs, accessed 2020-06-15.
[73]
Stack Overflow. 2012. Why can't C compilers rearrange struct members to eliminate alignment padding? https://tinyurl.com/yxncnqk8, accessed 2020-08-07.
[74]
Stack Overflow. 2016. Struct Reordering by compiler. https://stackoverflow.com/questions/38244689/struct-reordering-by-compiler, accessed 2020-06-15.
[75]
Maksim Panchenko, Rafael Auler, Bill Nell, and Guilherme Ottoni. 2019. BOLT: A Practical Binary Optimizer for Data Centers and Beyond. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (Washington, DC, USA) (CGO 2019). IEEE Press, 2?14. isbn:9781728114361
[76]
Aurojit Panda, Sangjin Han, Keon Jang, Melvin Walls, Sylvia Ratnasamy, and Scott Shenker. 2016. NetBricks: Taking the V out of NFV. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 203?216. isbn:978-1-931971-33-1 https://www.usenix.org/conference/osdi16/technical-sessions/presentation/panda
[77]
M. Paolino, N. Nikolaev, J. Fanguede, and D. Raho. 2015. SnabbSwitch user space virtual switch benchmark and performance optimization for NFV. In 2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN). 86?92. https://doi.org/10.1109/NFV-SDN.2015.7387411
[78]
Luis Pedrosa, Rishabh Iyer, Arseniy Zaostrovnykh, Jonas Fietz, and Katerina Argyraki. 2018. Automated Synthesis of Adversarial Workloads for Network Functions. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM ?18). Association for Computing Machinery, New York, NY, USA, 372?385. isbn:9781450355674 https://doi.org/10.1145/3230543.3230573
[79]
Ben Pfaff, Justin Pettit, Teemu Koponen, Ethan Jackson, Andy Zhou, Jarno Rajahalme, Jesse Gross, Alex Wang, Joe Stringer, Pravin Shelar, Keith Amidon, and Martin Casado. 2015. The Design and Implementation of Open vSwitch. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). USENIX Association, Oakland, CA, 117?130. isbn:978-1-931971-218 https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/pfaff
[80]
Solal Pirelli and George Candea. 2020. A Simpler and Faster NIC Driver Model for Network Functions. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 225?241. isbn:978-1-939133-19-9 https://www.usenix.org/conference/osdi20/presentation/pirelli
[81]
Sekhar Reddy. 2014. What is SKB in Linux kernel? What are SKB operations? Memory Representation of SKB? How to send packet out using skb operations? http://amsekharkernel.blogspot.com/2014/08/what-is-skb-in-linux-kernel-what-are.html.
[82]
The Rust Language Reference. 2008. Struct Types. https://github.com/rust-lang/reference/blob/master/src/types/struct.md, accessed 2020-06-15.
[83]
Luigi Rizzo. 2012. netmap: A Novel Framework for Fast Packet I/O. In 2012 USENIX Annual Technical Conference (USENIX ATC 12). USENIX Association, Boston, MA, 101?112. isbn:978-931971-93-5 https://www.usenix.org/conference/atc12/technical-sessions/presentation/rizzo
[84]
Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Jubi Taneja, and John Regehr. 2017. Souper: A Synthesizing Superoptimizer. CoRR abs/1711.04422 (2017). arxiv:1711.04422 http://arxiv.org/abs/1711.04422
[85]
Vyas Sekar, Norbert Egi, Sylvia Ratnasamy, Michael K. Reiter, and Guangyu Shi. 2012. Design and Implementation of a Consolidated Middlebox Architecture. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX, San Jose, CA, 323?336. isbn:978-931971-92-8 https://www.usenix.org/conference/nsdi12/technical-sessions/presentation/sekar
[86]
Chen Sun, Jun Bi, Zhilong Zheng, Heng Yu, and Hongxin Hu. 2017. NFP: Enabling Network Function Parallelism in NFV. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (Los Angeles, CA, USA) (SIGCOMM ?17). Association for Computing Machinery, New York, NY, USA, 43?56. isbn:9781450346535 https://doi.org/10.1145/3098822.3098826
[87]
W. Sun and R. Ricci. 2013. Fast and flexible: Parallel packet processing with GPUs and click. In Architectures for Networking and Communications Systems. 25?35. https://doi.org/10.1109/ANCS.2013.6665173
[88]
Vaibhav Sundriyal, Masha Sosonkina, Bryce M. Westheimer, and Mark Gordon. 2018. Comparisons of Core and Uncore Frequency Scaling Modes in Quantum Chemistry Application GAMESS. In Proceedings of the High Performance Computing Symposium (Baltimore, Maryland) (HPC ?18). Society for Computer Simulation International, San Diego, CA, USA, Article 13, 11 pages. isbn:9781510860162
[89]
Shelby Thomas, Rob McGuinness, Geoffrey M. Voelker, and George Porter. 2018. Dark Packets and the End of Network Scaling. In Proceedings of the 2018 Symposium on Architectures for Networking and Communications Systems (Ithaca, New York) (ANCS '18). ACM, New York, NY, USA, 1?14. isbn:978-1-4503-5902-3 https://doi.org/10.1145/3230718.3230727
[90]
Shelby Thomas, Geoffrey M. Voelker, and George Porter. 2018. CacheCloud: Towards Speed-of-light Datacenter Communication. In 10th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 18). USENIX Association, Boston, MA. https://www.usenix.org/conference/hotcloud18/presentation/thomas
[91]
Georgii Tkachuk, Maciek Konstantynowicz, and Shrikant M. Shah. 2019. Benchmarking Software Data Planes - Intel Xeon Skylake vs. Broadwell. Technical Report. Cisco, Intel Corporation, FD.io. https://www.lfnetworking.org/wp-content/uploads/sites/55/2019/03/benchmarking_sw_data_planes_skx_bdx_mar07_2019.pdf https://www.lfnetworking.org/wp-content/uploads/sites/55/2019/03/benchmarking_sw_data_planes_skx_bdx_mar07_2019.pdf, accessed 2020-07-24.
[92]
Yuta Tokusashi, Huynh Tu Dang, Fernando Pedone, Robert Soul\'e, and Noa Zilberman. 2019. The Case For In-Network Computing On Demand. In Proceedings of the Fourteenth EuroSys Conference 2019 (Dresden, Germany) (EuroSys '19). ACM, New York, NY, USA, Article 21, 16 pages. isbn:978-1-4503-6281-8 https://doi.org/10.1145/3302424.3303979
[93]
Tom Barbette. 2020. Network Performance Framework (NPF). https://github.com/tbarbette/npf, accessed 2020-07-24.
[94]
Amin Tootoonchian, Aurojit Panda, Chang Lan, Melvin Walls, Katerina Argyraki, Sylvia Ratnasamy, and Scott Shenker. 2018. ResQ: Enabling SLOs in Network Function Virtualization. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 283?297. isbn:978-1-939133-01-4 https://www.usenix.org/conference/nsdi18/presentation/tootoonchian
[95]
M. Trevisan, A. Finamore, M. Mellia, M. Munafo, and D. Rossi. 2017. Traffic Analysis with Off-the-Shelf Hardware: Challenges and Lessons Learned. IEEE Communications Magazine 55, 3 (2017), 163?169. https://doi.org/10.1109/MCOM.2017.1600756CM
[96]
Giorgos Vasiliadis, Lazaros Koromilas, Michalis Polychronakis, and Sotiris Ioannidis. 2014. GASPP: A GPU-Accelerated Stateful Packet Processing Framework. In 2014 USENIX Annual Technical Conference (USENIX ATC 14). USENIX Association, Philadelphia, PA, 321?332. isbn:978-1-931971-10-2 https://www.usenix.org/conference/atc14/technical-sessions/presentation/vasiliadis
[97]
James M. Westall. 2011. Management of sk_buffs. https://people.cs.clemson.edu/~westall/853/notes/skbuff.pdf.
[98]
Xiaodong Yi, Jingpu Duan, and Chuan Wu. 2017. GPUNFV: A GPU-Accelerated NFV System. In Proceedings of the First Asia-Pacific Workshop on Networking (Hong Kong, China) (APNet?17). Association for Computing Machinery, New York, NY, USA, 85?91. isbn:9781450352444 https://doi.org/10.1145/3106989.3106990
[99]
Arseniy Zaostrovnykh, Solal Pirelli, Rishabh Iyer, Matteo Rizzo, Luis Pedrosa, Katerina Argyraki, and George Candea. 2019. Verifying Software Network Functions with No Verification Expertise. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP ?19). Association for Computing Machinery, New York, NY, USA, 275?290. isbn:9781450368735 https://doi.org/10.1145/3341301.3359647
[100]
Arseniy Zaostrovnykh, Solal Pirelli, Luis Pedrosa, Katerina Argyraki, and George Candea. 2017. A Formally Verified NAT. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (Los Angeles, CA, USA) (SIGCOMM ?17). Association for Computing Machinery, New York, NY, USA, 141?154. isbn:9781450346535 https://doi.org/10.1145/3098822.3098833
[101]
Kai Zhang, Bingsheng He, Jiayu Hu, Zeke Wang, Bei Hua, Jiayi Meng, and Lishan Yang. 2018. G-NET: Effective GPU Sharing in NFV Systems. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 187?200. isbn:978-1-939133-01-4 https://www.usenix.org/conference/nsdi18/presentation/zhang-kai
[102]
Tianzhu Zhang, Leonardo Linguaglossa, Massimo Gallo, Paolo Giaccone, Luigi Iannone, and James Roberts. 2019. Comparing the Performance of State-of-the-Art Software Switches for NFV. In Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies (Orlando, Florida) (CoNEXT '19). Association for Computing Machinery, New York, NY, USA, 68?81. isbn:9781450369985 https://doi.org/10.1145/3359989.3365415
[103]
Yang Zhang, Bilal Anwer, Vijay Gopalakrishnan, Bo Han, Joshua Reich, Aman Shaikh, and Zhi-Li Zhang. 2017. ParaBox: Exploiting Parallelism for Virtual Network Functions in Service Chaining. In Proceedings of the Symposium on SDN Research (Santa Clara, CA, USA) (SOSR ?17). Association for Computing Machinery, New York, NY, USA, 143?149. isbn:9781450349475 https://doi.org/10.1145/3050220.3050236
[104]
Zhipeng Zhao, Hugo Sadok, Nirav Atre, James C. Hoe, Vyas Sekar, and Justine Sherry. 2020. Achieving 100Gbps Intrusion Prevention on a Single Server. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 1083?1100. isbn:978-1-939133-19-9 https://www.usenix.org/conference/osdi20/presentation/zhao-zhipeng
[105]
N. Zilberman, Y. Audzevich, G. A. Covington, and A. W. Moore. 2014. NetFPGA SUME: Toward 100 Gbps as Research Commodity. IEEE Micro 34, 5 (2014), 32?41. https://doi.org/10.1109/MM.2014.61

Cited By

View all
  • (2024)Incremental Specialization of Network ProgramsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696870(264-272)Online publication date: 18-Nov-2024
  • (2024)FAJITA: Stateful Packet Processing at 100 Million ppsProceedings of the ACM on Networking10.1145/36768612:CoNEXT3(1-22)Online publication date: 21-Aug-2024
  • (2024)Triton: A Flexible Hardware Offloading Architecture for Accelerating Apsara vSwitch in Alibaba CloudProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672224(750-763)Online publication date: 4-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
April 2021
1090 pages
ISBN:9781450383172
DOI:10.1145/3445814
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. 100-Gbps Networking
  2. Commodity Hardware
  3. Compiler Optimizations
  4. DPDK
  5. FastClick
  6. Full-Stack Optimization
  7. LLVM
  8. Metadata Management
  9. Middleboxes
  10. Packet Processing
  11. PacketMill
  12. X-Change

Qualifiers

  • Research-article

Funding Sources

Conference

ASPLOS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,058
  • Downloads (Last 6 weeks)94
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Incremental Specialization of Network ProgramsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696870(264-272)Online publication date: 18-Nov-2024
  • (2024)FAJITA: Stateful Packet Processing at 100 Million ppsProceedings of the ACM on Networking10.1145/36768612:CoNEXT3(1-22)Online publication date: 21-Aug-2024
  • (2024)Triton: A Flexible Hardware Offloading Architecture for Accelerating Apsara vSwitch in Alibaba CloudProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672224(750-763)Online publication date: 4-Aug-2024
  • (2024)Morpheus: A Run Time Compiler and Optimizer for Software Data PlanesIEEE/ACM Transactions on Networking10.1109/TNET.2023.334628632:3(2269-2284)Online publication date: 1-Jun-2024
  • (2024)Un-IOV: Achieving Bare-Metal Level I/O Virtualization Performance for Cloud Usage With Migratability, Scalability and TransparencyIEEE Transactions on Computers10.1109/TC.2024.337558973:7(1655-1668)Online publication date: Jul-2024
  • (2024)Morphable Networks For Cross-Layer And Cross-Domain Programmability: A Novel Network ParadigmIEEE Vehicular Technology Magazine10.1109/MVT.2024.343367019:3(68-77)Online publication date: Sep-2024
  • (2023)High-Speed Network DDoS Attack Detection: A SurveySensors10.3390/s2315685023:15(6850)Online publication date: 1-Aug-2023
  • (2023)State Disaggregation for Dynamic Scaling of Network FunctionsIEEE/ACM Transactions on Networking10.1109/TNET.2023.328256232:1(81-95)Online publication date: 12-Jun-2023
  • (2023)Understanding Roadblocks in Virtual Network I/O: A Comprehensive Analysis of CPU Cache Usage2023 IEEE 9th International Conference on Network Softwarization (NetSoft)10.1109/NetSoft57336.2023.10175477(450-455)Online publication date: 19-Jun-2023
  • (2022)A Survey of NFV Network Acceleration from ETSI PerspectiveElectronics10.3390/electronics1109145711:9(1457)Online publication date: 2-May-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media