Multi-Processor System-on-Chip 2 covers application-specific MPSoC design, including compilers and architecture exploration. This second volume describes optimization methods, tools to optimize and port specific applications on MPSoC architectures. Details on compilation, power consumption and wireless communication are also presented, as well as examples of modeling frameworks and CAD tools. Explanations of specific platforms for automotive and real-time computing are also included.
Table of Contents
Foreword xi
Ahmed JERRAYA
Acknowledgments xiii
Liliana ANDRADE and Frédéric ROUSSEAU
Part 1. MPSoC for Telecom 1
Chapter 1. From Challenges to Hardware Requirements for Wireless Communications Reaching 6G 3
Stefan A. DAMJANCEVIC, Emil MATUS, Dmitry UTYANSKY, Pieter VAN DER WOLF and Gerhard P. FETTWEIS
1.1. Introduction 4
1.2. Breadth of workloads 6
1.2.1. Vision, trends and applications 6
1.2.2. Standard specifications 8
1.2.3. Outcome of workloads 13
1.3. GFDM algorithm breakdown 14
1.3.1. Equation 15
1.3.2. Dataflow processing graph and matrix representation 15
1.3.3. Pseudo-code 16
1.4. Algorithm precision requirements and considerations 18
1.5. Implementation 21
1.5.1. Implementation considerations 23
1.5.2. Design space exploration 23
1.5.3. Measurements for low-end and high-end use cases 26
1.6. Conclusion 28
1.7. Acknowledgments 29
1.8. References 29
Chapter 2. Towards Tbit/s Wireless Communication Baseband Processing: When Shannon meets Moore 33
Matthias HERRMANN and Norbert WEHN
2.1. Introduction 34
2.2. Role of microelectronics 36
2.3. Towards 1 Tbit/s throughput decoders 37
2.3.1. Turbodecoder 39
2.3.2. LDPC decoder 41
2.3.3. Polar decoder 41
2.4. Conclusion 43
2.5. Acknowledgments 43
2.6. References 43
Part 2. Application-specific MPSoC Architectures 47
Chapter 3. Automation for Industry 4.0 by using Secure LoRaWAN Edge Gateways 49
Marcello COPPOLA and George KORNAROS
3.1. Introduction 50
3.2. Security in IIoT 52
3.3. LoRaWAN security in IIoT 53
3.4. Threatmodel 55
3.4.1. LoRaWAN attack model 55
3.4.2. IIoT node attack model 56
3.5. Trusted boot chain with STM32MP1 57
3.5.1. Trust base of node 57
3.5.2. Trusted firmware inSTM32MP1 57
3.5.3. Trusted execution environments and OP-TEE 58
3.5.4. OP-TEE scheduling considerations 60
3.5.5. OP-TEEmemorymanagement 60
3.5.6. OP-TEE clientAPI 61
3.5.7.TEE internal coreAPI 62
3.5.8. Root and chain of trust 62
3.5.9. Hardware unique key 62
3.5.10. Secure clock 63
3.5.11. Cryptographic operations 63
3.6. LoRaWAN gateway withSTM32MP1 64
3.7. Discussion and future scope 65
3.8. Acknowledgments 66
3.9. References 66
Chapter 4. Accelerating Virtualized Distributed NVMe Storage in Hardware 69
Julian CHESTERFIELD and Michail FLOURIS
4.1. Introduction 70
4.1.1. Virtualization and traditional hypervisors 71
4.1.2. Hyperconverged versus disaggregated cloud architectures 72
4.1.3. NVMe flash storage 74
4.2. Motivation:NVMe storage for the cloud 75
4.2.1. Motivation for a new hypervisor 75
4.2.2. Motivation for accelerating disaggregated storage 76
4.3. Design 77
4.3.1. Optimizing the hypervisor I/O operations 77
4.3.2. Design of accelerated disaggregated storage 80
4.4. Implementation 86
4.4.1. The NexVisor platform 87
4.4.2. Accelerated disaggregated storage 87
4.5. Results 90
4.5.1. Sequential reads 90
4.5.2. Sequentialwrites 90
4.5.3. Sequential reads on one NVMe drive 92
4.5.4. Networkperformance 92
4.6. Conclusion 93
4.7. References 93
Chapter 5. Modular and Open Platform for Future Automotive Computing Environment 95
Raphaël DAVID, Etienne HAMELIN, Paul DUBRULLE, Shuai LI, Philippe DORE, Alexis OLIVEREAU, Maroun OJAIL, Alexandre CARBON and Laurent LE GARFF
5.1. Introduction 96
5.2. Outline of this approach 98
5.2.1. Centralized computation, distributed data 98
5.2.2. Modularity and heterogeneity 99
5.2.3. Tools for specification, configuration and integration 101
5.3. Results 102
5.3.1. Hardware platform 103
5.3.2. FACE SW architecture 108
5.3.3. FACE Tool Suite 112
5.4. Use case 116
5.4.1. Adaptive braking system 116
5.5. Conclusion 118
5.6. References 119
Chapter 6. Post-Moore Datacenter Server Architecture 123
Babak FALSAFI
6.1. Introduction 124
6.2. Background: today’s blades are from the desktops of the 1980s 125
6.3. Memory-centricserverdesign 127
6.4. Data management accelerators 129
6.5. Integrated network controllers 130
6.6. References 131
Part 3. Architecture Examples and Tools for MPSoC 135
Chapter 7. SESAM: A Comprehensive Framework for Cyber-Physical System Prototyping 137
Amir CHARIF, AriefWICAKSANA, Salah-Eddine SAIDI, Tanguy SASSOLAS, Caaliph ANDRIAMISAINA and Nicolas VENTROUX
7.1. Introduction 138
7.2. An overview of the SESAM platform 138
7.2.1. Multi-abstraction system prototyping 139
7.2.2. Assessing extra-functional system properties 140
7.3. VPSim: fast and easy virtual prototyping 140
7.3.1. Writing peripherals in Python 141
7.3.2. The Model Provider interface 142
7.3.3. QEMU support 144
7.3.4. Online simulation monitoring 146
7.3.5. Acceleration methods 146
7.4. Hybrid prototyping 147
7.4.1. Co-simulationmode 148
7.4.2. Co-emulationmode 149
7.4.3. Runtime performance analysis and debugging features 149
7.5.FMI for co-simulation 150
7.5.1. Functional mock-up interface 151
7.5.2. VPSim integration inFMI co-simulation 152
7.6. Conclusion 155
7.7. References 155
Chapter 8. StaccatoLab: A Programming and Execution Model for Large-scale Dataflow Computing 157
Kees VAN BERKEL
8.1. Introduction 158
8.2. Static dataflow 161
8.2.1. Synchronous dataflow 162
8.2.2. Cyclo-static dataflow 166
8.2.3. Dataflow graph transformations 167
8.3. Dynamic dataflow 168
8.3.1. Data-dependentdataflow 168
8.3.2. Non-determinatedataflow 172
8.4. Dataflow execution models 175
8.4.1. A brief review of dataflow theory 175
8.4.2. The StaccatoLab execution model 177
8.5. StaccatoLab 180
8.5.1. Dataflow graph description and analysis 180
8.5.2. Verilog synthesis 180
8.6. Large-scale dataflow computing? 182
8.6.1. What kind of applications? 182
8.6.2. Why effective? 183
8.6.3. Why efficient? 184
8.7. Acknowledgments 185
8.8. References 185
Chapter 9. Smart Cameras and MPSoCs 189
Marilyn WOLF
9.1. Introduction 189
9.2. Early VLSI video processors 190
9.3. Video signal processors 191
9.4. Accelerators 193
9.5. From VSP to MPSoC 195
9.6. Graphics processing units 197
9.7. Neural networks and tensor processing units 197
9.8. Conclusion 199
9.9. References 199
Chapter 10. Software Compilation and Optimization Techniques for Heterogeneous Multi-core Platforms 203
Weihua SHENG, Jeronimo CASTRILLON and Rainer LEUPERS
10.1. Introduction 204
10.2. Dataflow modeling 207
10.2.1. General concepts 207
10.2.2. Process networks 208
10.2.3. Cfor process networks 209
10.3. Source-to-source-based compiler infrastructure 214
10.3.1.Design rationale 214
10.3.2. Implementation strategy 216
10.4. Software distribution 218
10.4.1. KPNanalysis 219
10.4.2. Static KPN mapping 220
10.4.3. Hybrid KPN mapping 221
10.5. Results 222
10.5.1.Applications and experiences 222
10.5.2. Retargetability 229
10.6. Conclusion 230
10.7. References 231
List of Authors 237
Author Biographies 241
Index 251