Shakti (Micro Processor) – The Advanced Micro Processor By India
- by Shanu Raj
Shakti Micro Processor
Basic Introduction : SHAKTI is an open-source initiative by the RISE group at IIT-Madras, which is not only building open source, production grade processors, but also associated components like interconnect fabrics, verification tools, storage controllers, peripheral IPs and SOC tools.
The SHAKTI project is building a family of 6 processors, based on the RISC-V ISA. We will also develop reference SoCs for each class of processors, which will serve as an exemplar for that family. While the primary focus of the team is architecture research, these SoCs will be competitive with commercial offerings in the market with respect to area, power and performance.
Apart from front-end design, SHAKTI is also actively working with partners to develop a base VLSI flow (front and back-end) for a large part of the ecosystem. While all the tools might not be open-source, the scripts and environment to plug-in SHAKTI components will be released in open-source.
Source code of all the components of the SHAKTI project are open sourced under the 3 part BSD license and will be royalty and patent free (as far as IIT-Madras is concerned, we will not assert any patents). Which basically means, you can use, modify and distribute this code as long as it meets the license terms.
Three types of base processors :
1 ) E Class : This is our embedded class processor, built around a 3-stage in-order core. It is aimed at low-power and low compute applications and is capable of running basic RTOSs like FreeRTOS, Zephyr and eChronos. Market segments include: smart-cards, IoT sensors, motor-controls and robotic platforms.
2 ) C Class : The C Class is a controller class of processors, aimed at mid-range application workloads. The core is a highly optimized, 5-stage in-order design with MMU support and capability to run operating systems Linux and Sel4. These processors are targeted at compute/control applications in the 0.5-1.5 Ghz range. The C-class will supports the full RISC-V ISA. The C Class is also the basis for our Tagged-ISA and Fault tolerant cores.
3 ) I Class : Equipped with performance oriented features like out-of-order execution, multi-threading, aggressive branch prediction, non-blocking caches and deep pipeline stages. the I-Class processors are targeted at the compute , mobile, storage and networking the mobile and networking segments. Target operating range – 1.5-2.5 Ghz.
Three Types Of Multicore Processors.
1 ) M Class : This is a mobile class processor with a maximum of 8 cores, the cores being a combination of C and I class cores. Tile-Link is used as the cache-coherent interconnect used along along with transaction adapters/bridges to AXI4/AHB to connect to fast and/or slow peripherals. The Tile Link topology is customizable to allow optimizations for various power/performance targets. In typical configurations, it is expected that a core complex of 2 or 4 cores will share an L2 cache. L3 caches are optional and are typically expected to be used in desktop type applications.
2 ) H Class : The S-Class is aimed at Workstation and Enterprise server workloads. The base core is an enhanced version of the I-class, with quad-issue and multi-threading support. A tile-link based cache coherent mesh fabric is the interconnect of choice. Cores are expected to use dedicated L2 caches and segmented L3 caches. A maximum core count of 32 will be supported. External interconnect is expected to be Gen-Z and we are considering supporting multi-socket cache coherency based on a MOESIF style protocol running on top of Gen-Z.
3 ) S Class : SoC configuration aimed at highly parallel enterprise ,HPC and analytics workloads. The cores can be a combination of C or I class, single thread performance driving the core choice. Optional L4 caches and an optimized memory hierarchy is key to achieving a high memory bandwidth. The architecture thrust is on accelerators, VPU and AI/ML and an mesh SoC fabric optimized for up to 128 cores with multiple accelerators per core. Close integration with an external Gen-Z fabric is a key part of the design, as is support for storage class memory. This aspect of the design is crucial since I/O and memory bandwidth is often the bottleneck for these classes of processors.
Two Types Of Experimental Processors.
1 ) T Class : A variant of the C-Class that explores tag based ISAs for object level security. We plan to support coarse and fine grain tags. Coarse grain tags will be used to realize micro-VM like functionality. to mitigate software attacks like buffer-overflow.
2 ) F Class : T-Class processors are fault tolerant versions of the base-processors. Features include redundant compute blocks (like DMR and TMR), temporal redundancy modules to detect permanent faults, lock-step core configurations, fault localization circuits, ECC for critical memory blocks and redundant bus fabrics. These are also a key component of our ASIL-D solutions and autonomous vehicle compute blocks.