Chips SBCs & COMs tech

Zynq UltraScale+ board supports new Xilinx AI Platform

Zynq UltraScale+ board supports new Xilinx AI Platform

iWave unveiled a dev package for its Linux-driven, Zynq Ultrascale+ based mostly iW-Rainbow G30M module with help for a brand new Xilinx AI Platform. Xilinx is baking associated AI know-how into its soon-to-ship, Linux-powered 7nm Versa processors.

iWave Methods has launched an “iW-Rainbow G30D Zynq Ultrascale+ MPSoC Improvement Package” for its iW-Rainbow G30M compute module, which runs Linux on the Arm Cortex-A53/FPGA Xilinx Zynq UltraScale+ MPSoC. In saying the package, iWave targeted mostly on the platform’s potential to check the new Xilinx AI Platform, which it calls Xilinx/Deephi core. The Xilinx AI Platform, which spans from the edge to the datacenter, is predicated largely on its acquisition of edge AI agency DeePhi.

iW-Rainbow G30M

Farther under, we’ll take a better take a look at the Xilinx AI Platform and the way Xilinx is using some of this know-how inside its new 7nm, dual -A72/FPGA Versal ACAP chips. Xilinx showcased the Versal early this week on the Xilinx Developer Forum.

Additionally this week Xilinx introduced a Vitis improvement platform for its FPGAs that’s past the scope of this text. Based mostly on open source libraries, Vitis is billed as a neater various to its Vivado Design Suite. The platform includes a Vitis AI element that appears to target the Versa.

iW-Rainbow G30D

The brand new Zynq Ultrascale+ MPSoC Improvement Package with iW-Rainbow G30D service board extends the Linux 4.14-driven iW-Rainbow G30M module. The G30M module runs on a quad -A53 Zynq UltraScale+ MPSoC with 192Okay to 504Okay FPGA logic cells. The module ships with 4GB DDR4, 1GB for the FPGA, and 8GB of expandable eMMC. There’s also -40 to 85°C help amongst other options detailed in our Sep. 2018 iW-Rainbow G30M report.

iW-Rainbow G30D and block diagram
(click photographs to enlarge)

The 140 x 130mm iW-Rainbow G30D service board features 2x GbE ports and an SFP+ cage. You additionally get single DisplayPort, USB 2.0 host, USB Sort-C, and debug console ports. Inner I/O consists of SD, CAN, JTAG, and 20-pin I/O headers.

Twin FMC HPC connectors provide FPGA-related I/O together with LVDS, 14 high-speed transceivers, twin 12-pin PMOD, SATA, PCIe x4, and extra. The board has an RTC with battery holder plus a 12V input.

Xilinx AI Platform

The iW-Rainbow G30D announcement hyperlinks to an internet web page for the Xilinx AI Platform, which it calls “Xilinx/Deephi Core.” The Xilinx AI Platform was developed largely based mostly on Xilinx’s acquisition of DeepPhi Know-how Co. in July 2018. DeePhi was a Beijing-based start-up with experience in in machine studying, deep compression, pruning, and system-level optimization for neural networks.

Xilinx AI Platform (left) and Xilinx Edge AI Platform structure diagrams
(click pictures to enlarge)

The Deephi core algorithms can execute important real-time tasks instantly on the Zynq UltraScale+ FPGA, says iWave. The iW-Rainbow G30M supports “a huge portfolio of Deephi cores” for edge/AI purposes, and the new dev package now allows easier prototyping with the know-how, says the corporate.

Baidu EdgeBoard

The Zynq Ultrascale+ has previously been featured as an AI processor on Baidu’s EdgeBoard, which was announced in January. Nevertheless, the just lately launched EdgeBoard uses Baidu’s personal Baidu Mind AI algorithms.

The Deephi “sparse neural network” Core know-how encompasses a Convolutional Neural Community (CNN) pruning know-how and deep compression algorithm to scale back the dimensions of AI algorithms for edge purposes. The “ultra-low latency real-time inference” Deephi algorithms help AI/ML acceleration in face recognition and picture/pose detection for sensible surveillance, says iWave. Other purposes embrace intuitive ADAS for automotive assistance, industrial automation predictive maintenance, and sensible healthcare for real-time monitoring and analysis.

Early Xilinx slide deck displaying deliberate integration of Deephi know-how
(click image to enlarge)

As explained in this EE Journal evaluation of the acquisition, DeePhi optimized its algorithms for the Zynq 7000 earlier than shifting on to the Zynq UltraScale+ MPSoC. As recommended within the chart above from a 2018 Xilinx slide deck (PDF), the Deephi know-how, together with pruning, quantizer, compiler, runtime, fashions, and FPGA IP, went on to type the bulk of what would later be marketed because the Xilinx AI Platform. It varieties virtually all the edge/embedded aspect, which is known as the Xilinx Edge AI Platform.

Xilinx Edge AI Platform DPU structure (left) and obtainable Xilinx AI Platform fashions
(click pictures to enlarge)

The FPGA IP element in the Xilinx Edge AI Platform known as the Deep-learning Processing Unit (DPU). The hardware block is optimized to work with Xilinx FPGAs to speed up AI algorithms with low latency.

The Xilinx Edge AI Platform supports AI frameworks including TensorFlow, Caffe, and Darknet, among others. Xilinx lists 18 obtainable models for object, face, pedestrian, ADAS-related recognition, classification, detection, estimation, and localization (see chart above).

The Xilinx Edge AI Platform encompasses a Linux-ready DNNDK (Deep Neural Network Improvement Package) for deploying AI inference on Xilinx Edge AI platforms with a lightweight C/C++ API. DNNDK’s DEep ComprEssioN Software (DECENT) “can scale back model complexity by 5x to 50x with minimal accuracy influence,” says Xilinx. There’s also a Deep Neural Community Compiler (DNNC), a Neural Community Runtime (N2Cube), and a profiler.

The datacenter version of the Xilinx AI Platform lacks the DPU but as an alternative provides Xilinx’s xDNN (Xilinx Deep Neural Network Inference) FPGA architecture on the bottom FPGA IP degree. Supported by a related xfDNN compiler and runtime, XDNN maps a variety of neural community frameworks onto the high-end VU9P Virtex UltraScale+ FPGA for datacenters.

Versal ACAP

Last October, Xilinx announced a serious new Versal ACAP (adaptive compute acceleration platform) processor family. The heterogeneous accelerated Versal “is the primary platform to combine software program programmability with domain-specific hardware acceleration” and built-in adaptability by way of the ACAP architecture,” says Xilinx.

Xilinx Versal

Built with a 7nm FinFET course of compared to 16nm for the Zynq UltraScale+, Versal will comprise six separate processors, two of which can begin rolling out earlier than the top of the yr. The initial Versal Prime and Versal AI Core fashions, which are primarily aimed toward datacenter and high-end edge-AI units, respectively, began sampling in June.

The Versal Prime, Premium, and HBM collection processors goal high-end datacenter and networking purposes. The AI Core, AI Edge, and AI RF collection target AI-enabled networking and edge units and add an AI Engine block designed for low-latency AI inference.

The AI Engine seems to be based mostly partially on the Deephi and Xilinx Edge AI Platform know-how. The AI Engine options 1.3GHz VLIW/SIMD vector processors deployable in a tile structure. The cores talk at “terabytes/sec” bandwidth to other engines.

As detailed on this Versal slide deck (PDF), all the Versal processors function twin 1.7GHz Cortex-A72 cores supported by an embedded Linux runtime and twin 750MHz Cortex-R5 cores supported by FreeRTOS.

Versal block diagram
(click on picture to enlarge)

The programmable logic element is referred to not as an FPGA, but as Versal Adaptable Engines. The logic consists of “fine-grained parallel processing, knowledge aggregation, and sensor fusion.” It also provides a programmable reminiscence hierarchy with “excessive bandwidth, low latency knowledge motion between the engines and I/O,” says Xilinx.

The Adaptable Engines provide 4x larger density per logic block, presumably in comparison with the UltraScale+. Separate from the programmable logic is a DSP Engines block with up to 1GHz performance designed for accelerating wireless, machine studying, and HPC. As famous, selected fashions additionally provide the AI Engine.

Tying all these pieces together is a multi-terabit-per-second network-on-chip (NoC) that memory maps access to all assets for easier programmability. It additionally allows simply swapping of kernels and connectivity between totally different kernels.

The NOC works with a “Shell” element that features a Platform Management Controller that gives security and boot options. It also includes a scalable memory subsystem and host and I/O interfaces. The Versa works with the brand new Vitis unified software platform and is backward suitable with Zynq UltraScale+.

The primary two Versal versions are the presently documented Versal Prime and Versal AI Core. The AI Engine-enabled AI Core is provided with 256KB of on-chip RAM with ECC and greater than 1.9 million system logic cells. There are also more than 1,900 DSP engines optimized for high-precision floating point with low latency.

Xilinx Versal structure (left) and Versal AI Core VCK190 eval package
(click on photographs to enlarge)

There’s already a Linux-powered Versal AI Core VCK190 eval package. The AI Core is aimed toward very high-end techniques comparable to 5G infrastructure, automotive, and datacenter. We imagine, nevertheless, that the majority LinuxGizmos readers will probably be extra in the upcoming — and at present undocumented — embedded AI Edge and AI RF platforms.

Additional info

iWave’s iW-Rainbow G30D Zynq Ultrascale+ MPSoC Improvement Package is out there now at an undisclosed worth. Extra info may be discovered on its product web page. Extra on the Xilinx AI Platform may be discovered here and more on the Versal processors may be found here.


(perform(d, s, id)
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) return;
js = d.createElement(s); = id;
js.src = “//connect.fb.internet/en_US/sdk.js#xfbml=1&version=v2.6”;
fjs.parentNode.insertBefore(js, fjs);
(document, ‘script’, ‘facebook-jssdk’));