Optimizing DFT Test Coverage for AI Accelerators and Compute Chips

Main Article Content

Vikas Nagaraj

Abstract

The number of transistors, heterogeneous cores, and chiplet-based integrations packed into AI accelerators and compute chips at advanced nodes has steep test and reliability requirements. This paper provides a practical and straightforward design of Design for Test (DFT) coverage optimisation when available resources are limited in terms of cost and time. It begins with the principles of scan, traditional fault models, and access standards (IEEE 1149.1/1500/1687). It proceeds to current developments, including hierarchical DFT, pattern compression, cell-sensitive faults, and power/thermal-sensitive testing. Specific AI silicon issues, such as massive parallel operation, large memory hierarchies, multiple clock/power domains, and 3D packaging, are converted into actual techniques: full/partial scan with at-speed capture, MBIST/LBIST, boundary scan reuse, TSV and micro-bump testing, and stress-controlled scheduling. An overview of the key steps includes the initial plans for RTL/floorplan, coverage target (Example. DPPM), fault model, ML-assisted ATPG, hierarchical IP wrappers, chiplet/3D integration, coverage grading, and production correlation. Cases of improvements in the yield and test time in accelerator and chiplet processors at high volume are seen. In the future, multi-stage access networks will be necessary for heterogeneous integration and wafer-to-wafer stacking. In-field/online DFT will transition to continuous health monitoring and self-repair, and adaptive pattern generation will become possible through AI-based automation of defect analytics. This leads to a practical roadmap for high-yield, reliable, and cost-effective testing of next-generation AI computing.

Article Details

Section
Articles