20240320_国盛证券_通信行业深度:AI算力的ASIC之路从以太坊矿机说起_26页.pdf
|2024 03 20 AI ASIC AI GPU ASIC ASIC GPU GPU ASIC CPU GPU ASIC ASIC ASIC GPT Transformer GPU Transformer ASIC TPU Groq ASIC GPU Transformer ASIC AISC 3 ASIC POS ETH ASIC、ASIC ASIC CUDA ASIC AGI AI AI CUDA 地 ASIC ASIC AI AISC ASIC、AGI ASIC ASIC TSMC INTC AISC MRVL AVGO ASIC CAN AI AI ASIC S0680519010002 S0680519050002 S0680522120003 1 2024-03-17 2 GTC 2024-03-10 3 Claude 3 2024-03-05-32%-16%0%16%32%2023-03 2023-07 2023-11 300 2024 03 20 P.2 1.3 2 ASIC.4 2.1 ASIC.4 2.2.5 3 ASIC.6 3.1 GPU ASIC.6 3.2 TPU.8 3.3 Groq.12 3.4 Wafer-Scaling+.14 4 ASIC.15 4.1 CPU.15 4.2.16 4.3 ASIC.17 5.18 5.1.18 5.2.19 6 AISC GPU.21 7.23 8.25 1 CPU GPU FPGA ASIC.4 2 GPU CPU.6 3 GPU.7 4 NVIDIA H100 Tensor Core FP8.7 5 A100.8 6.9 7 TPU V1.10 8 TPU V1-V2.11 9 TPU V4.11 10 Groq.12 11 Groq.13 12 TSP Superlane.13 13 1366I.15 14 NVIDIA CmpHx GPU.16 15 CUDA.18 16 TSP Superlane.19 17 Groq.19 18.20 19 ASIC AI.22 20.24 ZXDWuNtOnOpOoQnOrMpQrQ9PaO8OoMrRsQqMkPpPpMjMnPmNaQnNuNxNoOuMMYtPpM 2024 03 20 P.3 1 CPU GPU ASIC ASIC POW POS ASIC、ASIC A100 H100 AI Transformer ASIC Transformer ASIC Tensor Core CUDA AI CPU、Transformer ASIC Wafer-Scaling、LPU Groq ASIC 1.Transformer ASIC 2.AI ASIC TSMC INTC AISC MRVL AVGO ASIC CAN 2024 03 20 P.4 2 ASIC 2.1 ASIC ASIC Application Specific Integrated Circuit ASIC TPU 昇 910B AI ASIC ASIC ASIC TPU DPU NPU TPU AI DPU NPU AI CNN SoC CPU GPU FPGA ASIC CPU GPU ASIC FPGA ASIC 1 CPU GPU FPGA ASIC CPU GPU FPGA ASIC CPU GPU CPU GPU D-central KDLjy4fbg+uqWXJD3ELEzLzkDSe1RnSOm2svFrHBRVnRjcCCg+VKG/dJUlyGUKN4 2024 03 20 P.5 2.2 CPU GPU ASIC、2023 CNN GPT OPENAI CNN Transformer 2023 2w ASIC ASIC ASIC 2024 03 20 P.6 3 ASIC 3.1 GPU ASIC GPU CPU GPU ASIC CPU CPU AI GPU ASIC 2 GPU CPU GPU CPU GPU GPU GPU GPU Tensor Core 2017 5 Volta Tensor Core Pascal 12 TFLOPS 6 Volta Pascal 3 2024 03 20 P.7 3 GPU Tensor Core 4 NVIDIA H100 Tensor Core FP8 Tensor Core GPU Transformer CUDA、GPU CUDA 2024 03 20 P.8 3.2 TPU Transformer ASIC TPU TPU Systolic Array 1982 CPU 2016 TPU 2019 CU+TU TensorCore 3D TensorCore TensorCore NxN N TensorCore TensorCore TensorCore GPU A100 TensorCore TU 16 INT32 16 FP32 8 FP64 CU Cycle INT32 INT32 ALU TU 1 16 ALU cycle TU 16 5 A100 TU 16x16 TU 2*163 16 INT32 ALU 512 cycle TU 2024 03 20 P.9 6 CU TU TU 16x16 TU cycle A B TU,B cycle A B TU TU 1,1 a11 b11 TU cycle A B TU A B TU TU 1,1 A12 B21 TU 1,1 T 2,1 T 1,2 TU 16 cycle A B TU TU 1,1 16 AxB 1,1 17 TU 1,1 TU 1,2 TU 2,1 cycle 16 cycle 16x16 TU 16 cycle FPU ALU TU TU GPU TU cycle CPU TU DFF 2024 03 20 P.10 TPU TPU V5e TPU Transformer TPU V1 V4 ASIC TPUV1 TPU V2 MXU DDR HBM MXU 7 TPU V1 Jouppi,Norman P.,et al.In-datacenter performance analysis of a tensor processing unit.TPUV2 V1 ASIC 2024 03 20 P.11 8 TPU V1-V2 Thomas Norrie,Nishant Patil,Doe Hyun Yoon,et al.The design process for Googles training chips:TPUv2 and TPUv3.TPU V2 V3、V4 Wafer-Scaling MXU TPU V4 Tensor core 8 MXU V2 4 9 TPU V4 google cloud TPU ASIC NV Tensor Core 2024 03 20 P.12 TPU 0、ASIC ASIC 3.3 Groq TPU、ASIC MXU ASIC LLM ASIC Groq Groq TPU 2024 2 Groq Meta Llama 2 500 Token/s GPT-4 40 Token/s 10 Groq Groq 14NM 4NM H100 ASIC Groq Groq sRAM MXU SRAM Token Token 2024 03 20 P.13 11 Groq Groq Groq,sRAM groq CNN Groq 12 TSP Superlane Groq Groq ASIC MXU sRAM、DDR7 MXU HBM Token IO IO 2024 03 20 P.14 3.4 Wafer-Scaling+ASIC GPU TPU Streaming Processing TPU Groq ASIC Wafer-Scaling Sam Altman cerebras MXU Groq MXU sRAM sRAM HBM ASIC 2024 03 20 P.15 4 ASIC ASIC GPU VS ASIC 4.1 CPU ASIC CPU 2009 1 CPU CPU 2010 SHA256 GPU 2010-2012 GPU CPU 2011 FPGA FPGA 2012 SHA256 ASIC 2013 ASIC ASIC、ASIC GPU ASIC 13 1366I 2024 03 20 P.16 CPU GPU,GPU ASIC,2019 GPU CPU AI。ASIC、4.2 2014 2022 9 POS 8 POW ICO 2014 2015 7 30 Frontier BTC POW POS ASIC ASIC Ethash PoW ASIC AI 14 NVIDIA CmpHx GPU ASIC ASIC、ETH POS ASIC 2024 03 20 P.17 4.3 ASIC ASIC POS ETH ASIC、CNN ASIC ASIC。ASIC E H100 P ASIC、AI ASIC Transformer Transformer ASIC POW Transformer ASIC 2024 03 20 P.18 5 ASIC AI CUDA AI CUDA、15 CUDA。Transformer ASIC 5.1 ASIC TPU V2 LLM ASIC Groq 2024 03 20 P.19 16 TSP Superlane Groq Transformer Transformer ASIC 5.2 Groq 2020 17 Groq 2016 Google TPU Jonathan Ross TPU Groq 2020 1 GROQ ROCKS NEURAL NETWORKS 2020 7 Think Fast:A Tensor Streaming Processor(TSP)for Accelerating Deep Learning Workloads 2022 6 A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning 2023 8 The Groq Software-de ned Scale-out Tensor Streaming Multiprocessor 2024 2 Mistral-MOE 7*8B 500 tokens/s Groq Groq 2024 03 20 P.20 18 Groq AI CUDA CUDA CUDA Tom s Hardware CUDA 2024 03 20 P.21 6 AISC GPU ASIC、ASIC ASIC ASIC FY24Q1 FY24Q1 33 46%2 AI FY24 25%35%100 70%AI 20%10%PAM4 DSP FY24Q4 cloud optimized silicon 2024 10 AI ASIC ASIC AI ASIC AI CUDA CUDA Transformer。AI ASIC ASIC ASIC AI AI AGI GPU ASIC、AGI 2024 03 20 P.22 19 ASIC AI 2024 03 20 P.23 7 AISC。ASIC。ASIC ASIC ASIC ASIC ASIC AI ASIC TSMC INITC ASIC ASIC ASIC、IP ASIC MRVL AVGO ASIC ASIC GPU ASIC ASIC ASIC CAN ASIC ASIC 2024 03 20 P.24 20 2024 PE ASIC MRVL 646.0 12.13 53.3 ASIC AVGO 5,726.0 228.34 25.1 TSM 7,104.0 313.75 22.4 Fab INTC 1,802.8 55.82 34.0 PC 688981 3,812.1 67.02 56.0 002837 201.2 5.24 38.4 002179 769.2 42.32 18.2 300499 43.7 1.76 24.8 ASIC CAN 3.1 0.87 3.6 688256 812.5/688521 214.0 0.53 403.8 601138 4,968.0 259.42 19.2 000628 306.5/603019 780.1 24.39 32.0 Wind BBC NEWS 2024 3 18 non-GAAP A Wind 2024 03 20 P.25 8 AI AI ASIC 1 AI AI AI ASIC 2 AI Transformer ASIC ASIC 3 ASIC、ASIC 2024 03 20 P.26“”“”“”6 A 300 500 15%5%15%-5%+5%5%10%-10%+10%10%8 7 7 100077 555 22 200120 021-38124100 1115 330038 0791-86281485 100 24 518033