RTL code annotations by GPT
To construct the RTL-language dataset, we organize the data into four distinct levels: repository, file, module, and block. The detailed example shown in the figure.
We employ a Chain of Thought (CoT) approach for RTL code annotation, leveraging GPT-4 and Claude to generate detailed comments, descriptions, and question-answer pairs.
Chip
5,471
36,955
84
IP
12,863
20,101
183
Module
28,901
-
1,389
RISC-V
2,116
-
560
The table illustrates the number of annotations at the module, block, and repository levels for various RTL categories.
The annotation download url:
if you get some problems when using this link, such as 'cannot unzip' , 'cannot download' and 'the link is not valid', you can try this new link: https://huggingface.co/datasets/zeju-0727/DeepCirCuitX_Dataset
The annotation test case download url:
One case of our data structure:
chip/Communications_Processor/Design-of-reduced-latency-and-increased-throughput-Polar-Decoder
design_files
├── design_files/pe_1         //pe_1 in original code is a Verilog file
│   ├── design_files/pe_1/intermediate_comment
│   │   ├── design_files/pe_1/intermediate_comment/pe_1_QA.json
│   │   ├── design_files/pe_1/intermediate_comment/pe_1_module.json
│   │   └── design_files/pe_1/intermediate_comment/pe_1_spec.json
│   ├── design_files/pe_1/pe_1.txt  // Module-level comment
│   └── design_files/pe_1/spec
│       └── design_files/pe_1/spec/spec.txt   // file-level specification annotation
│   └── design_files/pe_1/pe_1.v  // file-level code
├── design_files/pe_2   ...
├── design_files/t_to_s_or_s_to_t   ...
├── design_files/sign_processing_unit   ...
├── design_files/half_adder_subtractor   ...
├── design_files/pe_1_modified_merge   ...
├── design_files/comparator_module   ...
├── design_files/full_adder_subtractor   ...
├── design_files/merged_pe_2   ...
TestBench_Files ...
Design-of-reduced-latency-and-increased-throughput-Polar-Decoder.txt  // Repo-level commentThe structure for the annotated Verilog code in the 'Design-of-reduced-latency-and-increased-throughput-Polar-Decoder' project.
The design_files folder contains individual Verilog files, with pe_1 serving as an example. Each module's source code (e.g., pe_1.v) is accompanied by various annotation files, such as intermediate comments, specifications, and a textual description (pe_1.txt). 
These annotations are organized into subdirectories like intermediate_comment and spec. This structure enables detailed documentation and analysis of the Verilog code for various modules across the project.

Last updated
