# RTL code annotations by GPT

To construct the RTL-language dataset, we organize the data into four distinct levels: repository, file, module, and block. The detailed example shown in the figure.

&#x20;We employ a Chain of Thought (CoT) approach for RTL code annotation, leveraging GPT-4 and Claude to generate detailed comments, descriptions, and question-answer pairs.&#x20;

<table><thead><tr><th width="172">RTL Category</th><th width="135">Module-Level Annotations</th><th width="207">Block-Level Annotations</th><th>Repository-Level Annotations</th></tr></thead><tbody><tr><td><strong>Chip</strong></td><td>5,471</td><td>36,955</td><td>84</td></tr><tr><td><strong>IP</strong></td><td>12,863</td><td>20,101</td><td>183</td></tr><tr><td><strong>Module</strong></td><td>28,901</td><td>-</td><td>1,389</td></tr><tr><td><strong>RISC-V</strong></td><td>2,116</td><td>-</td><td>560</td></tr></tbody></table>

The table illustrates the number of annotations at the module, block, and repository levels for various RTL categories.

*<mark style="color:red;">**The annotation download url:**</mark>*&#x20;

{% embed url="<https://drive.google.com/file/d/14tntC-N-kCGbYkTGD3mZe6H18Mf449PV/view?usp=sharing>" %}
All the RTL code and corresponding different level annotations&#x20;
{% endembed %}

<sup>*<mark style="color:$info;">**if you get some problems when using this link, such as 'cannot unzip' , 'cannot download' and 'the link is not valid', you can try this new link:**</mark>*</sup>  [<sup>*<mark style="color:$primary;">**https://huggingface.co/datasets/zeju-0727/DeepCirCuitX\_Dataset**</mark>*</sup>](https://huggingface.co/datasets/zeju-0727/DeepCirCuitX_Dataset)

*<mark style="color:red;">**The annotation test case download url:**</mark>*&#x20;

{% embed url="<https://drive.google.com/file/d/1eBEwESWOCGFDfqgKqDouChRGvqrrsy24/view?usp=sharing>" %}
One complete case of our annotation data （with RTL code）
{% endembed %}

*<mark style="color:red;">**One case of our data structure：**</mark>*

chip/Communications\_Processor/Design-of-reduced-latency-and-increased-throughput-Polar-Decoder

```
design_files
├── design_files/pe_1         //pe_1 in original code is a Verilog file
│   ├── design_files/pe_1/intermediate_comment
│   │   ├── design_files/pe_1/intermediate_comment/pe_1_QA.json
│   │   ├── design_files/pe_1/intermediate_comment/pe_1_module.json
│   │   └── design_files/pe_1/intermediate_comment/pe_1_spec.json
│   ├── design_files/pe_1/pe_1.txt  // Module-level comment
│   └── design_files/pe_1/spec
│       └── design_files/pe_1/spec/spec.txt   // file-level specification annotation
│   └── design_files/pe_1/pe_1.v  // file-level code
├── design_files/pe_2   ...
├── design_files/t_to_s_or_s_to_t   ...
├── design_files/sign_processing_unit   ...
├── design_files/half_adder_subtractor   ...
├── design_files/pe_1_modified_merge   ...
├── design_files/comparator_module   ...
├── design_files/full_adder_subtractor   ...
├── design_files/merged_pe_2   ...
TestBench_Files ...

Design-of-reduced-latency-and-increased-throughput-Polar-Decoder.txt  // Repo-level comment
```

**The structure for the annotated Verilog code in the 'Design-of-reduced-latency-and-increased-throughput-Polar-Decoder' project.**&#x20;

**The `design_files` folder contains individual Verilog files, with `pe_1` serving as an example. Each module's source code (e.g., `pe_1.v`) is accompanied by various annotation files, such as intermediate comments, specifications, and a textual description (`pe_1.txt`).**&#x20;

**These annotations are organized into subdirectories like `intermediate_comment` and `spec`. This structure enables detailed documentation and analysis of the Verilog code for various modules across the project.**

<figure><img src="https://204291402-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqpjfvyQt0RAeOzMVWp4g%2Fuploads%2FjK7Qhc2yeh8drXFI79h3%2Fimage.png?alt=media&#x26;token=3dd341fb-2cf0-4fbf-9b3e-c89028c9334d" alt=""><figcaption><p>Illustration of the dataset repository structure with multi-level annotations</p></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zeju.gitbook.io/lcm-team/deepcircuitx/rtl-code-annotations-by-gpt.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
