FPGA开发学习

9-18 勿忘国耻！

先来补充一下我读了一个博主写的如何用硬件思维去写c++，并且适应HLS(High-Level synthesis)的规则。

我就被HLS的功能所误导，以前以为只要写出了c/c++，就可以通过HLS进行硬件电路转换得到RTL硬件电路，但是c/c++代码能否转换成满足实际工程需求（面积，速度）的RTL代码是个问题，也就是，转换期间，是存在代沟的。从语言特征的角度来看，c/c++与verilog有着本质的区别。

c++是顺序执行的，而HDL(Hardware Description Language)是并行执行的。因为HDL描述的是硬件电路，一旦上电，所有电路单元并行工作，HDL的并行特征正体现了硬件电路的这个特点。

c++是静态的，HDL是动态的。静态就是我们用c++时只需要关注算法本身，而使用HDL描述算法时，我们要关注的是如何将算法映射为硬件电路，关注每个时钟周期电路应该实现的行为。电路在时钟下工作，数据随时钟流动，时钟也就是时间的基本单位，或者说执行一次运算的基本时间单位。

c++是没有时序性的，而时序性是HDL的一个显著特征，比如时序逻辑电路。所以我们不能只考虑c++中算法的描述，还要考虑用流水线的方式使得数据在各个处理单元之间流动，同时设计者还要管理好每个处理单元完成操作所需要的时钟周期。也就是要考虑关键路径（最长路径）的门延迟。

但是c++和HDL也有相似之处，比如c++是顺序执行，而在有限状态机里面HDL也是顺序执行。比如c++的for循环，就可以翻译成有限状态机里面的进入不同状态执行不同逻辑体。我们知道执行循环非常耗时，用HDL描述状态机时我们回忆，需要考虑状态转移条件，每个状态持续的时钟周期个数。在for循环中，进入/退出for循环各需要一个时钟周期，每次for循环所需要的是时钟周期个数就取决于循环体内的操作。整个循环所需要的时钟周期时和循环次数相关的，所以说我们尽可能要让循环最大次数是常数

c++最常用的数据类型是数组，所以是占用存储空间的。而FPGA中记忆元件包括寄存器和RAM，数组最终都可以被映射到这两类元件中。在HDL中描述RAM时，我们要给出RAM的宽度和深度，从而使得工具可以在FPGA中分配固定的存储单元。

尽管c++中支持动态数组，但是HLS时确不支持，所以HLS要求c++中的数组必须为固定大小，和HDL要求一致。

Vitis HLS将c++转换为RTL代码时分为三大过程：进度安排（scheduling），绑定(binding)和状态提取。Scheduling解决的是什么时候做什么事，进一步而言就是时钟周期需要执行的操作。binding解决的是绑定需要的硬件资源，状态提取就是字面含义了，就是在c++代码中提取有限状态机。

Getting started

Build a circuit with no inputs and one output. That output should always drive 1 (or logic high).

module top_module( output one );

// Insert your code here
    assign one = 1'b1;

endmodule

Zero

Build a circuit with no inputs and one output that outputs a constant 0

module top_module(
    output zero
);// Module body starts after semicolon

endmodule

verilog Language

simple wire

Create a module with one input and one output that behaves like a wire.

Unlike physical wires, wires (and other signals) in Verilog are directional. This means information flows in only one direction, from (usually one) source to the sinks (The source is also often called a driver that drives a value onto a wire). In a Verilog “continuous assignment” (assign left_side = right_side;), the value of the signal on the right side is driven onto the wire on the left side. The assignment is “continuous” because the assignment continues all the time even if the right side’s value changes. A continuous assignment is not a one-time event.

The ports on a module also have a direction (usually input or output). An input port is driven by something from outside the module, while an output port drives something outside. When viewed from inside the module, an input port is a driver or source, while an output port is a sink.

The diagram below illustrates how each part of the circuit corresponds to each bit of Verilog code. The module and port declarations create the black portions of the circuit. Your task is to create a wire (in green) by adding an assign statement to connect in to out. The parts outside the box are not your concern, but you should know that your circuit is tested by connecting signals from our test harness to the ports on your top_module.

1
2
3

module top_module( input in, output out );
    assign out = in;
endmodule

wire 4

Create a module with 3 inputs and 4 outputs that behaves like wires that makes these connections:

a -> w
b -> x
b -> y
c -> z

The diagram below illustrates how each part of the circuit corresponds to each bit of Verilog code. From outside the module, there are three input ports and four output ports.

When you have multiple assign statements, the order in which they appear in the code does not matter. Unlike a programming language, assign statements (“continuous assignments”) describe connections between things, not the action of copying a value from one thing to another.

One potential source of confusion that should perhaps be clarified now: The green arrows here represent connections between wires, but are not wires in themselves. The module itself already has 7 wires declared (named a, b, c, w, x, y, and z). This is because input and output declarations actually declare a wire unless otherwise specified. Writing input wire a is the same as input a. Thus, the assign statements are not creating wires, they are creating the connections between the 7 wires that already exist.

module top_module( 
    input a,b,c,
    output w,x,y,z );
    assign w = a;
    assign x = b;
    assign y = b;
    assign z = c;
endmodule

Not gate

Create a module that implements a NOT gate.

This circuit is similar to wire, but with a slight difference. When making the connection from the wire in to the wire out we’re going to implement an inverter (or “NOT-gate”) instead of a plain wire.

Use an assign statement. The assign statement will continuously drive the inverse of in onto wire out.

1
2
3

module top_module( input in, output out );
    assign out = ~in;
endmodule

Andgate

Create a module that implements an AND gate.

This circuit now has three wires (a, b, and out). Wires a and b already have values driven onto them by the input ports. But wire out currently is not driven by anything. Write an assign statement that drives out with the AND of signals a and b.

Note that this circuit is very similar to the NOT gate, just with one more input. If it sounds different, it’s because I’ve started describing signals as being driven (has a known value determined by something attached to it) or not driven by something. Input wires are driven by something outside the module. assign statements will drive a logic level onto a wire. As you might expect, a wire cannot have more than one driver (what is its logic level if there is?), and a wire that has no drivers will have an undefined value (often treated as 0 when synthesizing hardware).

module top_module( 
    input a, 
    input b, 
    output out );
    assign out = a&b;
endmodule

Norgate

Create a module that implements a NOR gate. A NOR gate is an OR gate with its output inverted. A NOR function needs two operators when written in Verilog.

An assign statement drives a wire (or “net”, as it’s more formally called) with a value. This value can be as complex a function as you want, as long as it’s a combinational (i.e., memory-less, with no hidden state) function. An assign statement is a continuous assignment because the output is “recomputed” whenever any of its inputs change, forever, much like a simple logic gate.

module top_module( 
    input a, 
    input b, 
    output out );
    assign out = ~(a || b);
endmodule

Xnorgate

Create a module that implements an XNOR gate.

module top_module( 
    input a, 
    input b, 
    output out );
    assign out = ~(a^b);
endmodule

Declaring wires

The circuits so far have been simple enough that the outputs are simple functions of the inputs. As circuits become more complex, you will need wires to connect internal components together. When you need to use a wire, you should declare it in the body of the module, somewhere before it is first used. (In the future, you will encounter more types of signals and variables that are also declared the same way, but for now, we’ll start with a signal of type wire).

Example:

module top_module (
    input in,              // Declare an input wire named "in"
    output out             // Declare an output wire named "out"
);

    wire not_in;           // Declare a wire named "not_in"

    assign out = ~not_in;  // Assign a value to out (create a NOT gate).
    assign not_in = ~in;   // Assign a value to not_in (create another NOT gate).

endmodule   // End of module "top_module"

two NOT gates are created using two assign statements.Note that it doesn’t matter which of the NOT gates you create first: You still end up with the same circuit.定义顺序没关系！先定义哪个都行！

practice

Implement the following circuit. Create two intermediate wires (named anything you want) to connect the AND and OR gates together. Note that the wire that feeds the NOT gate is really wire out, so you do not necessarily need to declare a third wire here. Notice how wires are driven by exactly one source (output of a gate), but can feed multiple inputs.

If you’re following the circuit structure in the diagram, you should end up with four assign statements, as there are four signals that need a value assigned.

`default_nettype none
module top_module(
    input a,
    input b,
    input c,
    input d,
    output out,
    output out_n   ); 
    wire one_one;
    wire one_two;//左边的两条
    wire two;//右边的一条
    
    assign one_one = a&b;
    assign one_two = c&d;
    assign two = one_one || one_two;
    assign out = two;
    assign out_n = ~two;
endmodule

7458 chip

The 7458 is a chip with four AND gates and two OR gates. This problem is slightly more complex than 7420.

Create a module with the same functionality as the 7458 chip. It has 10 inputs and 2 outputs. You may choose to use an assign statement to drive each of the output wires, or you may choose to declare (four) wires for use as intermediate signals, where each internal wire is driven by the output of one of the AND gates. For extra practice, try it both ways.

module top_module ( 
    input p1a, p1b, p1c, p1d, p1e, p1f,
    output p1y,
    input p2a, p2b, p2c, p2d,
    output p2y );
    
    wire a2_b2;//p2a和p2b生成的线
    wire c2_d2;
    wire c1_b1_a1;
    wire d1_e1_f1;
    
    assign a2_b2 = p2a & p2b;
    assign c2_d2 = p2c & p2d;
    assign c1_b1_a1 = p1a & p1b & p1c;
    assign d1_e1_f1 = p1d & p1e & p1f;
    
    assign p1y= c1_b1_a1 || d1_e1_f1;
    assign p2y = a2_b2 & c2_d2;

endmodule

Vector0(其实就是数组，将多个线合并成一个数组)

向量被用来将相关信号归类，使其更容易被操作。比如wire [7:0] w; 声明了一个8位的向量w，在功能上等价于有8条分开的线。

Notice that the declaration of a vector places the dimensions before the name of the vector, which is unusual compared to C syntax. However, the part select has the dimensions after the vector name as you would expect.

1	wire [99:0] my_vector; // Declare a 100-element vector assign out = my_vector[10]; // Part-select one bit out of the vector

Build a circuit that has one 3-bit input, then outputs the same vector, and also splits it into three separate 1-bit outputs. Connect output o0 to the input vector’s position 0, o1 to position 1, etc.

In a diagram, a tick mark with a number next to it indicates the width of the vector (or “bus”), rather than drawing a separate line for each bit in the vector.

module top_module ( 
    input wire [2:0] vec,
    output wire [2:0] outv,
    output wire o2,
    output wire o1,
    output wire o0  ); // Module body starts after module declaration
    assign outv[0] = o0;
    assign outv[1] = o1;
    assign outv[2] = o2;
    
    assign o0 = vec[0];
    assign o1 = vec[1];
    assign o2 = vec[2];
endmodule

Vector1

Declaring Vectors

Vectors must be declared:

1	type [upper:lower] vector_name;

type 指定了向量的数据格式。This is usually wire or reg. If you are declaring a input or output port, the type can additionally include the port type (e.g., input or output) as well. Some examples:

wire [7:0] w;         // 8-bit wire
reg  [4:1] x;         // 4-bit reg
output reg [0:0] y;   // 1-bit reg that is also an output port (this is still a vector)
input wire [3:-2] z;  // 6-bit wire input (negative ranges are allowed)
output [3:0] a;       // 4-bit output wire. Type is 'wire' unless specified otherwise.
wire [0:7] b;         // 8-bit wire where b[0] is the most-significant bit.
上述就是显式指定数据类型。

The endianness (or, informally, “direction”) of a vector is whether the least significant bit has a lower index (little-endian, e.g., [3:0]) or a higher index (big-endian, e.g., [0:3]). In Verilog, once a vector is declared with a particular endianness, it must always be used the same way. e.g., writing vec[0:3] when vec is declared wire [3:0] vec; is illegal. 我们要确保，如果声明数据类型时用的就是小端，那么在后面给变量赋值时，就要遵从小端顺序，用 vec[0:3] 。Being consistent with endianness is good practice, as weird bugs occur if vectors of different endianness are assigned or used together.这里也就是注意小端是[3:0]也就是0是最低位。（最低位拿到最低索引）

Implicit nets

Implicit nets are often a source of hard-to-detect bugs. In Verilog, net-type signals can be implicitly created by an assign statement or by attaching something undeclared to a module port. Implicit nets are always one-bit wires and causes bugs if you had intended to use a vector. Disabling creation of implicit nets can be done using the `default_nettype none directive.

wire [2:0] a, c;   // Two vectors
assign a = 3'b101;  // a = 101
assign b = a;       // b =   1  implicitly-created wire
assign c = b;       // c = 001  <-- bug
my_module i1 (d,e); // d and e are implicitly one-bit wide if not declared.
                    // This could be a bug if the port was intended to be a vector.

也就是尝试将向量a赋值给没有显式声明的b，verilog通常会隐式地创建一个单比特的线(wire)来存储结果。由于a是3位宽的向量，只有最低位a[0]会被赋值给b，所以b变成了单比特且值为1。

所以当b再被赋值给c时，由于c是一个3位宽的向量，所以导致了一个潜在的错误，由于位宽不匹配。

最后实例化了一个名为my_module的模块，并将端口d和e连接到该模块的端口。如果在模块内部没有像之前一样定义d和e的宽度。那么他们将同样被隐式地视为单比特线。这可能会导致问题，如果模块内部的端口本应该是向量而不是单比特。

而default_nettype none是一种设置，它告诉编译器不要隐式创建未声明信号，而是要求我们显式声明所有信号。

Unpacked vs. Packed Arrays

你可能注意到，向量索引写在了向量名字的前面。这声明了数组的打包维度，其中的位被打包到一个blob中（这在模拟器中相关，但在硬件中无关）。未打包维度被声明在向量名之后。他们通常被用来声明内存指针，因为ECE253并没有覆盖内存数组，这门课中我们也没有用打包的指针。

1 2	reg [7:0] mem [255:0];//这是256个未打包元素，每个元素是一个8比特的打包reg向量。 reg mem2 [28:0];29 unpacked elements, each of which is a 1-bit reg.

Accessing Vector Elements: Part-Select

我们用向量名来访问一个向量。比如

assign w = a;

将整个4比特的a赋值分配给8比特向量w。如果等号左右两边不匹配，则会根据情况进行0扩展或者截断（truncated）。

所以，part-select operator可以被用来访问一个向量的一部分。

w[3:0]//w的低4位
x[1]//x的最低位
x[1:1]//也是x的最低位
z[-1:-2]//z的最低两位
b[3:0]//非法！因为在上面我们的定义b向量是按照大端来赋值的！必须与声明匹配。
b[0:3]//b的高四位。
assign w[3:0] = b[0:3];//显然是合法的因为w是小端，b是大端所以这样反过来赋值。分配b的高4位给w的低四位。

A Bit of Practice

建立一个组合电路，使得能够分开输入字(16bits,[15:0])，分成低8位[7:0]和高8位[15:8]。

读完题显然知道这是小端序。

代码如下

`default_nettype none     // Disable implicit nets. Reduces some types of bugs.
module top_module( 
    input wire [15:0] in,
    output wire [7:0] out_hi,
    output wire [7:0] out_lo );
    assign out_hi[7:0] = in[15:8];
    assign out_lo[7:0] = in[7:0];//就是简单赋值 
endmodule

Vector2

A 32-bit vector can be viewed as containing 4 bytes (bits [31:24], [23:16], etc.). Build a circuit that will reverse the byte ordering of the 4-byte word.

1	AaaaaaaaBbbbbbbbCcccccccDddddddd => DdddddddCcccccccBbbbbbbbAaaaaaaa

显然我们只需要一段一段赋值即可

module top_module( 
    input [31:0] in,
    output [31:0] out );//

    assign out[31:24] = in[7:0];
    assign out[23:16] = in[15:8];
    assign out[15:8] = in[23:16];
    assign out[7:0] = in[31:24];

endmodule

Vectorgates

主要讲一下向量整体进行or运算，和两个向量中的8位分别进行or运算的区别。

Build a circuit that has two 3-bit inputs that computes the bitwise-OR of the two vectors, the logical-OR of the two vectors, and the inverse (NOT) of both vectors. Place the inverse of b in the upper half of out_not (i.e., bits [5:3]), and the inverse of a in the lower half.

Bitwise vs. Logical Operators

Earlier, we mentioned that there are bitwise and logical versions of the various boolean operators (e.g., norgate). When using vectors, the distinction between the two operator types becomes important. A bitwise operation between two N-bit vectors replicates the operation for each bit of the vector and produces a N-bit output, while a logical operation treats the entire vector as a boolean value (true = non-zero, false = zero) and produces a 1-bit output.

Look at the simulation waveforms at how the bitwise-OR and logical-OR differ.

代码如下：

module top_module( 
    input [2:0] a,
    input [2:0] b,
    output [2:0] out_or_bitwise,
    output out_or_logical,
    output [5:0] out_not
);
    assign out_or_bitwise[2] = a[2] || b[2];
    assign out_or_bitwise[1] = a[1] || b[1];
    assign out_or_bitwise[0] = a[0] || b[0];
    assign out_or_logical = a || b;
    assign out_not[5:3] = ~b[2:0];
    assign out_not[2:0] = ~a[2:0];
endmodule

当然我做的十分麻烦，标准答案如下：

module top_module(
	input [2:0] a, 
	input [2:0] b, 
	output [2:0] out_or_bitwise,
	output out_or_logical,
	output [5:0] out_not
);
	
	assign out_or_bitwise = a | b;
	assign out_or_logical = a || b;

	assign out_not[2:0] = ~a;	// Part-select on left side is o.
	assign out_not[5:3] = ~b;	//Assigning to [5:3] does not conflict with [2:0]
	
endmodule

注意对比特操作或运算就用单竖线。并且可以用~直接对向量所有比特取反。

Gates4

Build a combinational circuit with four inputs, in[3:0]

There are 3 outputs:

out_and: output of a 4-input AND gate.
out_or: output of a 4-input OR gate.
out_xor: output of a 4-input XOR gate.

代码如下

module top_module( 
    input [3:0] in,
    output out_and,
    output out_or,
    output out_xor
);
    assign out_and = in[0]&&in[1]&&in[2]&&in[3];
    assign out_or = in[0] || in[1] || in[2] || in[3];
    assign out_xor = in[0] ^ in[1] ^ in[2] ^ in[3];

endmodule

数据选择器

//数据选择器
//1.行为描述方式
module mux_2_to_1(a, b, out, outbar, sel);
    input a, b, sel;
    output out, outbar;//分别定义了输入河输出接口
    assign out = sel ? a : b;//也就是sel是1则选择a否则选择b
    assign outbar = ~out;//把选出的另一个值取出
endmodule

//2.结构描述方式
module muxgate(a, b, out, outbar, sel);
    input a, b, sel;
    output out, outbar;
    wire out1, out2, selb;//定义内部的三个连接点
        and a1 (out1, a, sel);
        not i1 (selb, sel);
        and a2 (out2, b, selb);
        or o1 (out, out1, out2);
        assign outbar = ~out;
endmodule

Vector3

部分选择被用来选择向量的一部分，连接符号{a, b, c}被用来创建大向量，通过连接小部分的向量。

{3'b111, 3'b000} => 6'b111000
{1'b1, 1'b0, 3'b101} => 5'b10101
{4'ha, 4'd10} => 8'b10101010     // 4'ha and 4'd10 are both 4'b1010 in binary
//注意4'd表示4位10进制数

连接需要知道每个分量的宽度（不然你怎么知道最后的长度呢）。因此，{1, 2, 3}是非法的并且在结果中，会有这样的错误消息：unsized constants are not allowed in concatenations.

连接符既可以被用在声明左边也可以被用在声明右边。

input [15:0] in;
output [23:0] out;
assign {out[7:0], out[15:8]} = in;//用在了左边
assign out[15:0] = {in[7:0], in[15:8]};//用在了右边
assign out = {in[7:0], in[15:8]};//注意这个是不同的！因为out一共有23位！所以右边会被扩展，并且out[23:16]会是0。但是上两行，就没有被声明这些0.

A bit of practice

给定一些输入向量，连接他们并且把他们分成一些输出向量。有6个5bit输入向量：a,b,c,d,e,f所以一共30bit的输入。有四个8bit的输出向量w,x,y,z一共是32bit。输出将会是输入向量加上比特11的连接。显然，30bit+2bit = 32bit正好与输出位数相同。

显然代码如下：

module top_module (
    input [4:0] a, b, c, d, e, f,
    output [7:0] w, x, y, z,
  wire [31:0] input_concat;
);//

    // assign { ... } = { ... };
    assign input_concat[31:0] = {a[4:0], b[4:0], c[4:0], d[4:0], e[4:0], f[4:0], 2'b11};
    assign w[7:0] = intput_concat[31:24];
    assign x[7:0] = intput_concat[23:16];
    assign y[7:0] = intput_concat[15:8];
    assign z[7:0] = intput_concat[7:0];
    
endmodule

还有一种更简单的

1 2	直接不用定义中间件，这样分配即可。 assign {w[7:0], x[7:0], y[7:0], z[7:0]} = {a[4:0], b[4:0], c[4:0], d[4:0], e[4:0], f[4:0], 2'b11};

VectorOrr

给定一个8比特输入向量[7:0]，反转它的比特顺序。

module top_module (
	input [7:0] in,
	output [7:0] out
);
	
  assign {out[0],out[1],out[2],out[3],out[4],out[5],out[6],out[7]} = in[7:0];
	其实我们完全可以用for循环来做。
endmodule

for循环描述电路行为而不是结构，因此只能使用他们在程序块内(比如always块)。这句话我也不是很理解，总是直接用循环即可。

always @(*)begin
  for(int i = 0; i < 8; i++)//int 是systemVerilog语句，如果纯Verilog语句只能用integer
    out[i] = in[8-1-i];
end

Generate-for 循环是一种用于在编译时生成硬件描述的机制，它与常规的过程性 for 循环不同，因为它并不执行操作，而是生成硬件结构。所以还可以用generate-for语句

generate 
  genvar i;
  for(i = 0;i < 8;i = i + 1) begin: my_block_name
    assign out[i] = in[8-i-1];
  end
endgenerate

Generate-for 循环：
- 用途：用于生成硬件电路结构。
- 执行时机：在编译时生成硬件结构，不在仿真时执行。
- 示例用途：用于在硬件级别创建多个模块实例、连接信号等。
普通 for 循环：
- 用途：用于执行仿真时的操作和控制流程。
- 执行时机：在仿真时执行循环内的操作，用于模拟电路的功能。
- 示例用途：用于编写测试向量、控制电路行为、执行算法等。

当然，下面也是一个练习，给一个100-bit的输入向量[99:0], 反转bit顺序。

module top_module( 
    input [99:0] in,
    output [99:0] out
);
    always @(*)begin
        for(integer i = 0;i < 100; i++)
            out[i] = in[100-1-i];
    end
endmodule

还可以简化一下：

module top_module (
	input [99:0] in,
	output reg [99:0] out
);
	
	always @(*) begin
		for (int i=0;i<$bits(out);i++)		// $bits() 是一个系统函数，返回信号宽度。
			out[i] = in[$bits(out)-i-1];	// $bits(out) 显然返回100位宽。
	end
	
endmodule

Replication operator

连接符允许从大向量中连接向量。有时你想被连接起来的字符重复好多次，但是单纯手敲好几遍太麻烦了。所以才有了这种方式。

1
2
3

{5{1'b1}}           // 5'b11111 (or 5'd31 or 5'h1f)
{2{a,b,c}}          // The same as {a,b,c,a,b,c}
{3'd5, {2{3'd6}}}//首先是一个101，然后是两个110，把这三个连接到一起。

下面练习。

看到复制运算符的一个常见情况是将较小的数字符号扩展为较大的数字，同时保留其有符号值。这是通过将较小数字的符号位（最高有效位）复制到左侧来完成的。比如

1
2
3

4'b0101(5)到8bit结果8'b00000101(5)
还有
4'b1101(-3)到8bit结果8'b11111101(-3)

我们要构建一个电路，用符号扩展一个8bit的数到32bits.这显然需要一个24bits扩展的符号位，然后接上原来数字本身。

module top_module (
    input [7:0] in,
    output [31:0] out );//

    // assign out = { replicate-sign-bit , the-input };
    assign out = {{24{in[7]}}, in[7:0]};

endmodule

more replication

给定5个1bit信号a, b, c, d, e计算所有25对1-bit比较，在25个输出向量中。输出将会是1，如果两bit相同。

out[24] = ~a ^ a;//a == a,所以out[24]总是1。因为先取反再异或，如果相同肯定是1，不同的话，取反就相同了，异或就是0了。
out[23] = ~a ^ b;
...
out[0] = ~e ^ e;

module top_module (
    input a, b, c, d, e,
    output [24:0] out );//

    // The output is XNOR of two vectors created by 
    // concatenating and replicating the five inputs.
    // assign out = ~{ ... } ^ { ... };
    assign out[24:20] = ~{a, a, a ,a, a} ^ {a, b, c, d, e};
    assign out[19:15] = ~{b, b, b, b, b} ^ {a, b, c ,d, e};
    assign out[14:10] = ~{c, c, c, c, c} ^ {a, b, c, d, e};
    assign out[9:5] = ~{d, d, d, d, d} ^ {a, b, c, d, e};
    assign out[4:0] = ~{e, e, e, e, e} ^ {a, b, c, d, e};

endmodule

这是一个非常麻烦和傻的答案，当然，这是我写的，下面我们看简单的。

module top_module (
	input a, b, c, d, e,
	output [24:0] out
);
  wire[24:0] top, bottom;
  assign top = { {5{a}}, {5{b}}, {5{c}}, {5{d}}, {5{e}}};
  assign bottom = { 5{a, b, c, d, e} };
  assign out = ~top ^ bottom;//这里会进行bit级别的逻辑运算！
endmodule

Module(也就是设计一个芯片模块)

1
2
3

module mod_a ( input in1, input in2, output out );
    // Module body
endmodule

其中的三个参数表示了三个端口，显然in1, in2是两个输入，output out是一个输出。

模块的层级通过创建在一个模块内部创建模块而体现，只要所有的被使用的模块属于同一个项目（编译器也就知道去哪里找模块了）。

一个模块的代码并不是在另一个模块的内部。

下面是两种把信号连接到模块端口的办法。

1.通过位置

这个语法和c语言非常相似。

1	mod_a instance1(wa, wb, wc)

这样实例化了一个具有type mod_a的module，并且给它了一个实例名字instance1，然后连接信号wa(在新module之外的),连接到第一个端口(in1)，同理wb连接到第二个端口in2，如图所示。wc就连到输出端口out。

这种语法的一个缺点就是，如果模块的接口列表改变了，所有关于这个模块的实例都要被找到，并且改变，匹配新的模块模板。

2.通过名字

通过名称连接信号到模块的端口，允许wires保持正确的连接即使端口列表改变了。

1	mod_a instance2(.out(wc), .in1(wa), .in2(wb));

上述初始化了一个mod_a类型的模块，叫做instance2。注意到端口的顺序并不是按顺序的，因为连接将会被对应到正确的端口名，无论端口在端口列表中的位置如何。显然这种方式更好。

module top_module (
	input a,
	input b,
	output out
);

	// 在顶层模块中创建一个mod_a模块并且实例化位inst1", 通过名字连接端口:
	mod_a inst1 ( 
    .in1(a), 	// mod_a的端口1连接到上层的a
    .in2(b),	// 端口in2连接到上层的b
    .out(out)	// mod_a的out连接到外面top_module的out.
	);

/*
	// Create an instance of "mod_a" named "inst2", and connect ports by position:
	mod_a inst2 ( a, b, out );	// The three wires are connected to ports in1, in2, and out, respectively.
*/
endmodule

实际上就是大模块里面定义了一个小模块。下面是一个练习题目。

给定了一个内部模块命名为mod_a，有2个输出和4个输入，我们必须要连接6个端口通过位置或者名称，连接到我们的外部模块top_module上，外部模块的输出和输入分别为out1, out2, a, b, c, d。

Module pos.png

错误：module top_module(input a, input b, input c, input d, output out1, output out2);
  mod_a instance1(
    .in(a),
    .in(b),
    .in(c),
    .in(d),
    .out(out1),
    .out(out2)
  );
endmodule
//这样便会报错！因为in无法指定具体是哪个端口，所以我们只能用位置命名！
module top_module(input a, input b, input c, input d, output out1, output out2);
  mod_a instance1(out1, out2, a, b, c, d);
endmodule//注意里面的顺序，是根据mod_a定义时的声明顺序来的。因为module mod_a ( output, output, input, input, input, input );

但是如果电路图这样给出。

Module name.png

我们便可以用上面的命名方式，来连接两个模块的端口了！

module top_module(input a, input b, input c, input d, output out1, output out2);
  mod_a instance1(
    .in1(a),
    .in2(b),
    .in3(c),
    .in4(d),
    .out1(out1),
    .out2(out2)
  );
endmodule

Module shift

这是设计一个移位寄存器，初始化三个锁存器，连接在一起，并且将三个的时钟信号连接到一起。

Module shift.png

其中锁存器的模块命名为module my_dff ( input clk, input d, output q );

注意想要完成内部链接，我们需要定义一些wires。注意我们对线和实例的命名，必须是唯一的。

module top_module ( input clk, input d, output q );
    wire dff_1_out;
    wire dff_2_out;
    my_dff instant1(clk, d, dff_1_out);
    my_dff instant2(clk, dff_1_out, dff_2_out);
    my_dff instant3(clk, dff_2_out, q);
endmodule
//细细理解，当然这里可以定义三条线，最后再用q连接第三条线，只不过两条线就足够了。

下面是拓展问题，上面我们的线和端口只是朴素的线，这次我们把它换成向量。

Module shift8.png

这里我们使用my_dff8，有两个输入和一个输出，把他们联结在一起，此外创建一个4-1的复用器（数据选择器）来根据sel[1:0]两位，决定输出哪一个数据。本质上，sel选择延迟输入的周期数，从零到三个时钟周期。

其中my_dff8的定义方式为my_dff8 ( input clk, input [7:0] d, output [7:0] q );

但是数据选择器并没有提供，我们需要先练习创建一个16bit宽的，9-1的数据选择器。sel=0选择a，sel=1选择b，等等。对于没有使用的情况（sel=9~15）输出位置1。

module top_module( 
    input [15:0] a, b, c, d, e, f, g, h, i,
    input [3:0] sel,
    output [15:0] out );

    always @(*) begin
        case(sel)
            4'b0000: out = a;  // 当sel=0选择数据a
            4'b0001: out = b;  // When sel is 1, choose input b
            4'b0010: out = c;  // When sel is 2, choose input c
            4'b0011: out = d;  // When sel is 3, choose input d
            4'b0100: out = e;  // When sel is 4, choose input e
            4'b0101: out = f;  // When sel is 5, choose input f
            4'b0110: out = g;  // When sel is 6, choose input g
            4'b0111: out = h;  // When sel is 7, choose input h
            4'b1000: out = i;
            default: out = 16'b1111111111111111;  // 也就是选择8-15时全部置1.
        endcase
    end

endmodule

或者标准答案给出了一个省略default的写法，如下，在中间部分替换一下：

always @(*) begin
		out = '1;		// '1 is a special literal syntax for a number with all bits set to 1.
						// '0, 'x, and 'z are also valid.
						// I prefer to assign a default value to 'out' instead of using a
						// default case.
		case (sel)
			4'h0: out = a;
			4'h1: out = b;
			4'h2: out = c;
			4'h3: out = d;
			4'h4: out = e;
			4'h5: out = f;
			4'h6: out = g;
			4'h7: out = h;
			4'h8: out = i;
		endcase
	end

回到本题，因为是4选1，所以只需要sel两位即可。看题目中所给，正好是两位[1:0]。说明没问题啦。

在创建线的时候，我们要创建成向量模式了！

module top_module ( 
    input clk, 
    input [7:0] d, 
    input [1:0] sel, 
    output [7:0] q 
);
    wire [7:0] dff_1_output;//因为定义中输入输出都是8位了
    wire [7:0] dff_2_output;
    wire [7:0] dff_3_output;
    wire [7:0] select_out;
    my_dff8 instant1(clk, d[7:0], dff_1_output[7:0]);
    my_dff8 instant2(clk, dff_1_output[7:0], dff_2_output[7:0]);
    my_dff8 instant3(clk, dff_2_output[7:0], dff_3_output[7:0]);
    
    always @(*)begin
        case(sel)
            2'b00: select_out[7:0] = d[7:0];
            2'b01: select_out = dff_1_output;
            2'b10: select_out = dff_2_output;
            2'b11: select_out[7:0] = dff_3_output[7:0];
            default: select_out[7:0] = 8'b0;
        endcase
    end
    assign q = select_out;

endmodule

一定要注意定义向量的顺序！先定义范围，在安排名字！犯错好几次了！

Module add

我们被给出了一个16位加法器，产生了16位加法。初始化两个并且创造一个32位加法器。其中一个16位加法器计算低16位，在接收第一个加法器的进位之后，第二个加法器计算高16位。假设我们的32位加法器不需要考虑输入进位，即第一个16模块的输入进位设为0，和输出进位。（但是内部的模块需要考虑这些，比如第二个16加法器模块就要算上第一个的输出进位）。

16位加法器的定义如下：

1	module add16 ( input[15:0] a, input[15:0] b, input cin, output[15:0] sum, output cout );

Module add.png

代码如下：

module top_module(
    input [31:0] a,
    input [31:0] b,
    output [31:0] sum
);
    wire [15:0] out_1;
    wire [15:0] out_2;
    wire c_out_1;
    wire c_out_2;
    add16 instant1(a[15:0], b[15:0], 0, out_1[15:0], c_out_1);
    add16 instant2(a[31:16], b[31:16], c_out_1, out_2[15:0], c_out_2);
    assign sum[15:0] = out_1[15:0];
    assign sum[31:16] = out_2[15:0];
endmodule

练习

在这个练习中，我们将会创造一个两个等级的电路。我们的top_module将会初始化两个16位加法器，就像上面给出的，然后每个加法器将会初始化16个1位加法器，所以我们要写两个模块：top_module和add1。

就像上面的练习，我们被给出了一个16位加法器来完成一个16位加法。我必须初始化两个来创造一个32位加法。

连接两个16位加法器如图所示。对16位加法器的声明如下：

1	module add16 ( input[15:0] a, input[15:0] b, input cin, output[15:0] sum, output cout );

在每个16位加法器内，16个1位加法器被初始化来完成这个运算。这个1位加法器的声明如下：

1	module add1 ( input a, input b, input cin, output sum, output cout );

总而言之，在这个设计中，有三个模块：

top_module包含了两个16位加法器。

add16一个16位加法器包含了16个1位加法器。

add1一个1位全加器。

Module fadd.png

代码如下：

module top_module (
    input [31:0] a,
    input [31:0] b,
    output [31:0] sum
);//
    //要用线定义多个中间进位信号
    wire [15:0] out_1;
    wire [15:0] out_2;
    wire [15:0] c_out_1;
    wire [15:0] c_out_2;
    //add16 instant1(a[15:0], b[15:0], 0, out_1[15:0], c_out_1[15:0]);
    //add16 instant2(a[31:16], b[31:16], c_out_1, out_2[15:0], c_out_2[15:0]);
    
    add1 instant(a[0], b[0], 0, out_1[0], c_out_1[0]);
    genvar i;//for循环实例化16个全加器
    generate 
        for(i = 1; i < 16; i=i+1) begin: gen_add1
            add1 instant(a[i], b[i], c_out_1[i-1], out_1[i], c_out_1[i]);//把第一个里面剩下的进行循环
        end
    endgenerate
    
    add1 instant12(a[16], b[16], c_out_1[15], out_2[0], c_out_2[0]);
    genvar j;
    generate
        for(j = 1; j < 16; j = j + 1) begin: gen_add2
            add1 instant(a[j + 16], b[j + 16], c_out_2[j-1], out_2[j], c_out_2[j]);
        end
    endgenerate
    assign sum[31:16] = c_out_2[15:0];
    assign sum[15:0] = c_out_1[15:0];
    
endmodule

module add1 ( input a, input b, input cin,   output sum, output cout );
    wire x1, x2, x3;
    xor (x1, a, b);
    xor (sum, x1, cin);//结果
    //计算进位
    and (x2, a, b);
    or (x3, x2, (a & cin));
    or (cout, x3, (b & cin));//这样俩俩计算进位即可。
// Full adder module here

endmodule

写了半天，结果白费功夫，肯定不能这样套娃啊，已经模块化设计了。

正确答案如下！

module top_module (
    input [31:0] a,
    input [31:0] b,
    output [31:0] sum
);

    wire c_out_1;
    wire c_out_2;

    // 实例化两个16位加法器
    add16 instant1(a[15:0], b[15:0], 0, sum[15:0], c_out_1);
    add16 instant2(a[31:16], b[31:16], c_out_1, sum[31:16], c_out_2);


endmodule

module add1 (
    input a,
    input b,
    input cin,
    output sum,
    output cout
);

    wire x1, x2, x3;

    // 计算和
    xor (x1, a, b);
    xor (sum, x1, cin);

    // 计算进位
    and (x2, a, b);
    or (x3, x2, (a & cin));
    or (cout, x3, (b & cin));

endmodule

注意此题只是想让我们完成最下面的add1模块的定义即可！而add1主要考察进位的表示，显然三个加数俩俩与酒可以了！

carry-select adder

上面的合成加法器，也叫做波纹进位加法器（ripple carry adder）的缺点是，加法器计算进位的延迟相当的慢，并且第二个加法器要等第一个加法器的carry out计算出来之后，才能进行计算它自己的carry out，这让整个加法器都慢了许多。一种改进的方法如下图所示：

Module cseladd.png

第一个加法器像以前一样，不变，但是我们倍增第二个加法器，其中一个假定carry in = 0，另一个假定carry in = 1，然后用快速2-1数据选择器来选择哪个结果是正确的。

在这个练习中，我们被提供和上次一行的16位加法器。这次我们必须要初始化三个16位加法器，并且使用16位2-1数据选择器，来实现进位选择加法器。

1	module add16 ( input[15:0] a, input[15:0] b, input cin, output[15:0] sum, output cout );

连接起来的模块如图所示。

Module cseladd.png

显然并不是很难，用到了我们上面数据选择器

module top_module(
    input [31:0] a,
    input [31:0] b,
    output [31:0] sum
);
    wire c_out_1, c_out_2, c_out_3;//c_out1就是sel信号
    wire [15:0] out_1, out_2;
    add16 instant1(a[15:0], b[15:0], 0, sum[15:0], c_out_1);
    add16 instant2(a[31:16], b[31:16], 0, out_1[15:0], c_out_2);
    add16 instant3(a[31:16], b[31:16], 1, out_2[15:0], c_out_3);
    
    always @(*) begin
        case(c_out_1)
            1'b1: sum[31:16] = out_2;
            1'b0: sum[31:16] = out_1;
            default: sum[31:16] = 16'b0;
        endcase
    end
    
endmodule

Module add-sub(加减法器)

一个加减法器可以通过把一个加法器选择性地取反一位输入来实现，这和取反输入再加1是一样的。最终是一个可以执行两种操作的电路：(a + b + 0)和(a + ~b + 1)也就是用了补码的操作。

这里同样我们被提供了一个像之前一样的16位加法器。

1	module add16 ( input[15:0] a, input[15:0] b, input cin, output[15:0] sum, output cout );

用一个32位宽的xor来反转b输入，当sub位是1的时候（执行减法）。也可以写作b[31:0] xor sub重复32次。

Module addsub.png

并且把sub input连接到cin端口。

下面是代码：

module top_module(
    input [31:0] a,
    input [31:0] b,
    input sub,
    output [31:0] sum
);
    wire c_out_1, c_out_2;
    wire [31:0] sub_32 = {32{sub}};
    wire [31:0] reversed_b;
    assign reversed_b[31:0] = b[31:0] ^ sub_32[31:0];
    add16 instant1(a[15:0], reversed_b[15:0], sub, sum[15:0], c_out_1);
    add16 instant2(a[31:16], reversed_b[31:16], c_out_1, sum[31:16], c_out_2);   

endmodule

但是这里有一个小问题我之前给reversed_b赋值用的

1	xor(reversed_b[31:0], b[31:0], sub_32[31:0]);

结果出现错误，这是为什么。

Always block1

因为数字电路就是由带线连接的逻辑门构成的，任何电路可以被表示为一些模块和声明的组合。然而有时这并不是最方便的描述电路的方法，Procedures(包括always, initial, task, function)这些语句帮助我们描述电路。

对于仿真硬件，下面两种always的块都可以：

1 2	combinational: always @(*) clocked: always @(posedge clk)

将always块组合起来和赋值语句是等价的，因此我们可以用两种方式表示组合电路。而选择其中哪一种就要看语法的方便程度了。过程块内部代码的语法和外部代码不同。过程块具有更丰富的语句集(if-then, case)。

比如下面两种语句描述同一个赋值。

1 2	always @(*) out2 = a&b \| c^d; assign out1 = a&b \| c^d;

永远梳子.png

如图所示，对于assign赋值，变量类型是wire，而对于always赋值，变量类型是reg。这些变量类型和合成的硬件无关。

下面是练习代码，用两种语句写一个与门。

module top_module(
    input a, 
    input b,
    output wire out_assign,
    output reg out_alwaysblock
);
    assign out_assign = a & b;
    always @(*) out_alwaysblock = a & b;

endmodule

Always block2

对于硬件仿真，有两种always块

combinational: always @(*)

clocked: always @(posege clk)

下面的时钟块也是创建一个组合逻辑块，而且还在组合逻辑块的输出处创建一组触发器(flip-flop)(或者寄存器reg)。逻辑块的输出不是立即可见，而是仅仅在下一个(posege clk)之后立即可见。

blocking vs non-blocking assignment

Continuousassignment: assign x = y;只能用于不在procedures(always块)中时。

procedual blockingassignment比如x = y;只能用在procedure中。

procedual non-blockingassignment比如x <= y;也只能用在procedure中。

在一个组合always块中，用blocking声明。在clockedalways块中，用non-blocking声明。

下面是一个练习。

使用三种方式构建异或门。（分配语句，组合always块，clocked always块）注意，clocked always块与其他两个不同，有一个触发器(flip-flop)所以输出被延迟。

总是ff.png

代码如下：

module top_module(
    input clk,
    input a,
    input b,
    output wire out_assign,
    output reg out_always_comb,
    output reg out_always_ff   );
    
    assign out_assign = a ^ b;
    always @(*) out_always_comb = a ^ b;
    always @(posedge clk) out_always_ff = a ^ b;

endmodule

if-statement

我们用if语句实现2-1数据选择器。如果条件正确输出1，条件错误输出0。

always @(*) begin
    if (condition) begin
        out = x;
    end
    else begin
        out = y;
    end
end

或者等驾于用一个三幕运算符。

1	assign out = (condition) ? x: y;

根据下面真值表，给出电路语句。

sel_b1	sel_b2	out_assign out_always
0	0	a
0	1	a
1	0	a
1	1	b

在电路中用两种方式，一种用assign直接赋值，另一种用procedual中的if statement。

module top_module(
    input a,
    input b,
    input sel_b1,
    input sel_b2,
    output wire out_assign,
    output reg out_always   ); 
    assign out_assign = (sel_b1 & sel_b2) ? b:a;
    always @(*)begin
        if(sel_b1 & sel_b2) begin
            out_always = b;
        end
        else begin
            out_always = a;
        end
    end
endmodule

非常简单！

A common source of errors: How to avoid making latches（锁存器）

下面的代码，不正确的操作，创建了一个锁存器。修改bug，以至于你只有在它真正过载的时候才关闭电脑，并且停止驾驶当你到达目的地准备加油时。

这是语句描述出来的错误电路，并不是你想要的逻辑。这样会使电脑默认关机了。

always @(*)begin 
  if(cpu_overhead)
    shut_off_computer = 1;
end

always @(*) begin
  if(~arrived)
    keep_driving = ~gas_tank_empty;
end
在第一个电路中如果过载就关机了，那么如果没过载，默认也关机了，所以相当于创造了一个锁存器存住了默认执行语句。
在第二个电路默认空油也会驾驶。

也就是我们需要一个else语句，否则else情况就会默认执行前面的语句。

修改之后如下：

module top_module (
    input      cpu_overheated,
    output reg shut_off_computer,
    input      arrived,
    input      gas_tank_empty,
    output reg keep_driving  ); //

    always @(*) begin
        if (cpu_overheated) begin
           shut_off_computer = 1;
        end
        else
            shut_off_computer = 0;
    end

    always @(*) begin
        if (~arrived) begin
           keep_driving = ~gas_tank_empty;
        end
        else
            keep_driving = 0;
    end

endmodule

case语句

我们已经比较熟悉了，因为前面的例子中已经提到过，所以我们只需要来练习一下即可。

创建一个6选1的数据选择器，当选择键sel在0-5之间，选择对应的数据输出。并且给定输入数据和输出数据都是4位。

module top_module ( 
    input [2:0] sel, 
    input [3:0] data0,
    input [3:0] data1,
    input [3:0] data2,
    input [3:0] data3,
    input [3:0] data4,
    input [3:0] data5,
    output reg [3:0] out   );//

    always@(*) begin  // This is a combinational circuit
        case(sel)
            3'b000: out = data0;
            3'b001: out = data1;
            3'b010: out = data2;
            3'b011: out = data3;
            3'b100: out = data4;
            3'b101: out = data5;
            default: out = 4'b0000;
        endcase
    end

endmodule

case 2

一个优先编码器是一个组合逻辑电路，当给定一个输入向量，输出是向量中的第一个电平位1的比特。比如，一个8bit的优先编码器被给定了输入8'b10010000将会输出3'd4因为bit[4]是第一个电平位1的电平。

下面构建一个优先编码器。当然，如果所有输入都没有1这个位，那么就输出0。

module top_module (
    input [3:0] in,
    output reg [1:0] pos  );
    always @(*) begin
        case(in)
            4'b0000: pos = 2'd0;
            4'b0001: pos = 2'd0;
            4'b0010: pos = 2'd1;
            4'b0011: pos = 2'd0;
            4'b0100: pos = 2'd2;
            4'b0101: pos = 2'd0;
            4'b0110: pos = 2'd1;
            4'b0111: pos = 2'd0;
            4'b1000: pos = 2'd3;
            4'b1001: pos = 2'd0;
            4'b1010: pos = 2'd1;
            4'b1011: pos = 2'd0;
            4'b1100: pos = 2'd2;
            4'b1101: pos = 2'd0;
            4'b1110: pos = 2'd1;
            4'b1111: pos = 2'd0;
            default: pos = 2'd0;
        endcase
    end

endmodule

显然这样表示比较麻烦，我们更倾向于16进制表示，会简单很多。

Always CaseZ

创建一个优先编码器应对8位输入。给定一个8位向量，输出应该汇报出1出现的第一个位置。报告0如果输入向量

但是8位输入的话，我们就要在case中写256个情况，未免太多了点。其实我们可以把256种情况减小到9种，如果其中的一些项包含了一些无关紧要的bits。这就是caseZ做的事情：他在比较中将具有值z的位设置为无关位。

比如上一个例子中，就可以简写成

always @(*) begin
    casez (in[3:0])
        4'bzzz1: out = 0;   // in[3:1] can be anything
        4'bzz1z: out = 1;
        4'bz1zz: out = 2;
        4'b1zzz: out = 3;
        default: out = 0;
    endcase
end

运行程序的时候所有16种情况会自己进去配对，配到哪种就输出对应的值，但是问题来了，比如4’b1111这对于上面的情况都吻合，该输出什么呢？此时应该输出匹配到的第一项，也就是out = 0.后面的项不再匹配。

注意：匹配符号用?或者z都可以！

module top_module (
    input [7:0] in,
    output reg [2:0] pos
);

always @(*) begin
    casez(in)
        8'b???????1: pos = 3'd0;
        8'b??????10: pos = 3'd1;
        8'b?????100: pos = 3'd2;
        8'b????1000: pos = 3'd3;
        8'b???10000: pos = 3'd4;
        8'b??100000: pos = 3'd5;//用z或者用?都可以！
        8'b?1000000: pos = 3'd6;
        8'b10000000: pos = 3'd7;
        default: pos = 3'd0;
    endcase
end

endmodule

注意：此时的case一定要改成casez

Always no latches

假设您正在构建一个电路来处理游戏中 PS/2 键盘的扫描码。鉴于收到的扫描码的最后两个字节，您需要指示键盘上的方向键之一是否已被按下。这涉及一个相当简单的映射，可以将其实现为具有四种情况的 case 语句（或 if-elseif）。

Scancode [15:0]	Arrow key
`16'he06b`	left arrow
`16'he072`	down arrow
`16'he074`	right arrow
`16'he075`	up arrow
Anything else	none

您的电路有 1 个 16 位输入和 4 个输出。构建该电路来识别这四个扫描码并断言正确的输出。

为了避免创建锁存器，我们要在所有可能的条件下为所有的输出分配一个值。解决他的简单方法就是在case语句之前为输出分配一个默认值。

always @(*) begin
    up = 1'b0; down = 1'b0; left = 1'b0; right = 1'b0;
    case (scancode)
        ... // Set to 1 as necessary.
    endcase
end

题目代码如下：

// synthesis verilog_input_version verilog_2001
module top_module (
    input [15:0] scancode,
    output reg left,
    output reg down,
    output reg right,
    output reg up  ); 
    always @(*)begin
        up = 1'b0;down = 1'b0;left = 1'b0;right = 1'b0;
        case(scancode)
            16'he06b: left = 1'b1;
            16'he072: down = 1'b1;
            16'he074: right = 1'b1;
            16'he075: up = 1'b1;
        endcase
    end

endmodule

实际上也不难！之前的例子中已经出现了提前赋值的情况！