MF8A18 Microcode

MF8A18: MicroFPGA 8 bit engine rev A year 2018

Instr	Function	RISC-V code
ADD	Rd = Rd + Rs	ADD rd, rd, rs
ADC	Rd = Rd + Rs + Carry	*
SUB	Rd = Rd - Rs	SUB
SUBI	Rd = Rd - imm	ADDI t0, r0, imm; SUB rd, rd, t0
SBC	Rd = Rd - Rs * Carry	*
AND		ANDI rd, rd, rs
ANDI		ANDI rd, rd, imm
OR		OR rd, rd, rs
ORI		ORI rd, rd, imm
XOR		XOR rd, rd, rs
ROR	Rd = Rd >> 1; R[7] = Carry	*
SHR	Rd = Rd >> 1; R[7] = 0	SRLI rd, rd, 1
ASR	Rd = Rd >> 1; R[7] = R[7]	SRAI rd, rd, 1
SWAP	Rd = Rd[3:0,7:4]	*
JMP	Jump (relative)	JAL r0, label
SKP0	Skip if Rd bit 0	*
SKP1	Skip if Rd bit 1	*
BEQ	Branch if Zero set	*
BNE	Branch if Zero clear	*
BLT		*
BGE		*
STORE	RAM[R31:R30] = Rd	SB
LOAD	Rd = RAM[R31:R30]	LB
IN	Rd = input
OUT	output = Rd

Emulated instructions

Instr	Emulated as
Compare	WREG=const; WREG=WREG-value
ROL	Rd = Rd + Rd

Optimization Xilinx LUT6

Optimization step	LUT	Slice	Comment
Initial	187	52	?
PC unit # 1	181	55	PC Unit from 25/10 to 10/3
PC unit # 2a	182	52	PC Unit from 25/10 to 14/5
PC unit # 2b	179	52	PC Unit from 25/10 to 12/5
PC unit # 2c	179	52	PC Unit from 25/10 to 11/4
misc #1	165	48	cleaned up a bit

PC unit optimization step 1

PC unit should use 10 LUT5, 3 x CARRY4 and 10 FF and fit to 2 Slices, but it did show 25 LUT and 5 slice initially. First attempt to optimize PC unit did reduce its own resources but overall design used more LUT and more Slices then before optimization. Forcing PC unit to 2 Slices and 10 LUT5 did yield overall LUT decrease from 187 to 181 but had more slices used than initial design. One potential possible optimization found: the PC unit as re-designed used only 4 inputs of the LUT5, so one input was free in all 10 LUT and as "keep hierarchy" was set the logic from higher level was not able to merge into the LUT inside PC unit. Another bigger issue is merging of offset select multiplexer into the PC unit, keep hierarchy disables that too.

Note: 6 CARRY 4 are used because one extra CARRY4 is used as route through for the addsub unit, Xilinx optimization does not use transparent latches as route through. This CARRY4 could be optimized away also with manual ALU/addsub.

PC unit optimization step 2 a/b/c

Offset selector manually merged into PC unit, now the main PC logic seems to be optimized, there are 3 extra LUT, one is inverting active low reset, the other two are used to generate freeze and carry in logic values. Next step would be to use active high reset.

Misc optimization step 1

Different cleanup on RTL, several times reducing logic complexity added new LUT to final design. Still some cleanup possible to reduce a few more LUT/FF.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MF8A18 Microcode

Optimization Xilinx LUT6

PC unit optimization step 1

PC unit optimization step 2 a/b/c

Misc optimization step 1

Clone this wiki locally