Jump to content


Photo

Jzsimd - Mxu Instructions List


  • Please log in to reply
7 replies to this topic

#1 hlide

hlide

    GP32 Hardcore

  • GP32 Hardcore
  • PipPipPipPip
  • 225 posts

Posted 29 October 2009 - 02:49 PM

WORK IN PROGRESS

Analysing the mxu_as AWK script, I can determine there are 58 basic SIMD instructions having a common opcode name. Amongst them, there is 12 SIMD instructions which can be subdivised into 48 instructions and 3 into 48 instructions and 1 into 8 instructions because they have some fix arguments (WW, LW, HW, XW, AA, AS, SA, SS, etc.) which partly change the behavior of a basic SIMD instruction. So, we may consider there are 146 SIMD instructions.

Here are the basic SIMD instructions list :
WW_LW_HW_XW:2   = { WW:00, LW:01, HW:10, XW: 11 }
HH_LL_HL_LH:2   = { HH:00, LL:01, HL:10, LH: 11 }
AA_AS_SA_SS:2   = { AA:00, AS:01, SA:10, SS:11 }
A_S:1           = { A:00, S:01 }
SFL_PTN:2       = { PTN0:00, PTN1:01, PTN2:10, PTN3:11 }

XA:4      = { XR0:0000, XR1:0001, ..., XR15=1111 }
XB:4      = { XR0:0000, XR1:0001, ..., XR15=1111 }
XC:4      = { XR0:0000, XR1:0001, ..., XR15=1111 }
XD:4      = { XR0:0000, XR1:0001, ..., XR15=1111 }
XA_XR16:5 = { XR0:00000, XR1:00001, ..., XR15=01111, XR16=10000 }

RS:5 = { GR0:00000, GR1:00001, ..., GR31=11111 }
RT:5 = { GR0:00000, GR1:00001, ..., GR31=11111 }
RD:5 = { GR0:00000, GR1:00001, ..., GR31=11111 }

SA:5 = { 0, 1, ..., 31 }
SS:2 = { 0, 1, 2, 4 }

SPECIAL2:6 = 011100

Q8MUL       (011100:00          :00         :XD   :XC  :XB   :XA   :111000)
Q8MAC       (011100:AA_AS_SA_SS :00         :XD   :XC  :XB   :XA   :111010)
Q8MADL      (011100:AA_AS_SA_SS :00         :XD   :XC  :XB   :XA   :111100)
Q8SAD       (011100:00          :00         :XD   :XC  :XB   :XA   :111110)
Q8ADDE      (011100:AA_AS_SA_SS :00         :XD   :XC  :XB   :XA   :011100)
Q8ACCE      (011100:AA_AS_SA_SS :00         :XD   :XC  :XB   :XA   :011101)
Q8ABD       (011100:00          :00         :0:100:XC  :XB   :XA   :000111)
Q8AVG       (011100:00          :00         :0:100:XC  :XB   :XA   :000110)
Q8AVGR      (011100:00          :00         :0:101:XC  :XB   :XA   :000110)
Q8ADD       (011100:AA_AS_SA_SS :00         :0:111:XC  :XB   :XA   :000110)
Q8MAX       (011100:00          :00         :0:100:XC  :XB   :XA   :000011)
Q8MIN       (011100:00          :00         :0:101:XC  :XB   :XA   :000011)
Q8SLT       (011100:00          :00         :0:110:XC  :XB   :XA   :000011)
D16MUL      (011100:00          :WW_LW_HW_XW:XD   :XC  :XB   :XA   :001000)
D16MULF     (011100:00          :WW_LW_HW_XW:0:000:XC  :XB   :XA   :001001)
D16MAC      (011100:AA_AS_SA_SS :WW_LW_HW_XW:XD   :XC  :XB   :XA   :001010)
D16MACF     (011100:00          :WW_LW_HW_XW:0:000:XC  :XB   :XA   :001011)
D16MADL     (011100:00          :WW_LW_HW_XW:0:000:XC  :XB   :XA   :001100)
S16MAD      (011100:0:A_S       :HH_LL_HL_LH:XD   :XC  :XB   :XA   :001101)
Q16ADD      (011100:AA_AS_SA_SS :WW_LW_HW_XW:XD   :XC  :XB   :XA   :001110)
Q16ACC      (011100:AA_AS_SA_SS :WW_LW_HW_XW:XD   :XC  :XB   :XA   :011011)
D16CPS      (011100:00          :00         :0:010:XC  :XB   :XA   :000111)
Q16SAT      (011100:00          :00         :0:110:XC  :XB   :XA   :000111)
D16AVG      (011100:00          :00         :0:010:XC  :XB   :XA   :000110)
D16AVGR     (011100:00          :00         :0:011:XC  :XB   :XA   :000110)
D16MAX      (011100:00          :00         :0:010:XC  :XB   :XA   :000011)
D16MIN      (011100:00          :00         :0:011:XC  :XB   :XA   :000011)
Q16SLL      (011100:SA                      :XD   :XC  :XB   :0000 :110100)
Q16SLR      (011100:SA                      :XD   :XC  :XB   :0000 :110101)
Q16SAR      (011100:SA                      :XD   :XC  :XB   :0000 :110111)
Q16SLLV     (011100:RS                        :100:XC  :XB   :XA   :110110)
Q16SLRV     (011100:RS                        :101:XC  :XB   :XA   :110110)
Q16SARV     (011100:RS                        :111:XC  :XB   :XA   :110110)
S32SFL      (011100:SFL_PTN     :00         :XD   :XC  :XB   :XA   :111101)
D32ADD      (011100:AA_AS_SA_SS :00         :XD   :XC  :XB   :XA   :011000)
D32ACC      (011100:AA_AS_SA_SS :00         :XD   :XC  :XB   :XA   :011001)
S32CPS      (011100:00          :00         :0:000:XC  :XB   :XA   :000111)
S32MAX      (011100:00          :00         :0:000:XC  :XB   :XA   :000011)
S32MIN      (011100:00          :00         :0:001:XC  :XB   :XA   :000011)
D32SLL      (011100:SA                      :XD   :XC  :XB   :XA   :110000)
D32SLR      (011100:SA                      :XD   :XC  :XB   :XA   :110001)
D32SAR      (011100:SA                      :XD   :XC  :XB   :XA   :110011)
D32SARL     (011100:SA                      :0:000:XC  :XB   :XA   :110010)
D32SARV     (011100:RS                        :011:XC  :XB   :0000 :110110)
D32SARW     (011100:RS                        :000:XC  :XB   :XA   :100111)
D32SLLV     (011100:RS                        :000:XC  :XB   :0000 :110110)
D32SLRV     (011100:RS                        :001:XC  :XB   :0000 :110110)
S32ALN      (011100:RS                        :001:XC  :XB   :XA   :100111)
S32M2I      (011100:00          :00         :0:RT   :00:000:XA_XR16:101110)
S32I2M      (011100:00          :00         :0:RT   :00:000:XA_XR16:101111)
S32LDD      (011100:RS                        :0:ADDR10      :XA   :010000)
S32STD      (011100:RS                        :0:ADDR10      :XA   :010001)
S32LDI      (011100:RS                        :0:ADDR10      :XA   :010100)
S32SDI      (011100:RS                        :0:ADDR10      :XA   :010101)
S32LDDV     (011100:RS                        :RT   :SS:000  :XA   :010010)
S32STDV     (011100:RS                        :RT   :SS:000  :XA   :010011)
S32LDIV     (011100:RS                        :RT   :SS:000  :XA   :010110)
S32SDIV     (011100:RS                        :RT   :SS:000  :XA   :010111)

Edited by hlide, 29 October 2009 - 03:52 PM.


#2 Exophase

Exophase

    Exophase is bad. Nothing good will ever come of him.

  • GP Guru
  • 5464 posts
  • Location:Cleveland OH

Posted 29 October 2009 - 03:29 PM

Maybe this will be useful to someone:

http://pastebin.com/...hp?dl=f442c9e6e

I'm sure it's taken from somewhere but I don't know where.

Edited by Exophase, 29 October 2009 - 03:32 PM.


#3 hlide

hlide

    GP32 Hardcore

  • GP32 Hardcore
  • PipPipPipPip
  • 225 posts

Posted 29 October 2009 - 03:44 PM

Maybe this will be useful to someone:

http://pastebin.com/...hp?dl=f442c9e6e

I'm sure it's taken from somewhere but I don't know where.

What I find interesting is that they've added not only SIMD instructions but a lot of extensions to improve the rather poor memory address modes MIPS has. The SIMD instructions themselves appear to be 32-bit wide only so not extremely useful outside of a few applications.


Indeed, most of SIMD instructions compute on 8 x 8-bit, 4 x 16-bit or 2 x 32-bit through two registers XD and XA as input/output.

pastebin: I have jz_mxu.h but the C version is incomplete (only 34 instructions have their C counterpart). So I guess some reverse-engineering are necessary to find out what the other 112 instructions do.

Edited by hlide, 29 October 2009 - 03:58 PM.


#4 Exophase

Exophase

    Exophase is bad. Nothing good will ever come of him.

  • GP Guru
  • 5464 posts
  • Location:Cleveland OH

Posted 29 October 2009 - 03:59 PM

Indeed, most of SIMD instructions compute on 8 x 8-bit, 4 x 16-bit or 2 x 32-bit through two registers XD and XA as input/output.


Damn, you're a hawk, responding before my edit went through. Anyway, there are some 2x32bit ones so I retracted what I said, that and the extended address modes appear to only apply to loading/storing to a new register set. Of course all can not be clear from your information and that C source alone.

#5 slaanesh

slaanesh

    Mega GP Mania

  • GP Guru
  • 1918 posts
  • Gender:Male
  • Location:Melbourne, Australia
  • Interests:GP32, GP2X, Zodiac, PSP, Dingoo, Pandora.

Posted 30 October 2009 - 12:22 AM

How do we use these extra instructions in GCC for example? Does GCC even support using thse extensions?

Guessing from what you've said, I imagine there is no datasheet available.

Very curious as I'd like to make use of some of these if appropriate.

#6 hlide

hlide

    GP32 Hardcore

  • GP32 Hardcore
  • PipPipPipPip
  • 225 posts

Posted 30 October 2009 - 01:45 AM

How do we use these extra instructions in GCC for example? Does GCC even support using thse extensions?

Guessing from what you've said, I imagine there is no datasheet available.

Very curious as I'd like to make use of some of these if appropriate.


nope, neither gcc nor as are aware of them. They use a hack by running this AWK script mxu_as on a .s file as a preprocessor to transform each of those instructions into a .word 0bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.

of course, we can imagine to add them in binutils and gcc. I think the minimal would be to have them through __asm and define those media registers as v8qi, v4hi, v2hi and v2si types and let gcc allocate them for us. Intrinsics are a little more complicate, I think.

#7 hlide

hlide

    GP32 Hardcore

  • GP32 Hardcore
  • PipPipPipPip
  • 225 posts

Posted 30 October 2009 - 09:54 AM

Like standard MIPS general purpose registers, media register has a special register XR0 hardwired to 0 as value. So it should be possible to make scalar operation like "D32ADD XR2, XR0, XR1, XR0, SS" being equivalent to "XR2 = -XR1". So if you need to use 3D integer vectors, the fourth register for output can be XR0.

I was wondering about two things:

- XR16 is only accessible through S32M2I and S32I2M, that is, you need a GPR to access the content of XR16. What purpose is that register ?
- some instructions like MAD and ACC accumulate an operation result to the output registers. For instance, "D32ACC XR3, XR1, XR2, XR4, AS" means "XR3 += XR1 + XR2; XR4 += XR1 - XR2". What will happen if we have the same register as output ? something like "XR3 += XR1 + XR2; XR3 += XR1 - XR2" is surely impossible as it should be done in parallel.

#8 vimrc

vimrc

    Newbie

  • Member
  • Pip
  • 1 posts

Posted 30 October 2009 - 04:02 PM

Have you noted this one which comments the 60 SIMD instructions for the Jz47xx MIPS core:
http://gitorious.org/~jz4740/linux_jz4740/jz_mxu_doc/blobs/master/jz_mxu_doc.c
The comments are in Chinese, however it's can be translated by Google.

Edited by vimrc, 31 October 2009 - 01:00 PM.