CLIDR = 0a200023 CCSIDR, L1-D = 700fe01a supports Write back supports Read allocate supports Write allocate 128 sets as 4 way line size = 16 words (64 bytes) CCSIDR, L1-I = 203fe009 supports Read allocate 512 sets as 2 way line size = 8 words (32 bytes) CCSIDR, L2-D = 707fe03a supports Write back supports Read allocate supports Write allocate 1024 sets as 8 way line size = 16 words (64 bytes) CCSIDR, L2-I = 707fe03a supports Write back supports Read allocate supports Write allocate 1024 sets as 8 way line size = 16 words (64 bytes)Note that we get the same result for L2 whether we ask for I or D, which makes sense I suppose given that it is a unified cache. Also notice:
for L1 D: 128 sets * 4 way * 64 byte lines = 32768 bytes for L1 I: 512 sets * 2 way * 32 byte lines = 32768 bytes for L2: 1024 sets * 8 way * 64 bytes lines = 524288 bytesThis is all as advertised. Code I had written before got a 64 byte line size, but that could be called coincidence or luck given that I never set the CSSELR register.
Now look at the CTR register:
CTR = 84448003 CTR - minimum line in I cache = 32 CTR - minimum line in D cache = 64 CTR - CWG = 64 CTR - ERG = 64This register gives line sizes (in words, but I display the values in bytes above). The manual "strongly recommends" using the DMIN and IMIN values given for loops in cache maintenance operations.
And there is the CLIDR register:
CLIDR = 0a200023 CLIDR - LoUU = 1 CLIDR - Loc = 2 CLIDR - LoUIS = 1 CLIDR - L1 type = 3 I/D CLIDR - L2 type = 4 unifiedIt gives the "type" for up to 7 levels of cache, with the results for the Cortex-A7 in the Allwinner H3 shown. This register allows you to discover how many levels of cache your device has, then you could interrogate those levels using the CLIDR register as shown above.
I am not yet working with any devices with more than 2 levels of cache. Someday I might get my hands on the Rockchip RK3588. It is an 8 core device set up with 3 levels:
L1 = 4 64/64 and 4 32/32 L2 = 2M and 512K L3 = 3M
Pages B2-1286 to 1287 give example code for cache maintenance (cleaning in this case).
There is a bug/typo on line 3 where they shift by 23 to get the LoC value (should be 24).
Note that LSR is "logical shift right" and LSL is "logical shift left".
Also note the "isb" between setting the CSSELR and reading the CCSIDR.
Here is my (possibly error ridden) translation into pseudocode.
r0 = CLIDR (mrc) r3 = (r0 & 0x07000000) >> 24 if ( r3 == 0 ) finished r10 = 0 for ( ;; ) { r2 = 3 * r10 r1 = r0 >> r2 r1 &= 7 /* skip if no cache or I cache only */ if ( r1 < 2 ) continue CSSELR = r10 isb() r1 = CCSIDR r2 = r1 & 0x7 // line length r2 += 4 r4 = (r1>>3) & 0x3ff r5 = clz(r4) r9 = r4 // way number for ( ;; ) { r7 = (r1>>13) & 0x7fff // index for ( ;; ) { r11 = r10 | r9 << r5 // way number and cache number r11 |= r7 << r2 // factor in index number DCCSW = r11 // clean by set/way r7-- // decrement index if ( r7 == 0 ) break } r9-- ;; decrement way number if ( r9 == 0 ) break } r10 += 2 if ( r10 < r3 ) break }Note the "clz" instruction. This is an ARM instruction that counts leading zeros in a word. The heart of all this is the DCCSW register. This is "Data Cache Clean by Set/Way" and is one of the Cache maintenance instructions. A number of instructions share a common data format in the register:
S = log2 of number of sets L = log2 of the line length B = L + SAll this is described here:
Kyu / tom@mmto.org