April 8, 2025

Black Pill boards - F411 USB -- Giving the code the Kondo treatment

I have two other essays on "tidying up the source code". Rather than making this number 3, I decided to use the entertaining title of "Kondo treatment". This comes from this book by a Japanese woman who writes about the virtue of "tidying and decluttering". I have already done a lot of "code tidying", but here are some other things I intend to attack: This is a big enough list for now. Long overly verbose function and variable names just yield useless clutter. I know I am working on the USB OTG project by virtue of the directory I am in -- there is no reason to prefix each and every function name with USB_OTG and such. I believe that every aspect of a function name should communicate something. Clutter is bad and distracts the mind. Some consideration must be given to name collisions. We can't just name a function "init" without worry, but everything that can be static should be, then there is no issue.

The code base we are working with has support for the USB controller in host mode. This is what OTG is all about after all -- the controller can act either as a host or as a device. I have no intention of doing anything at this time with host mode. So host code is just more clutter and in the way. I want to move it to its own files which I can ignore and not compile. The use of "ifdef" sections is an attempt to do this, but just yields clutter and confusion.

Structures within structures

The all important "pdev" structure pointer gets used like this:
pdev->cfg.phy_itface
&pdev->hw->GREGS->GCCFG
Why not get rid of "hw" and "cfg" and just flatten out the structure? The code would be less verbose, which would be an immediate benefit. Would the compiler be able to generate better code? Probably not, but it is worth investigating.
I write this bit of test code:
int
pickle ( USB_OTG_CORE_HANDLE *pdev )
{
    if (pdev->cfg.coreID == 0)
        return 0;
    else
        return 1;
}
Then I look at the generated ARM assembly (I always generate hydra.dump):
08001e5c :
 8001e5c:   7ac0        ldrb    r0, [r0, #11]
 8001e5e:   3800        subs    r0, #0
 8001e60:   bf18        it  ne
 8001e62:   2001        movne   r0, #1
 8001e64:   4770        bx  lr
This code is as good as I can hope for (I do compile with -O), the "coreID" element is accessed by the offset #11 relative to the structure pointer in r0. We do see a curious and interesting instruction "it" that I have never encountered before -- and this looks like thumb code with 16 bit instructions, but those topics are aside from the current question. My conclusion is that flattening the structure will not benefit the compiler, but it will benefit us by yielding more concise code.

Unions to access hardware registers

The bottom line here is that the -O switch on the compiler does amazing things. Consider the following code:
USB_OTG_STS
USB_OTG_DisableGlobalInt(USB_OTG_CORE_HANDLE *pdev)
{
  USB_OTG_STS status = USB_OTG_OK;

  USB_OTG_GAHBCFG_TypeDef  ahbcfg;
  ahbcfg.d32 = 0;
  ahbcfg.b.glblintrmsk = 1; /* Enable interrupts */

  USB_OTG_MODIFY_REG32(&pdev->hw->GREGS->GAHBCFG, ahbcfg.d32, 0);

  return status;
}
I would certainly rewrite this as follows, getting rid of the "status" variable, and using something simple (like "val") rather than the needless clutter of a name like "ahbcfg". We can do far better (see below).
USB_OTG_STS
USB_OTG_DisableGlobalInt(USB_OTG_CORE_HANDLE *pdev)
{
  USB_OTG_GAHBCFG_TypeDef  val;
  val.d32 = 0;
  val.b.glblintrmsk = 1; /* Enable interrupts */

  USB_OTG_MODIFY_REG32(&pdev->hw->GREGS->GAHBCFG, val.d32, 0);

  return USB_OTG_OK
}
But the resulting code is pretty good:
0800204e :
 800204e:   68c3        ldr r3, [r0, #12]	; fetch hw
 8002050:   681a        ldr r2, [r3, #0]	; fetch GREGS
 8002052:   6893        ldr r3, [r2, #8]	; access GAHBCFG
 8002054:   f023 0301   bic.w   r3, r3, #1
 8002058:   6093        str r3, [r2, #8]	; access GAHBCFG
 800205a:   2000        movs    r0, #0
 800205c:   4770        bx  lr
All evidence of the union for GAHBCFG has vanished, and the modify macro has been handled as well as could be imagined. The need to step along from hw to GREGS is due to my own changes where I introduced "hw" as a pointer rather than embedding the structure. I did this in order to avoid having to pull in the entire deviceregs structure which would require code in "library" to include header files in "driver". I may rethink that in some way that avoids (or minimizes) having dependence between "library" and "driver". The idea is that someday we can substitute an entirely different driver without any changes to the library code.

As a final note, I would probably rewrite this routine as:

static void
DisableGlobalInt ( CORE_HANDLE *pdev)
{
  pdev->hw->GREGS->GAHBCFG &= ~GLBLINTRMSK;
}

One last thing -- hardware access macros

Note my code just above. I cleared a bit in a hardware register with one line of C and no macro for hardware access. Some people would have a fit upon seeing this. I have written vast amounts of code for many years in this way, and there is no problem whatsoever. But some people insist on something like this:
dctl.d32 = USB_OTG_READ_REG32(&pdev->hw->DREGS->DCTL);
dctl.b.sftdiscon  = 0
USB_OTG_WRITE_REG32(&pdev->hw->DREGS->DCTL, dctl.d32);
Here we have the needless "accessor macros" along with overly verbose names for them. I could nicely convert this to one line of C as shown above. The only reason to have accessor macros like this is for architectures (like the x86) where there are explicit ioread and iowrite hardware instructions. At least that was true of the x86 in the very old days. Today even on the x86, most hardware registers are memory mapped and all of this clutter and verbosity is pointless, useless, and stupid.

On the ARM it is even more senseless, silly, and ridiculous.

Using uint8_t to be frugal

I understand what is aimed for here, but it is not as useful as you might think. Especially for function arguments and return values. In these cases, you may as well use "int" as entire 32 bit registers will be passed as arguments and function return values get returned in r0.

Conclusion

The things I would like to do to declutter the code would not have significant effects on the quality of the generated code, at least not in the simple cases I have examined.
This is not to say it isn't worth decluttering the code just to make it more understandable and readable. I have learned over many years that writing readable and understandable code is by far the most important thing. In very very rare cases, performance can demand something less than straightforward -- but far less often than people imagine.
Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org