[Novalug] Programming hardware

Dennis Zarger denniszarger@gmail.com
Wed Aug 26 10:23:07 EDT 2020


On 8/25/20 8:11 PM, Peter Larsen via Novalug wrote:
> This is going to be a slight tangent and a bit deeper run based on
> having to say this:
>
> On 8/24/20 12:06 PM, Peter Larsen wrote:
>> The "problem" with parallel comes when you have more than one line of
>> code. Trust me when I say, that saying you know it's parallel and then
>> coding based on it, are two very different things - particular when
>> you've spent decades using sequential statements.
You hit the nail on the head here--my monkey brain is still parsing most 
of this code as sequential and thinking "oh, yeah that works." I mention 
that a bit more below.
> If you have no interest in programming IC/FPGAs - this thread and this
> post isn't for you. "These aren't the drones you're looking for".

That's not it at all, in fact I'll lead with some questions: where does 
a humble guy like myself start tinkering with this? I saw you had an 
Arty A7, but that seems like it would be a bit over-kill for me, just 
trying to wrap my head around the different issues that arise with 
parallel processing. Is there another dev-board you might recommend? 
What is the workflow like after you've written your program, what 
software are you using to push these schematics?

> New subject: Double Dabble and the magic of parallel event programming
> You've all seen the digital number that those 7 segments can display.
>
> To display a symbol we take a numeric value and convert it into the a
> combination of these LED lines to turn on. This encoding is called BCD -
> Binary-coded Decimal.  We use 4 bits to represent a number from 0 to 9
> (or 0-F if we do HEX) which we then encode into 7 bit combinations.
> This is easily done:
>
> entity bcd_output is
>      Port ( bcd : in STD_LOGIC_VECTOR (3 downto 0);
>             LED_out : out STD_LOGIC_VECTOR (7 downto 0));
> end bcd_output;
>
> architecture display of bcd_output is
> begin
>      -- Common anode - 0 means the LED turns on
>      LED_out <=
>         "0000001" when ( bcd = "0000" ) else -- 0
>         "1001111" when ( bcd = "0001" ) else -- 1
>         "0010010" when ( bcd = "0010" ) else -- 2
>         "0000110" when ( bcd = "0011" ) else -- 3
>         "1001100" when ( bcd = "0100" ) else -- 4
>         "0100100" when ( bcd = "0101" ) else -- 5
>         "0100000" when ( bcd = "0110" ) else -- 6
>         "0001111" when ( bcd = "0111" ) else -- 7
>         "0000000" when ( bcd = "1000" ) else -- 8
>         "0000100" when ( bcd = "1001" ) else -- 9
>         "1111111"; -- No display otherwise
> end display;
> === EOF ===
Got it.
> https://en.wikipedia.org/wiki/Double_dabble
For the purposes of my message, I'll leave the double-dabble-grok for a 
later time. Binary algos are not exactly something I deal with regularly 
so it would take me a bit longer than I have right now to fully 
understand the reasoning behind what I'm sure is an ingenious mechanic.
> This code is definitely not my invention but I do like how it forces you
> to think "in parallel":
>
> library ieee;
> use ieee.std_logic_1164.all;
> use ieee.std_logic_unsigned.all;
>
> entity binary_bcd is
>      generic(N: positive := 16);
>      port(
>          clk, reset: in std_logic;
>          binary_in: in std_logic_vector(N-1 downto 0);
>          bcd0, bcd1, bcd2, bcd3, bcd4: out std_logic_vector(3 downto 0)
>      );
> end binary_bcd ;
>
> === temporary EOF ====
>
> This is declaring the interface. There's a clock, a reset line, the
> number to be displayed (in binary of course) and 5 output BCDs.
Just a bit strange, but understandable--we set up our I/O here.
> === continue code  ====
>
> architecture behaviour of binary_bcd is
>      type states is (start, shift, done);
>      signal state, state_next: states;
>
>      signal binary, binary_next: std_logic_vector(N-1 downto 0);
>      signal bcds, bcds_reg, bcds_next: std_logic_vector(19 downto 0);
>      -- output register keep output constant during conversion
>      signal bcds_out_reg, bcds_out_reg_next: std_logic_vector(19 downto 0);
>      -- need to keep track of shifts
>      signal shift_counter, shift_counter_next: natural range 0 to N;
>
> === temporary EOF ====
Setting up variables--so far this isn't far off a "normal" program.
> === continue code - the main routine ===
>
> begin
>      process(clk, reset)
>      begin
>          if reset = '1' then
>              binary <= (others => '0');
>              bcds <= (others => '0');
>              state <= start;
>              bcds_out_reg <= (others => '0');
>              shift_counter <= 0;
>          elsif falling_edge(clk) then
>              binary <= binary_next;
>              bcds <= bcds_next;
>              state <= state_next;
>              bcds_out_reg <= bcds_out_reg_next;
>              shift_counter <= shift_counter_next;
>          end if;
>      end process;
>
> === temporary EOF ===
>
> This is the straight forward part. For every clock determine if the
> reset signal is set, and if it is "reset all values". If not, release
> the latch so the value of _next goes into the real value. This is
> EXTREMELY IMPORTANT when trying to understand the next section.
Like you said--EZ.
> === continue code - the fun part ===
>
>      convert:
>      process(state, binary, binary_in, bcds, bcds_reg, shift_counter)
>      begin
>          state_next <= state;
>          bcds_next <= bcds;
>          binary_next <= binary;
>          shift_counter_next <= shift_counter;
>
>          case state is
>              when start =>
>                  state_next <= shift;
>                  binary_next <= binary_in;
>                  bcds_next <= (others => '0');
>                  shift_counter_next <= 0;
>              when shift =>
>                  if shift_counter = N then
>                      state_next <= done;
>                  else
>                      binary_next <= binary(N-2 downto 0) & 'L';
>                      bcds_next <= bcds_reg(18 downto 0) & binary(N-1);
>                      shift_counter_next <= shift_counter + 1;
>                  end if;
>              when done =>
>                  state_next <= start;
>          end case;
>      end process;
I feel like I'm perceiving this incorrectly; even this part reads like 
it would work if "executed" sequentially. I think that's why I would 
need to get my hands dirty and actually stumble over this misconception 
a few times before I fully understood what you mean.
>      bcds_reg(19 downto 16) <= bcds(19 downto 16) + 3 when bcds(19 downto
> 16) > 4 else
>                                bcds(19 downto 16);
>      bcds_reg(15 downto 12) <= bcds(15 downto 12) + 3 when bcds(15 downto
> 12) > 4 else
>                                bcds(15 downto 12);
>      bcds_reg(11 downto 8) <= bcds(11 downto 8) + 3 when bcds(11 downto
> 8) > 4 else
>                               bcds(11 downto 8);
>      bcds_reg(7 downto 4) <= bcds(7 downto 4) + 3 when bcds(7 downto 4) >
> 4 else
>                              bcds(7 downto 4);
>      bcds_reg(3 downto 0) <= bcds(3 downto 0) + 3 when bcds(3 downto 0) >
> 4 else
>                              bcds(3 downto 0);
>
>      bcds_out_reg_next <= bcds when state = done else
>                           bcds_out_reg;
>
>      bcd4 <= bcds_out_reg(19 downto 16);
>      bcd3 <= bcds_out_reg(15 downto 12);
>      bcd2 <= bcds_out_reg(11 downto 8);
>      bcd1 <= bcds_out_reg(7 downto 4);
>      bcd0 <= bcds_out_reg(3 downto 0);
> end behaviour;
>
> ==== EOF ====
This bit (+3 when >4) I imagine is down to the double-dabble. Otherwise 
I feel like I fully understand this--which I /know/ is wrong based on 
all you've said.
> Ok - this is a lot. First try to follow it - if you do it sequential it
> will absolutely not make any sense.  And I left the whole section intact
> instead of breaking it up - because it all happens at once!  This means
> the part outside the process changes when the process changes a value it
> depends on - without the process having to "exit". <= means "electric
> connection" so it just is.
>
> For instance:
> bcds_out_reg_next <= bcds when state = done else
>                       bcds_out_reg;
>
> The bcds_out_reg_next is automatically changed when the state is changed
> to done or bcds_out_reg changes. It's sorta like declaring a rule - this
> happens; you don't define when and where. So when you see references to
> those two variables in the process part just keep in mind, that this
> part executes too when those value change.
I see here a pretty clear example of parallelism, and it's clever to 
work it to your advantage here.
> Inside the state machine, all operations are done on _next variables.
> Remember, at the end of the clock those values are copied to the correct
> values. We cannot have electrical rules that over-rule one another.
> Well, we can in a process portion - as you can see just following the
> state_next variable. If you try in the behavior section you get an
> error. Also, if we change "state" we have other stuff going on right
> away - this way, that doesn't happen until we get to the clock. So it's
> synchronizes all changes to the clock. That's pretty nifty and something
> a old procedural programmer like me find quite powerful.
>
> In the case statement (the implementation of the state machine) you see
> the initialization and the processing. And the fact that it looked like
> it was implemented by 3 simple lines was what attracted me to this.
> Until I realized I had totally forgotten about the many parallel
> statements below.

Which I'm failing to fully understand due to a stubborn desire to forego 
fully understanding the algo at hand.

> So notice in the Wiki page that they use a sequential loop and talk
> about when to test a BCD value, how many they have etc.  All of that is
> not needed here. It just happens, the node doesn't have to explicitly
> call out each BCD segment - they're pre-wired to just work.  The state
> machine simply shifts the value of the BCDs and the binary number as
> described in the algorithm, and the +3 stuff happens elsewhere. That's
> just brilliant in my opinion - it not only simplifies code but I do
> believe it eliminates a lot of room for errors. There are other areas
> with plenty of room for errors, but this isn't one of them.
>
> So when you grab the VHDL simulator and follow along "step by step" it
> jumps all over the place. It's hypnotic how the code doesn't flow from
> the top to bottom but it's all over the place. The debugger pretends
> each line is executed in isolation - I'm pretty sure if it showed it all
> happen at once, it wouldn't be a good debugger.
Haha--yeah, "it happened" is not exactly good feedback. You mention a 
VHDL simulator; is that open-source or at least free-as-in-beer? I 
really wouldn't mind toying with this a bit.
> This is an example of programming in parallel. I've only seen something
> remotely close to this with traditional programming languages when I was
> using LISP/PROLOG and Drools. But since they are implemented using
> traditional procedural languages there is a sequence - it's just not
> supposed to matter. Here, there is no sequence at all.
LISP is amazing and I also hate it. Haven't done much with it except 
tweak my emacs, and even then I tend to use packages instead of 
implementing "novel" features myself.
> Hope you're still awake - I had a eureka moment and wanted to share!
I was still awake, I was just preoccupied :(. Thank you for sharing. I 
feel sort of badly because you wrote this book and most of what I have 
to say is "oh, cool!" But it's really excellent learning material, 
Peter. I look forward to finding some way to toy with this language.



More information about the Novalug mailing list