[This program computes decimal digits of the transcendental number e (https://oeis.org/A001113). As usual, it's a nonterminating program because it's a nonterminating sequence.

e.b is my monsterpiece; it was on my to-do list for years, then I got the algorithm figured out, and then spent a long time tinkering with individual components. (Mostly with pen and paper.) Naturally it's undergone much further revision since then. I wasn't sure I was going to be able to get it under a kilobyte (which is my usual aim for my programs), but I managed it.

The formula for e we use is "sum of the reciprocals of the factorials of the natural numbers". To store fractions exactly in brainfuck, we'll need to store the numerator and denominator separately.

Let's notice that 1+(1/1)+(1/(1*2))+(1/(1*2*3))...+(1/n!) has a least common denominator of n!. This implies that after we have the series up to 1/n! added up and stored as a single fraction, we can add the next term of the series just by multiplying numerator and denominator by (n+1) and then adding 1 to the numerator. That's easy.

For concision we'll call the numerator and denominator x and y, so our current approximation at any given time is x/y. As we add terms we'll get better and better accuracy. Obviously we can't sum the whole infinite series and then output the infinitely long answer; instead, we'll output digits as soon as we're sure they're correct. Two things to note about this.

1. To get the first digit, we just do integer division of x/y; the quotient from that is the first digit and the remainder is part of the rest of e. To get the second digit we multiply that remainder by 10 (since we're outputting decimal digits) and integer-divide THAT by y, keeping the remainder, and so on. After outputting a digit, we don't need to store it anymore, so we can just store the remainder in place of the previous x. Naturally, we will add future terms of the series to the remainder, and not to the part that we've already output and discarded from memory.

(One thing that thus needs to change is that 1 we keep adding to the numerator in order to add terms to our approximation; it needs to be multiplied by 10 whenever x is, to stay commensurate with it. So effectively, we'll store 10^(number of digits output so far) and add that to the numerator each time we're adding another term. Call it "t" for "tens". We don't actually need to keep track of how many digits we've output so far.)

2. How do we know when a digit is correct? We'll use an upper bound as well as a lower bound, and when they match, up to a given digit, we'll output that digit. Here's an easy upper bound: (x+t)/y. Notice that each term 1/(1*2*...*n) (where n>2) is less than half of the previous term. Comparing with the geometric sequence 1+1/2+1/4...=2, I notice then that each term (starting with the second one) is greater than the sum of all subsequent terms, which means just doubling the last term gets us a loose upper bound for e, but tight enough for our purposes; it tightens as we add more terms to our approximation.

So we're needing to store four natural numbers: x, y, n, and t. We'll store them in binary for ease of calculation, since we won't be outputting any of these numbers directly. And to do integer division with remainder, where the quotients are always 0-9, division by repeated subtraction works fine.

Our algorithm then is:

x=2, y=1, t=1, n=1
while (true){
    n = n + 1
    x = x * n, y = y * n
    x = x + t // add next term
    flag = true
    while (flag){
        while (x >= y){ // division by repeated subtraction
            x = x - y
            digit = digit + 1
        }
        x = x + t // check upper bound
        if (x >= y){ // gives different value; add term to improve approximation
            x = x - t
            flag=false
        } else { // gives same value; output this digit and check the next one
            x = x - t
            output digit
            digit = 0
            x = x * 10
            t = t * 10
        }
    }
}

For the brainfuck program, I decided to keep the four binary numbers as strings of nonzero cells interleaved with each other. x, y, and t, I represent a 0-bit with a brainfuck cell value of 1, and a 1-bit with a cell value of 2. A cell value of 0 means the number doesn't extend that far. For brevity and to try to avoid confusion between layers of representation, I'll abbreviate "0-bit" of the binary numbers as "0b", and "cell value of 1" as "cv1". So for x, y, and t, 0b=cv1, 1b=cv2. For n, I chose 0b=cv2, 1b=cv1 because it made the code slightly shorter. This program has some other quirks for concision; I'll try to note most of them in passing.

Also: to avoid duplicating the (expensive) code for x = x + t, x = x - t, and x >= t, which occur twice each in the algorithm above, I rejiggered the algorithm to put it under the control of a little state machine. The transformed algorithm is thus something like:

x=2, y=1, t=1, n=1
while (true)
    n = n + 1
    x = x * n, y = y * n
    tflag = true
    status = -1
    while (status != 0){
        if (tflag){
            x = x + t
            tflag=false
        } 
        if (y > x){
            status = status + 1
        }
        status = status + 1
        case (status){
            0:{ //can subtract y again
                x = x - y
                digit = digit + 1
                status = -1
            }
            1:{ //can't subtract y any more; division done; add t again to make upper bound to check if it matches
                tflag=true
                status = 1
            }
            2 or 3:{ //just checked upper bound
                x = x - t //subtract to restore to lower bound
                2:{ //upper bound didn't match; add another term
                    status = 0
                }
                3:{ //upper bound matched lower; output digit and get ready to compute next
                    output digit
                    digit = 0
                    t = t * 10
                    x = x * 10
                    status = -1
                }
            }
        }
    }
}

So it subtracts y from x as many times as it can (state 0, usually several times in a row), counting the times to get the integer x/y; when y>x so it can't subtract without going negative, it goes into state 1, causing it to add another t to x (to bring it to the upper bound) and compare again. if y>x still, then (state 3) it outputs the digit and goes back to state 0 (or maybe 1) to see if we can get another digit. But if (state 2) adding that second t made x>=y, so y COULD be subtracted again, then the two bounds disagree about that digit and we need to go clear back to the beginning to multiply again and add another term for a better approximation. (Each new term gets us a haphazard number of decimal digits. Looking at the first 6000 terms as a sample, it's usually 3 or 4 digits but can be anything from 0 to 7.)

The memory layout is something like

0 0 digit status y x t n y x t n y x t n ...

Also, notice that since y=n! y is an even number from the time it first gets multiplied by 2. The leftmost bit of y can then be assumed to be 0b=cv1. So I gave it a second use as a place to store tflag (cv2 if true), and also end up zeroing it for practical reasons at several places in the program. I'll abbreviate this cell as "y0".

>>++>>++>+>++>+>+[
Initialize digit=2 y=01 x=0 t=1 n=1 and start main loop (getting a jump on the calculation; shorter than adding another bit to x)

  This whole next chunk is to increment n and multiply y and x by n. To save memory (and the extra code that extra memory use costs) I replace x and y in place, gradually from the right: for each bit of x, starting at the right, remove that bit and add a copy of n at that point if the bit was 1, doing all appropriate carries wherever the successive copies of n overlap each other. And the same thing with y.

  [>[>>[>>>>]<<<<[[>>>>+<<<<-]<<<<]>>>>>>]+<]>-
  This is convoluted: this line moves all of n to the right repeatedly until it's been moved past x into the clear space to the right; but this also checks y, and if y was longer than x it'll add leading zeroes (cv1) to x to make it as long as y is, and move n past that. So this moves n past both x and y. Note that this code relies on status being 0 at this point.

  >>--[+[+++<<<<--]++>>>>--]+[>>>>]<<<<[<<+<+<]<<[
  This increments n (remember, for n, 0b=cv2, 1b=cv1). Slight trickiness to handle the case where it runs off the end into cv0; it sets to 0b=cv2 and then backtracks before coming forward again. After incrementing n, this pads x and y with MORE leading zeroes the whole length of n, to make it easier to drop copies of n onto that space without worrying about any of it being cv0. Then we start a loop at what was the right end of x before we did that.

    >>>>>>[[<<<<+>>>>-]>>>>]<<<<<<<<[<<<<]
    This moves all bits of n left one bit.

    >>-[<<+>>-]+<<[->>>>[-[+>>>>-]-<<-[>>>>-]++>>+[-<<<<+]+>>>>]<<<<[<<<<]]
    Check the current bit of x; set it to 0b=cv1; if it was 1b, add n onto x. (This addition code scans n looking for 1b bits; when it finds one it sets that cell to -1 as a marker before adding that bit into x, carrying it past any 1b=cv2 bits of x and setting them to cv1 before setting the first cv1 bit it finds to cv2=1b and then going back to the -1 marker. Note that the scan back for that -1 marker could be shortened to [-<+] if concision were the only priority here, but for me speed is also a priority. Also note that this code relies on the rightmost bit of n being 1b=cv1; otherwise the scan for 1b bits could go on forever.

    >[-[<+>-]]+<[->>>>[-[+>>>>-]-<<<-[>>>>-]++>>>+[-<<<<+]+>>>>]<<<<[<<<<]]<<
    Do the same thing for y, with the added wrinkle that y could be cv0 (in which case it becomes cv1). That added case costs one extra bracket pair near the left end.

  ]>>>+[>>>>]-[+<<<<--]++[<<<<]>>>-[
  End multiplication loop at second cell. Now we have n restored to its position and both x and y multiplied by n. Either x or y (or both) can have leading 0b cells now. Set y0=tflag=cv2. Remove any leading zeroes from y (will stay removed until next time y is multiplied). Set status to -1. (Statuses entering this loop are -1 or +1; -1 if we're still subtracting to divide x/y, +1 if we're checking whether our upper bound matches; these are then changed to -1/0/+1/+2 by the comparison, then incremented to 0/1/2/3 just before the case statement.) Begin loop that will output digits until this approximation fails, and we need to go back and add another term.

    >-[
    check tflag (y0); if it's cv2 we enter this loop to set x=x+t.

      >>[--[++>>+>>--]-<[-[-[+++<<<<-]+>>>>-]]++>+[-<<<<+]++>>+>>]
      Add t to x; this uses a combination of the "addition" and "increment" code from before since it must handle x values of cv0 as well as cv1 or cv2. Also, this increments y cells in passing, the entire length of t (doesn't increment y0). This is because we will be coasting back left along y and not t; if we went left along t we'd run into "digit" rather than a terminating 0.
      Note also that this code only touches x in response to a t cell of 1b (may touch several x cells finding a place it can put that 1b due to carries). So it doesn't necessarily touch every x cell. If t is much longer than x, it may leave swathes of the middle of the newly lengthened x as cv0, where t has 0b=cv1, and add its bits to the right, leaving x divided in pieces. The next line will fix this.

      <<[>[<-<<<]+<]>->>>
      This scans left along a combination of x and y, decrementing each y-cell by 1 (including zeroing y0), and setting any x cells that might be cv0 to cv1.

    ]+>[>>>>]-[+<<<<--]++<[
    This restores y0=0b=cv1, and strips any leading zeroes from x. Then it checks the y-bit matching the rightmost 1-bit of x. We're about to check whether y>x; this main case covers cases where y is at least as long as x.

      [>>>>]<<<<[
      If y was at least as long as x, we go to the right end of y, and scan left, comparing x and y bit-by-bit. We don't use any memory other than the same cells that are being compared, AND we restore all those to their initial values. This is nuts...these next two lines were easily the least straightforward part of this to write. I'd suggest tracing through the code manually for all combinations of values if you'd like to understand it.

        -[+>[<->-]++<[[>-<-]++[<<<<]+>>>+>-]++<<<<-]
        This line handles the case where y=cv2. The outer loop here continues as long as y-bits are cv2 (and x-bits match them). It must handle the cases where x=cv2, x=cv1, x=cv0, and restore values in all those cases. If x!=cv2, we conclude y>x and enter the longer, second inner loop, which restores values and scans all the way left to the left end. Now we have an intricate little dance to get out of the outer loops. Also we increment status, to record that we found y>x, which is the whole point of the comparison. Notice that this code depends on y0=cv1 to get out of loops, both in the case of y>x and in the case where y=x (similar to the way those addition loops before depend on rightmost bits being 1b).

        >-[+[<+[<<<<]>]<+>]+<[->->>>[-]]+<<<<
        This line handles the cases where y=cv1, or where we already concluded y>x and are just trying to get out. If y=cv1 and x=cv1 we skip both inner loops, restore y=cv1 and x=cv1, and go on to the next y-bit (could be cv1 or cv2). If y=cv1 and x=cv2, we conclude y<x, go through all the loops, and scan left but do not increment status. If we already concluded y>x and incremented status, we go one layer into the left loop but skip the next, and do the second inner loop. In both these last two cases, the second inner loop is mostly meant to pre-negate the left movement from the final +<<<< for the "normal" case.

      ]
    ]>[<<<<]>>+[
    Now we wrap up the "if y was at least as long" case; if y was NOT at least as long, this drops us back at the rightmost x-bit, and we scan back left and naturally don't increment status since x>y. Next we DO increment status, to bring the possible values into the range 0-3 (some of this control structure probably isn't the best possible; rethinking it is still on my to-do list), and go into our case statement. This first bracket rules out case 0.

      -[
      This one rules out case 1.

        >>>[--[++>>>>--]-<--[+++>>>>--]+>+[-<<<<+]++>>>>]<<<<<[<<<<]
        In case 2 or 3 we subtract t from x (the extra t that changed lower bound to upper bound). We cruise back along x to avoid hitting status. 

        >>-[
        Rules out case 2. Now we know we're in case 3, which means both bounds match for this digit, so it's known correct.

          +++++[<++++++++>-]<.>>>>--[<]<<[<<----.<]>[-]>>->>++[
          In case 3 we increase digit by 48 (ASCII '0') and output it. Also, we want to output a period (ASCII 46) as a decimal point ONLY after the first digit. I used to keep a dedicated flag for this, before noticing that t (10^number of digits already output) is 1 (odd) the first time, and even every subsequent time. So we use t's leftmost bit to decide whether to output that decimal point, and then restore it. Also, another thing that should have been really obvious: that first digit is always going to be "2", i.e. ASCII 50, so we can just subtract 4 from it to get the decimal point, we don't need to code it in some more general way that would still print "." for a different first digit. Which I initially did. Just before restoring t, I zero y0 yet again, because I'm about to use another loop to avoid duplicating my *=10 code.

            [>>>>]+[-[->>>>+>>>>>>>>-[-[+++<<<<[-]]+>>>>-]++[<<<<]]+<<<<]>>>
            So this multiplies the number by 10 (binary 1010), in place, gradually from the right; it does that for t, then for x, then it hits y0=cv0 and stops. This has yet another small tweak from my previous increment code (the brackets of that "[-]") in order to neatly handle that initial cv0->0b two cells right of the start point. Notice this relies on digit being 0=cv0 right now, or it'd hit it.

          ]+<-<<< restore y0 and flag1
          We restore y0=cv1, and set status=-1 again; we'll compute another digit, and later check if it's right. We go back to leftmost cell of the array and skip later cases.

        ]>[<<<]<
        This is case 2. It does nothing but reposition the pointer; status has already been set to 0 by the outer parts of the case structure. We're here because the upper and lower bound didn't match, so we need to go back to line 2 of the program and add another term to our approximation.

        (Here's another mistake: for the longest time, I had case 2 do x=x+y in a loop executed (digit) times to restore x to what it was before this division, setting digit back to 0, before going back to the start and multiplying and adding the next term and then doing the division all over again. That made no sense; since digit is the lower bound, that's a wholly unnecessary detour on the way back to the same place. It made the program longer and slower, I just didn't see it. This is very characteristic of early drafts in my experience.)

      ]>[+<+<<]<
      This is case 1; y>x, so we can't subtract y again, done with the division, now need to add t again to get the upper bound and check it. So we set y0=tflag=cv2, and status = 1 (which will be 2 or 3 by the time we get back to this case statement again).

    ]>[<<+>->[--[++>>>>--]->--[+++>>>>--]+<+[-<<<<+]++>>>>]<<<[<<<<]]>>
    Case 0: still working on division by repeated subtraction. increment digit, set status = -1, do x = x - y, cruising back on x. (This subtraction, much like the addition, relies on the fact that the rightmost bit of y is cv2. Also it relies on the fact that x>=y.) Then we step back to status to end our "while status != 0" loop.

  ]>
  Again, this loop continues computing digits and outputting them until the upper and lower bounds fail to agree (case 2).

]
And here we must have hit case 2, so we go back to the beginning to increment n and add a new term for a better approximation. This program doesn't terminate because e doesn't terminate.

The uncommented version is:]

>>++>>++>+>++>+>+[
  [>[>>[>>>>]<<<<[[>>>>+<<<<-]<<<<]>>>>>>]+<]>-
  >>--[+[+++<<<<--]++>>>>--]+[>>>>]<<<<[<<+<+<]<<[
    >>>>>>[[<<<<+>>>>-]>>>>]<<<<<<<<[<<<<]
    >>-[<<+>>-]+<<[->>>>[-[+>>>>-]-<<-[>>>>-]++>>+[-<<<<+]+>>>>]<<<<[<<<<]]
    >[-[<+>-]]+<[->>>>[-[+>>>>-]-<<<-[>>>>-]++>>>+[-<<<<+]+>>>>]<<<<[<<<<]]<<
  ]>>>+[>>>>]-[+<<<<--]++[<<<<]>>>-[
    >-[
      >>[--[++>>+>>--]-<[-[-[+++<<<<-]+>>>>-]]++>+[-<<<<+]++>>+>>]
      <<[>[<-<<<]+<]>->>>
    ]+>[>>>>]-[+<<<<--]++<[
      [>>>>]<<<<[
        -[+>[<->-]++<[[>-<-]++[<<<<]+>>>+>-]++<<<<-]
        >-[+[<+[<<<<]>]<+>]+<[->->>>[-]]+<<<<
      ]
    ]>[<<<<]>>+[
      -[
        >>>[--[++>>>>--]-<--[+++>>>>--]+>+[-<<<<+]++>>>>]<<<<<[<<<<]>>-[
          +++++[<++++++++>-]<brainfuck.org>>>>--[<]<<[<<----.<]>[-]>>->>++[
            [>>>>]+[-[->>>>+>>>>>>>>-[-[+++<<<<[-]]+>>>>-]++[<<<<]]+<<<<]>>>
          ]+<-<<<
        ]>[<<<]<
      ]>[+<+<<]<
    ]>[<<+>->[--[++>>>>--]->--[+++>>>>--]+<+[-<<<<+]++>>>>]<<<[<<<<]]>>
  ]>
]