Start with:
   list of type 'a,
   initial value of type 'b,
   function f with a pair ('a * 'b) of inputs
For example:
   [x0, x1, x2, x3, x4] : 'a list
   z : 'b
   f : fn : 'a * 'b -> 'b

We want to understand how the following applications of
foldr and foldl work.

   foldr f z [x0, x1, x2, x3, x4];
   foldl f z [x0, x1, x2, x3, x4];


Here is how foldr uses the function f to "fold" the list.

     f(__, f(__, f(__, f(__, f(__, z)))))

The empty slots are where the elements from the list are placed.

     f(x0, f(x1, f(x2, f(x3, f(x4, z)))))

Here is the code that implements foldr.

(*
   fn : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
*)
fun foldr _ z [] = z
  | foldr f z (x::xs) = f (x, (foldr f z xs));


If the function f is thought of as an infix operator @
(that is, we replace every f(x, y) with x @ y),
then here is what foldr looks like in operator notation.

      __ @ (__ @ (__ @ (__ @ (__ @ z))))

      x0 @ (x1 @ (x2 @ (x3 @ (x4 @ z))))

Notice that the @ operators are grouped from right to left.

We can also represent foldr using a tree. Notice how this
tree grows to the right. (This tree is evaluated from the
bottom up.)
              f
             / \
           x0   f
               / \
             x1   f
                 / \
               x2   f
                   / \
                 x3   f
                     / \
                   x4    z

IMPORTANT: In the operator notation for foldr, use cons
as the operator and the empty list as the initial value.

      x0 :: (x1 :: (x2 :: (x3 :: (x4 :: []))))

Notice how, in this case, foldr just rebuilds the original list.
This shows that foldr is a way of generalizing the building, or
walking, of a list.

              ::
             /  \
           x0    ::
                /  \
              x1    ::
                   /  \
                 x2    ::
                      /  \
                    x3    ::
                         /  \
                       x4    []



Here is how foldl uses the function f to "fold" the list.

     f(__, f(__, f(__, f(__, f(__, z)))))

The empty slots are where the elements from the list are placed
but notice the order.

     f(x4, f(x3, f(x2, f(x1, f(x0, z)))))

Here is the code that implements foldl.

(*
   fn : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
*)
fun foldl _ z [] = z
  | foldl f z (x::xs) = foldl f (f (x, z)) xs;


If the function f is thought of as an infix operator @,
then here is what foldl looks like in operator notation.

      __ @ (__ @ (__ @ (__ @ (__ @ z))))

      x4 @ (x3 @ (x2 @ (x1 @ (x0 @ z))))

(Notice that the @ operators are still grouped from right
to left. This is a quirk of ML's implementation if foldl.
There is another way to implement foldl that groups the @
operators from left to right. See below. But also notice
that since the elements of the list are listed "backwards",
the operator uses up the elements of the list from left to 
right.)

We can also represent foldl using a tree. This tree is 
evaluated from the bottom up, but since the first element
from the list is at the bottom of the tree, the list is
processed from left to right.

                 f
                / \
              x4   f
                  / \
                x3   f
                    / \
                  x2   f
                      / \
                    x1   f
                        / \
                      x0   z


Look carefully at the operator notation for foldr and foldl.
      x0 @ (x1 @ (x2 @ (x3 @ (x4 @ z))))
      x4 @ (x3 @ (x2 @ (x1 @ (x0 @ z))))
They are the same except for the reversal of the elements from
the list. This means that we can express foldl the following way,

   fun foldl f z xs = foldr f z (reverse xs)

where "reverse" is a function that reverses a list. But we wouldn't
want to define foldl this way because it would force foldl to walk
the list three times instead of just one time (see below).


NOTE: The definition for foldl given above is not the only way
that foldl can be defined. The definition given above is not really
symmetric with the definition for foldr. Here are foldr and foldl
on the list [x0,x1,x2,x3,x4].

foldr f z [x0,x1,x2,x3,x4] = f(x0, f(x1, f(x2, f(x3, f(x4, z)))))
foldl f z [x0,x1,x2,x3,x4] = f(x4, f(x3, f(x2, f(x1, f(x0, z)))))

Here they are in operator notation.
      x0 @ (x1 @ (x2 @ (x3 @ (x4 @ z))))
      x4 @ (x3 @ (x2 @ (x1 @ (x0 @ z))))

Here is foldr compared with a different version of foldl.

foldr f z [x0,x1,x2,x3,x4] = f(x0, f(x1, f(x2, f(x3, f(x4, z)))))
foldl f z [x0,x1,x2,x3,x4] = f(f(f(f(f(z, x0), x1), x2), x3), x4)

Here is the operator notation for foldr and the new version of foldl.
      x0 @ (x1 @ (x2 @ (x3 @ (x4 @ z))))
      ((((z @ x0) @ x1) @ x2) @ x3) @ x4

This new definition of foldl is more symmetric with the definition
of foldr. Notice how foldr has z on the right end of x0, x1, x2, x3, 
and x4, and the @ operators are grouped from right to left. The new 
foldl has z on the left end of x0, x1, x2, x3, and x4, and the @
operators are grouped from left to right. (The previous definition
of foldl has z on the right end of x4, x3, x2, x1 and x0, and the
@ operators are grouped from right to left.)

Here is the tree for foldr compared with the tree for this other
version of foldl. The trees are almost mirror images of each other.

      foldr                       foldl
        f                           f
       / \                         / \
     x0   f                       f   x4
         / \                     / \
       x1   f                   f   x3
           / \                 / \
         x2   f               f   x2
             / \             / \
           x3   f           f   x1
               / \         / \
             x4    z      z   x0

This other version of foldl would have this definition in ML.

(*
   fn : ('a * 'b -> 'a) -> 'a -> 'b list -> 'a
*)
fun foldl _ z [] = z
  | foldl f z (x::xs) = foldl f (f (z, x)) xs;
  
One thing that is surprising is how small the change is in
the definition of foldl. Here is the previous definition.
Notice that the previous definition has the exact same type
as foldr, but the above, slightly different definition of 
foldl has a slightly different type from foldr.)

(*
   fn : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
*)
fun foldl _ z [] = z
  | foldl f z (x::xs) = foldl f (f (x, z)) xs;

An important thing to notice is that BOTH definitions of foldl
process the given list from left to right. That is, they both
start with the parameter z and the first element of the list
and then they work their way to the end of the list. This is
very different from foldr which needs to traverse the list two
times, first from beginning to end, to get to the end where the
last element of the list gets paired with the input parameter z,
and then foldr works it way back to the beginning of the list
using the function f.

Which version of foldl is "better"? One definition is more
symmetric with foldr, but the other definition has the same
type signature as foldr. It is hard to say which of these
aspects of foldl is more important. When we use foldl, would
it help us that foldl has the same type signature as foldr?
On the other hand, would foldl being more symmetric with foldr
help us more than it having the same type signature as foldr?
Different languages have made different choices for foldl.
SML uses the first definition of foldl we gave above. Haskell
uses the second definition of foldl.