Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatMapM to foldMapA and foldMapM #171

Open
effectfully opened this issue May 19, 2018 · 4 comments
Open

concatMapM to foldMapA and foldMapM #171

effectfully opened this issue May 19, 2018 · 4 comments
Labels
documentation feature request have to think type:breaking Breaking change (removal, renaming, semantic change, etc.)
Milestone

Comments

@effectfully
Copy link
Contributor

concatMapM has multiple problems. For one, it has a rather confusing name for a function defined like this:

concatMapM
    :: ( Applicative f
       , Monoid m
       , Container (l m)
       , Element (l m) ~ m
       , Traversable l
       )
    => (a -> f m) -> l a -> f m
concatMapM f = fmap fold . traverse f

There is nothing related to M here. Moreover, the user might think concatMapM can be used with a strict monoid like Sum, but that would result in a space leak, because an entire container is traversed first by traverse and only then flattened by fold. And the type signature is more restrictive than it should be: there is no need for Traversable -- Foldable is enough as it's just applicative foldMap.

So it should be defined something like this (modulo that Container thing):

foldMapA :: (Applicative f, Monoid m, Foldable t) => (a -> f m) -> t a -> f m
foldMapA f = foldr (\a fm -> mappend <$> f a <*> fm) (pure mempty)

But there indeed exists a notion of monadic folding which is useful for folding into a strict monoid:

foldMapM :: (Foldable f, Monad m, Monoid b) => (a -> m b) -> f a -> m b
foldMapM f xs = foldr step return xs mempty where
  step x r z = f x >>= \y -> r $! z `mappend` y

You can define all those andM, orM, allM, anyM in terms of foldMapM just like and, or, all, any are defined in terms of foldMap.

@chshersh
Copy link
Contributor

@effectfully This looks like and interesting idea. I would like to see benchmarks and profiling first just to be 💯 sure that this implementation is faster or at least not slower (and doesn't contain space leaks). Type of concatMapM always was confusing to me as well, it's just a result of generalization. But I'm okay with making things simpler. Personally I just want to be able to call multiple actions on some collection, where which action returns some monoid and monoid concatenation is performed automatically. Having examples like this to work is also a good feature:

concatMapM readFile files >>= putTextLn

@effectfully
Copy link
Contributor Author

effectfully commented Jun 2, 2018

I would like to see benchmarks and profiling first just to be 100 sure that this implementation is faster or at least not slower

That's a lot of things to do just to prove an obvious fact that fmap fold . traverse f is way too inefficient when run in a strict Monoid (compared to foldMapM) and is too restrictive (compared to foldMapA).

(and doesn't contain space leaks)

foldMapA eats stack of course. I checked foldMapM some time ago and it worked alright, but yes, a proper test is needed here. If I was to write such a test, I'd use Control.Monad.Writer.CPS and weigh, but right now I have many other things to do.

@gromakovsky gromakovsky added the type:breaking Breaking change (removal, renaming, semantic change, etc.) label Aug 12, 2018
@dcastro
Copy link
Member

dcastro commented Jul 11, 2022

I had a look at foldMapM to see how evaluation would look like.

If we expand the definition a bit, and add a type sig to step for clarity:

foldMapM 
  :: forall c a m b. (Container c, Element c ~ a, Monad m, Monoid b)
  => (a -> m b) -> c -> m b
foldMapM f xs = foldr step return xs mempty
  where
    step :: a -> (b -> m b) -> (b -> m b)
    step a b_mb = \b -> do
      b2 <- f a
      let !res = b <> b2
      b_mb res

Let's specialize to lists, where foldr is defined as:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f acc []     = acc
foldr f acc (x:xs) = f x (foldr f acc xs)

And now try to evaluate this expression:

a :: IO (Sum Integer)
a = foldMapM (\x -> pure (Sum x)) [1, 2]
foldMapM (\x -> pure (Sum x)) [1, 2]

-- substitute `foldMapM`
foldr step return [1,2] mempty
  where
    step :: Int -> (Sum Int -> IO (Sum Int)) -> (Sum Int -> IO (Sum Int))
    step a b_mb = \b -> do
      b2 <- pure $ Sum a
      let !res = b <> b2
      b_mb res

-- substitute `foldr` with its 2nd equation
step 1 (foldr step return [2]) mempty

-- substitute `step`
do
  b2 <- pure $ Sum 1
  let !res = mempty <> b2
  foldr step return [2] res


-- evaluate the first 2 statements
foldr step return [2] (Sum 1)

-- substitute `foldr` with its 2nd equation
step 2 (foldr step return []) (Sum 1)

-- substitute `step`
do
  b2 <- pure $ Sum 2
  let !res = (Sum 1) <> b2
  foldr step return [] res

-- evaluate the first 2 statements
foldr step return [] (Sum 3)

-- substitute `foldr` with its 1st equation
return (Sum 3)

In conclusion: constant stack, doesn't create thunks, so no space leaks. It seems to have similar properties to foldl'.

@dcastro dcastro added this to the v.1.9.0 milestone Jul 11, 2022
@s-and-witch
Copy link

s-and-witch commented Jul 11, 2022

Let's see foldMapA behavior on strict and lasy monoids.

The first example will be:

foo :: (String, (Sum Int))
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3] 

-- expand our definition, replacing foldMapA with it's body
foo = foldr (\a fm -> mappend <$> ((\x -> pure (Sum x)) a) <*> fm) (pure mempty) [1, 2, 3] 

-- reduce (String,) applicative and Monoid functions
foo = foldr (\a fm -> (\(s1, x) (s2, y) -> (s1 <> s2, x + y)) ((\x -> ([], x)) a) fm) ([], 0) [1, 2, 3] 

-- simplify lambdas
foo = foldr (\a (s2, y) -> let (s1, x) = ((\x -> ([], x)) a) in (s1 <> s2, x + y)) ([], 0) [1, 2, 3] 

foo = foldr (\a (s2, y) -> ([] <> s2, a + y)) ([], 0) [1, 2, 3] 

-- move lamba into `step` function
foo = foldr step ([], 0) [1, 2] 
  where
    step a (s2, y) = ([] <> s2, a + y)

Now we can clearly see, that our step function strict on it's second argument and computation will be:

> step 1 (foldr step ([], 0) [2, 3])
> step 1 (step 2 (foldr step ([], 0) [3]))
> step 1 (step 2 (step 3 (foldr step ([], 0) [])))
> step 1 (step 2 (step 3 ([], 0)))
> step 1 (step 2 ([], 3))
> step 1 ([], 5)
> ([], 6)

What about lazy applicative, such as (-> e)?

foo :: Int -> (Sum Int)
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3] 

-- definition from the first attempt
foo = foldr (\a fm -> mappend <$> ((\x -> pure (Sum x)) a) <*> fm) (pure mempty) [1, 2, 3]

-- inline Applicative definitions
foo = foldr (\a fm -> \e -> ((\x -> const x) a e) + fm e) (const 0) [1, 2, 3]

-- simplify
foo = foldr (\a fm -> \e -> ((const a) e) + fm e) (const 0) [1, 2, 3]

-- introduce `step`
foo = foldr step (const 0) [1, 2, 3]
  where
    step a fm = \e -> ((const a) e) + fm e

Now step is'n strict on Applicative, but sum yet not tail recursive

step 1 (foldr step (const 0) [2, 3])
\e -> const 1 e + (foldr step (const 0) [2, 3]) e

-- now add argument
(\e -> const 1 e + (foldr step (const 0) [2, 3]) e) 4
const 1 4 + (foldr step (const 0) [2, 3]) 4
1 + (step 2 (foldr step (const 0) [3])) 4
1 + (\e -> const 2 e + (foldr step (const 0) [3]) e) 4
1 + (const 2 4 + (foldr step (const 0) [3]) 4) 
1 + (2 + (foldr step (const 0) [3]) 4) 
1 + (2 + (step 3 (foldr step (const 0) [])) 4) 
1 + (2 + (\e -> const 3 e + (foldr step (const 0) [] e)) 4) 
1 + (2 + (const 3 4 + (foldr step (const 0) [] 4))) 
1 + (2 + (3 + (foldr step (const 0) [] 4))) 
1 + (2 + (3 + (const 0 4))) 
1 + (2 + (3 + 0)) 
1 + (2 + 3) 
1 + 5
6 

There is no reason why strict Applicative can start work with lazy monoid, so let's move to both lazy Applicative and Monoid

foo :: Int -> [Int]
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3] 

foo = foldr (\a fm -> \e -> ((\x -> const [x]) a e) <> fm e) (const []) [1, 2, 3]

foo = foldr (\a fm -> \e -> ((const [a]) e) <> fm e) (const []) [1, 2, 3]

foo = foldr step (const []) [1, 2, 3]
  where
    step a fm = \e -> ((const [a]) e) <> fm e

Seems good to me:

step 1 (foldr step (const []) [2, 3])
\e -> const [1] e <> (foldr step (const []) [2, 3]) e

-- now add argument
(\e -> const [1] e <> (foldr step (const []) [2, 3]) e) 4
const [1] 4 <> (foldr step (const []) [2, 3]) 4
[1] <> (foldr step (const []) [2, 3]) 4

-- now guarded recursion works, next steps will not compute untill case matching/seq function
1 : (foldr step (const []) [2, 3]) 4

-- but let's immagine, that we `force` list
1 : step 2 (foldr step (const []) [3]) 4
1 : (\e -> const [2] e <> (foldr step (const []) [3]) e) 4
1 : const [2] 4 <> (foldr step (const []) [3]) 4
1 : [2] <> (foldr step (const []) [3]) 4
1 : 2 : (foldr step (const []) [3]) 4
1 : 2 : step 3 (foldr step (const []) []) 4
1 : 2 : (\e -> const [3] e <> (foldr step (const []) []) e) 4
1 : 2 : const [3] 4 <> (foldr step (const []) []) 4
1 : 2 : [3] <> (foldr step (const []) []) 4
1 : 2 : 3 : (foldr step (const []) []) 4
1 : 2 : 3 : const [] 4
1 : 2 : 3 : []

And some tests in GHCi:

-- ([Int], Sum Int)
ghci> foldr (\a fs -> mappend <$> (pure $ Sum a) <*> fs) ([], mempty) (replicate 100000000 1)
*** Exception: stack overflow

-- Int -> Sum Int
ghci> foldr (\a fs -> mappend <$> (const $ Sum a) <*> fs) (const mempty) (replicate 100000000 1) 4
Sum {getSum = *** Exception: stack overflow

-- ([Int], [Int])
ghci> foldr (\a fs -> mappend <$> (pure $ [a]) <*> fs) ([], mempty) (replicate 100000000 1)
*** Exception: stack overflow

-- Int -> [Int]
ghci> foldr (\a fs -> mappend <$> (const $ [a]) <*> fs) (const mempty) (replicate 100000000 1) 4
[1,1,1,1,1,1,1...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation feature request have to think type:breaking Breaking change (removal, renaming, semantic change, etc.)
Projects
None yet
Development

No branches or pull requests

5 participants