-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concatMapM
to foldMapA
and foldMapM
#171
Comments
@effectfully This looks like and interesting idea. I would like to see benchmarks and profiling first just to be 💯 sure that this implementation is faster or at least not slower (and doesn't contain space leaks). Type of concatMapM readFile files >>= putTextLn |
That's a lot of things to do just to prove an obvious fact that
|
I had a look at If we expand the definition a bit, and add a type sig to foldMapM
:: forall c a m b. (Container c, Element c ~ a, Monad m, Monoid b)
=> (a -> m b) -> c -> m b
foldMapM f xs = foldr step return xs mempty
where
step :: a -> (b -> m b) -> (b -> m b)
step a b_mb = \b -> do
b2 <- f a
let !res = b <> b2
b_mb res Let's specialize to lists, where foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f acc [] = acc
foldr f acc (x:xs) = f x (foldr f acc xs) And now try to evaluate this expression: a :: IO (Sum Integer)
a = foldMapM (\x -> pure (Sum x)) [1, 2] foldMapM (\x -> pure (Sum x)) [1, 2]
-- substitute `foldMapM`
foldr step return [1,2] mempty
where
step :: Int -> (Sum Int -> IO (Sum Int)) -> (Sum Int -> IO (Sum Int))
step a b_mb = \b -> do
b2 <- pure $ Sum a
let !res = b <> b2
b_mb res
-- substitute `foldr` with its 2nd equation
step 1 (foldr step return [2]) mempty
-- substitute `step`
do
b2 <- pure $ Sum 1
let !res = mempty <> b2
foldr step return [2] res
-- evaluate the first 2 statements
foldr step return [2] (Sum 1)
-- substitute `foldr` with its 2nd equation
step 2 (foldr step return []) (Sum 1)
-- substitute `step`
do
b2 <- pure $ Sum 2
let !res = (Sum 1) <> b2
foldr step return [] res
-- evaluate the first 2 statements
foldr step return [] (Sum 3)
-- substitute `foldr` with its 1st equation
return (Sum 3) In conclusion: constant stack, doesn't create thunks, so no space leaks. It seems to have similar properties to |
Let's see foldMapA behavior on strict and lasy monoids. The first example will be: foo :: (String, (Sum Int))
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3]
-- expand our definition, replacing foldMapA with it's body
foo = foldr (\a fm -> mappend <$> ((\x -> pure (Sum x)) a) <*> fm) (pure mempty) [1, 2, 3]
-- reduce (String,) applicative and Monoid functions
foo = foldr (\a fm -> (\(s1, x) (s2, y) -> (s1 <> s2, x + y)) ((\x -> ([], x)) a) fm) ([], 0) [1, 2, 3]
-- simplify lambdas
foo = foldr (\a (s2, y) -> let (s1, x) = ((\x -> ([], x)) a) in (s1 <> s2, x + y)) ([], 0) [1, 2, 3]
foo = foldr (\a (s2, y) -> ([] <> s2, a + y)) ([], 0) [1, 2, 3]
-- move lamba into `step` function
foo = foldr step ([], 0) [1, 2]
where
step a (s2, y) = ([] <> s2, a + y) Now we can clearly see, that our step function strict on it's second argument and computation will be: > step 1 (foldr step ([], 0) [2, 3])
> step 1 (step 2 (foldr step ([], 0) [3]))
> step 1 (step 2 (step 3 (foldr step ([], 0) [])))
> step 1 (step 2 (step 3 ([], 0)))
> step 1 (step 2 ([], 3))
> step 1 ([], 5)
> ([], 6) What about lazy applicative, such as (-> e)? foo :: Int -> (Sum Int)
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3]
-- definition from the first attempt
foo = foldr (\a fm -> mappend <$> ((\x -> pure (Sum x)) a) <*> fm) (pure mempty) [1, 2, 3]
-- inline Applicative definitions
foo = foldr (\a fm -> \e -> ((\x -> const x) a e) + fm e) (const 0) [1, 2, 3]
-- simplify
foo = foldr (\a fm -> \e -> ((const a) e) + fm e) (const 0) [1, 2, 3]
-- introduce `step`
foo = foldr step (const 0) [1, 2, 3]
where
step a fm = \e -> ((const a) e) + fm e Now step 1 (foldr step (const 0) [2, 3])
\e -> const 1 e + (foldr step (const 0) [2, 3]) e
-- now add argument
(\e -> const 1 e + (foldr step (const 0) [2, 3]) e) 4
const 1 4 + (foldr step (const 0) [2, 3]) 4
1 + (step 2 (foldr step (const 0) [3])) 4
1 + (\e -> const 2 e + (foldr step (const 0) [3]) e) 4
1 + (const 2 4 + (foldr step (const 0) [3]) 4)
1 + (2 + (foldr step (const 0) [3]) 4)
1 + (2 + (step 3 (foldr step (const 0) [])) 4)
1 + (2 + (\e -> const 3 e + (foldr step (const 0) [] e)) 4)
1 + (2 + (const 3 4 + (foldr step (const 0) [] 4)))
1 + (2 + (3 + (foldr step (const 0) [] 4)))
1 + (2 + (3 + (const 0 4)))
1 + (2 + (3 + 0))
1 + (2 + 3)
1 + 5
6 There is no reason why strict Applicative can start work with lazy monoid, so let's move to both lazy Applicative and Monoid foo :: Int -> [Int]
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3]
foo = foldr (\a fm -> \e -> ((\x -> const [x]) a e) <> fm e) (const []) [1, 2, 3]
foo = foldr (\a fm -> \e -> ((const [a]) e) <> fm e) (const []) [1, 2, 3]
foo = foldr step (const []) [1, 2, 3]
where
step a fm = \e -> ((const [a]) e) <> fm e Seems good to me: step 1 (foldr step (const []) [2, 3])
\e -> const [1] e <> (foldr step (const []) [2, 3]) e
-- now add argument
(\e -> const [1] e <> (foldr step (const []) [2, 3]) e) 4
const [1] 4 <> (foldr step (const []) [2, 3]) 4
[1] <> (foldr step (const []) [2, 3]) 4
-- now guarded recursion works, next steps will not compute untill case matching/seq function
1 : (foldr step (const []) [2, 3]) 4
-- but let's immagine, that we `force` list
1 : step 2 (foldr step (const []) [3]) 4
1 : (\e -> const [2] e <> (foldr step (const []) [3]) e) 4
1 : const [2] 4 <> (foldr step (const []) [3]) 4
1 : [2] <> (foldr step (const []) [3]) 4
1 : 2 : (foldr step (const []) [3]) 4
1 : 2 : step 3 (foldr step (const []) []) 4
1 : 2 : (\e -> const [3] e <> (foldr step (const []) []) e) 4
1 : 2 : const [3] 4 <> (foldr step (const []) []) 4
1 : 2 : [3] <> (foldr step (const []) []) 4
1 : 2 : 3 : (foldr step (const []) []) 4
1 : 2 : 3 : const [] 4
1 : 2 : 3 : [] And some tests in GHCi: -- ([Int], Sum Int)
ghci> foldr (\a fs -> mappend <$> (pure $ Sum a) <*> fs) ([], mempty) (replicate 100000000 1)
*** Exception: stack overflow
-- Int -> Sum Int
ghci> foldr (\a fs -> mappend <$> (const $ Sum a) <*> fs) (const mempty) (replicate 100000000 1) 4
Sum {getSum = *** Exception: stack overflow
-- ([Int], [Int])
ghci> foldr (\a fs -> mappend <$> (pure $ [a]) <*> fs) ([], mempty) (replicate 100000000 1)
*** Exception: stack overflow
-- Int -> [Int]
ghci> foldr (\a fs -> mappend <$> (const $ [a]) <*> fs) (const mempty) (replicate 100000000 1) 4
[1,1,1,1,1,1,1... |
concatMapM
has multiple problems. For one, it has a rather confusing name for a function defined like this:There is nothing related to
M
here. Moreover, the user might thinkconcatMapM
can be used with a strict monoid likeSum
, but that would result in a space leak, because an entire container is traversed first bytraverse
and only then flattened byfold
. And the type signature is more restrictive than it should be: there is no need forTraversable
--Foldable
is enough as it's just applicativefoldMap
.So it should be defined something like this (modulo that
Container
thing):But there indeed exists a notion of monadic folding which is useful for folding into a strict monoid:
You can define all those
andM
,orM
,allM
,anyM
in terms offoldMapM
just likeand
,or
,all
,any
are defined in terms offoldMap
.The text was updated successfully, but these errors were encountered: