我试图在Haskell中并行运行多个下载,我通常只使用Control.Concurrent.Async.mapConcurrently函数.但是,这样做会打开~3000个连接,这会导致Web服务器拒绝它们.是否可以完成与mapConcurrent相同的任务,但一次只打开有限数量的连接(即一次只打开2个或4个)?
一个快速的解决方案是使用信号量来限制并发操作的数量.它不是最优的(所有线程都是一次创建然后等待),但是有效:
import Control.Concurrent.MSem import Control.Concurrent.Async import Control.Concurrent (threadDelay) import qualified Data.Traversable as T mapPool :: T.Traversable t => Int -> (a -> IO b) -> t a -> IO (t b) mapPool max f xs = do sem <- new max mapConcurrently (with sem . f) xs -- A little test: main = mapPool 10 (\x -> threadDelay 1000000 >> print x) [1..100]
您也可以尝试使用pooled-io包来编写:
import qualified Control.Concurrent.PooledIO.Final as Pool import Control.DeepSeq (NFData) import Data.Traversable (Traversable, traverse) mapPool :: (Traversable t, NFData b) => Int -> (a -> IO b) -> t a -> IO (t b) mapPool n f = Pool.runLimited n . traverse (Pool.fork . f)