You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 6, 2023. It is now read-only.
RLP represents tasks as a set of largely independent user requests. Computation is easily partitioned across different requests and is load balanced at the DNS level.
\section{Data-Level Parallelism}
DLP distributes data across different nodes, which operate on the data in parallel. DLP on WSC supports parallelism across multiple machines.
\subsection{MapReduce}
Simple data-parallel programming model designed for scalability and fault-tolerance. Users specify the computation in terms of a \emph{map} function and a \emph{reduce} function.
\medskip
Underlying runtime system:
\begin{itemize}
\item Automatically parallelize the computation across large scale clusters of machines
\item Handles machine failure
\item Schedule inter-machine communication to make efficient use of the networks
\end{itemize}
\subsection{Spark}
Apache Spark is a fast and general engine for large-scale data processing.