Skip to content

Commit

Permalink
Updates to PDFs (#1036)
Browse files Browse the repository at this point in the history
Fixes #1034
See #1035
  • Loading branch information
rowanc1 authored Oct 30, 2024
1 parent f792cb1 commit 05f9298
Show file tree
Hide file tree
Showing 101 changed files with 446 additions and 379 deletions.
Binary file added papers/Alireza_Vaezi/full_text.pdf
Binary file not shown.
Binary file added papers/Alireza_Vaezi/meca.zip
Binary file not shown.
Binary file added papers/Arushi_Nath/full_text.pdf
Binary file not shown.
1 change: 1 addition & 0 deletions papers/Arushi_Nath/main.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
# Ensure that this title is the same as the one in `myst.yml`
title: Algorithms to Determine Asteroid’s Physical Properties using Sparse and Dense Photometry, Robotic Telescopes and Open Data
short_title: Algorithms to Determine Asteroid’s Physical Properties
abstract: |
The rapid pace of discovering asteroids due to advancements in detection techniques outpaces current abilities to analyze them comprehensively. Understanding an asteroid's physical properties is crucial for effective deflection strategies and improves our understanding of the solar system's formation and evolution. Dense photometry provides continuous time-series measurements valuable for determining an asteroid's rotation period, yet is limited to a singular phase angle. Conversely, sparse photometry offers non-continuous measurements across multiple phase angles, essential for determining an asteroid's absolute magnitude, albedo (reflectivity), and size. This paper presents open-source algorithms that integrate dense photometry from citizen scientists with sparse photometry from space and ground-based all-sky surveys to determine asteroids' albedo, size, rotation, strength, and composition.
Applying the algorithms to the Didymos binary asteroid, combined with data from GAIA, the Zwicky Transient Facility, and ATLAS photometric sky surveys, revealed Didymos to be 840 meters wide, with a 0.14 albedo, an 18.14 absolute magnitude, a 2.26-hour rotation period, rubble-pile strength, and an S-type composition. Didymos was the target of the 2022 NASA Double Asteroid Redirection Test (DART) mission. The algorithm successfully measured a 35-minute decrease in the mutual orbital period following the DART mission, equating to a 40-meter reduction in the mutual orbital radius, proving a successful deflection. Analysis of the broader asteroid population highlighted significant compositional diversity, with a predominance of carbonaceous (C-type) asteroids in the outer regions of the asteroid belt and siliceous (S-type) and metallic (M-type) asteroids more common in the inner regions. These findings provide insights into the diversity and distribution of asteroid compositions, reflecting the conditions and processes of the early solar system.
Expand Down
Binary file added papers/Arushi_Nath/meca.zip
Binary file not shown.
1 change: 1 addition & 0 deletions papers/Arushi_Nath/myst.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ project:
id: scipy-2024-Arushi_Nath
# Ensure your title is the same as in your `main.md`
title: Algorithms to Determine Asteroid’s Physical Properties using Sparse and Dense Photometry, Robotic Telescopes and Open Data
short_title: Algorithms to Determine Asteroid’s Physical Properties
# Authors should have affiliations, emails and ORCIDs if available
authors:
- name: Arushi Nath
Expand Down
Binary file added papers/Gagnon_Kebe_Tahiri/full_text.pdf
Binary file not shown.
2 changes: 2 additions & 0 deletions papers/Gagnon_Kebe_Tahiri/main.tex
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
\title{Ecological and Spatial Influences on the Genetics of Cumacea (Crustacea: Peracarida) in the Northern North Atlantic}

\begin{abstract}
The peracarid taxon Cumacea is an essential indicator of benthic quality in marine ecosystems. This study investigated the influence of environmental (i.e., biological or ecosystemic), climatic (i.e., meteorological or atmospheric), and spatial (i.e., geographic or regional) variables on their genetic variability and adaptability in the Northern North Atlantic, focusing on Icelandic waters. We analyzed partial sequences of the 16S rRNA mitochondrial gene from 62 Cumacea specimens. Using the \textit{aPhyloGeo} software, we compared these sequences with relevant variables such as latitude (decimal degree) at the end of sampling, wind speed (m/s) at the start of sampling, O\textsubscript{2} concentration (mg/L), and depth (m) at the start of sampling.

Expand Down
Binary file added papers/Gagnon_Kebe_Tahiri/meca.zip
Binary file not shown.
6 changes: 6 additions & 0 deletions papers/Gagnon_Kebe_Tahiri/myst.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,5 +40,11 @@ project:
- rule: doi-exists
severity: ignore
keys:
exports:
- id: pdf
format: typst
template: /Users/rowan/git/typst/scipy
article: main.tex
output: full_text.pdf
site:
template: article-theme
Binary file added papers/Suvrakamal_Das/full_text.pdf
Binary file not shown.
Binary file added papers/Suvrakamal_Das/meca.zip
Binary file not shown.
Binary file added papers/Valeria_Martin/full_text.pdf
Binary file not shown.
103 changes: 52 additions & 51 deletions papers/Valeria_Martin/main.tex

Large diffs are not rendered by default.

Binary file added papers/Valeria_Martin/meca.zip
Binary file not shown.
6 changes: 6 additions & 0 deletions papers/Valeria_Martin/myst.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,5 +89,11 @@ project:
- '8113128'
- DBLP:RonnebergerFB15
- lecun
exports:
- id: pdf
format: typst
template: /Users/rowan/git/typst/scipy
article: main.tex
output: full_text.pdf
site:
template: article-theme
Binary file added papers/alan_lujan/full_text.pdf
Binary file not shown.
Binary file added papers/alan_lujan/meca.zip
Binary file not shown.
7 changes: 7 additions & 0 deletions papers/alan_lujan/myst.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,10 @@ project:
- Bradbury2018
- Pedregosa2011
- Paszke2019
toc:
- file: main.md
- file: notebooks/Curvilinear_Interpolation.ipynb
- file: notebooks/Multivalued_Interpolation.ipynb
- file: notebooks/Multivariate_Interpolation.ipynb
- file: notebooks/Multivariate_Interpolation_with_Derivatives.ipynb
- file: notebooks/Unstructured_Interpolation.ipynb
10 changes: 0 additions & 10 deletions papers/alan_lujan/notebooks/Curvilinear_Interpolation.ipynb

Large diffs are not rendered by default.

10 changes: 0 additions & 10 deletions papers/alan_lujan/notebooks/Unstructured_Interpolation.ipynb

Large diffs are not rendered by default.

Binary file added papers/aleksandar_makelov/full_text.pdf
Binary file not shown.
196 changes: 127 additions & 69 deletions papers/aleksandar_makelov/main.tex
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
\title[Mandala]{Mandala: Compositional Memoization for Simple & Powerful Scientific Data Management}

\begin{abstract}
We present
\href{https://github.com/amakelov/mandala}{\texttt{mandala}}, a Python
Expand Down Expand Up @@ -56,17 +58,56 @@ \section{Introduction}
\centering
\begin{subfigure}{0.23\textwidth}
\centering
\includegraphics[width=\textwidth]{img/fig1.pdf}
% \includegraphics[width=0.23\textwidth]{img/fig1.pdf}
\begin{lstlisting}[language=python]
# decorate any
# Python funcs
@op
def f(x):
return x**2

@op
def g(x, y):
return x + y

...
\end{lstlisting}
\caption{}
\end{subfigure}
\begin{subfigure}{0.35\textwidth}
\centering
\includegraphics[width=\textwidth]{img/fig2.pdf}
% \includegraphics[width=0.35\textwidth]{img/fig2.pdf}
\begin{lstlisting}[language=python]
storage = Storage()

# memoizing context
with storage:
for x in range(3):
y = f(x)

In [1]: y # wrapped value
Out[1]: AtomRef(4,
hid='628...',
cid='a82...')
\end{lstlisting}
\caption{}
\end{subfigure}
\begin{subfigure}{0.4\textwidth}
\centering
\includegraphics[width=\textwidth]{img/fig3.pdf}
% \includegraphics[width=0.4\textwidth]{img/fig3.pdf}
\begin{lstlisting}[language=python]
with storage:
# just add more calls
# & reuse old results
for x in range(5):
y = f(x)
# unwrap for control flow
if storage.unwrap(y) > 5:
z = g(x, y)

# the "program" is now end-to-end
# memoized & retraceable
\end{lstlisting}
\caption{}
\end{subfigure}
\caption{Basic imperative usage of \texttt{mandala}. \textbf{(a)}: add the \texttt{@op}
Expand Down Expand Up @@ -99,19 +140,19 @@ \section{Introduction}
The rest of this paper presents the design and main functionalities of
\texttt{mandala}, and is organized as follows:
\begin{itemize}
\item In Section \ref{section:core-concepts}, we describe how memoization is
\item In \autoref{section:core-concepts}, we describe how memoization is
designed, how this allows memoized calls to be composed and memoized results to
be reused without storage duplication, and how this enables the \emph{retracing}
pattern of interacting with computational artifacts.
\item In Section \ref{section:cf}, we introduce the concept of a
\item In \autoref{section:cf}, we introduce the concept of a
\emph{computation frame}, which generalizes a dataframe by replacing columns
with a computational graph, and rows with individual computations that
(partially) follow this graph. Computation frames allow high-level exploration
and manipulation of the stored computation graph, such as adding the calls that
produced/used given values to the graph, deleting all computations that depend
on the calls captured in the frame, and restricting the frame to a particular
subgraph or subset of values with given properties.
\item In Section \ref{section:extra-features}, we describe some other features of
\item In \autoref{section:extra-features}, we describe some other features of
\texttt{mandala} necessary to make it a practical tool, such as:
\begin{itemize}
\item Representing Python collections in a way transparent to the storage, so
Expand All @@ -122,7 +163,7 @@ \section{Introduction}
\end{itemize}
\end{itemize}

Finally, we give an overview of related work in Section \ref{section:related-work}.
Finally, we give an overview of related work in \autoref{section:related-work}.

\section{Core Concepts}
\label{section:core-concepts}
Expand All @@ -133,7 +174,7 @@ \subsection{Memoization and the Computational Graph}
to avoid redundant computation. \texttt{mandala} uses \emph{automatic}
memoization \citep{norvig1991techniques} which is applied via the combination of
a decorator (\texttt{@op}) and a context manager which specifies the
\texttt{Storage} object to use (Figure \ref{fig:basic-usage}). The memoization
\texttt{Storage} object to use (\autoref{fig:basic-usage}). The memoization
can optionally be made persistent to disk, which is what you would typically
want in a long-running project. Any Python function can be memoized (as long as
its inputs and outputs are serializable by the \texttt{joblib} library; see the
Expand Down Expand Up @@ -227,8 +268,7 @@ \subsection{Motivation for the Design of Memoization}
composition of \texttt{@op}s, it \textbf{automatically builds up a computational
graph of the project}. Most data management tasks --- e.g., a frequent use case
is getting a table of relationships between some variables --- are naturally
expressed as queries over this graph, as we will see in Section
\ref{section:cf}.
expressed as queries over this graph, as we will see in \autoref{section:cf}.
\item It \textbf{organizes storage functionality around a familiar and flexible
interface: the function call}. This automatically enforces the good practice
of partitioning code into functions, and eliminates extra `accidental' code to
Expand All @@ -242,9 +282,9 @@ \subsection{Motivation for the Design of Memoization}
\item Referring to values without reference to the code that produced or used
them becomes difficult, because from the point of view of storage the `identity'
of a value is its place in the computational graph. We discuss practical ways to
overcome this in Section \ref{section:cf}.
overcome this in \autoref{section:cf}.
\item Modifying \texttt{@op} functions requires care, as changes may invalidate
the stored computational graph. We discuss a versioning system that automates this process in Section \ref{subsection:versioning}.
the stored computational graph. We discuss a versioning system that automates this process in \autoref{subsection:versioning}.
\end{itemize}

\subsection{Retracing as a Versatile Imperative Interface to the Stored Computation Graph}
Expand All @@ -256,8 +296,7 @@ \subsection{Retracing as a Versatile Imperative Interface to the Stored Computat
way to interact with such a persisted computation is through \textbf{retracing},
which means stepping through memoized code with the purpose of resuming from a
failure, loading intermediate values, or continuing from a particular point with
new computations. A small example of retracing is shown in Figure
\ref{fig:basic-usage} (c).
new computations. A small example of retracing is shown in \autoref{fig:basic-usage} (c).

This pattern is simple yet powerful, as it allows the user to interact with the
stored computation graph in a way that is adapted to their use case, and to
Expand All @@ -270,52 +309,58 @@ \section{Computation Frames}

\begin{figure}[htbp]
\centering
\begin{minipage}[b]{0.48\textwidth}
\begin{subfigure}[b]{\textwidth}
\centering
\includegraphics[width=\textwidth]{img/fig4.pdf}
\caption{Continuing from Figure \ref{fig:basic-usage}, we first
create a computation frame from a single function \texttt{f}, then
expand it to include all calls that can be reached from the memoized
calls to \texttt{f} via their inputs/outputs, and finally convert
the computation frame into a dataframe. We see that this
automatically produces a computation graph corresponding to the
computations found.}
\label{fig:figure1}
\end{subfigure}

\vspace{1em}

\begin{subfigure}[b]{\textwidth}
\centering
\includegraphics[width=\textwidth]{img/fig5.pdf}
\caption{The output of the call to \texttt{.eval()} from the left
subfigure used to turn the computaiton frame into a dataframe. The
resulting table has columns for all variables and functions
appearing in the captured computation graph, and each row correspond
to a partial computation following this graph. The variable columns
contain values these variables take, whereas function columns
contain call objects representing the memoized calls to the
respective functions. We see that, because we call \texttt{g}
conditional on the output of \texttt{f}, some rows have nulls in the
\texttt{g} column.}
\label{fig:figure2}
\end{subfigure}
\end{minipage}
\hfill
\begin{minipage}[b]{0.45\textwidth}
\begin{subfigure}[b]{\textwidth}
\centering
\includegraphics[width=\textwidth]{img/cf.pdf}
\caption{A visualization of the computation frame from the previous
two subfigures. The red nodes indicate functions, and the blue nodes
indicate variables in the computation graph. Each edge is labeled
with the input/output name of the adjacent function. Nodes and edges
also show the number of \texttt{Ref}s and \texttt{Call}s they
represent.}
\label{fig:figure3}
\end{subfigure}
\end{minipage}
\begin{subfigure}[b]{\textwidth}
\centering
% \includegraphics[width=0.45\textwidth]{img/fig4.pdf}
\begin{lstlisting}[language=python]
In [1]:
# get the computation frame for f
storage.cf(f).\
# add all computations reachable
# from calls to f
expand().\
# extract as a dataframe
eval()
Out[1]: Extracting tuples from the
computation graph:
output_0 = f(x=x)
output_1 = g(y=output_0, x=x)
\end{lstlisting}
\caption{Continuing from \autoref{fig:basic-usage}, we first
create a computation frame from a single function \texttt{f}, then
expand it to include all calls that can be reached from the memoized
calls to \texttt{f} via their inputs/outputs, and finally convert
the computation frame into a dataframe. We see that this
automatically produces a computation graph corresponding to the
computations found.}
\label{fig:figure1}
\end{subfigure}
\begin{subfigure}[b]{\textwidth}
\centering
\includegraphics[width=0.80\linewidth]{img/fig5.pdf}
\caption{The output of the call to \texttt{.eval()} from the left
subfigure used to turn the computaiton frame into a dataframe. The
resulting table has columns for all variables and functions
appearing in the captured computation graph, and each row correspond
to a partial computation following this graph. The variable columns
contain values these variables take, whereas function columns
contain call objects representing the memoized calls to the
respective functions. We see that, because we call \texttt{g}
conditional on the output of \texttt{f}, some rows have nulls in the
\texttt{g} column.}
\label{fig:figure2}
\end{subfigure}
\begin{subfigure}[b]{\textwidth}
\centering
\includegraphics[width=0.45\textwidth]{img/cf.pdf}
\caption{A visualization of the computation frame from the previous
two subfigures. The red nodes indicate functions, and the blue nodes
indicate variables in the computation graph. Each edge is labeled
with the input/output name of the adjacent function. Nodes and edges
also show the number of \texttt{Ref}s and \texttt{Call}s they
represent.}
\label{fig:figure3}
\end{subfigure}
\caption{Basic declarative usage of \texttt{mandala} and an example of
computation frames.}
\label{fig:cf}
Expand All @@ -334,9 +379,8 @@ \subsection{Motivation and Intuition}
\texttt{@op} calls into groups, where the calls in each group have an analogous
role in the computation, and the groups form a high-level computational graph of
variables (which represent groups of \texttt{Ref}s) and functions (groups of
\texttt{Call}s). The illustration in Figure \ref{fig:cf} (c) shows a
visualization of a computation frame extracted from the computations in Figure
\ref{fig:basic-usage}.
\texttt{Call}s). The illustration in \autoref{fig:cf} (c) shows a
visualization of a computation frame extracted from the computations in \autoref{fig:basic-usage}.

This kind of organization is useful because it reflects how the user thinks
about the computation, and allows them to tailor the exploration of the
Expand All @@ -355,12 +399,12 @@ \subsection{Formal Definition}
\label{subsection:cf-definition}


A computation frame (Figure \ref{fig:cf}) consists of the following data:
A computation frame (\autoref{fig:cf}) consists of the following data:
\begin{itemize}
\item \textbf{Computation graph}: a directed graph $G=(V,F,E)$ where $V$ are
named variables and $F$ are named instances of \texttt{@op}-decorated functions.
The edges $E$ are labeled with the input/output names of the adjacent functions.
An example is shown in Figure \ref{fig:cf} (c);
An example is shown in \autoref{fig:cf} (c);
\item \textbf{Groups of \texttt{Ref}s and \texttt{Call}s}: for each variable
$v\in V$, a set of (history IDs of) \texttt{Ref}s $R_v$, and for each function
$f\in F$ with underlying \texttt{@op} $o_f$, a set of (history IDs of) \texttt{Call}s $C_f$;
Expand All @@ -384,7 +428,7 @@ \subsection{Basic Usage}
\item \textbf{Iteratively expanding the frame with functions that generated or
used existing variables}: this is useful for exploring the computation graph in
a particular direction, or for adding more context to a particular computation.
For example, in Figure \ref{fig:cf} (a), we start with a computation frame
For example, in \autoref{fig:cf} (a), we start with a computation frame
containing only the calls to \texttt{f}, and then expand it to include all calls
that can be reached from the memoized calls to \texttt{f} via their
inputs/outputs, which adds the calls to \texttt{g} to the frame.
Expand All @@ -394,7 +438,7 @@ \subsection{Basic Usage}
\texttt{Ref}s in the frame's computational graph (i.e., those that are not
inputs to any function in the frame), computing their computational history in
the frame (grouped by variable), and joining the resulting tables over the
variables. This is shown in Figure \ref{fig:cf} (right). In particular, as shown
variables. This is shown in \autoref{fig:cf} (right). In particular, as shown
in the example, this step may produce nulls, as the computation frame can
contain computations that only partially follow the graph.
\item \textbf{Performing high-level storage manipulations}: such as deleting all
Expand All @@ -421,7 +465,21 @@ \subsection{Data Structures}

\begin{wrapfigure}[18]{l}{0.45\textwidth}
\centering
\includegraphics[width=\linewidth]{img/list.pdf}
% \includegraphics[width=0.45\linewidth]{img/list.pdf}
\begin{lstlisting}[language=python]
@op
def avg_items(xs: MList[int]) -> float:
return sum(xs) / len(xs)

@op
def get_xs(n) -> MList[int]:
return list(range(n))

with storage:
xs = get_xs(10)
for i in range(2, 10, 2):
avg = avg_items(xs[:i])
\end{lstlisting}
\caption{Illustration of native collection memoization in \texttt{mandala}.
The custom type annotation \texttt{MList[int]} is used to memoize a list of
integers as a list of pointers to element \texttt{Ref}s.}
Expand All @@ -435,7 +493,7 @@ \subsection{Data Structures}
collections are naturally incorporated in the computation graph. These internal
\texttt{@op}s are applied automatically when a collection is passed as an
argument to a memoized function, or when a collection is returned from a
memoized function (Figure \ref{fig:list}).
memoized function (\autoref{fig:list}).

% \subsection{Caching}
% To speed up retracing and memoization, it is necessary to avoid frequent reads
Expand Down
Binary file added papers/aleksandar_makelov/meca.zip
Binary file not shown.
6 changes: 6 additions & 0 deletions papers/aleksandar_makelov/myst.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,11 @@ project:
- maymounkov2018koji
- semver
- lozano2017unison
exports:
- id: pdf
format: typst
template: /Users/rowan/git/typst/scipy
article: main.tex
output: full_text.pdf
site:
template: article-theme
Binary file added papers/amadi_udu/full_text.pdf
Binary file not shown.
Loading

0 comments on commit 05f9298

Please sign in to comment.