diff --git a/doc/basic_usage.rst b/doc/basic_usage.rst
index fba7e5d7..05436966 100644
--- a/doc/basic_usage.rst
+++ b/doc/basic_usage.rst
@@ -142,7 +142,7 @@ can a dimension reduction technique like UMAP do for us? By reducing the
 dimension in a way that preserves as much of the structure of the data
 as possible we can get a visualisable representation of the data
 allowing us to "see" the data and its structure and begin to get some
-inuitions about the data itself.
+intuition about the data itself.
 
 To use UMAP for this task we need to first construct a UMAP object that
 will do the job for us. That is as simple as instantiating the class. So
@@ -198,7 +198,7 @@ the original).
 This does a useful job of capturing the structure of the data, and as
 can be seen from the matrix of scatterplots this is relatively accurate.
 Of course we learned at least this much just from that matrix of
-scatterplots -- which we could do since we only had four differnt
+scatterplots -- which we could do since we only had four different
 dimensions to analyse. If we had data with a larger number of dimensions
 the scatterplot matrix would quickly become unwieldy to plot, and far
 harder to interpret. So moving on from the Iris dataset, let's consider
@@ -362,7 +362,7 @@ of the reducer object, or call transform on the original data.
 We now have a dataset with 1797 rows (one for each hand-written digit
 sample), but only 2 columns. As with the Iris example we can now plot
 the resulting embedding, coloring the data points by the class that
-theyr belong to (i.e. the digit they represent).
+they belong to (i.e. the digit they represent).
 
 .. code:: python3
 
diff --git a/doc/clustering.rst b/doc/clustering.rst
index f3ae97ac..e0bbbca7 100644
--- a/doc/clustering.rst
+++ b/doc/clustering.rst
@@ -4,14 +4,14 @@ Using UMAP for Clustering
 UMAP can be used as an effective preprocessing step to boost the
 performance of density based clustering. This is somewhat controversial,
 and should be attempted with care. For a good discussion of some of the
-issues involved in this please see the various answers `in this
+issues involved in this, please see the various answers `in this
 stackoverflow
 thread <https://stats.stackexchange.com/questions/263539/clustering-on-the-output-of-t-sne>`__
 on clustering the results of t-SNE. Many of the points of concern raised
 there are salient for clustering the results of UMAP. The most notable
 is that UMAP, like t-SNE, does not completely preserve density. UMAP,
-like t-SNE, can also create tears in clusters that are not actually
-present, resulting in a finer clustering than is necessarily present in
+like t-SNE, can also create false tears in clusters, resulting in a 
+finer clustering than is necessarily present in
 the data. Despite these concerns there are still valid reasons to use
 UMAP as a preprocessing step for clustering. As with any clustering
 approach one will want to do some exploration and evaluation of the
@@ -136,7 +136,7 @@ of largely spherical clusters -- this is responsible for some of the
 sharp divides that K-Means puts across digit classes. We can potentially
 improve on this by using a smarter density based algorithm. In this case
 we've chosen to try HDBSCAN, which we believe to be among the most
-advanced density based tehcniques. For the sake of performance we'll
+advanced density based techniques. For the sake of performance we'll
 reduce the dimensionality of the data down to 50 dimensions via PCA
 (this recovers most of the variance), since HDBSCAN scales somewhat
 poorly with the dimensionality of the data it will work on.
diff --git a/doc/exploratory_analysis.rst b/doc/exploratory_analysis.rst
index f44c79eb..b0c8f7e9 100644
--- a/doc/exploratory_analysis.rst
+++ b/doc/exploratory_analysis.rst
@@ -18,7 +18,7 @@ exactly this, and the results are fascinating. While they may not actually tell
 anything new about number theory they do highlight interesting structures
 in prime factorizations, and demonstrate how UMAP can aid in interesting explorations
 of datasets that we might think we know well. It's worth visiting the linked article
-below as Dr. Williamson provides a rich and detiled exploration of UMAP as
+below as Dr. Williamson provides a rich and detailed exploration of UMAP as
 applied to prime factorizations of integers.
 
 .. image:: images/umap_primes.png
@@ -50,11 +50,11 @@ Language, Context, and Geometry in Neural Networks
 Among recent developments in natural language processing is the BERT neural network
 based technique for analysis of language. Among many things that BERT can do one is
 context sensitive embeddings of words -- providing numeric vector representations of words
-that are sentive to the context of how the word is used. Exactly what goes on inside
+that are sensitive to the context of how the word is used. Exactly what goes on inside
 the neural network to do this is a little mysterious (since the network is very complex
 with many many parameters). A tram of researchers from Google set out to explore the
 word embedding space generated by BERT, and among the tools used was UMAP. The linked
-blog post provides a detailed and inspirign analysis of what BERT's word embeddings
+blog post provides a detailed and inspiring analysis of what BERT's word embeddings
 look like, and how the different layers of BERT represent different aspects of language.
 
 .. image:: images/bert_embedding.png
@@ -91,7 +91,7 @@ gives you over 150,000 texts to consider. Since the texts are open you can actua
 the text content involved. With some NLP and neural network wizardry David McClure build
 a network of such texts and then used node2vec and UMAP to generate a map of them. The result
 is a galaxy of textbooks showing inter-relationships between subjects, similar and related texts,
-and genrally just a an interesting ladscape of science to be explored. As with some
+and generally just a an interesting ladscape of science to be explored. As with some
 of the other projects here David made a great interactive viewer allowing for rich exploration
 of the results.
 
diff --git a/doc/supervised.rst b/doc/supervised.rst
index d91eaf51..0b670100 100644
--- a/doc/supervised.rst
+++ b/doc/supervised.rst
@@ -24,7 +24,7 @@ seaborn for plotting.
 Our example dataset for this exploration will be the `Fashion-MNIST
 dataset from Zalando
 Research <https://github.com/zalandoresearch/fashion-mnist>`__. It is
-desgined to be a drop-in replacement for the classic MNIST digits
+designed to be a drop-in replacement for the classic MNIST digits
 dataset, but uses images of fashion items (dresses, coats, shoes, bags,
 etc.) instead of handwritten digits. Since the images are more complex
 it provides a greater challenge than MNIST digits. We can load it in
@@ -86,7 +86,7 @@ a scatterplot.
 That took a little time, but not all that long considering it is 70,000
 data points in 784 dimensional space. We can simply plot the results as
 a scatterplot, colored by the class of the fashion item. We can use
-matplotlibs colorbar with suitable tick-labels to give us the color key.
+matplotlib's colorbar with suitable tick-labels to give us the color key.
 
 .. code:: python3
 
@@ -109,7 +109,7 @@ separate quite so cleanly. In particular T-shirts, shirts, dresses,
 pullovers, and coats are all a little mixed. At the very least the
 dresses are largely separated, and the T-shirts are mostly in one large
 clump, but they are not well distinguished from the others. Worse still
-are the coats, shirts, and pullovers (somewhat unsruprisingly as these
+are the coats, shirts, and pullovers (somewhat unsurprisingly as these
 can certainly look very similar) which all have significant overlap with
 one another. Ideally we would like much better class separation. Since
 we have the label information we can actually give that to UMAP to use!
@@ -169,7 +169,7 @@ distinct banding pattern that was visible in the original unsupervised
 case; the pants, t-shirts and bags both retained their shape and
 internal structure; etc. The second point to note is that we have also
 retained the global structure. While the individual classes have been
-cleanly seprated from one another, the inter-relationships among the
+cleanly separated from one another, the inter-relationships among the
 classes have been preserved: footwear classes are all near one another;
 trousers and bags are at opposite sides of the plot; and the arc of
 pullover, shirts, t-shirts and dresses is still in place.
@@ -177,7 +177,7 @@ pullover, shirts, t-shirts and dresses is still in place.
 The key point is this: the important structural properties of the data
 have been retained while the known classes have been cleanly pulled
 apart and isolated. If you have data with known classes and want to
-seprate them while still having a meaningful embedding of individual
+separate them while still having a meaningful embedding of individual
 points then supervised UMAP can provide exactly what you need.
 
 Using Partial Labelling (Semi-Supervised UMAP)
@@ -198,7 +198,7 @@ the noise points from a DBSCAN clustering).
 
 Now that we have randomly masked some of the labels we can try to
 perform supervised learning again. Everything works as before, but UMAP
-will interpret the -1 label as beingan unlabelled point and learn
+will interpret the -1 label as being an unlabelled point and learn
 accordingly.
 
 .. code:: python3
@@ -338,7 +338,7 @@ including much of the internal structure of the classes. For the most
 part assignment of new points follows the classes well. The greatest
 source of confusion in some t-shirts that ended up in mixed with the
 shirts, and some pullovers which are confused with the coats. Given the
-difficulty of the problemn this is a good result, particularly when
+difficulty of the problem this is a good result, particularly when
 compared with current state-of-the-art approaches such as `siamese and
 triplet
 networks <https://github.com/adambielski/siamese-triplet/blob/master/Experiments_FashionMNIST.ipynb>`__.