forked from floswald/ScPoProgramming
-
Notifications
You must be signed in to change notification settings - Fork 0
/
05-git.qmd
724 lines (457 loc) · 13.1 KB
/
05-git.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
---
title: "Version Control with `Git`"
format:
revealjs:
theme: _extensions/metropolis-theme/metropolis.scss
chalkboard: true
logo: /images/ScPo-logo.png
footer: ScPo Intro To Programming 2023
incremental: false
code-line-numbers: false
highlight-style: github
author: Florian Oswald and The Software Carpentry
---
## Version What?
::::: {.callout-note}
# Question
* What is Version Control and Why Should I Care?
:::::
::::: {.callout-tip}
# Objectives
- Understand the benefits of an automated version control system.
- Understand the basics of how automated version control systems work.
::::::
---
## Final.doc
!["Piled Higher and Deeper" by Jorge Cham, http://www.phdcomics.com](/images/phd101212s.png)
---
## Undo
* The *latest version* is often best for text documents.
* However, sometimes our view of *best* evolves. Then, we want to *undo*.
* *Undo* means going back in history.
. . .
* MS Word etc have *track changes* features.
* Once you accepted the proposed changes of a collaborator, can you go back?
* What about Dropbox-like solutions? (What *is* dropbox actually?)
---
## Which Version: 20210611_draft.tex
:::: {.columns}
::: {.column width="50%"}
* Hey, fixed that thing last week.
* In `20220629-paper.tex`?
* Erm. Yes. No. I think `20211203-paper.tex` - messed up the file name.
* Ok, can you copy it into the latest version?
* Sure. Damn, can't find it anymore. I'll just write it again. All in my head. 🤯
:::
::: {.column width="40%"}
!["True Story" by Florian Oswald](/images/which-version.png)
:::
::::
---
## Which Version 2: **Why is the sample size so small suddenly**?
::::: {.columns}
:::: {.column width="40%"}
* We had 800 observations, now 733. Why?
* Erm...😱 No clue!
* Well you must have changed the code.
* Yes, I *improved* the code in several parts.
* Well you have to find out what happened.
* But that was weeks ago - I don't remember! 😢
::::
:::: {.column width="50%"}
### Hard Bugs
* The hard bugs 🐛 are the ones you see only after a while.
* See result today, error was introduced long ago.
* You can rewind dropbox 30 days. What if... ?
* Also, throw away 30 days of work?
* 😱 😱 😱 😱
::::
:::::
---
## {background-image="./images/removed-that.png" background-size=100%}
---
## Setting Up Git
* We all installed `git`.
* Let's setup our name
```bash
$ git config --global user.name "Your Name"
$ git config --global user.email "[email protected]"
```
* Line Endings on Windows:
```bash
git config --global core.autocrlf false
```
---
## Creating a Git **Repository**
::::: {.callout-note}
# Question
* Where does Git store information?
:::::
::::: {.callout-tip}
# Objectives
- Create a local repository
- Describe purpose of `.git` directory
::::::
---
## Gas Prices Project
* Let's create a project folder in our home to look at the gas prices from last week.
```bash
$ cd # going to home dir
$ mkdir gasprices # create directory
$ cd gasprices
$ git init
```
* Now the directory `~/gasprices` is endowed with `git` version control.
* What does that look like?
---
## Where is Git?
* Remember *hidden files* and folders?
```bash
$ ls -a
./ ../ .git/
```
* Git for this repository resides in `.git`
::: {.callout-warning}
# Danger Zone
* If you _delete_ that folder, the entire version control is GONE.
* Be very careful that you really want to do that.
:::
---
## Tracking Changes with Git
::::: {.callout-note}
# Question
* How do I record changes in Git?
* How do I check the status of my version control repository?
* How do I record notes about what changes I made and why?
:::::
::::: {.callout-tip}
# Objectives
- Understand the benefits of an automated version control system.
- Understand the basics of how automated version control systems work.
::::::
---
## Adding Code and Text
::: {.callout-note}
* Notice: The code we produce **is** text.
* Remember what we learned about **file endings**.
:::
* Let's add a shell script where we add our pipeline from last week.
1. run to get the raw data again:
```bash
wget https://www.data.gouv.fr/fr/datasets/r/64e02cff-9e53-4cb2-adfd-5fcc88b2dc09 -O ~/gasprices/carburants.csv
```
---
## Adding Code and Text
2. create a script
```bash
nano maketable.sh # open nano
# type this:
cd ~/gasprices # make sure we are in the right place
cut -d ';' -f 5 carburants.csv | tr [:lower:] [:upper:] | sort | uniq -c | sort
# save and exit
```
3. (Does it work?)
```bash
ls . # check the new file is there
./maketable.sh # run it!
```
. . .
4. No, it doesn't. 😖
```bash
chmod +x ./maketable.sh # add executable mode
ls -a
./maketable.sh
```
---
## Viewing Changes
* Ok, now let's see what `git` makes of our additions to this directory.
```tcl
floswald@PTL11077 ~/gasprices (main)> git status
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
carburants.csv
maketable.sh
```
* It is actually helpful **not** to use `bash` as a shell...
* Customizing your shell is an extremely effective procrastination device.
* You must know what [*shaving a Yak*](https://projects.csail.mit.edu/gsb/old-archive/gsb-archive/gsb2000-02-11.html) means before you walk out of my class.
---
## Seeing the Difference
* the command `git diff` shows you what changed between versions.
* lets see what it shows now:
```bash
$ git diff
```
* It shows nothing, i.e. an _empty_ diff, because _there are no commits yet_ to compare with.
* Ok, let's change that.
---
## Modify-Add-Commit 1
* git reports about _untracked files_. We need to decide *what to track*.
1. Move files to *staging area*:
```bash
git add maketable.sh
git status
```
* Notice that I did *not* want to track the `csv` file.
```bash
On branch main
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: maketable.sh
Untracked files:
(use "git add <file>..." to include in what will be committed)
carburants.csv
```
---
## Modify-Add-Commit 2
* Now, let's *record* what is in the staging area.
```bash
$ git commit -m 'added the maketable script'
[main (root-commit) 9956506] added the maketable script
1 file changed, 2 insertions(+)
create mode 100644 maketable.sh
```
* check status:
```bash
$ git status
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
carburants.csv
nothing added to commit but untracked files present (use "git add" to track)
➜ gasprices git:(main) ✗
```
---
## Modify-Add-Commit 3
* Let's check what's in the log.
```bash
$ git log
commit 9956506dc3159403b87aea3b04654c293e82c680 (HEAD -> main)
Author: Florian Oswald <[email protected]>
Date: Tue Feb 7 10:50:51 2023 +0100
added the maketable script
```
---
## Modify-Add-Commit 4
* Now let's _modify_ the script finally.
```bash
$ nano maketable.sh
# add this line on top
echo hello user, will make a contigency table now.
# save and exit
```
* now - what's the difference in the repo?
---
## Diffing
* there are still the same files here:
```bash
$ ls
carburants.csv maketable.sh
```
* But we can now _compare_ versions:
```bash
$ git diff
diff --git a/maketable.sh b/maketable.sh
index 7e01058..3b7007e 100644
--- a/maketable.sh
+++ b/maketable.sh
@@ -1,2 +1,3 @@
+echo hello user, will make a contigency table now.
cd ~/gasprices # make sure we are in the right place
cut -d ';' -f 5 carburants.csv | tr [:lower:] [:upper:] | sort | uniq -c | sort
```
---
## Commiting Changes Again
* let's first check everything runs
```bash
$ ./maketable.sh
```
* good. commit!
```bash
$ git add maketable.sh
$ git commit -m 'added message to user'
```
---
## Adding a README
* Good. Now let's add a `README` file.
* It's customary to write this in [markdown](https://carpentries-incubator.github.io/markdown-intro/)
```bash
$ nano README.md
```
write this in nano and save when done.
```md
# Gas Prices
This repo contains code to analyse gas prices at French gas stations.
```
* add to staging area, so we can take a snapshot
```bash
$ git add README.md
$ git commit -m 'added readme'
```
---
## What is this Staging Area
* git is like a foto camera.
* before you take a picture of your friends, you need to arrange them somehow, so that all fit, and all 😁.
* You put them _on stage_. Same for files in your repo.
![figure from [software carpentry]()](/images/git-staging-area.svg)
---
## Looking at History
::::: {.callout-note}
# Question
* How can I identify old versions of files?
* How do I review my changes?
* How can I recover old versions of files?
:::::
::::: {.callout-tip}
# Objectives
* Explain what the HEAD of a repository is and how to use it.
* Identify and use Git commit numbers.
* Compare various versions of tracked files.
* Restore old versions of files.
::::::
---
## The most recent version: HEAD
* Let's change the `maketable.sh` script again:
```bash
$ nano maketable.sh
echo program run successfully
# save exit
$ git add maketable.sh
```
* The most recent version of our repo is called `HEAD`.
```bash
$ git diff # compares entire repo to HEAD
$ git diff HEAD maketable.sh
```
---
## Whoops, typo
* Oh no, we wrote _program run successfully_. That should be _ran_ not _run_.
* What now?
. . .
* we have not committed this yet!
* we can just get back the version in HEAD, and edit again:
```bash
$ git restore maketable.sh
$ git checkout maketable.sh # also works
```
* edit the script, add and commit.
---
## How to get a *specific* version
* What if you want something else than `HEAD`?
* like, the first version of `maketable.sh`?
* look at history:
```bash
$ git log --oneline --graph
* a6f023b (HEAD -> main) added readme
* 9956506 added the maketable script
```
* The `9956506` is the unique identifier of that version.
* We can go back to that version:
```bash
$ git checkout 9956506 maketable.sh
```
---
::: {.callout-tip}
# Key Points
* `git diff` displays differences between commits.
* `git checkout` recovers old versions of files.
:::
---
## So, how does this thing work?
![software carpentry image.](/images/git_staging.svg)
---
## Version Control with VScode
* Download [Visual Studio Code](https://code.visualstudio.com/)
* Start
* Open folder `~/gasprices`
* check version control tab on the left.
---
## Version Control with RStudio
* top right click on *new project*
* Select *existing directory*
* Select `~/gasprices`
* checkout out the `git` tab in Rstudio!
---
## Collaborating with Git on GitHub
* Create repo
* copy ssh remote URL
* connect local to remote repo
---
## SSH connections
* Secure Shell Protocol
* Private-Public key pair. It's like a lock, and you have the only key.
* Let's check if you have one already!
```bash
ls -la ~/.ssh
```
if error, create one:
```bash
ssh-keygen -t ed25519 -C "[email protected]"
```
press enter (no passphrase)
check
```bash
ls -la ~/.ssh
```
---
## Communicate with GitHub Remote
* Let's ping the remote server at GitHub now.
```bash
ssh -T [email protected]
```
* right, of course Github doesn't have our public key yet (the _lock_ for our key!)
* copy from your terminal
```bash
cat ~/.ssh/id_ed25519.pub # or your *.pub
```
* Go to github.com, click top right corner, settings, SSH keys.
---
## Adding a Remote to your local Repo
* Now that we can talk to Github.com, let's add the remote to our local repo.
* We `add` a remote by getting the `SSH` url from the repository (green button) online.
```bash
$ git remote add origin [email protected]:YOUR_USER/YOUR_REPO.git
```
* `origin` is the _name_ of the remote server. your choice, but _origin_ is common.
* this should set that remote both for sending and retrieving stuff from the repo. _pull_ and _push_, in git language:
```bash
$ git remote --v
```
---
## Pushing It
* Now we can _push_ our local repository to the remote repo.
* There will be a full copy of what is in `.git` (i.e., the entire history of the repo) on that remote machine.
* You will be able to use it like a central backup location for your work.
```bash
$ git push -u origin main
```
* the `-u` flag sets the _main_ branch as default _upstream_ branch to track.
---
## Branching It
* Next to different _versions_ of a file/directory _over time_, we can have versions evolving in parallel.
* Imagine development history _branching_ off into 2 separate directions at one point.
* They may converge at some point again, but maybe one of them will turn out a failure and we drop it.
* Branches are hugely useful to organize team work.
```bash
$ git checkout -b testing # checkout repo on new branch `testing`
Switched to a new branch 'testing'
```
* Now can develop stuff on the `testing` branch.
* Later on, we can `merge` it back into `main` if we like it.
---
## The Full Picture(s)
![picture from [@MarkLodato](https://marklodato.github.io/visual-git-guide/index-en.html) - click for more!](/images/git-images/full.png)
---
## Pushing Branches to GitHub
* Once you created a local branch you can of course copy (_push_) it to your remote to share with others.
* you would amend the push command:
```bash
# make sure you are on the desired branch
$ git branch
main
* testing
$ git push origin testing
```