Commit 9050ac92 authored by Caleb C. Sander's avatar Caleb C. Sander
Browse files

Clarifications in MiniVC spec

parent 83d0ac85
No related merge requests found
Showing with 14 additions and 12 deletions
+14 -12
......@@ -8,6 +8,8 @@ Like most real asynchronous projects, this one mostly consists of tasks that nee
`async`-`await` can greatly simplify this sort of asynchronous code.
In `async` functions, `try`-`catch` statements are used to handle errors in `await`ed `Promise`s, so you may want to read about [error handling in JavaScript](../../notes/js/js.md#error-handling).
Since you'll be writing an HTTP server, remember that the [HTTP notes](../../notes/http/http.md) have information on how HTTP servers work and examples implementing them in Node.js.
# Mini Version Control
## Goals
......@@ -22,12 +24,11 @@ You've been using Git for all the projects in this class, and possibly for other
Git is one of many tools called "version control software", which track the history of changes to a set of files over time.
VCS tools differ in their implementations, but they share several core concepts.
In this project, you will implement a version control application with the core features of Git or CVS.
(Git actually stores its history quite differently than MiniVC since it's optimized for different use cases.
If you take CS 24 this fall, you will learn how Git really works by implement parts of it!)
(Git actually stores its history quite differently than MiniVC since it's optimized for different operations.)
The application has two parts which communicate over HTTP:
- A command line program allows the user to manage the repositories that have been downloaded locally and to communicate with the server
- A server stores the definitive copy of all repositories and mediates the changes uploaded by different clients
- A server stores the authoritative copy of all repositories and mediates the changes uploaded by different clients
You are provided an implementation of the client, so you only need to write the server.
......@@ -41,7 +42,7 @@ This is also convenient because it makes it easy to identify what changes were m
### Diffs
A "diff" between two versions of a file is the set of changes needed to turn the first version into the second.
Again, there are different granularities of changes that you can track; each has tradeoffs.
Again, there are different granularities of changes that you can track (e.g. character-by-character or line-by-line); each has tradeoffs.
Git considers a change to be inserting or deleting a line of text.
This works pretty well, so we'll use the same approach.
......@@ -117,7 +118,7 @@ We will represent this by omitting the `parentID` field for the initial commit (
### Merges
Version control histories are mostly linear, i.e. there is one chain of commits from the initial commit to the current commit.
In this case, the commits' `parentID`s form an implicit linked list of commits.
In this case, the commits' `parentID`s organize them into a linked list.
However, a good version control system allows multiple users to work on the codebase at the same time.
It is possible that two users are working off the same commit and both of them try to push new commits before they have seen the other's changes.
In this case, whichever commit is pushed first will be added to the history, but the second commit will need to be "merged" with the first commit.
......@@ -136,20 +137,21 @@ Changing B to have E as its parent instead of A is called "rebasing" in Git.
### Repositories
A repository (what GitLab calls a "project") is a set of files under version control.
The history of the repository is represented by storing all its commits by ID, plus the commit ID of the current commit.
A server can store multiple repositories, each with a different name.
The history of a repository is represented by storing all its commits by ID, plus the commit ID of the current commit.
This pointer to the current commit is called `HEAD` in Git.
Note that `HEAD` won't point to a commit until the first commit is pushed to the repository.
You can choose how to store this information.
I recommend the way the client stores it: a `commits` directory with a file for each commit (where the filename is the commit ID), and a `head` file that stores the ID of the current commit.
Your server should save all files inside the current working directroy (`.`): for example, if the server is started in the directory `server-repos`, all files and directories created by the server should be inside `server-repos`.
I recommend the way the client stores it: a file storing each commit in JSON (where the filename is the commit ID), and a `head` file that stores the ID of the current commit.
Your server should save all files under the current working directroy (`.`): for example, if the server is started in the directory `server-repos`, all files belonging to the repository named `repo` should be inside `server-repos/repo`.
## Server API
The command-line client communicates with the server via an HTTP API.
Requests and responses are sent as JSON.
You can parse the request JSON by concatenating all the chunks in the request stream into a string, and then calling `JSON.parse()` on it.
To respond with JSON, you should set the `Content-Type` header to `application/json`.
To respond with JSON, you should set the `Content-Type` header to `application/json` and use `JSON.stringify()` to convert the response object to a JSON string.
(JSON is not the most space-efficient way to store commit diffs, but it is very convenient to read and write from JavaScript.)
All responses have the form `{success: true, ...data}` if successful and `{success: false, message: string}` if an error occurs.
......@@ -207,7 +209,7 @@ type Commit = {
diffs: FileDiffs
}
type FileDiffs = {
// Map each filename to a diff for that file
// An object mapping each filename to a diff for that file
[file: string]: Diff
}
// A diff is an array of "same" and "change" elements
......@@ -222,7 +224,7 @@ type DiffElement
Pushes the given commit to the given repository.
It is an error if the repository does not exist or the parent commit does not exist in that repository.
You can use whatever ID for the new commit you want, but it should not be likely to conflict with an existing ID.
(I used [`require('crypto').randomBytes(16).toString('hex')`](https://nodejs.org/api/crypto.html#crypto_crypto_randombytes_size_callback).)
(I used [`util.promisify(crypto.randomBytes)(16)`](https://nodejs.org/api/crypto.html#crypto_crypto_randombytes_size_callback) to generate 16 random bytes and then converted it to a string using `.toString('hex')`.)
The commit needs to be merged/rebased off the current `HEAD` if `HEAD` is different from `parentID`.
If there is a merge conflict, `mergeFileDiffs()` will throw an error, which you should report to the client.
......@@ -250,7 +252,7 @@ The commits sent back should include the newly added commit.
The ones you will likely find useful when implementing the server are:
- `addFileDiffs()`: this function takes an array of diffs to apply in sequence and concatenates them into a single diff.
The result of each diff should be the source of the next.
For example, if we have a diff from `A` to `B`, `B` to `C`, and `C` to `D`, adding them gives the diff from `A` to `D`.
For example, if we have diffs from `A` to `B`, `B` to `C`, and `C` to `D`, adding them gives the diff from `A` to `D`.
- `mergeFileDiffs()`: this function takes two diffs to apply in parallel.
Both diffs should have the same source.
For example, if we have a diff from `A` to `B` and from `A` to `C`, merging them gives a diff from `A` that combines both sets of changes.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment