Commit 6094b74d authored by Caleb C. Sander's avatar Caleb C. Sander
Browse files

Update make spec

parent 40f238c6
No related merge requests found
Showing with 53 additions and 45 deletions
+53 -45
......@@ -6,41 +6,47 @@ You will be writing a Node.js program, so I recommend reading about the basics o
- [An example Node.js program](../../notes/js/js.md#nodejs-example)
- [Node.js documentation](../../notes/js/js.md#nodejs-documentation)
You will also likely find the builtin JS type [`Map`](../../notes/js/js.md#map) very useful for this project.
This project also uses the builtin JS type [`Map`](../../notes/js/js.md#map) heavily, so make sure you understand how to work with it.
# `make`
## Goals
- See how `Promise`s can be used to compose simple asynchronous tasks into complex asynchronous tasks
- Learn how to execute other programs (asynchronously) in Node.js
- Learn how to run other programs (asynchronously) from a Node.js program
## The GNU `make` utility
If you've ever worked with large C or C++ projects, you have probably used `make` or a similar build program.
`make` solves a couple of problems that would cause large projects to be hard to work with:
`make` solves a couple of problems with compiling large projects:
- Decomposing the build.
Rather than provide the full list of commands needed to build the project, `make` lets us specify how *each file* can be built from other files.
Running `make some-file` automatically runs the necessary commands to create `some-file`, including any files that `some-file` depends on.
Rather than provide a full sequence of commands to build the project, `make` lets us specify how *each file can be built from other files.*
Running `make some-file` automatically runs the necessary commands to create `some-file`, including building any files that `some-file` depends on.
- Performing incremental builds.
In a large project, it can be very expensive to rebuild the entire project when a few files change.
`make` will only rebuild the files affected by the change, and continue to use existing versions of the other files.
## How (synchronous) `make` works
Consider a sample C project with the following structure:
- There are two modules:
We will use a C project as an example, since this is what `make` is often used for.
Here's a quick refresher of how C code is compiled:
- Code is written in `.c` files
- `.c` files can declare some of their functions in a corresponding `.h` ("header") file.
Other `.c` files can use this `.h` file to call the declared functions.
- Each `.c` file is compiled individually into a `.o` ("object") file
- Finally, the `.o` files are linked together into a single executable file
Consider a sample C project with two `.c` files:
- `util.c` implements various useful functions declared in `util.h`
- `main.c` implements some program that uses the functions declared in `util.h`
- We can compile each C file into machine code separately.
`util.c` is compiled into `util.o` and `main.c` is compiled into `main.o`.
- To produce the final executable `run`, we need to link `util.o` and `main.o` together.
Note that both `util.c` and `main.c` include `util.h`, so changes to `util.h` should cause both `util.o` and `main.o` to be rebuilt.
Since each `.c` file is compiled into a corresponding `.o` file, `util.o` depends on `util.c` and `main.o` depends on `main.c`.
Note that both `util.c` and `main.c` include `util.h`, so both `util.o` and `main.o` depend on `util.h`.
`program`, the final
We can represent these relationships with a "dependency graph", where each file points to the files it depends on:
```
run
program
/ \
/ \
/ \
......@@ -50,11 +56,15 @@ We can represent these relationships with a "dependency graph", where each file
/ \ / \
main.c util.h util.c
```
The dependency relationships restrict the orders in which we can build the files: we can build `main.o` and `util.o` in either order, but both need to be built *before* building `run` because `run` depends on them.
The pseudo-code looks like this:
You can also read this from the bottom up: if `util.h` changes, `main.o` and `util.o` need to be rebuilt, and therefore `program` needs to be rebuilt too.
However, if `main.c` is modified, only `main.o` and `program` need to be rebuilt; `util.o` is unaffected.
The dependency relationships restrict the orders in which we can build the files: we can build `main.o` and `util.o` in either order, but both need to be built *before* building `program` because `program` depends on them.
Pseudo-code for synchronous `make` looks like this:
```python
# Builds the given target. Returns when done.
def build(target):
# Skip the build if target is already up to date (or was already rebuilt)
# Skip the build if target is up to date (or was already rebuilt)
if needs_rebuild(target):
# Recursively build target's dependencies one-by-one
for dependency in get_dependencies(target):
......@@ -63,7 +73,7 @@ def build(target):
run(get_build_command(target))
```
Since we have written `build()` synchronously, only one build command will run at a time!
Since `build()` waits to return until `target` is built, only one build command will run at a time!
In our example above, `main.o` and `util.o` don't depend on each other, so the build could run faster if we built them simultaneously.
## Your task
......@@ -72,26 +82,28 @@ Implement `make` **asynchronously** in Node.js.
### Rules
Your program will be invoked with `node make.js makefile.js target1 ... targetN`.
`makefile.js` will be a Node.js file that exports an array of rules that have the following structure:
Your program will be run as `node make.js makefile.js target1 ... targetN`.
`makefile.js` is a Node.js file that exports an array of all available rules to build files.
Each rule has the following structure:
```ts
interface Rule {
// The name of the file being built
{
// The name of the file that this rule builds
target: string
// The file's dependencies
dependencies: string[]
// The command to run to built `target`.
// This can be omitted if no command needs to be run.
command?: string[]
// The file's dependencies.
// These can be existing files or ones built according to other rules.
dependencies: Array<string>
// The command to run to built the file.
// This can be omitted (`undefined`) if no command needs to be run.
command?: Array<string>
}
```
For example, in the sample project above, the rules would look like:
For example, the rules for the sample project above would look like:
```js
[
{
target: 'run',
target: 'program',
dependencies: ['main.o', 'util.o'],
command: ['clang', 'main.o', 'util.o', '-o', 'run']
command: ['clang', 'main.o', 'util.o', '-o', 'program']
},
{
target: 'main.o',
......@@ -105,19 +117,21 @@ For example, in the sample project above, the rules would look like:
}
]
```
The program should build all of the requested targets, including all the files they depend on.
If there is no rule to build a target, then it is interpreted as a source file and it should already exist.
`make.js` should build all of the requested targets, including the files they depend on (and their dependencies, etc.).
If there is no rule to build a target, then it is interpreted as a source file, so it should already exist.
Additional requirements:
- Files should be built at most once.
For example, if `A` depends on `B` and `C`, and both `B` and `C` depend on `D`, building `A` should only build `D` once.
- Files should only be rebuilt if necessary.
Specifically, a file needs to be rebuilt if it doesn't exist, or if any of its dependencies have a later modified time than it does.
Specifically, a file needs to be built if it doesn't exist, or if any of its dependencies was changed since the target was last built.
(You can use the "last modified time" that all files have to tell whether any dependencies are newer than the target.)
Note that a file also needs to be rebuilt if an indirect dependency is modified.
For example, if `A` depends on `B` and `B` depends on `C`, then if `C` is updated, both `B` and `A` need to be rebuilt.
- Files should be built as soon as possible (i.e. as soon as all their dependencies are up-to-date).
For example, if `A` depends on `B` and `B` depends on `C`, then if `C` is updated, `B` needs to be rebuilt and then `A` does too.
- Files should be built as soon as possible (i.e. as soon as all their dependencies are up to date).
- Like `make`, print out each command before executing it.
Also print out anything that the subprocess writes to `stdout` or `stderr`.
This is already handled by `runCommmand()` in the starter code.
- If any build command fails, everything that depends on it should fail and `make.js` should exit with an error code.
You must use `Promise`s to represent the dependencies being built!
......@@ -140,33 +154,27 @@ During the build, there are three types of relationships between files which `Pr
```
By combining these `Promise` operations, we can implicitly represent the dependency graph; we don't need any additional data structure to store the graph!
Note that if a `Promise` rejects, it skips all the `Promise`s that depend on it and makes them reject too; this is exactly the behavior we want.
Note that if a command fails, its `Promise` rejects, which makes all the `Promise`s that depend on it immediately reject too; this is exactly the behavior we want.
## Options if you want to go further
- Dependency graphs are technically "directed acyclic graphs".
- Dependency graphs are technically "directed *acyclic* graphs".
It is impossible to perform a build if there is a "dependency cycle", e.g. `A` depends on `B`, `B` depends on `C`, and `C` depends on `A`.
If the build encounters a dependency cycle, print out a useful error message so the user can fix the set of rules.
- Our implementation builds each file as soon as possible.
If we are building lots of files that don't depend on each other, this spawns many simultaneous processes.
A system with `n` CPUs can run up to `n` processes at the same time, but if we run more, the system will waste a lot of time switching back and forth between them.
A system with `n` CPUs can run up to `n` processes at the same time, but if we run more, the system wastes a lot of time switching back and forth between them.
Limit the number of commands that run simultaneously to the number of CPUs.
You can use [`os.cpus()`](https://nodejs.org/api/os.html#os_os_cpus) to find out how many CPUs the computer has.
## Useful functions from the Node.js standard library
To get the rules exported by `makefile.js`, you should use `require()`.
You will have to provide the full path to `makefile.js`, e.g. with `require(path.resolve(makeFile))`.
`path.resolve()` is documented [here](https://nodejs.org/api/path.html#path_path_resolve_paths).
`require()` is blocking, but this is okay because nothing else can be done until `makefile.js` is processed.
To get metadata (e.g. modified time) about files, you can use [`fsPromises.stat(filename)`](https://nodejs.org/api/fs.html#fs_fspromises_stat_path_options), where `fsPromises = require('fs').promises`.
To get file metadata (e.g. last modified time), you can use [`fsPromises.stat(filename)`](https://nodejs.org/api/fs.html#fs_fspromises_stat_path_options), where `fsPromises = require('fs').promises`.
The modified time is stored in the `mtimeMs` field of the returned [`fs.Stats`](https://nodejs.org/api/fs.html#fs_class_fs_stats) object.
Child processes can be created using [`child_process.execFile()`](https://nodejs.org/api/child_process.html#child_process_child_process_execfile_file_args_options_callback).
For example, to run the command `clang -c main.c`, you would call `child_process.execFile('clang', ['-c', 'main.c'], ...)`.
Note that the `child_process` module doesn't have a built-in `Promise` interface.
You can use [`util.promisify()`](https://nodejs.org/api/util.html#util_util_promisify_original) to get a version of `execFile()` that returns a `Promise`; there is an example of this in the `execFile()` documentation.
We use [`util.promisify()`](https://nodejs.org/api/util.html#util_util_promisify_original) to make `execFile()` return a `Promise` instead of using a callback function.
This, along with printing out the command and its output, is already done by `runCommand()` in the starter code.
You can use `process.exit(statusCode)` to exit the process with the given code.
If the code is non-zero, it indicates that `make.js` failed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment