diff --git a/specs/make/make.md b/specs/make/make.md index 9b010a31b365aea02b5b8096b1bdd0b234686da5..e160f5dea59db74cc3c38f875831aae1adb6defd 100644 --- a/specs/make/make.md +++ b/specs/make/make.md @@ -6,41 +6,47 @@ You will be writing a Node.js program, so I recommend reading about the basics o - [An example Node.js program](../../notes/js/js.md#nodejs-example) - [Node.js documentation](../../notes/js/js.md#nodejs-documentation) -You will also likely find the builtin JS type [`Map`](../../notes/js/js.md#map) very useful for this project. +This project also uses the builtin JS type [`Map`](../../notes/js/js.md#map) heavily, so make sure you understand how to work with it. # `make` ## Goals - See how `Promise`s can be used to compose simple asynchronous tasks into complex asynchronous tasks -- Learn how to execute other programs (asynchronously) in Node.js +- Learn how to run other programs (asynchronously) from a Node.js program ## The GNU `make` utility If you've ever worked with large C or C++ projects, you have probably used `make` or a similar build program. -`make` solves a couple of problems that would cause large projects to be hard to work with: +`make` solves a couple of problems with compiling large projects: - Decomposing the build. - Rather than provide the full list of commands needed to build the project, `make` lets us specify how *each file* can be built from other files. - Running `make some-file` automatically runs the necessary commands to create `some-file`, including any files that `some-file` depends on. + Rather than provide a full sequence of commands to build the project, `make` lets us specify how *each file can be built from other files.* + Running `make some-file` automatically runs the necessary commands to create `some-file`, including building any files that `some-file` depends on. - Performing incremental builds. In a large project, it can be very expensive to rebuild the entire project when a few files change. `make` will only rebuild the files affected by the change, and continue to use existing versions of the other files. ## How (synchronous) `make` works -Consider a sample C project with the following structure: -- There are two modules: +We will use a C project as an example, since this is what `make` is often used for. +Here's a quick refresher of how C code is compiled: +- Code is written in `.c` files +- `.c` files can declare some of their functions in a corresponding `.h` ("header") file. + Other `.c` files can use this `.h` file to call the declared functions. +- Each `.c` file is compiled individually into a `.o` ("object") file +- Finally, the `.o` files are linked together into a single executable file + +Consider a sample C project with two `.c` files: - `util.c` implements various useful functions declared in `util.h` - `main.c` implements some program that uses the functions declared in `util.h` -- We can compile each C file into machine code separately. - `util.c` is compiled into `util.o` and `main.c` is compiled into `main.o`. -- To produce the final executable `run`, we need to link `util.o` and `main.o` together. -Note that both `util.c` and `main.c` include `util.h`, so changes to `util.h` should cause both `util.o` and `main.o` to be rebuilt. +Since each `.c` file is compiled into a corresponding `.o` file, `util.o` depends on `util.c` and `main.o` depends on `main.c`. +Note that both `util.c` and `main.c` include `util.h`, so both `util.o` and `main.o` depend on `util.h`. +`program`, the final We can represent these relationships with a "dependency graph", where each file points to the files it depends on: ``` - run + program / \ / \ / \ @@ -50,11 +56,15 @@ We can represent these relationships with a "dependency graph", where each file / \ / \ main.c util.h util.c ``` -The dependency relationships restrict the orders in which we can build the files: we can build `main.o` and `util.o` in either order, but both need to be built *before* building `run` because `run` depends on them. -The pseudo-code looks like this: +You can also read this from the bottom up: if `util.h` changes, `main.o` and `util.o` need to be rebuilt, and therefore `program` needs to be rebuilt too. +However, if `main.c` is modified, only `main.o` and `program` need to be rebuilt; `util.o` is unaffected. + +The dependency relationships restrict the orders in which we can build the files: we can build `main.o` and `util.o` in either order, but both need to be built *before* building `program` because `program` depends on them. +Pseudo-code for synchronous `make` looks like this: ```python +# Builds the given target. Returns when done. def build(target): - # Skip the build if target is already up to date (or was already rebuilt) + # Skip the build if target is up to date (or was already rebuilt) if needs_rebuild(target): # Recursively build target's dependencies one-by-one for dependency in get_dependencies(target): @@ -63,7 +73,7 @@ def build(target): run(get_build_command(target)) ``` -Since we have written `build()` synchronously, only one build command will run at a time! +Since `build()` waits to return until `target` is built, only one build command will run at a time! In our example above, `main.o` and `util.o` don't depend on each other, so the build could run faster if we built them simultaneously. ## Your task @@ -72,26 +82,28 @@ Implement `make` **asynchronously** in Node.js. ### Rules -Your program will be invoked with `node make.js makefile.js target1 ... targetN`. -`makefile.js` will be a Node.js file that exports an array of rules that have the following structure: +Your program will be run as `node make.js makefile.js target1 ... targetN`. +`makefile.js` is a Node.js file that exports an array of all available rules to build files. +Each rule has the following structure: ```ts -interface Rule { - // The name of the file being built +{ + // The name of the file that this rule builds target: string - // The file's dependencies - dependencies: string[] - // The command to run to built `target`. - // This can be omitted if no command needs to be run. - command?: string[] + // The file's dependencies. + // These can be existing files or ones built according to other rules. + dependencies: Array<string> + // The command to run to built the file. + // This can be omitted (`undefined`) if no command needs to be run. + command?: Array<string> } ``` -For example, in the sample project above, the rules would look like: +For example, the rules for the sample project above would look like: ```js [ { - target: 'run', + target: 'program', dependencies: ['main.o', 'util.o'], - command: ['clang', 'main.o', 'util.o', '-o', 'run'] + command: ['clang', 'main.o', 'util.o', '-o', 'program'] }, { target: 'main.o', @@ -105,19 +117,21 @@ For example, in the sample project above, the rules would look like: } ] ``` -The program should build all of the requested targets, including all the files they depend on. -If there is no rule to build a target, then it is interpreted as a source file and it should already exist. +`make.js` should build all of the requested targets, including the files they depend on (and their dependencies, etc.). +If there is no rule to build a target, then it is interpreted as a source file, so it should already exist. Additional requirements: - Files should be built at most once. For example, if `A` depends on `B` and `C`, and both `B` and `C` depend on `D`, building `A` should only build `D` once. - Files should only be rebuilt if necessary. - Specifically, a file needs to be rebuilt if it doesn't exist, or if any of its dependencies have a later modified time than it does. + Specifically, a file needs to be built if it doesn't exist, or if any of its dependencies was changed since the target was last built. + (You can use the "last modified time" that all files have to tell whether any dependencies are newer than the target.) Note that a file also needs to be rebuilt if an indirect dependency is modified. - For example, if `A` depends on `B` and `B` depends on `C`, then if `C` is updated, both `B` and `A` need to be rebuilt. -- Files should be built as soon as possible (i.e. as soon as all their dependencies are up-to-date). + For example, if `A` depends on `B` and `B` depends on `C`, then if `C` is updated, `B` needs to be rebuilt and then `A` does too. +- Files should be built as soon as possible (i.e. as soon as all their dependencies are up to date). - Like `make`, print out each command before executing it. Also print out anything that the subprocess writes to `stdout` or `stderr`. + This is already handled by `runCommmand()` in the starter code. - If any build command fails, everything that depends on it should fail and `make.js` should exit with an error code. You must use `Promise`s to represent the dependencies being built! @@ -140,33 +154,27 @@ During the build, there are three types of relationships between files which `Pr ``` By combining these `Promise` operations, we can implicitly represent the dependency graph; we don't need any additional data structure to store the graph! -Note that if a `Promise` rejects, it skips all the `Promise`s that depend on it and makes them reject too; this is exactly the behavior we want. +Note that if a command fails, its `Promise` rejects, which makes all the `Promise`s that depend on it immediately reject too; this is exactly the behavior we want. ## Options if you want to go further -- Dependency graphs are technically "directed acyclic graphs". +- Dependency graphs are technically "directed *acyclic* graphs". It is impossible to perform a build if there is a "dependency cycle", e.g. `A` depends on `B`, `B` depends on `C`, and `C` depends on `A`. If the build encounters a dependency cycle, print out a useful error message so the user can fix the set of rules. - Our implementation builds each file as soon as possible. If we are building lots of files that don't depend on each other, this spawns many simultaneous processes. - A system with `n` CPUs can run up to `n` processes at the same time, but if we run more, the system will waste a lot of time switching back and forth between them. + A system with `n` CPUs can run up to `n` processes at the same time, but if we run more, the system wastes a lot of time switching back and forth between them. Limit the number of commands that run simultaneously to the number of CPUs. You can use [`os.cpus()`](https://nodejs.org/api/os.html#os_os_cpus) to find out how many CPUs the computer has. ## Useful functions from the Node.js standard library -To get the rules exported by `makefile.js`, you should use `require()`. -You will have to provide the full path to `makefile.js`, e.g. with `require(path.resolve(makeFile))`. -`path.resolve()` is documented [here](https://nodejs.org/api/path.html#path_path_resolve_paths). -`require()` is blocking, but this is okay because nothing else can be done until `makefile.js` is processed. - -To get metadata (e.g. modified time) about files, you can use [`fsPromises.stat(filename)`](https://nodejs.org/api/fs.html#fs_fspromises_stat_path_options), where `fsPromises = require('fs').promises`. +To get file metadata (e.g. last modified time), you can use [`fsPromises.stat(filename)`](https://nodejs.org/api/fs.html#fs_fspromises_stat_path_options), where `fsPromises = require('fs').promises`. The modified time is stored in the `mtimeMs` field of the returned [`fs.Stats`](https://nodejs.org/api/fs.html#fs_class_fs_stats) object. Child processes can be created using [`child_process.execFile()`](https://nodejs.org/api/child_process.html#child_process_child_process_execfile_file_args_options_callback). -For example, to run the command `clang -c main.c`, you would call `child_process.execFile('clang', ['-c', 'main.c'], ...)`. -Note that the `child_process` module doesn't have a built-in `Promise` interface. -You can use [`util.promisify()`](https://nodejs.org/api/util.html#util_util_promisify_original) to get a version of `execFile()` that returns a `Promise`; there is an example of this in the `execFile()` documentation. +We use [`util.promisify()`](https://nodejs.org/api/util.html#util_util_promisify_original) to make `execFile()` return a `Promise` instead of using a callback function. +This, along with printing out the command and its output, is already done by `runCommand()` in the starter code. You can use `process.exit(statusCode)` to exit the process with the given code. If the code is non-zero, it indicates that `make.js` failed.