diff --git a/doc/lab4design.txt b/doc/lab4design.txt new file mode 100644 index 0000000000000000000000000000000000000000..1af54d5a77d74a542e72852bcf875b332397b0e2 --- /dev/null +++ b/doc/lab4design.txt @@ -0,0 +1,106 @@ +CS122 Assignment 4 - Join Optimization - Design Document +======================================================== + +Fill in answers for all questions based on your team's work on Assignment 4. + +A: Refactoring Your Planner +---------------------------- + +A1. What did you name your planner base-class? Briefly describe what + functionality you migrated into this class. + +A2. Was there any functionality that is common to both your SimplePlanner + and your CostBasedJoinPlanner that you left duplicated in the + subclasses? If so, explain your rationale for each part of the + duplicated functionality. + +B: Generating Optimal Joins +---------------------------- + +B1. Briefly describe how you generate an "optimal" access to a base table. + +B2. Briefly describe how you decide when it is acceptable to push + conjuncts down through an outer join. + +B3. The planner in this assignment is still somewhat limited; for example, + we can't push conjuncts down into subqueries. Using the stores schema, + write an example SQL query that includes a subquery, where it would be + beneficial to push a conjunct down into the subquery. (Your planner + obviously doesn't need to perform this optimization.) + +B4. Enumerate the situations where you call prepare() on plans being + generated. Since this operation is somewhat expensive, do you + see any ways to reduce the number of times you call prepare() in + your implementation? + +B5. Given a schema t1(a, c1), t2(a, c2), t3(a, c3), and this SQL query: + SELECT * FROM t1 JOIN t2 ON t1.a = t2.a JOIN t3 ON t2.a = t3.a; + + Assume that we are using the simple cost-based join planning strategy + described in this assignment. What issue arises when the planner + considers joining the pair of tables t1 and t3 first? What can be done + to resolve this problem? (Your planner does not have to implement + this functionality!) + +B6. Is it possible to end up with unused conjuncts (from the WHERE and ON + clauses) after planning joins? Briefly explain why or why not. + +C: Costing SQL Queries +----------------------- + +After you have loaded the stores-28K.sql data and have analyzed all of +the tables in that schema, run the following explain operations and paste +the output from your planner (excluding debug output!). + +If there is anything unusual or non-obvious about your output, feel free +to write explanatory notes after your output. + +C1. EXPLAIN SELECT * FROM cities WHERE population > 5000000; + +C2. EXPLAIN SELECT store_id FROM stores, cities WHERE stores.city_id = cities.city_id AND + cities.population > 1000000; + +C3. EXPLAIN SELECT store_id FROM stores JOIN (SELECT city_id FROM cities WHERE population > 1000000) AS big_cities ON stores.city_id = big_cities.city_id; + +C4. EXPLAIN SELECT store_id, property_costs FROM stores, cities, states WHERE stores.city_id = cities.city_id AND + cities.state_id = states.state_id AND state_name = 'Oregon' AND property_costs > 500000; + +D: Extra Credit [OPTIONAL] +--------------------------- + +If you implemented any extra-credit tasks for this assignment, describe +them here. The description should be like this, with stuff in "<>" replaced. +(The value i starts at 1 and increments...) + +D<i>: <one-line description> + + <brief summary of what you did, including the specific classes that + we should look at for your implementation> + + <brief summary of test-cases that demonstrate/exercise your extra work> + +E: Feedback [OPTIONAL] +----------------------- + +WE NEED YOUR FEEDBACK! Thoughtful and constructive input will help us to +improve future versions of the course. These questions are OPTIONAL, and +your answers will not affect your grade in any way (including if you hate +everything about the assignment and databases in general, or Donnie and/or +the TAs in particular). Feel free to answer as many or as few of them as +you wish. + +E1. What parts of the assignment were most time-consuming? Why? + +E2. Did you find any parts of the assignment particularly instructive? + Correspondingly, did any parts feel like unnecessary busy-work? + +E3. Did you particularly enjoy any parts of the assignment? Were there + any parts that you particularly disliked? + +E4. Were there any critical details that you wish had been provided with the + assignment, that we should consider including in subsequent versions of + the assignment? + +E5. Do you have any other suggestions for how future versions of the + assignment can be improved? + diff --git a/doc/lab4info.txt b/doc/lab4info.txt new file mode 100644 index 0000000000000000000000000000000000000000..4b30df5052b351bdbc1dadaa29e444dbf98f0c89 --- /dev/null +++ b/doc/lab4info.txt @@ -0,0 +1,33 @@ +CS122 Assignment 4 - Join Optimization +====================================== + +Please completely fill out this document so that we know who participated on +the assignment, any late extensions received, and how much time the assignment +took for your team. Thank you! + +L1. List your team name and the people who worked on this assignment. + + <team name> + + <name> + <name> + ... + +L2. Specify the tag and commit-hash of the Git commit you are submitting for + your assignment. (You can list the hashes of all tags with the command + "git show-ref --tags".) + + Tag: <tag> + Commit hash: <hash> + +L3. Specify how many late tokens you are applying to this assignment, if any. + Similarly, if your team received an extension from Donnie then please + indicate how many days extension you received. You may leave this blank + if it is not relevant to this submission. + + <tokens / extension> + +L4. For each teammate, briefly describe what parts of the assignment each + teammate focused on, along with the total hours spent on the assignment. + +