Design document and supporting code for lab 5

Supporting code includes some SQL tests with scalar subqueries, with some being correlated and others not. Also, a buggy materialize-node implementation for those who want to tinker with making IN/NOT IN subqueries more efficient - but the bugs will need to be fixed.

Design document and supporting code for lab 5
Supporting code includes some SQL tests with scalar subqueries, with some being correlated and others not. Also, a buggy materialize-node implementation for those who want to tinker with making IN/NOT IN subqueries more efficient - but the bugs will need to be fixed.
391659b2 · Donald H. (Donnie) Pinkston, III · bfc08dde · 391659b2 · 391659b2 · 391659b2
Commit 391659b2 authored 6 years ago by Donald H. (Donnie) Pinkston, III
Hide whitespace changes
Inline Side-by-side

Showing

with 271 additions and 3 deletions
+271 -3
--- a/doc/lab5design.txt
+++ b/doc/lab5design.txt
+CS122 Assignment 5 - Advanced Subqueries - Design Document
+==========================================================
+
+A:  Subquery Planning
+---------------------
+
+A1.  Which planner did your team add subquery planning to?  How easy or
+     difficult was it to incorporate subquery-planning into the planner?
+
+A2.  Briefly outline the process by which you perform subquery planning in
+     your query planner.  Also explain briefly how your subquery-planner works.
+
+A3.  Briefly describe how you detect whether there are subqueries in the
+     GROUP BY and ORDER BY clauses.  How do you make sure this doesn't
+     interfere with subqueries in other clauses?
+
+B:  Correlated Evaluation
+-------------------------
+
+B1.  How easy or difficult was it to incorporate support for correlated
+     evaluation into your query planner?  Did it affect the sequence of
+     steps in any substantial ways?
+
+D:  Extra Credit [OPTIONAL]
+---------------------------
+
+If you implemented any extra-credit tasks for this assignment, describe
+them here.  The description should be like this, with stuff in "<>" replaced.
+(The value i starts at 1 and increments...)
+
+D<i>:  <one-line description>
+
+     <brief summary of what you did, including the specific classes that
+     we should look at for your implementation>
+
+     <brief summary of test-cases that demonstrate/exercise your extra work>
+
+E:  Feedback [OPTIONAL]
+-----------------------
+
+WE NEED YOUR FEEDBACK!  Thoughtful and constructive input will help us to
+improve future versions of the course.  These questions are OPTIONAL, and
+your answers will not affect your grade in any way (including if you hate
+everything about the assignment and databases in general, or Donnie and/or
+the TAs in particular).  Feel free to answer as many or as few of them as
+you wish.
+
+E1.  What parts of the assignment were most time-consuming?  Why?
+
+E2.  Did you find any parts of the assignment particularly instructive?
+     Correspondingly, did any parts feel like unnecessary busy-work?
+
+E3.  Did you particularly enjoy any parts of the assignment?  Were there
+     any parts that you particularly disliked?
+
+E4.  Were there any critical details that you wish had been provided with the
+     assignment, that we should consider including in subsequent versions of
+     the assignment?
+
+E5.  Do you have any other suggestions for how future versions of the
+     assignment can be improved?
+
--- a/doc/lab5info.txt
+++ b/doc/lab5info.txt
+CS122 Assignment 5 - Advanced Subqueries
+========================================
+
+Please completely fill out this document so that we know who participated on
+the assignment, any late extensions received, and how much time the assignment
+took for your team.  Thank you!
+
+L1.  List your team name and the people who worked on this assignment.
+
+     <team name>
+
+     <name>
+     <name>
+     ...
+
+L2.  Specify the tag and commit-hash of the Git commit you are submitting for
+     your assignment.  (You can list the hashes of all tags with the command
+     "git show-ref --tags".)
+
+     Tag:  <tag>
+     Commit hash:  <hash>
+
+L3.  Specify how many late tokens you are applying to this assignment, if any.
+     Similarly, if your team received an extension from Donnie then please
+     indicate how many days extension you received.  You may leave this blank
+     if it is not relevant to this submission.
+
+     <tokens / extension>
+
+L4.  For each teammate, briefly describe what parts of the assignment each
+     teammate focused on, along with the total hours spent on the assignment.
+
+
--- a/src/main/java/edu/caltech/nanodb/plannodes/MaterializeNode.java
+++ b/src/main/java/edu/caltech/nanodb/plannodes/MaterializeNode.java
+package edu.caltech.nanodb.plannodes;
+
+
+import java.util.ArrayList;
+import java.util.List;
+
+import edu.caltech.nanodb.expressions.OrderByExpression;
+import edu.caltech.nanodb.expressions.TupleLiteral;
+import edu.caltech.nanodb.relations.Tuple;
+
+
+/**
+ * <p>
+ * This plan-node materializes the results of a child plan-node in memory.
+ * The tuples of the child plan-node are fetched on-demand (not all at once
+ * at the start of plan-node execution), and are cached within this plan-node.
+ * If the child plan produces disk-backed tuples, this plan-node caches a copy
+ * of the tuple so that the disk-backed tuple can be unpinned.
+ * </p>
+ * <p>
+ * Note that a more typical implementation of a "materialize" node would have
+ * a maximum in-memory footprint, and would store tuples to a temporary disk
+ * file when this maximum memory size is reached.  However, the memory
+ * allocation for in-memory tuples is not managed by the Buffer Manager, so it
+ * really isn't possible for the materialize node to provide this kind of
+ * functionality.
+ * </p>
+ */
+public class MaterializeNode extends PlanNode {
+
+    /**
+     * This collection holds the tuples that have been generated by the child
+     * plan so far.
+     */
+    private ArrayList<Tuple> tuples;
+
+
+    /**
+     * This field stores the index of the "current tuple" as the materialized
+     * results are traversed.
+     */
+    private int currentTupleIndex;
+
+
+    /** This field stores the index of the tuple when a position is marked. */
+    private int markedTupleIndex;
+
+
+
+    /**
+     * When this flag is true, the child plan-node has finished generating
+     * all of its results.
+     */
+    private boolean childNodeFinished;
+
+
+
+    public MaterializeNode(PlanNode leftChild) {
+        super(leftChild);
+    }
+
+
+    @Override
+    public List<OrderByExpression> resultsOrderedBy() {
+        return leftChild.resultsOrderedBy();
+    }
+
+
+    @Override
+    public boolean supportsMarking() {
+        return true;
+    }
+
+
+    @Override
+    public void prepare() {
+        leftChild.prepare();
+
+        schema = leftChild.getSchema();
+        stats = leftChild.getStats();
+        cost = leftChild.getCost();
+
+        tuples = new ArrayList<>();
+        currentTupleIndex = -1;
+        markedTupleIndex = -1;
+        childNodeFinished = false;
+    }
+
+
+    @Override
+    public Tuple getNextTuple() {
+        Tuple tup = null;
+
+        assert currentTupleIndex >= -1;
+        assert currentTupleIndex <= tuples.size();
+
+        if (currentTupleIndex + 1 < tuples.size()) {
+            // Moving forward the "current tuple" index will stay within the
+            // tuples we have so far.
+            currentTupleIndex++;
+            tup = tuples.get(currentTupleIndex);
+        }
+        else {
+            // Moving forward the "current tuple" index will move beyond the
+            // tuples we have so far.  Need to see if we have consumed all
+            // child tuples yet.
+
+            if (!childNodeFinished) {
+                // Try to fetch another tuple from the child plan-node, if
+                // there is one.
+                tup = leftChild.getNextTuple();
+                if (tup != null) {
+                    if (tup.isDiskBacked()) {
+                        // Make an in-memory version of the tuple we can cache.
+                        Tuple copy = new TupleLiteral(tup);
+                        tup.unpin();
+                        tup = copy;
+                    }
+
+                    tuples.add(tup);
+                    currentTupleIndex++;
+                }
+                else {
+                    // The child has no more tuples.
+                    childNodeFinished = true;
+                }
+
+                assert currentTupleIndex <= tuples.size();
+            }
+        }
+
+        return tup;
+    }
+
+
+    @Override
+    public void markCurrentPosition() {
+        markedTupleIndex = currentTupleIndex;
+    }
+
+    @Override
+    public void resetToLastMark() {
+        currentTupleIndex = markedTupleIndex;
+    }
+
+    @Override
+    public void cleanUp() {
+        leftChild.cleanUp();
+
+        tuples = null;
+        currentTupleIndex = -1;
+    }
+
+    @Override
+    public String toString() {
+        return "Materialize";
+    }
+
+    @Override
+    public boolean equals(Object obj) {
+        if (obj instanceof MaterializeNode) {
+            MaterializeNode other = (MaterializeNode) obj;
+            return leftChild.equals(other.leftChild);
+        }
+
+        return false;
+    }
+
+    @Override
+    public int hashCode() {
+        return leftChild.hashCode();
+    }
+}
--- a/src/test/java/edu/caltech/test/nanodb/sql/TestExists.java
+++ b/src/test/java/edu/caltech/test/nanodb/sql/TestExists.java
@@ -11,7 +11,7 @@ import edu.caltech.nanodb.server.CommandResult;
 * This class exercises the database with some simple EXISTS operations, to
 * verify that the most basic functionality works.
 **/
-@Test(groups={"sql"})
+@Test(groups={"sql", "hw5"})
 public class TestExists extends SqlTestCase {
    public TestExists() {
        super("setup_testExists");

--- a/src/test/java/edu/caltech/test/nanodb/sql/TestInPredicates.java
+++ b/src/test/java/edu/caltech/test/nanodb/sql/TestInPredicates.java
@@ -67,7 +67,7 @@ public class TestInPredicates extends SqlTestCase {
     *
     * @throws Exception if any query parsing or execution issues occur.
     */
-    @Test(groups={"sql"})
+    @Test(groups={"sql", "hw5"})
    public void testInSubquery() throws Throwable {
        CommandResult result;
        TupleLiteral[] expected1 = {

--- a/src/test/java/edu/caltech/test/nanodb/sql/TestScalarSubquery.java
+++ b/src/test/java/edu/caltech/test/nanodb/sql/TestScalarSubquery.java
@@ -13,7 +13,7 @@ import edu.caltech.nanodb.server.CommandResult;
 * This class exercises the database with some simple scalar subqueries, to
 * verify that the most basic functionality works.
 **/
-@Test(groups={"sql"})
+@Test(groups={"sql", "hw5"})
 public class TestScalarSubquery extends SqlTestCase {
    public TestScalarSubquery() {
        super("setup_testExists");