This first assignment will basically test your ability to write an enhanced linked data structure. In this case I want you to write a balanced order statistic tree. You can pick whether your tree is an AVL tree or a red-black tree. As with all the assignments, you can write this in whatever language you want, but I have to be able to compile and run it under Linux on the lab machines. What you turn in will be graded for correctness and speed. It might also be used in a future assignment so keep that in mind and try to build flexible and robust data structures. You will use this data structure to solve the following problem. You might also use some other data structures to go along with it, but you don't have to write those if you can find a good library for them.
We want to track the selling prices on houses in a certain market. Each house has a certain lot ID that is unique. When a house sells, you will be told the lot ID, the selling price, and the date of sale. The sales will be given to you in sorted order by date of sale and you can assume the "current time" is the date of the most recent sale. There are two types of information that we are interested in for this. We care about the statistics (mean, median, 1st quartile, 3rd quartile) for all houses and for houses sold in the last 6 months. You will also need to be able to look up the selling prices of houses given lot IDs and provide the rank of the selling price of that house overall and, if it was sold in the last 180 days, in the last 180 days.
The interface for this program will be all text. You will read from standard input and write to standard output. Each line of input will have a command that you have to process. The meaning of each command and what you must print for it, if anything, is given below.
SELL date lotID price
When you get this command, don't print anything, just update your data in your program to reflect that this has been added. Dates will be given as an integer in days since 1/1/2000.
STAT
When you get this command you will print out the current stats both for all lots and for all lots sold in the last 180 days with the following format.
(MP - 1st MED 3rd)(MP - 1st MED 3rd)
The values in the first parentheses are for all lots and the values in the second parentheses are for lots sold in the last 180 days.
MP is the mean price, 1st is the price of the lot at the top of the first quartile, MED is the median price, 3rd is the price of the lot at the top of the 3rd quartile. If there is an ambiguity of which house is median, take the lower, don't average two values. This will happen naturally if you do integer arithmetic.
LAST lotID
When you get this command simply print out the date of sale and price of the given lotID.
RANK lotID
When you get this command simply print the rank of the last sale for all sales and if it was sold in the last 180 days also print out the rank among lots sold during that time. The ranks should be printed in the format x/y where x is the position in sorted order and y is the total number in the set.
PRANK price
When you get this command print the rank that price would be. If it ties other selling prices assume it is the lowest (use a lotID of " "). Print the rank the same way as above.
If two properties sold for the same price during a timeframe we will disambiguate by sorting them by lotID after price. So if lot ABC and lot DEF were sold at the same price we would say that lot ABC comes before lot DEF in our price ranking. Keep inmind that you might want to augment your tree beyond just making it an order statistic tree. Also note that all values are going to be integers so use integer arithmetic to do it.
Here are some sample inputs. I'll put up the sample outputs as soon as I have the solution written.