Rebate Center Update #5

Well, time for another update, although I am yet to receive a single comment about this exciting script. That’s OK, I guess– the users of the rebate center will probably get a kick out of reading the development blog once the site has been opened.

Development on the rebate center has slowed down a bit, not because I have run out of things to do, but because I have hit a wall.

This seems to be a regular feature for me when coding a big project. I imagine that other developers experience similar stumbling block.

When I write code, it is as close to a visual art as I get. I can’t draw. I can’t paint. I can’t do any of that. So when I write code, I try to make it the most beautiful, most efficient piece of work that I possibly can. Unfortunately, that sometimes leads to a slow development process.

Before I get into the stumbling block, I’ll talk about one success I have had. I added a feature that I didn’t plan, but was clearly necessary.

When you add a rebate to the system, there is some basic information you must provide. In the interest of keeping the form from being an overgrown monster, I built form fields for the absolute-minimum information I need to run the system.

That is all fine and good, but not always practical for a user. Let’s say a user tracked down the corporate directory for a company that has stiffed them on a rebate. As it was, the user would create a general note for the rebate with the included information. Then, when they want to retrieve that information, they would have to go back into the rebate, and search through their notes to find the information the entered.

Now, users can add “custom fields” to their rebates. The fields appear directly below the rebate summary, so they are very easy to find very quickly. To add a field, a user simply clicks “Add More Information” for a specific rebate, and then provides a field “title” and a field “value” for their custom information. The system allows users to add as many custom fields as they would like.

And now the stumbling block: Statistics.

These statistics are giving me the toughest time. It seems it is incredibly difficult to build real-time statistics without either a lot of processing via php, or a lot of reading via mysql.

The statistics I’m trying to build are simple. Members will be able to see where they rank compared to other members in four categories:

* Success Rank: The percentage of a user’s rebates that were successfully completed.
* Total Value Rank: The cumulative value of all rebate checks a user has received.
* Average Value Rank: The average value of all rebate checks a user has received.
* On-time Rank: The percentage of a user’s rebates that were received on-time.

Users can view a site-wide list of rankings (everyone on the site), can view just their statistics (from their rebate center homepage), and can use a dynamically-generated signature image that lists their ranking in all four categories.

I’ve thought and thought and thought about how to pull these stats, and can’t come up with anything good. I’ve settled to using a method that seems to be a problem waiting to happen, but there is not much I can do for right now. Once the system is underway and I have better sample data for testing, I will have to rewrite the statistic functions.

For now, I read every single rebate in the system with one huge SELECT query. I then step through the rows, and build an array that looks like this:

    [6 ](member id) => Array
        (
            [past_due_rebates] => 3
            [open_past_due_rebates] => 1
            [closed_rebates] => 2
            [successful_rebates] => 2
            [total_cash_received] => 128
        )

From there, I loop through the array, and create four new arrays of the actual statistics. Perform an arsort() on each of the arrays, and I end up with a nicely formatted array like this one:

success_rank ARSORT: PrintArray
(
    [6] => 0.666666666667
    [5] => 0.25
    [4] => 0
    [2] => 0
)

This allows me to access any of the statistical arrays whenever I need them. I can pull a specific element of an array, or I can pull a specific user’s position within the array.

I’ll take these four arrays, put them into one big multi-dimensional array, serialize it, and stuff it into a cache table I’ve created in the database.

I’ll be writing a stats_read() function and a stats_write() function to serialize and stuff the array, and take in and reformat the array into a usable format.

I’ll run the script that rebuilds the stats with a CRON job, and will hopefully keep CPU usage under control as a result.

It shouldn’t be long before I finish the stats section. Then, I just have to perform a complete rewrite on all of the code to make sure it is as efficient as possible, register a domain, and we’re set for a beta launch.

I can’t wait!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>