Moaning about +/-

26Apr09

Game 4 between Utah and L.A. last night provides an interesting data point in considering the usefulness of +/- as a basketball stat. We’ve all seen instances where a player who contributes very little statistically ends up with a phenomenally high +/- rating, while the team’s stars are rated as having played poorly. Occasionally, this even happens for a player on the losing team. The reason, of course, is that a player’s +/- rating is, by definition, the point differential between two teams accrued while that player is on the court. So, if you’ve got a scrub on the court for precisely the five minutes that his superstar teammate reels off 15 consecutive points while the other team is held scoreless, he ends up with a +15 rating; the superstar gets a +15 for that five-minute stretch too, but since he probably plays the rest of the game too, his score is likely diluted by game-end.

With that in mind, it’s interesting to read J.A. Adande’s recap of Game 4 with a scorecard on hand. What we see is that Kobe Bryant, who scored 38 points on 66% shooting in 39 minutes, ended the game with a +4 rating, the same as Luke Walton, who chipped in 9 points on 50% shooting in 19 minutes. Sasha Vujacic had a +9 for his contributions of 9 points on 33% shooting in 17 minutes. The guys who fare the best are Lamar Odom (10-15-6 line on 40% shooting in 41 minutes, for +20) and Pau Gasol (13-10-1 in 41 minutes, for +15). Clearly, those numbers are all over the place, but what’s weird is how Kobe ended up with just a +4, given how much everyone insists he dominated the game. The following excerpt from Adande’s article helps to explain this:

Bryant came out firing. … He scored the Lakers’ first 11 points and 13 of their first 15, hitting 6-of-8 shots in the first quarter. … But the rest of the Lakers went 2 for 10 and Utah led, 25-20. The turning point actually came when he was on the bench at the start of the second quarter and the Lakers fielded a lineup of Shannon Brown, Sasha Vujacic, Luke Walton and starters Lamar Odom and Pau Gasol. It was the reserve players who hit three consecutive 3-pointers to take the Lakers from a seven-point deficit to a two-point lead.

A-ha! Kobe, may have scored a full one-third of the team’s points over the course of the game, but he was sitting out during a key stretch. Worse, while he was lighting it up in the first quarter, everybody else’s ineptitude was costing him in the +/- stakes.

This situation captures almost perfectly the beauty and the shortcomings of +/-: on the one hand, it zooms in on what matters most — winning — without ascribing arbitrary weightings to individual statistics like points, rebounds and assists while still incorporating the value of intangible contributions. On the other hand, the value of the individual is so tightly tied to who is on the court with him that +/- as a comparative tool becomes worthless, since it reflects more the overall quality of the team and the ability of the coach to construct effective rotations. It’s sort of the classic curse of single-number metrics.

One is tempted to abandon +/- as the ‘holy grail’ statistic at this point, but it’s really a very good idea in as far as cutting out the pseudo-science of sports statistics and focussing on what actually leads to wins. One wonders if the problem is ’snapshotting’ +/- ratings at the end of the game, which necessarily throws out all the information that a real-time +/- score would have. Behold:

Utah-L.A. Game 4, Playoffs 2009 - Selected Plus-Minus Ratings

Utah-L.A. Game 4, Playoffs 2009 - Selected Plus-Minus Ratings

What you’re seeing is a graph of realtime +/- ratings for the teams (ie, overall) as well as for various players over the course of the entire game. The overall team rating is just the point differential between L.A. and Utah (I’m a Lakers fan); for the individual players, you can see how their +/- rating was related to that of the team through time. The flat regions (eg, Luke Walton from the beginning of the game until about 7:30 in) are periods when the player’s +/- didn’t change — ie, he was either off the court, or he was on, but neither team scored. You’ll also see some periods when everyone’s rating moves together — eg, the plotted players’ +/- ratings all fell between 10:30 and 14:30 — which implies that they were all on the court (in the 10:30 - 14:30 period, the falling ratings mean Utah made a run).

The plot makes it immediately apparent how, for example, Luke Walton ended up with the same +/- rating as Kobe: he simply sat out the last 10 minutes of the game, when Utah was cutting L.A.’s lead. This, of course, raises some interesting questions, such as how starting and ending games ends players’ +/- ratings; from this example, it looks like garbage time can hurt the winning team’s ratings (though conversely, Utah’s players’ ratings benefited from garbage time), while starting can be good or bad for one’s score depending on who starts the game stronger (ie, there’s no bias there, whereas there is for garbage time).

Of course, since one now has a nice time-series describing the players’ contributions (as encapsulated by +/-), one can try computing a single-number metric of performance (though, obviously, YMMV*) by simply computing the correlation between the team’s overall +/- and the players’ individual ratings. Note that this number really only makes sense in the context of the team’s performance — ie, a player with a perfect score on the Kings is clearly worse than a player with a perfect score on the Lakers, since correlation doesn’t differentiate between covariance that leads to team success or failure. That said, let’s take a look at what this ‘+/- correlation’ metric looks like for Game 4:

Utah-L.A. Game 4, Playoffs 2009 - +/- correlations

Utah-L.A. Game 4, Playoffs 2009 - +/- correlations

I’ve shown both team’s players on this one chart, so what you’re seeing is who contributed to what ultimately proved to be a Lakers’ win as a normalised score based on players’ full real-time contributions. So if you were to say a dude is absolutely indispensable to a team’s success (ie, the MVP), you would expect him to have, over the course of the season, a +/- correlation of almost +1. We see that, for example, for Pau Gasol, implying he was on the court for almost every significant period of the game (even if he didn’t contribute as much as Kobe in raw numbers). The implications for Andrew Bynum are also interesting, since a correlation that low means he basically didn’t matter much as far as the team’s ultimate success is concerned. I’m really itching to see what it says about Shane Battier ;)

Unfortunately, it’s not the end-all, be-all of metrics because by construction the amount of playing time a player receives will bias it, and because it’s basis in the overall winning ability of the five players on the court means it doesn’t break down how much of that success is attributable to any individual player. However, it does roughly put players where we expect them in the pecking order, and gives a rudimentary sense of how much responsibility a team’s results individual players should be ascribed. And there are ways to improve this metric, a topic which really deserves another post (although here’s an obvious extension: define the overall team +/- to always be in favour of the team that wins).

In summary, what complaints we have against +/- ratings might be resolvable by analysing players’ +/- ratings as they evolve through time. In abstract terms, +/- seems a much more objective metric than the weighted-average metrics in more common use (eg, PER, Wages-of-Wins), which make assumptions that borne out only by a couple of decades’ history.

I’m attaching the spreadsheet containing my data as well in case anyone’s interested in taking a look at it. If I get time — and if there’s interest — I might compile this sort of data for all the games next year and put it on the web. Let me know what you think.

Utah-L.A. Game 4, Playoffs 2009 - Data

Utah-L.A. Game 4, Playoffs 2009 - Scorecard

  • I have trouble using this phrase in polite conversation now, but hopefully we’re all on the same page as to why I’m using it!

Statistics and basketball

28Mar09

I hate how people are trying to apply statistics to basketball, like it’s some voodoo that could potentially unlock all sorts of answers and reduce the sport to a science. It’s a ridiculous, depressing and self-defeating (for sports fans, anyway) thing to do. The pompous proponents of this mentality are correct in arguing that there is value to applying predictive analysis techniques to the reams of numbers basketball statisticians have been keeping for years, but the purportedly sophisticated methods that people like John Hollinger are pushing in order to allow, for example, comparisons between players of different eras, are so desperately arbitrary and leave so much information out that you really wonder what they’re smoking (and yes, wish they’d pass it around).

All I want right now, is for someone to look at Tim Thomas’ line in the March 28 Bulls-Pacers game and restate the value of +/-, basketball’s statistic of the moment:

                        MIN  FGM-A  3PM-A  FTM-A  OREB  DREB  REB  AST  STL  BLK  TO  PF  +/-  PTS
Tim Thomas, PF  6   0-3     0-1        0-0      0   1   1   0   0   0    0    2 +8  0

Yes, for the 6 minutes in which he netted 1 rebound, 3 misses and 2 fouls, he was rewarded with a +8 rating.

The cynic in me…

18Mar09

…smirks that while we celebrate the birth of real democracy in Pakistan, the Pakistani people are already so thoroughly drenched in the gooey residue of their success that they don’t recognise the CJP is already installed in their minds as the next in the long line of saviours to whom we periodically hand over our hopes and, always terrifyingly, destinies. I’m thinking the Quaid, BB, Musharraf, of whom only the Quaid actually succeeded outright in securing our goals.

The Pakistani people have doubtless gained ground but the struggle to hold our overlords (and, indeed, ourselves) accountable must continue apace.

Yo Dawg, blogging is hard work…

01Feb09

Dang yo, I haven’t been blogging regularly.

I might have mentioned this before, but the biggest reason I don’t blog is because I decide I want to edit something before I publish it; a few days later, I’ve lost my train of thought and the draft never makes it to the blog. There’s certainly a lot of stuff to blog about (including: I joined the Tehreek-e-Insaaf recently, and I’ve almost finished my petition to start a library in Islamabad), but I get too caught up in trying to get my writing just right…

That said, I’ve noticed that it doesn’t really matter how much editing I put into a post: I inevitably read it several times after publishing and tinker with it then. So, I’m trying something new right now: I’ve posted a 80%-complete draft on the Nairaing Foundation blog about shadow cabinets. It’s not quite as well-digested as I would like and I think I want to simplify the wording a little and add a proper conclusion, but now that it’s out there, I think I’ll have more of an incentive to fix it.

Let’s see.

Dean Kamen on Innovators

13Jan09

Something worth hanging on to for rainy days:

Kamen … said every entrepreneurial innovator he’s ever seen shares a few characteristics.

“It’s not that they’re brilliant or well-educated,” Kamen said. “They work all the time. They don’t let failure demoralize or destroy them. They pick themselves up and keep going and eventually, every once in a while, one of your ideas actually breaks through and works, and it makes all that stuff seem worthwhile.”

In other words, Mr. Kamen agrees with Ammi :)

Corruption and Development

23Nov08

I was quite excited to find this blog post but now that I’ve read it, I’m a little disappointed. It’s a bit wishy washy in the way academic articles can be and doesn’t really get much further than defining corruption and categorising it as high-level and low-level.

The reason for that is probably the extreme complexity of the issue under examination — simply taking a handful of countries and establishing some relationship between their respective levels of corruption and growth doesn’t make sense because there are too many other factors at play. Economists always try to take a ‘partial derivative’, linearised view of the world (ie, all else equal, find a single driving variable), since that’s the most obvious way to decompose a complex issue into aspects that can be studied individually, but there are inevitable limitations to how effective this approach can be at the macro level, where there are dozens of significant variables, with unknown cross-effects that are likely deserving of separate study.

It’s clear that corruption impedes growth and development, indirectly through economically suboptimal allocation of resources and directly through the increased costs it imposes. Beyond that, it’s pretty hard to make any substantial, justifiable statement. Perhaps rather than drilling until some tenuous relationship is established, the focus should be to establish how corruption becomes systemic and what measures can be taken to prevent this from happening. In places like Pakistan where corruption is an everyday occurrence, people simply assume it will continue unimpeded and price it in as a cost of business. We know there it is exacerbated by extreme social inequality, exceptionally poor pay for officials, and a bizarre mindset (particularly in communities like the Memons) that considers bribes a mark of respect and an essential element in building necessary relationships with officials. In short, there’s a significant cultural component to corruption that must not be obscured by the economic aspect.

Of course, once corruption becomes so commonplace that society’s perception of it vacillates between necessary evil and competitive advantage, obvious steps like making examples of a few chosen offenders become useless, and even attempts to tackle the root issues — such as the double-salary schemes offered by the Federal Board of Revenue under reforms agreed with the World Bank — are only marginally effective. Certainly, gimmicky (the National Accountability Bureau is a massive sham) and/or highly focussed measures (the double-salary scheme was only offered to key officials) are a waste of time and not deserving of discussion. Bold steps, applied with commitment and consistency and monitored actively, are needed in order to remove the root causes and steadily (if slowly) rub out corruption. And of course, these steps would need to address each facet of the problem.

One bold step might be to draw the problem out into daylight by legitimising and regularising the bribes officials are already taking in the form of some minimal extra fee; this would be additional remuneration for providing a service. This immediately circumvents the problem of burdening the government with higher payroll and actually encourages efficiency in directing the remuneration to those who have earned it. In an indirect way, it might also go some way in addressing the cultural issues (continuing to pick on Memons — assuming they don’t pay doctors more than their quoted fee, I don’t see why they would feel obliged to push further bribes on officials). Doing this would likely create issues of nepotism in the allocation of official duties and potentially harmful competition between colleagues, but these are straightforward problems with more obvious solutions. The government would simultaneously have to empower citizens to report negligence or incompetence or — more tricky — officials charging in excess of the permitted fee. With sufficient political will and clout to aggressively prosecute citizens and officials who continue to flout the law, maybe we would have a workable system.

(Holy never-ending-sentence-alert, Batman! My brain isn’t working these days…)

Un Ke Dekhe Se - Mirza Ghalib

13Nov08

Un ke dekhe se jo aa jati hai mun par ronaq,
Woh samajhtey hain keh beemar ka haal accha hai.

Dekhiye paate hain usshaq buton se kya faiz,
Ek brahman ne kaha hai keh yeh saal accha hai.

Hum ko maaloom hai jannat ki haqeeqat laikin,
Dil kay khush rakhnay ko Ghalib yeh khayal accha hai.

And here’s the wonderful Jagjit Singh version from the Mirza Ghalib soundtrack. His voice is superlative: mellifluous like no other and with an emotional depth that draws the listener in. Absolutely delightful.

The second misra is confusing to me; the literal translation is ‘what benefit is to be gained from gazing upon a loved one’, but there’s an undertone there that I don’t understand, since ‘buton’ (statues) suggests coldness or aloofness. Or maybe Ghalib is just pointing to the impotence of the situation, since that’s the general direction of the third misra too.

It’s interesting that each shair stands very well on its own, almost inviting other interpretations.

Programmatically inserting and running VBA in a password-protected spreadsheet

02Oct08

That heading describes a problem I hope you never have to tackle. We’ve got a full-fledged exotics trading system (call it XX) built in Excel with all sorts of fancy bells and whistles. For some reason that escapes me, XX has never been split into a thin front-end sheet backed by an XLA containing all the business logic.

The system is used by a lot of users all over the place for all sorts of things, including, notably, for running risk batches, which of course are crucial for trading. Recently we’d been trying to increase the complexity of batches, so that users could basically tweak all sorts of configuration settings and perform any transformations of market data they may wish to do before running the valuations. There are various ways of doing this, but the most powerful way would be to offer a way to load or inject custom batch logic written in VBA, so that anyone could whip up a custom batch and debug it easily.

In this scenario, the the user would simply specify a text file containing the VBA logic when launching XX in Excel. XX would then run the custom logic and then run the batch. Here’s an example of something we might want to do (’XX’-prefixed functions are calls to the trading system):

Sub BatchLogic()
    Dim transformations As New Collection
    Call transformations.Add(0.01, “SpotBump”)

    Call XXLoadMarketData
    Call XXTransformMarketData(transformations)
End Sub

It sounds a bit hairy given that VBA doesn’t have an evaluate function to execute arbitrary VBA code (you can evaluate worksheet functions, but you don’t have first-class functions like Lisp), but I’m learning you can do practically anything with the VBE if you’re willing to mess around for a bit. The key references, as always, are the Pearson Consulting website and the Erlandsen Consulting website. It took me three tries to get something that worked like I wanted.

My first try was to add a module called ‘Batches’ in the XX VBA project and try to dynimically add a procedure to it and then call it. The code in Batches looked something like this:

Sub InjectAndRunBatchLogic(filename As String)
    Dim text As Collection

    ’ Load VBA from file
    text = ReadFileAsCollection(filename)

    ’ Add to XX
    Call AddProcedureToModule(”Batches”, text)

    ’Call BatchLogic
    Application.Run (”BatchLogic”)
End Sub

The call to BatchLogic is commented out in favour of Application.Run since the function will not exist in XX except when running a custom batch (and will therefore cause compilation problems during normal development).

Unfortunately, it turns out that VBA compiles the module when it is used, so that dynamically modifying the module will either result in BatchLogic not being found (because it didn’t exist when InjectAndRunBatchLogic was called) or Excel crashing (if you try to put a stub BatchLogic in Batches and then delete it before creating it from the text file). Of course, once InjectAndRunBatchLogic has finished running and BatchLogic exists in the module, one can call it. That’s not much use though, because I don’t have the option of restarting the VBA project flow.

The next thing I tried was to dynamically add a new module to XX and then add the new batch procedure to this new module. The idea was that the new module wouldn’t be compiled until BatchLogic was called. Here’s what the code looked like:

Sub InjectAndRunBatchLogic(filename As String)
    Dim text As Collection

    ’ Load VBA from file
    text = ReadFileAsCollection(filename)

    ’ Add a temporary module to XX
    Call AddModuleToProject(”Temporary”)

    ’ Add to XX
    Call AddProcedureToModule(”Temporary”, text)

    Application.Run (”BatchLogic”)
End Sub

This ended up working decently well, until I password-protected XX’s VBA project so that I could test it the way users use it. Turns out you can’t add modules to password-protected VBA projects. They can’t even add modules to themselves! There’s a workaround using SendKeys that lets you unlock the project, but every reference to the method that I’ve seen warns that it’s flaky at best.

So, back to the drawing board. Since the problem was XX being password-protected, how about creating a new workbook that isn’t password-protected and adding the module there? Worked like a charm. Here’s the code (bit more explicit this time):

Sub InjectAndRunBatchLogic(filename As String)
    Dim wbk As Workbook

    Application.ScreenUpdating = False

    ’ Create new workbook
    Set wbk = Workbooks.Add

    With wbk.VBProject
        ’ Add module to new workbook (default name: ‘Module1′)
        .VBComponents.Add (vbext_ct_StdModule)

        ’ Add the procedure from the batch file
        .VBComponents(”Module1″).CodeModule.AddFromFile(filename)

        ’ Add reference that points to XX
        .References.AddFromFile(ThisWorkbook.VBProject.filename)
    End With

    ThisWorkbook.Activate
    Application.ScreenUpdating = True

    Application.Run (”‘” & wbk.Name & “‘!BatchLogic”)

    ’ Get rid of the new workbook
    wbk.Close (False)
End Sub

You can ignore the ScreenUpdating and workbook activation code — that’s just there to make sure that XX is what remains visible to the user. Since the batch logic will now end up in a separate workbook, you can see I am adding a reference to XX so that it is able to call XX functions. This is actually a fairly nice solution because one can add arbitrary references to the new workbook without having to add them to XX.

So, boys and girls, that’s how that worked out. It’s always fascinating to me how powerful and flexible an environment VBA provides…but of course, that’s also the number one reason for abusing VBA to do things that really should be done in other places.

Wow Day

03Aug08

I have 3 Wows to give out today.

  • WhyNot.net: I was Googling to see if anyone else was curious about using active noise-cancelling techniques for turning down the volume (in some sense) on babies. There are serious technical challenges (such as that current noise-cancelling devices target a point sink), but it’s a cool idea and even if you could get a 50% effective system, it would be immensely useful. And profitable. Anyway, Google turned this up. It’s a bit disappointing that this is another idea that someone’s already thought of, but on the other hand it’s cool to be able to read the comments and see useful extensions (eg, this would be a godsend for snorers’ spouses). The website’s like a catalogue of people’s if-onlys and what-ifs — a goldmine for potential entrepreneurs.

  • BookMooch: This one has me seriously excited. I have a bunch of books that I have to lug every time I change flats. I really like most of them but haven’t really picked them up since I first read them. BookMooch is a community of people just like that who are happy to exchange books with each other. Last night, for instance, I got online and listed the 20 or so books I’m willing to give away, and requested a math book that someone in Illinois has. This morning, my inbox had an ‘accepted’ message from the Illinois dude and 4 requests for my books (one from New Zealand!). You might argue sending the book to NZ will pretty much offset whatever money I saved by getting the math book for free, but there’s a larger point here: this is far more efficient than stockpiling books that one never reads. And in return, I get access to a pool of books that might not be available in local bookstores, or even be out of print. Of course, countries like Pakistan have a very active second-hand book market where one can often exchange books with minimal friction/costs so this might prove less useful, but here in the UK, the second-hand book scene is much more limited. There’s an analogy to zakat (and more specifically, keeping money circulating in the economy) taking shape in my head that I think is appropriate; also, it appeases my burgeoning sense of eco- and anti-consumerist responsibility.

  • The final Wow will be shared between coLinux and andLinux. coLinux compiled the Linux kernel to run on Windows, where it’s hosted as a guest operating system with virtualised access to your hardware. It’s a really cool idea and its performance is pretty impressive. I haven’t run anything really heavy (ie, OpenOffice), but the things I have run (Perl, Octave, Vim) run superbly well on my Centrino 1.6/1.25GB. The caveat in that last sentence isn’t even that much of a problem for me, since I already have MS Office and am not interested in OO. andLinux built on the coLinux base to provide an easily installable package (2 packages actually: KDE (!) and XFCE). Installation was light and problem-free, and now I can install any additional packages I want with Synaptic. Unlike VMWare, I don’t start the VM manually, and while idle the resource consumption is completely unnoticeable. I have a nice auto-hiding XFCE bar from which I can bring up a terminal or launch the file explorer and that’s that. Linux the way I always wanted it: on my terms, and as I need it.

BTW, my laptop is falling apart. Seriously: it’s cracked in two places, scratched all over, the palmrests are a little worn down, and now the USB and power ports are starting to get loose. It’s lasted 4 years without a single hitch though, which I think is serious testament to how solid Fujitsu machines are.

Whoa.

16Jul08

Check this out. I need more time to process it (and read up on most of them)…but whoa. And there’s our man Aitzaz Ahsan at #5!

(Obligatory disclaimer: I don’t trust Aitzaz Ahsan for a minute.)

Thoughts on Kobe

05Jul08

A certain someone isn’t going to stop badgering me until I say something about the Lakers’ recent loss in the NBA finals, so here goes.

I’ve been defending Kobe more or less since 2000 now. I’ve never been a full-on fanboy, but in a sport populated by egotistical megalomaniacs, I admired his rabid desire to win, and most importantly the pressure he put on himself to excel. His detractors argued this was just a manifestation of his ego, that he was only interested in bolstering his legacy; I didn’t care, as long as he won.

And that’s why this hurts. It’s not the first time Kobe’s lost in the Finals (2004, when he, along with Shaq, Payton and Malone managed to lose to the Pistons), but it is the first time that he’s the sole leader of a team that has been bounced out of the Finals. He was frequently doubled by the best defensive team in the league, but that’s not a good enough excuse. His teammates absolutely disappeared, but after doing enough to get Kobe to the Finals, they can’t be blamed. The Finals are when the Jordans of the game take over and deliver victories, no matter how or what or why or who. It’s only fair that Kobe is the face of the team when they’re winning and when they’re losing. We demand that of our Jordans just as well.

The thing is, Jordans, plural, don’t exist. There was one Jordan, an icon who fortuitously came along at that period between eras where destiny offers individuals the opportunity to immortalise themselves. Magic and Bird were winding down, and the league had no one to rival the sheer athleticism and bloody-mindedness of Jordan. The two ingredients did exist, but separately, in the Drexlers and Isiahs of the world. True, the rules were tighter and players were more skilled, but a little thought reveals these facts cut both ways.

Maybe that’s why, in retrospect, I’m not all that surprised Kobe lost. He’s never been Jordan, not since he was drafted at the unlucky thirteen spot by a team he refused to play for. Not when he was labelled a copycat rather than an iconoclast as Jordan was. Not when forced to be second banana for several years, and having to learn leadership at 26 when Jordan started learning at 21. Not in being reviled by the public for an accusation that was later dropped where Jordan’s indiscretions (which many claim equalled Kobe’s) were carefully handled by Stern. Not in playing before people were willing to replace their basketball god, Jordan. And not in playing in a league where the level of athleticism has caught up to the point that other than Dwight Howard and Lebron, no one has an outright athletic advantage over others.

No, Kobe is not Jordan. But Kobe is Kobe, a guy who has conquered all the odds and the roadblocks fortune has thrown his way, willing himself into a position where the comparisons with Jordan, that most untouchable combination of talent and destiny, have at times not been unreasonable.

That’s saying something. That’s saying there’s good reason to keep watching as Kobe plots and attacks this latest roadblock to his only goal: winning.

Protected: Staying Awake

12Jun08

This post is password protected. To view it please enter your password below:




 

About

Uzair is.