Dealing with decadent dataDealing with decadent data

As threatened in the forum, I’ve finally tackled DataView integration for Databinder. Now you can page through those 100,000-row tables all day long. Sweet!

One reason for the delay was my reluctance to link to wicket-extensions; we already get enough complaints about dependency downloads. But fortunately it was possible to implement the methods of IDataProvider without actually declaring it. So Databinder won’t link to extensions, but if your project does there’s a mighty fine data provider waiting for you to extend and tag it IDataProvider.

The other reason I was so slow in getting to this is that I don’t care for pagination. Efficient pagination of large datasets is the main purpose of DataView; when using a large paginated ListView, you’re just wasting memory and cycles on objects you don’t read. The thing is, just as processing efficiency becomes critical in paging, the interface becomes useless.

What are you supposed to, I wonder, with one page out of 663? Will page 451 ever see the light of day? Users filter down the results, or order them by some criterion and skim through the first few pages. That’s great, but you could do the same thing with a ListView limited to the first 200 results; users can refine the query if they’re interested in the rest of the data. Or better yet, come up with an interface that doesn’t present such problems.

But back in the real world, large-scale pagination interfaces are expected by both users and programmers. Everyone is impressed, or at least relieved, when Rails scaffolding coughs up a (presumably efficient) paginated list. Not having an efficient one is like showing up for work impeccably dressed, but barefoot.

Conceding this point, Databinder has at last slipped on some Pumas. Now it includes a straightforward and tight method for paginating huge result sets with DataView. As always, there’s an appropriate example app: it’s time to look up baseball players, everybody!

As you can see in the Pager class, the imaginary connection between Databinder and DataView is not painful to solidify. You manage sorting and filtering in your subclass, and that’s about it. (Please do help me test it, particularly if you were already using your own data provider. It’s in the latest snapshot.)

So people looking to check efficient pagination off their lists when evaluating Databinder can finally exhale. It’s there, it’s “easy.” DataTable works the way I imagine expensive JSF components to work. You throw an empty tag into your template, close your eyes and start the server… then suddenly all these features your boss thinks he wants are sparkling on the screen.

Well, anyway. Have fun with those 16,566 baseball players. (Thanks, baseball archive.)


On repeaters:
Even if you don’t want pagination but just a lot of rows, kind of a long and plain report, I think ListView can’t help you, can it?
Don’t you need something that can render as the data arrives from the database?

Hibernate always returns a List, so even with DataView you’re not streaming perfectly. (No rows can be garbage collected until the is iterator is discarded.)

On the other hand, DataView does avoid having a ListItem-like child for each row. That’s a real advantage for this case. Good point!

I take that back. There is an Item class for DataView. How its memory consumption compares to ListView isn’t something I can answer from skimming through the code for the first time. It’s reasonable to expect, though, that since pains have been taken to optimize DataView for large data sets, it’s probably superior at handling them in a number of subtle ways.

Good thing Databinder supports it now.

Add a comment