Data virtualization

Sep 21, 2009 at 10:32 AM

Hello,

I have very large sets of data which are impractical to keep entirely in memory.  It would be better to load them as required when the chart is panned or zoomed.  Has anyone here implemented a virtualized data source for the D3 line chart?  I am thinking of something similar to the techniques described here: http://bea.stollnitz.com/blog/?p=344

 

Thanks

Dan

 

Sep 21, 2009 at 1:31 PM
Edited Sep 21, 2009 at 1:32 PM
I am also very interested in the answer to this question.
 
I am using SQL Server 2008 FileStreaming.
 
I am working with Forex real-time tick data - which means there is a lot of data and the chart must be updated on a tick-by-tick basis.
 
David Roh
Editor
Sep 25, 2009 at 2:07 PM

Hi,

 

currently there is no such a data source, but it will definitely be implemented as it belongs to our internal roadmap. If you are going to try to implement is yourself, I advise you to take a look at datasource used in new markers, which are available in D3.markers project at 'source code' page.

Ready to answer at your particular questions.

Mikhail.

Editor
Sep 25, 2009 at 6:49 PM

New datasources in DynamicDataDisplay.Markers.dll were designed (and continues to be designed, as they are not of final quality) with thought about such cases.

They has access to information about current visible area and properties like this, so some descendants of this data source can take this info into account and load only necessary items. For example, if data is stored as a sequence of typed records in large file, such data source can seek file to begin reading only those records that will be displayed.

I'm not familiar with SQL Server in general and FileStreaming particularly, but I can suggest using, for example, LINQ-to-SQL requests like sqlDataSequence.Skip(some amout of records to skip).Take(some number to load).

When I have some extra time, I'll try to implement some of these data sources.

If you share some additional information about the way you are storing your large data, I can use it in designing such data sources.

Best regards,

Mikhail.

Sep 29, 2009 at 5:29 PM

My bulk points data will be stored in binary format in a file.  Each time a new page of data is requested, the source will need to asynchronously seek to the correct file location and read back the data points.  I am intending to implement a caching system which predicts and fetches pages of data points from file in advance of them being requested in order to prevent delays in the GUI during graph scrolling.

A possible problem I can forsee is working out how to allow someone to zoom fully out on a very large set of data points.  Ideally, a decimated subset of the full collection will be shown to save memory and processing time.  Working out a system to deal with caching in this data from file during zoom-in and zoom-out will be very hard I think.

Many thanks for the excellent library!

Dan