Technology > A Short Review of Dataframes in JavaScript

TECH ARTICLE | 8 MIN

A Short Review of Dataframes in JavaScript

June 5, 2018

A clear leading dataframe implementation in JavaScript has not emerged. This article gives a review of a selection of the current options and poses the question - “what would an ideal JavaScript dataframe look like?”.

Dataframes are an increasingly commonly used data structure. Dataframe implementations provide an API to access and manipulate 2-dimensional “tabular” data. In general the rows represent individual “cases” each of which consists of a number of observations or measurements (columns). The columns can be of differing data types. A special case of a dataframe is a time series. This is where each row represents a set of observations with a specific time attached. Time series data is extremely common in financial analysis - prices, quotes, volumes, exchange rates, etc.

There are mature dataframe implementations in many languages. In R dataframes are built in, Python has the extensive pandas library and Julia also has an implementation.

Common across these are abilities for:

  • Rows and columns can be easily referenced by name or label with various indexing methods.
  • Filtering and selecting subsets of the data.
  • Numerical Analysis - mean, standard deviation etc.
  • Reshaping - groupby, pivot etc.
  • Handling Missing Data - gracefully handling gaps in data.
  • Joining multiple dataframes.
  • Reporting/Plotting - outputting textual and graphical views of the data.

Let’s look at some options for dataframe-like operations in JavaScript.

pandas-js

[https://github.com/StratoDem/pandas-js]

const df = new DataFrame(Immutable.Map({x: new Series([1, 2]), y: new Series([2, 3])}));

// Returns DataFrame(Immutable.Map({x: Series([2]), y: Series([3]));
df.filter(df.get('x').gt(1));

// Returns DataFrame(Immutable.Map({x: Series([2]), y: Series([3]));
df.filter([false, true]);

// Returns DataFrame(Immutable.Map({x: Series([2]), y: Series([3]));
df.filter(Immutable.Map([false, true]));

Pandas-js is an experimental library mimicking the Python pandas API in JavaScript. The Python pandas library is built on top of NumPy for its data storage. Panda-js mirrors this structure by building on top of immutable.js. It is well documented and aims to implement a large subset of pandas functionality. The documentation even references the pandas equivalent functions and the internals also resemble pandas internal structure e.g. a DataFrame is a subclass of a NDFrame (n-dimensional data structure). It has been open sourced since late 2016 and is being committed to actively. The code quality of the project looks good and it includes an extensive test suite. It has just over 150 stars (as of May 2018), so has yet to gain widespread popularity.

It implements many of pandas numerical analysis methods and reshaping operations. This library’s joining and index matching abilities are limited currently. It has some limited support for missing values.

Attempting to reproduce the pandas API is both a pro and a con in my opinion. Leveraging the pandas API and name gives the project name recognition and a clear focus. However JavaScript is not Python and following the Python API may end up being a limiting factor. Also it could be argued that the pandas dataframe API is somewhat bloated and should not be repeated.

Ubique

[https://github.com/maxto/ubique]

// set variables
var x = [0.003,0.026,0.015,-0.009,0.014,0.024,0.015,0.066,-0.014,0.039];
var y = [-0.005,0.081,0.04,-0.037,-0.061,0.058,-0.049,-0.021,0.062,0.058];
var z = [0.04,-0.022,0.043,0.028,-0.078,-0.011,0.033,-0.049,0.09,0.087];

// Concatenate X,Y and Z along columns, returns a matrix W with size 10x3
var W = ubique.cat(1,x,y,z);
// [ [ 0.003, -0.005, 0.04 ],
//   [ 0.026, 0.081, -0.022 ],
//   [ 0.015, 0.04, 0.043 ],
//   [ -0.009, -0.037, 0.028 ],
//   [ 0.014, -0.061, -0.078 ],
//   [ 0.024, 0.058, -0.011 ],
//   [ 0.015, -0.049, 0.033 ],
//   [ 0.066, -0.021, -0.049 ],
//   [ -0.014, 0.062, 0.09 ],
//   [ 0.039, 0.058, 0.087 ] ]

// Get statistics for matrix W
ubique.size(W) // size of the matrix
ubique.nrows(W) // number of rows
ubique.ncols(W) // number of columns
ubique.mean(W) // average value for columns
ubique.std(W) // standard deviation (sample)

Ubique is more similar to a NumPy implementation, than to a full dataframe implementation. It supports vectors and matrices but crucially does not support named columns. Ubique’s real strength is its implementations of many useful numeric and financial functions. Everything from kurtosis to the Sortino ratio is included. It is well tested but unfortunately no longer developed. One interesting design decision is that functions are not methods on the Matrix object itself which allows a Matrix to have a simpler API. This may lend itself to picking and choosing of methods to keep the code size down.

Gauss

[https://github.com/fredrick/gauss]

var Collection = gauss.Collection;
var things = new Collection(
    { type: 1, age: 1 },
    { type: 2, age: 2 },
    { type: 1, age: 3 },
    { type: 2, age: 4 });
things
    .find({ type: 2 })
    .map(function(thing) { return thing.age; })
    .toVector() // Scope chained converter, converting mapped collection of ages to Vector
    .sum();

var numbers = new gauss.Vector([8, 6, 7, 5, 3, 0, 9]);
numbers.min();
//As above but with a callback parameter,
numbers.min(function(result) {
    result / 2;
    /* Do more things with the minimum*/
});

Gauss has over 400 stars on GitHub but is no longer under active development. Its main utility is a kind of enhanced JavaScript Array with added numeric methods known as a Vector. The Vector object provides set of basic statistical methods that can be run on . Vector instances can be passed to various binary operations to allow multiplications, additions etc. One feature worth noting is that each function allows passing in callbacks for handling the results, which is nice adaptation to JavaScript norms. The API also particularly lends itself to method chaining which can lead to nice terse readable code.

Others worth mentioning

  • MathJS implements Matrix operations but has more of a linear algebra focus.
  • Crossfilter Is not really a dataframe implementation but offers super fast filtration and reduction functionality on datasets. This can give a really nice interactive experience. Crossfilter uses sorted indexes for fast performance of filtering, histograms and top N lists.
  • D3.js and Vega datalib. These might seem an odd inclusions here but despite their focus on visualising data, they can perform efficient filtering, reshaping and reduction operations. They are not dataframe implementations but have some overlapping usecases. The fact that these visualisation projects have needed to create their own data wrangling implementations shows the clear requirement for a solution in this space.

What next?

Clearly the implementation of dataframes in JavaScript is a relatively immature space. Given this immaturity and by contrast the maturity of dataframes in other languages, you may ask why even bother with JavaScript dataframes?

In my opinion the most persuasive usecase is providing a JavaScript browser based dataframe API. With a JavaScript dataframe the browser can load the data once and perform reshaping and analysis on it on demand. This removes the need for superfluous roundtrips to the backend, allowing low latency interactive GUIs to be built. JavaScript performance even in mobile browsers is now fast enough to support this analysis on moderately sized datasets.

For modern component based front end frameworks having a commonly understood JavaScript browser based dataframe would be a huge boon. We could create web component libraries to display, plot and explore tabular data without having to reinvent the data manipulation wheel each time. In Python an ecosystem of libraries has grown up around the pandas and NumPy libraries. An ecosystem of display components could grow around a JavaScript dataframe API in a similar way.

E.g. component based JavaScript libraries - Vue.js, React, Polymer etc. could implement components with dataframe support.

<rangefilter data="myDataFrame" column="quantity"></rangefilter>
<datatable data="myDataFrame"></datatable>
<histogram data="myDataFrame" show-quartiles></histogram>

Given the above component examples and by taking the best bits from the discussed libraries we can collect some features that an ideal JavaScript dataframe would provide.

  • It would provide reshaping & groupby abilities - like pandas & pandas-js?
  • The ability to do fast filtering and indexing would be very useful - using the clever Crossfilter indexing?
  • Provide well tested numeric method implementations with support for missing data. This is a must have. Ubique provides a leading implementation at the moment.
  • The ideal solution would have a simple small API. Most methods should not be implemented as part of the dataframe itself, but rather as separate optional modules to help keep required download size small. This also helps avoid the API explosion that pandas suffers from. This should take the best bits from Ubique and Gauss?
  • A JavaScript dataframe should have a JavaScript first API. It would use the best of JavaScript - a functional style, callbacks, promises, option to use webworkers, typed binary arrays etc.
  • The data needs to get to the browser quickly. Fast from_json and to_json serialisation methods would help with this. In addition the ability to deserialise binary data to avoid the overhead of json encoding and decoding would be beneficial.
  • Native Arrow support would be a possibility, as would support for the format used by our open source tool Arctic

This is quite a wishlist! I think pandas-js with a slight change in focus might get there. But a clean slate project using the best of each existing solution might be a better bet in the long run.

For further clarification on the terms which appear here, please visit our Glossary page.

This information is communicated and/or distributed by the relevant Man entity identified below (collectively the "Company") subject to the following conditions and restriction in their respective jurisdictions.

Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc (‘Man’). These opinions are subject to change without notice, are for information purposes only and do not constitute an offer or invitation to make an investment in any financial instrument or in any product to which the Company and/or its affiliates provides investment advisory or any other financial services. Any organisations, financial instrument or products described in this material are mentioned for reference purposes only which should not be considered a recommendation for their purchase or sale. Neither the Company nor the authors shall be liable to any person for any action taken on the basis of the information provided. Some statements contained in this material concerning goals, strategies, outlook or other non-historical matters may be forward-looking statements and are based on current indicators and expectations. These forward-looking statements speak only as of the date on which they are made, and the Company undertakes no obligation to update or revise any forward-looking statements. These forward-looking statements are subject to risks and uncertainties that may cause actual results to differ materially from those contained in the statements. The Company and/or its affiliates may or may not have a position in any financial instrument mentioned and may or may not be actively trading in any such securities. Unless stated otherwise all information is provided by the Company. Past performance is not indicative of future results.

Unless stated otherwise this information is communicated by the relevant entity listed below.

Australia: To the extent this material is distributed in Australia it is communicated by Man Investments Australia Limited ABN 47 002 747 480 AFSL 240581, which is regulated by the Australian Securities & Investments Commission ('ASIC'). This information has been prepared without taking into account anyone’s objectives, financial situation or needs.

Austria/Germany/Liechtenstein: To the extent this material is distributed in Austria, Germany and/or Liechtenstein it is communicated by Man (Europe) AG, which is authorised and regulated by the Liechtenstein Financial Market Authority (FMA). Man (Europe) AG is registered in the Principality of Liechtenstein no. FL-0002.420.371-2. Man (Europe) AG is an associated participant in the investor compensation scheme, which is operated by the Deposit Guarantee and Investor Compensation Foundation PCC (FL-0002.039.614-1) and corresponds with EU law. Further information is available on the Foundation's website under www.eas-liechtenstein.li.

European Economic Area: Unless indicated otherwise this material is communicated in the European Economic Area by Man Asset Management (Ireland) Limited (‘MAMIL’) which is registered in Ireland under company number 250493 and has its registered office at 70 Sir John Rogerson's Quay, Grand Canal Dock, Dublin 2, Ireland. MAMIL is authorised and regulated by the Central Bank of Ireland under number C22513.

Hong Kong SAR: To the extent this material is distributed in Hong Kong SAR, this material is communicated by Man Investments (Hong Kong) Limited and has not been reviewed by the Securities and Futures Commission in Hong Kong.

Japan: To the extent this material is distributed in Japan it is communicated by Man Group Japan Limited, Financial Instruments Business Operator, Director of Kanto Local Finance Bureau (Financial instruments firms) No. 624 for the purpose of providing information on investment strategies, investment services, etc. provided by Man Group, and is not a disclosure document based on laws and regulations. This material can only be communicated only to professional investors (i.e. specific investors or institutional investors as defined under Financial Instruments Exchange Law) who may have sufficient knowledge and experience of related risks.

Switzerland: To the extent this material is made available in Switzerland the communicating entity is:

  • For Clients (as such term is defined in the Swiss Financial Services Act): Man Investments (CH) AG, Huobstrasse 3, 8808 Pfäffikon SZ, Switzerland. Man Investment (CH) AG is regulated by the Swiss Financial Market Supervisory Authority (‘FINMA’); and
  • For Financial Service Providers (as defined in Art. 3 d. of FINSA, which are not Clients): Man Investments AG, Huobstrasse 3, 8808 Pfäffikon SZ, Switzerland, which is regulated by FINMA.

United Kingdom: Unless indicated otherwise this material is communicated in the United Kingdom by Man Solutions Limited ('MSL') which is a private limited company registered in England and Wales under number 3385362. MSL is authorised and regulated by the UK Financial Conduct Authority (the 'FCA') under number 185637 and has its registered office at Riverbank House, 2 Swan Lane, London, EC4R 3AD, United Kingdom.

United States: To the extent this material is distributed in the United States, it is communicated and distributed by Man Investments, Inc. (‘Man Investments’). Man Investments is registered as a broker-dealer with the SEC and is a member of the Financial Industry Regulatory Authority (‘FINRA’). Man Investments is also a member of the Securities Investor Protection Corporation (‘SIPC’). Man Investments is a wholly owned subsidiary of Man Group plc. The registration and memberships described above in no way imply a certain level of skill or expertise or that the SEC, FINRA or the SIPC have endorsed Man Investments. Man Investments Inc, 1345 Avenue of the Americas, 21st Floor, New York, NY 10105.

This material is proprietary information and may not be reproduced or otherwise disseminated in whole or in part without prior written consent. Any data services and information available from public sources used in the creation of this material are believed to be reliable. However accuracy is not warranted or guaranteed. © Man 2025