Monthly Archives: August 2013

Transforming JavaScript JSON

Colt McAnlis posted a very interesting blog post (http://mainroach.blogspot.com/2013/08/json-compression-transpose-binary.html) this evening on using Transposing to reduce the JSON data size; his post was right on the money.

We have been using a similar technique for a couple years now.  (Although, we use a different compression method over websocket as gzip is too expensive in pure JavaScript).

However, one thing that I commented on is that he went to step one, and step two gives him better results -- it actually improves the compression.

I created my own "original dataset" to show this example.   The Dataset has Spaces here in the blog and show it for formatting purposes to make it easier to read; but all my numbers are excluding spaces and returns as a raw json wouldn't have those in it.

The original Data (265 Characters):

[{Id: 1, Name: 'Nathan', Address: 'Somewhere', Country: 'USA', City:'Here', State:'OK',Zip:'55555'},
 {Id: 2, Name: 'Colt', Address: 'Elsewhere', Country: 'USA', City: 'There', State: 'CA',Zip:'44444'}
 {Id: 3, Name: 'You', Address: 'Not Sure', Country: 'USA', City: 'Where', State: 'AZ', Zip:'33333'}]

Colt's Transposing (211 Characters):
{'id':[1,2,3],
'Name':['Nathan','Colt','You'],
'Address':['Somewhere','Elsewhere','Not Sure'],
'Country':['USA','USA','USA'],
'City':['Here','There','Where'],
'State':['OK','CA','AZ'],
'Zip': ['55555','44444','33333']}

We transpose it into basically a JSON CSV (206 Characters):
[['Id','Name','Address','Country','City','State','Zip'],
 [1,'Nathan','Somewhere','USA','Here','OK','55555'],
 [2,'Colt','Elsewhere','USA','There','CA','44444'],
 [3,'You','Note Sure','USA','Where','AZ','33333']]

Now for every additional row of data we add with this dataset you add:

Original: 48 Characters of Static unchanging field definitions. (Ouch!)
Colt's: 7 Characters
Ours: 9 Characters

So how do we end up with better compression when after a dozen or so records our raw size is actually larger than Colt's?    Well; we only use [] and comma's.   He has added additional data to his data stream in addition to the [] and commas, he has  {}, and the colons.    By having more redundancy in our stream we compress better.

Wait; there is another easy savings if you think about the data...    Why send the header row?  If you already know the layout of what you are requesting; you can entirely eliminate the header row; which would then shrink your "raw" data down another 55 characters.  Meaning we start out at a small 151 characters.

So if you are dealing with straight raw characters; Colt's method actually is smaller (after about 30 rows) .  However, If you are going to compress the stream; the additional redundancy in our transformation appears to be better suited to make smaller compressed files.

Measure everything and think about how you actually use your data might be the difference in how you send your data making all the difference in how fast your app actually responds to requests because Performance Matters.

Announcing fluentReports

https://github.com/Nathanaela/fluentreports

Fluent Reports is a reporting Engine that is written for a project that should see widespread public use toward the end of the year.  But beyond that; mum is the word.   The Kellpro management has given me permission to discuss certain technologies we are using and open source some of the modules we have developed to give back to the community; just as we have used several open source libraries for our project.     You can find several modules we have enhanced and/or submitted bug reports that we are using in our github account.   But this is our first module that is completely 100% home grown by the developers at Kellpro, Inc.     Internally it is called the "REPORTAPI".   Since that is just so well, lame; I am giving it a new name for the world at large: fluentReports  (fR)!

There are a couple "minor" things in fR that are very specific to our project; they will currently remain in this code base as it is easier for us to maintain our project and keep this easily synced if I can do a diff/copy/paste from our internal system to this github repository.  So if you see weird things like the function "lowerprototypes" that seems out of place; well it is and maybe, just maybe someone will create a minor build script that removes that out of the minified version.

Features:

  • Completely Data Driven.  You pass in the data; you tell it easily how to print the data, and it generates the PDF report.
  • Headers, Footers, Title Headers, Summary Footers
  • Grouping, nested grouping, and even more nested grouping, ...
  • Auto-Summing (and other automatic totals like max/min/count)
  • Sane defaults, and the ability to easily override not only the defaults but pretty much every aspect of the report generation.
  • Images, Gradients, Text, Fonts, Lines, and many other PDF features supported.
  • Page-able data loading
  • Sub-Reports, Sub-Sub-Reports, etc...
  • Bands (Tables) & Suppressed Bands (w/ column wrapping or column clipping)
  • Free Flow Text
  • Ability to override each part of the report for total customization of your report
  • Fluent API
  • Ability to put data over images; gradients, etc.
  • Quickly generate complex reports with minimal lines of code.

We are using PDF Kit as the PDF generation library; and as such there is currently only one bug that we know about that we can't work around but hopefully should be rare and a open bug ticket has been submitted to PDFKit with the fix, so hopefully it will be fixed before you even get to play with the library.

Please note the examples are very simple; I've had the report engine working for about a year now; and kept meaning to release it.   Finally I got some "spare" time to polish up the examples a bit and to get the domain name running.  And git it actually committed to github.