Friday, December 08, 2006

Mac OSX Matrex start

In the Matrex Sourceforge pages Kusashi Nameta (JPN) reports the bug that the Mac OSX version of Matrex does not start correctly.

The problem is that I can test the system only on Windows and Linux, so things like that can happen. There should be someone testing the Mac version before it is published.

The solution, as you can see in my comments, is to add the -XstartOnFirstThread option to the java command.
To do that, edit the matrex_macosx.sh file in the Matrex directory and to:

java

subtitute:

java -XstartOnFirstThread

This should make it work correctly.
I'll publish myself a patched version of the Mac OSX setup file this weekend.

Sunday, November 26, 2006

What now?

Plans

After the release of Matrex version 1.0, it is time to make some plans for the future.

By now, I started to write the specification for the client/server architecture. This should start the process that takes to version 2.0 of Matrex.

After that:
  • some adapters must be updated: I'm thinking about the adapter to JRuby, which must at least be tested for the last version of JRuby and the adapter to Scilab that will soon reach version 4.1.
  • as usual I will try to test the adapter to Matlab, which has been ready for one year, but was never tested because I don't have Matlab. If someone wants to help, please contact me.
  • I will try to produce an adapter for the R statistical language.
  • start of the work to produce version 1.1.
Next versions

First of all, this is how I will number the next versions of Matrex:


  • A change in the first number means a major change: 2.0 will introduce the client server architecture.
  • A change in the second number means new important (visible) features: 1.1 will introduce the package to interact with Excel, the help framework and changes in the charts.
  • A change in the second number means bug fixing and minor feature changes: if a version 1.0.1 will be released, it will fix bugs in version 1.0.
I think these are the features to add in the next versions:
  • Package for the interaction with Excel: should contain import/export of Excel spreadsheet files, communication Excel->Matrex and Matrex->Excel via Java/COM.
  • Package for the interaction with OpenOffice.org: communication OpenOffice.org->Matrex and Matrex->OpenOffice.org (import/export can be done through .xls files).
  • Charts new features: scatter plot chart, possibility to have bars and lines in the same chart, chart attributes like line or bar type, vertical/horizontal bars.
  • Context help. Convert the "How To Use" document to a context help in the Matrex desktop application.
  • Add first phase in the expression parser where expressions of the type "a + b" (+,-,*,/,^) are converted to expressions of the type "plus(a,b)". This makes the expression parser simpler to understand.
  • Zip/unzip of projects, so that they can be easily sent to other users.
  • Custom dialogs to enter function parameters. They are needed for complex functions like query, csv and, in the future, link.
The idea is to publish these features in a set of 1.x versions (1.1, 1.2 ...) .
These versions should be released every 2/3 months to keep up the interest for the product.

The first major version will be Matrex 2.0, which will introduce the client/server Matrex.

Version 2.1 will (probably) introduce links between projects (also between projects running in different machines).

No more M1, M2 version will published; an RC1 version (or more if needed) will be released before the final version to have some feedback.

Friday, November 17, 2006

Matrex 1,0 final released!

Finally, after weeks of test, Matrex 1.0 final has been released!
The most important changes are the following:

  • Expression parser: the previous version was complex to use, expecially in the last step, where you had to choose the names of all the matrices and functions generated.
    With this version the last step requires only the names of the primary (root) matrix and function. The intermediate matrices and functions get a default name and are stored in a special package.

    Also, in the first step, two buttons have been added to select templates and matrices and add them to the input expression; so if you don't remember a template name or its parameters or a matrix name you get help.



  • Matrix editor: removed side effects when editing cells using the mouse; the editor remembers saves also the last edited cell content; better number and date parsing.
  • Added several function templates: among them:
    • sort, to sort several matrices using as index the first matrix, called key (like the pair of a map).
    • sumby and similar, similar to the select sum/group by functionality in SQL.
    • tail to tail a matrix with the content of another one.
    • queue to use a matrix as a queue, to which you queue values of other matrices

  • Removed any dependency of the Matrex API by the Matrex GUI, finally allowing to use Matrex as a library. In this way a Matrex project can be opened and used by any (java) application.
  • Presentation editor: better and easier interface. The format/position editor has been put in a separate dialog and substituted by a format/position viewer:


  • Presentation viewer: added vertical header:




    Also, it uses virtual tables. It is possible to view/edit large presentations without performance losses.

  • Matrix viewer: added vertical header:



  • Added a template full-text search dialog to find templates by their description:







Thursday, November 16, 2006

Matrex 1.0 final

I will probably release Matrex 1.0 final this night.
After a very hard test (GUI, events, templates) I think Matrex is ready for production.
This version has very few but important changes from RC1: easier to use and more professional GUI, more function templates and the Matrex API.

Wednesday, October 18, 2006

RMI for Client/Server?

Matrex 1.0 final is almost ready, so I'm thinking about the future of the system.
As you know, there will be at least two main versions after 1.0:
  • 1.1, with improvements in various areas of the desktop application.
  • 2.0, which will introduce the client/server architecture
With version 2.0, it will be possible for a Matrex desktop (or client) to connect to a Matrex server.
The server will host Matrex projects shared among serveral clients.

Matrex has been tought from the start as configurable as a client/server architecture, so it should be not very difficult to introduce the new architecture.

Anyway, some problems remained unresolved for a while, for example which protocol to use to connect client and server.
To choose it, I considered these facts:
  • The clients and servers can reside on a LAN, it does not make sense to have internet connections.
  • At least in a first stage only Matrex java clients will connect to the servers. If other programs need to use Matrex to calculate they can directly use the matrex API.
  • Performance is very important. The users expect the system to calculate fast even if it works in a network.
Here are the candidates possibilities:
  • SOAP
  • CORBA
  • Jini
  • RMI
The advantage of CORBA and SOAP is the compatibility with different platforms and languages; but this is not needed.
SOAP messages can pass through a firewall, but since Matrex just needs to work in a LAN, this is not needed.
On the other side, SOAP is the slowest protocol and CORBA is loosing popularity.

Jini gives high realiability and performance, but it is very complex. Using Jini for Matrex is like to hunt birds with tanks.

RMI is the right choice for Matrex. It is sure not the most recent (it was available already in Java 1.1) but it works fine, is fast and requires less coding, because is made exclusively for java.

Monday, October 02, 2006

SWT Wizard Component

I extracted from the code of Matrex 1 class and 2 interfaces to implement a generic wizard with the SWT GUI library.
It is used to implement the expression parser in Matrex:



You can find the code under the wizard directory in the Matrex CVS repository.
The license of Matrex is GPL, but may be for the wizard code it can be changed to LGPL.

Tuesday, September 26, 2006

Function Templates list

I added in the Matrex site a list of all function templates with:
  • name
  • description
  • input matrices
  • output matrices
  • parameters
I hope this will help to:
  • find fast which template you have to use to write your function
  • find out templates that are needed but missing
  • find out fails or imprecisions in the templates descriptions

SWT: table vertical header

Matrex RC1 introduced a vertical header column in the matrix editor table, to show the row index or, if needed, to show a vector as a vertical ruler.

The SWT table does not allow to have a vertical header column as needed for Matrex, i.e. such that:
  • is a fixed column in the table (it remains always visible scrolling horizontally)
  • has cells with raised border to distinguish it from the other columns.

In version RC1 I decided to not loose so much time verifying if there was a work around, so I simply used as vertical header the first column of the table and giving it a gray color to distinguish it from the others.

In version 1.0 final I want to have a vertical header in other two tables, the matrix viewer and the presentation viewer, just to show the row index.
Therefore I started to see if there is a better solution than the one implemented in version RC1.

So I tested the SWT snippet 2 which should solve the problem using as vertical header a different table with only one column, synchronized with the original one.
This is a good idea, but has these problems:
  • the vertical header table appears with a vertical scrollbar, which lets it appear disconnected from the main table. The scrollbar can be hidden, but I'm not sure it can be done on any platform.
  • the cells have no raised border.
  • sometimes the rows in the two tables appear slightly unsynchronized. This happens because the main table shows its first row only partially.
These problems (expecially the last one) make this solution not very usable and definitely do not give a professional look to the application.
Therefore by now I keep the solution used in version RC1. I'll see in the future if there will be other possibilities.

Monday, September 18, 2006

Again the Matrex database project

It looks like in the currently published version of matrexdb (in SourceForge) there is a logical error: the result column of the position presentation contains wrong values.
I will publish a new version of matrexdb this week with (hopefully) correct values.
This version will also contain a readme file with more details about the structure of the testdb project.

Sunday, September 10, 2006

Matrex 1.0

I'm working on the first production version of Matrex 1.0.
The changes from version 1.0 RC1 will probably be:
  • Virtual tables on presentations, to be able to show presentations based on very big matrices/vectors.
  • Row header for matrix viewer and presentation viewer.
  • Simplification of the Expression Parser when saving the produced matrices and functions. It will ask for the names of the final matrix but not of the intermediate ones.
  • New function templates (e.g. sumby, sort, asinh...). I'm trying to cover the most common needs.
  • Possibility to create matrices from the clipboard content, so you build a matrix directly from the copied block of a spreadsheet.
  • Better function debugging.
  • Better terminology (less X,Y more columns,rows).
If you have other ideas or needs please let me know!

Thursday, September 07, 2006

The Matrex database project

I published yesterday a Matrex sample project, matrexdb. You can find it under the Matrex download page in Sourceforge.
I published the project zip file alone because it is big (around 2.5 Mb), since it contains an entire database and the embedded McKoi library to access it from Matrex.

The database contains the data of a fake chocolate market (oh yeah), with products, spot and closing prices and more than 4000 contracts.
The database has been generated automatically by a Groovy script (also in the zip file), since it was impossible to find free data to build it. Even if fake, the data looks realistic.

The project queries the database using JDBC and contains a presentation (something like a read-only spreadsheet) showing the positions built with the contracts in the database (average prices, results).
The project contains also time charts showing the spot prices and the closing prices of two products.

The projects uses the function templates:
  • query to query the database
  • sumby to build the positions by product
  • lookup and at to connect the contracts with their products hours.
I think it is good demonstration of the features of Matrex.

Wednesday, September 06, 2006

FindBugs

Since I have the need to find the most possible amount of bugs in Matrex to come to version 1.0, which is supposed to be stable, I tried FindBugs (http://findbugs.sourceforge.net/) with success.

FindBugs finds possible bugs in your code.
Or better, as th FindBugs site says: "FindBugs looks for bugs in Java programs. It is based on the concept of bug patterns. A bug pattern is a code idiom that is often an error."

FindBugs can be used as a standalone application or as an Eclipse plugin. Since Matrex is an Eclipse project I tried both the installations. The Eclipse plugin is easier for me to use, because I don't need to configure anything, just install it and use it on my project.
That's how it looks:



You can see that the potential bugs are listed in the Problems view on the bottom; Clicking on a bug line, you get a good explanation of the problem in the Bug Details view on the left.


FindBugs does not pretend to be infallible. The idea is to find areas in your code that can potentially produce problems. So it can happen that some of the potential bugs FindBugs found in your project are false positives, i.e. not bugs. When you can see that one of the bugs of FindBugs is, in all cases, a false positive, you can remove it from the list of bugs FindBugs checks, changing the configuration:



As I told, my experience of FindBugs in my project Matrex is positive.
I have to tell that the current version of Matrex is beta, but it was tested several times, so I would not expect to find too many errors.
Findbugs found around 200 possible bugs, which is a considerable number, but not too high to check them one by one.
These are the bugs it found:

  • EI2: public method exposed internal rapresentation: FindBugs found this bug several times in the code of Matrix.
    One of the examples is in the Matrix class.
    The Matrix class is a light wrapper around a double array, that gives to it some features, among them thread synchronization.
    The array can be directly accessed from outside a Matrix object using the getValues method.
    Another method, setValues, has to be used to change the array.
    But some code could potentially access the array using getValues and change its content, causing inconsistencies.
    FindBugs proposes that getValues returns a copy of the array instead the array itself. That's generally a good solution, but cannot be applied to Matrex, which needs high performance, expecially when accessing matrices.
    So in this case all I could do is to add comment to the getValues method to tell how to use it.
    Since all the warning of this kind are not bugs in Matrex and the code cannot be changed to remove them, I remove EI2 from the list of bugs pattern in the configuration.

  • IS2: inconsistent synchronization:
    In some Matrex classes some of the methods was synchronized, some not.
    That is actually very dangerous and I fixed it.

  • MS: is not final but it should be:
    I forgot to declare some static constants as final. I don't find that dangerous because I know what is supposed to be constant in the Matrex code, but if more developers will work on the same code this bug could result in some problems.
    I fixed it.

  • RCN: null check of previously referenced value:
    I was calling a method of an object, then I was checking if the object was null (which is totally meaningless). In this case, I looked at the code and I could see that the object could not possibly be null. So, I removed the check. Nothing dangerous but confusing for someone reading the code.


  • SIC: should be a static inner class (was not static):
    There was some cases in Matrex in which a class had an inner class that was not declared static, even if it could. No danger, but better to have clean code.
    I fixed it.

  • URF: unread field:
    I use the java.util.logging API to trace Matrex. This requires to declare a static log variable in the class that needs to be traced. It happens that sometimes the variable is declared but never used. Again no danger, but better to have clean code.
    I removed the redundant declaration.

These potential bugs, expecially IS2 and RCN, are dangerous. So, I think FindBugs worked well, and I will re-examine the code of Matrex every time I will release a new version of it.

The only thing I can ask for is the possibility to exclude the checking of a potential bug category (like EI2) only for a certain set of classes: it can be that I know that these classes are correct even if FindBugs finds the bug pattern in them, but I still want to check the rest of the code for it.

Sunday, August 27, 2006

Virtual tables

In version 1.0 RC1 I used a feature of the SWT library, the virtual tables (SWT.VIRTUAL), in the matrix viewer.
This feature allows to show in the matrix viewer a matrix of thousand of rows and columns with a very good performance.
The idea is simple: since a table shows only a few rows at a time, you don't need to load the whole matrix in the table in one shot; you need to load only the rows that are shown in that moment.

My plan is to use the same feature also in the presentation viewer in a future version of the system (1.0 RC2?).

In this entry, I wanted to explain how I used the virtual tables in the matrix viewer.
For the implementation you can look in the CVS tree for the classes:

  • matrex.gui.viewer.matrix.MatrixViewer, the viewer

  • matrex.gui.viewer.matrix.MatrixViewerCache the cache for the viewer, containing the current version of the matrix (in other words, the viewer's model)



I based my code on the article SWT - Virtual Tables Tutorial.
As in the tutorial, I use the SWT.SetData event to fill the rows of the table. When the table shows a new row, it triggers a SWT.SetData event; in this event the code fills the row with data.
Differently from the tutorial, the matrix showed in the table can change its content at any time, and when this happens the table content must be updated.

Each time the matrix is updated, the code checks how big is the change, examining a small sample of items of the matrix:

  1. if only a small amount of items of the matrix is changed, only those items in the viewer need to be changed.

    Doing some test I discovered that extracting a row that the table did not show, using the statement table.getItem, the row is not null but each of its item is empty and the table does not try to populate it triggering a SWT.SetData event.
    Therefore it does not make sense to update a row that was not shown before.

    So, the viewer cache keeps an array that tells for each row if the row was shown before (for which an SWT.SetData event has been triggered).
    When an item of a row needs to be updated:


    • If the row was shown before it is updated immediately using the table.getItem.setText statement.

    • Otherwise nothing is done: if the table later shows the row, it triggers the SWT.SetData event for that row, which reads its current, updated content from the cache.



  2. Otherwise the table is cleared and filled up again with the new matrix; as we saw in this case the performances are good because the only rows that are loaded are the ones that are shown.
    To clear the table I used the following statements:

    table.clearAll();
    table.redraw();
    table.setItemCount([number of rows of the matrix]);

    The last statement triggers the SWT.SetData events for the shown rows.



I tested this solution with small and big matrices, small and big changes and it seems to work correctly and with good performances.

The Matrex API

From version RC1, the Matrex executable code is divided in 3 jar files:

  • matrex_api.jar containing the Matrex underlying structure, engine and file system interaction

  • matrex_fun.jar containing the java executable code for the function templates

  • matrex_gui.jar containing the code for the GUI



matrex_fun.jar was already there in version M2 (it was called function.jar), matrex_api.jar and matrex_gui.jar was merged in a single jar file, matrex.jar.

So what's the big deal?

Now you can use matrex_api.jar (probably together with matrex_fun.jar) as library in your own application.

For example if you have an application that needs to calculate a special function you can:

  1. start Matrex and build a project that calculates the function (with several matrix and function items)

  2. Let the application use that project (through matrex_api.jar and matrex_fun.jar) to calculate the function



In the future I will produce:

  • Javadoc documentation of matrex_api.jar

  • a guide for building an application that uses it

Matrix editor: custom headers

Often in a spreadsheet you need to work in the same time with two (or more) vectors that are connected by a key-value relationship.
For example:

  • days in a period and the quantity of rain in these days

  • categories of people with the average books they read in a year

  • cities with their population amount

  • bank accounts and the amount of money in them



In a spreadsheet you normally write them as two consecutive columns or rows.
In this way you can use the static vector (days, categories, cities, bank accounts) as ruler for the value vector (rain, books, population, money).

In Matrex that was not possible before version RC1. Matrex uses a "spreadsheet" concept only in the data presentation; to edit the vectors/matrices, it uses a matrix editor:



The editor can only edit one single vector/matrix, not two in the same time.

In version RC1 I solved the problem using customized headers.

In the editor popup menu, I added the Set header menu. It has the sub-menus:

  • Set vertical header

  • Set horizontal header



With the first sub-menu (vertical) you select a matrix int the project. The first, static, gray column will show the content of the first column of the selected matrix:



The second sub-menu (horizontal) you change the content of the first, static, gray row so that it shows the content of the first row of another matrix selected in the project.

In this way you can use the header data as ruler to write the content of the edited matrix.

Navigation

Matrex RC1 has a new feature, named navigation.



Navigation is based on the info window, which appear clicking on the info menu (Ctrl-I) on a tree item (matrix, function, chart ...).

The info windows are not new: they show some textual information about the item and the lists of items to which it is connected.
For example, the info window related to a matrix shows the function that calculates the matrix, the functions calculated with that matrix, the chart showing that matrix and the presentations containing that matrix.

What is new is that the each connected item displayed in the info window has a popup menu (view, edit, info).
Using the info menu you can navigate the connections of the project items: for example you can open the info window of a function, from it open the info window of one of the matrices that it calculates, from it open the info window of a chart that shows that matrix.

In this way you can, for example, know which items in the project change if you change the content of a matrix.

It is a powerful feature to understand the structure of a project.

The Matrex idea

I'm working with Excel in my job.
Generally we use Excel to calculate formulas on data extracted from a database and present everything as sheets and charts.
This is how we do it:
- write the SQL query
- get the result set in a sheet
- write the formulas for the cells of the result set's first row
- copy the formulas for all the other rows of the result set

It is easy to copy formulas from one cell to the other in Excel, but why we need to do it at all?
It would not be easier to get the result set as one vector for each column, and calculate the formulas on this vectors, not on sheet cells?

So I wrote Matrex.



Matrex is equivalent to a spreadsheet, but works with vectors, not cells.

So, to do the same in Matrex, we need to:
- write a function that produces vector/columns from an SQL query
- apply the functions/formulas on the resulting vectors (only once).

Working with Matrex you have also these advantages:
- you can name vectors/matrices and functions (formulas)
- multithreading
- in the future, client/server and distributed calculation