First part of the study focuses on product features. We had downloaded every system. We tried to use both the binary distribution and source code distribution if possible. We have installed the system and configured them. We have tried to set up equivalent configurations in each product. The same configuration was not possible and also not desired due to a significant variation of product design and philosophies. Therefore our primary goal was not to force each product to do the same thing the same way. We have rather tried to use the natural approach used by each of the projects.
Both functional and non-functional features and aspects of the systems were evaluated. Majority of the features is functional because this is an important part for practical IDM deployment. But we have also evaluated non-functional features such as performance, user interface response times, scalability, etc.
All the evaluated system was tested on the same hardware to ensure consistency if it was possible. When it was not possible we have done our best of using equivalent environments for the testing.
We have used a point system for the evaluation. We have awarded 0-5 points for each feature. We have chosen this system because a simple binary (yes/no) system does not make any sense for identity management system evaluation. All the evaluated IDM systems are flexible and programmable to some degree. Therefore even if a specific features is not supported out-of-the-box it can usually be at least partially supported using a custom code. The products are also open source therefore it is possible to modify the source code of the product itself and add the support for the feature. Therefore the points awarded to each feature reflect both the feature quality and also the ease of configuration/implementation.
Although we have tried to be as objective as possible this part of the study is inherently subjective. To reduce the subjectivity of the study we have tried to award the points for each feature as consistently as possible. We have used to the following guidelines:
|0||The feature is not supported at all. No support is present present out-of-the-box or the support is almost entirely useless. The support for the feature cannot be practically developed because product architecture prohibits it or because the development would be extremely difficult and very expensive.|
|1||The feature is not supported-out-of-the box or the support is mostly useless. However the support for the feature can be developed using a custom code. The custom development may still be quite demanding and/or expensive but it is practically feasible.|
|2||The feature is supported out-of-the-box but the support is very weak. Or the feature is not supported out-of-the-box but it is very easy to develop it using a custom code (not more than 1-2 man-days, including testing). But even if the the custom development is used the support is still below average.|
|3||The feature is present out-of-the-box and it has an average quality. The idea or the implementation is mostly copied from older systems without adding any significant unique aspect. If it is a complex and/or flexible feature that requires a heavy configuration and/or scripting then there is a sample or documentation that can be used as a starting point. The implementation must be very easy. Which mostly means that it is only required to develop a couple lines of scripting code or modify a couple of lines taken from the sample.|
|4||The feature is present out-of-the-box. The quality of the implementation is above average. It has some aspects that make it unique or outstanding among other evaluated systems.|
|5||The feature is present out-of-the-box. The quality is excellent. The feature has aspects that are unique not only among all the evaluated systems but also among all the IDM systems that we know about. The feature implementation is truly best-of-the-breed quality. We also award five points for features that are crucial part of almost any IDM solution and that are "commoditized" in a way that any IDM solution supports them in somehow standardized quality. We chose to do this to broaden a gap between the systems that do support and systems that do not support crucial IDM features.|
The points awarded to each individual features are simply summarized to produce totals.
The suitability scores were not computed from the feature scores. The suitability scores are provided only for informational purposes. The goal it so demonstrate that the products are not the same. A product which is well suitable for one environment can be completely unsuitable for another environment.
The source code analysis is automated. The source code for each project is taken from public source code repositories. The static source code analysis is performed by the cloc tool. The dynamic analysis is performed by processing individual commits to the source code base. Git is used as a primary tool for this part of the analysis. Source code repositories that does not publish git repositories are converted to the git format (e.g. by using git-svn module). Only commits that directly deliver source changes are considered (i.e. merge commits and empty commits are ignored).
We understand that the absolute values of source code metrics does not reveal much information. E.g. is it good or bad if the system has 300k lines of source code? But all the products use a equivalent platform (Java) and are designed to satisfy very similar set of requirements. Therefore the relative comparison of the source code metrics reveal precious information. E.g. a product that 300k lines of source code is likely to be much more comprehensive solution than a product that has 50k likes of source code. Such relative comparison is the primary goal of this part of the study.
The vertical axis of the quadrant chart shows project maturity. The maturity shows how good is the project today. It reflects how stable, usable and complete the project is. This is supposed to be proportional to the ability of product users to take it, deploy it and successfully use it right now.
The maturity score is determined by a formula:
The individual parts of the maturity score have the following meaning:
|Part||Meaning||Reasons for inclusion|
|age||Project age in calendar weeks||Age is one of the primary maturity metrics. It is obvious that older projects tend to be more mature - given that they are continuously developed. It looks like all evaluated projects are more-or-less continuously developed therefore a simple project age provides a good baseline for maturity evaluation.|
|developerstotal||Total number of people that have worked on the project (all time)||More people means more ideas. If a large team can agree to work on a single project to reach common goals this means that the result is likely to be better.|
|contributorstotal||Total number of people that have worked on the project and are not part of the core development team (all time)||It is usually easy enough to understand why the core development team works on the project. There may be financial interests (e.g. a salary or company shares), contracts, grants or any similar "hard" motivators. But it is much more interesting why people that are not part of the core development team get involved with the project. There is seldom a direct financial interest. Most of these contributors are volunteers or they have motivations that are independent from the motivations of the core team. Such contributors are very important indication of project maturity. Early and unstable projects are unlikely to attract independent contributors. Mature and stable projects are much more attractive. This is the primary reason why we put so much weight on project contributors. It is not because of the work that contributors do (which is usually only a very small part of the project). But it is because the simply act of choice to become a project contributor indicates that the project has reached a certain maturity stage.|
|commitstotal||Total number of all source code commits.||More commits usually means more development cycles which usually means that more work has been done. The projects with a high number of commits is likely to rework their code in a smooth continuous way whithout big "revolutions". This part of the formula is also used as a penalty for projects that are getting messy, that have long periods of inactivity with no commits or that deliver the code in big chunks "thrown over the fence". Such projects will have lower number of total commits as compared to projects with a sustained development rate. This also compensates for projects that not yet mature in "the open source way" and do not cooperate with their communities.|
The horizontal axis of the quadrant chart shows project progress. The progress indicates how quickly is the project improving. It suggests how much features and quality improvement the users can expect in a near future. Project that have high progress scores are likely to become leaders during next couple of years.
The progress score is determined by a formula:
The individual parts of the maturity score have the following meaning:
|Part||Meaning||Reasons for inclusion|
|commits1 year||Total number of all source code commits during last year.||The number of commits during a last year of development is the most important part of the progress score. It show how much alive the project is.|
|commits30 days||Total number of all source code commits during last 30 days.||The commits during a last month provide better indication of project dynamics as the commits over a longer period. Therefore we are adding these commits to the score again. They are already counted in the yearly sum of commits therefore adding them again is just increasing the weight of this particular aspect in the resulting score.|
|developers30 days||Total number of developers that contributed to the code during last 30 days.||The metric indicates the current size of the development team. Any developer that does not commit at least once per month can hardly be considered a regular and efficient member of the development team.|
|lastcommit||The time since the very last commit to the source code (in days)||This factor is used as penalty for projects that are not being developed or projects that have non-sustained development rates. Well managed project that has sustained development rate should never have a last commit more that a week or two ago. Therefore this penalty is negligible for active projects. However for discontinued or non-sustainable project this penalty is likely to be significant.|
Each product has s slightly different team culture and development processes. Therefore comparing the bulk of the source code in the form as it is distributed does not make a lot of sense. To compare apples to apples we have excluded the following sections from the source code distribution of every system:
We have also considered exclusion of the test code. But we have finally decided to include it. Although test code is strictly speaking not a deliverable but it has a very important role to play in project development. A system without a proper test code cannot be efficiently re-factored and evolved. Therefore although test code does not add much to the current product features it is a significant enabled for future features. The test code also adds to the overall product quality. Therefore we consider it to be fair to include it in the evaluation.
The evaluation script with a complete configuration is available for download. The script is provided to support transparency of this study. Everybody can have a look at the scripts and configuration and make sure that we have not cheated. Everybody can re-run the scripts and check our results.
Download the scripts here
Note: This distribution is not open source. The scripts and all the associated data are copyrighted. You can look at the scripts, check them, run them (in the original unmodified form) and check the results. But you do not have the right to re-distribute the scripts, modify them or modify the results that are produced by these scripts. And most importantly of all we are not giving a permission to use the evaluation data that we have gathered. There are good reasons for this. If you want to make your own study of IAM systems then do the same thing as we did: start from scratch. Download, install, configure and evaluate all the systems yourself. That's the only way how to make a fair comparison. Reusing part of our data is useless for this kind of fair comparison. And giving away permission to use the data would only be a temptation. Therefore we are not giving that away. If you just want to reuse the code (not the data) then we will be happy to allow it as long as you make the data and the scripts publicly available and give us a proper credit. But please contact us first. And be prepared to present your results for review before publishing so we can make sure that you have not reused any of our data.