In the next branch of Haskell Platform we’ll be adding and removing packages from the specification for the first time. The Haskell Platform steering committee will make recommendations for additions and removals based on individual proposals to add and remove packages from the list.
It is hard to come up with “notability” criteria for why a package should be added or removed. There are many competiting reasons why people use the Haskell Platform, and what packages they need.
The goal though should be an almost fully automated criteria for determining when a package should be added, based on objective data. Then, combined with strategic and other concerns, packages will be added or, sometimes, removed.
Possible Criteria for Notability
A quick list of possible criteria by which to evaluate whether a package is “blessed”:
- How popular is the package in Hackage downloads?
- How many packages depend on it?
- Do any applications of note depend on it?
- Does it meet a stated end-user need?
- Do similar systems include such a library (e.g. Python)?
- Is it portable?
- Does it add additional C libraries?
- Does it follow the package versioning system?
- Is the code of good quality?
- Does it have a good development history?
- Is it on hackage?
- Does it provide haddock documentation?
- Does it come with examples?
- Does it have a test suite?
- Does it have a maintainer?
- Does it in turn require new Haskell dependencies?
- Does it have a simple/configure-based Cabal build?
- Does it conflict/compete with existing functionality?
- Does it reuse existing types?
- Does it follow the hierarchical naming conventions?
- Is it -Wall clean?
- Have declared correctness or performance statements?
- Is it BSD licensed?
- Is it thread-safe?
A Point System
One way of determining notability for a package would be to use a points system against an agreed-upon set of such criteria.
Does anyone know of similar examples, or would like to code up some programs to experiment with these ratings?
Distro Page Rank
Another source of raw data may well be a sort of “Page Rank” across unix distros for how often a package is used. On the Arch Linux distribution, we have 3 level support for Haskell. In the core system some Haskell apps and tools are provided in binary form. In the “community” binary repo there are yet more packages. Finally, in the user-contributed repository are around 1300 other packages (~90% of Hackage).
Does your distro have popularity statistics? Could you determine the top 100 Haskell package by vote?
Most Popular Packages in Arch Linux
Some users install packages with the ‘yaourt’ tool, and some of those users opt in to voting when they install. Here’s the top 100 packages sorted by votes in Arch Linux, with those that are in the Haskell Platform already, indicated:
| HP | Repository | Category | Library/Program | Votes | Synopsis | Notes |
| Extra | darcs | Decentralized replacement for CVS with roots in quantum mechanics | ||||
| Extra | haskell-extensible-exceptions | Extensible exceptions | darcs dep | |||
| Extra | haskell-hashed-storage | Hashed file storage support code. | darcs dep | |||
| Extra | haskell-haskeline | A command-line interface for user input, written in Haskell. | darcs dep | |||
| Extra | haskell-mmap | Memory mapped files for POSIX and Windows | darcs dep | |||
| Extra | haskell-terminfo | Haskell bindings to the terminfo library. | darcs dep | |||
| Extra | haskell-utf8-string | Support for reading and writing UTF8 Strings | darcs dep | |||
| YES | Extra | ghc | The Glasgow Haskell Compiler | |||
| Extra | hugs98 | Haskell 98 interpreter | ||||
| YES | Extra | happy | The Parser Generator for Haskell | |||
| YES | Community | alex | a lexical analyser generator for Haskell | |||
| Community | gtk2hs | A GTK+2 binding for Haskell | ||||
| YES | Community | haskell-http | A library for client-side HTTP | cabal dep | ||
| YES | Community | cabal-install | The command-line interface for Cabal and Hackage. | |||
| Community | haskell-x11 | A Haskell binding to the X11 graphics library. | xmonad dep | |||
| Community | haskell-x11-xft | Bindings to the Xft, X Free Type interface library, and some Xrender parts | xmonad dep | |||
| YES | Community | haskell-zlib | Compression and decompression in the gzip and zlib formats | cabal dep | ||
| Community | pandoc | Haskell library and program to convert one markup format to another | ||||
| Community | xmonad | A lightweight X11 tiled window manager written in Haskell | ||||
| Community | xmonad-contrib | Add-ons for xmonad | xmonad dep | |||
| lib | haskell-binary 0.5.0.1-1 | 98 | Binary serialisation for Haskell values using lazy ByteStrings | |||
| YES | lib | haskell-opengl 2.2.1.1-1 | 56 | A binding for the OpenGL graphics system | ||
| lib | haskell-hslogger 1.0.7-2 | 51 | Versatile logging framework | |||
| lib | haskell-puremd5 1.0.0.0-1 | 48 | MD5 implementations that should become part of a ByteString Crypto package. | |||
| YES | lib | haskell-syb 0.1.0.0-1 | 48 | Scrap Your Boilerplate | ||
| YES | devel | haddock 2.4.2-1 | 46 | A documentation-generation tool for Haskell libraries | ||
| devel | haskell-xft 0.2-2 | 46 | Bindings to the Xft library, and some Xrender parts | |||
| lib | haskell-ghc-paths 0.1.0.5-1 | 45 | Knowledge of GHC’s installation directories | |||
| lib | haskell-haxml 1.13.3-1 | 42 | Utilities for manipulating XML documents | |||
| lib | haskell-missingh 1.1.0-1 | 40 | Large utility library | |||
| lib | haskell-testpack 1.0.2-1 | 36 | Test Utililty Pack for HUnit and QuickCheck | |||
| YES | lib | haskell-time 1.1.2.4-1 | 36 | A time library | ||
| lib | haskell-uniplate 1.2.0.3-1 | 36 | Uniform type generic traversals. | |||
| lib | haskell-diff 0.1.2-1 | 35 | O(ND) diff algorithm in haskell. | |||
| YES | lib | haskell-mtl 1.1.0.2-1 | 35 | Monad transformer library | ||
| YES | lib | haskell-regex-base 0.93.1-1 | 33 | Replaces/Enhances Text.Regex | ||
| YES | lib | haskell-parsec 3.0.0-1 | 32 | Monadic parser combinators | ||
| devel | cpphs 1.7-1 | 31 | A liberalised re-implementation of cpp, the C pre-processor. | |||
| lib | haskell-curl 1.3.5-1 | 31 | Haskell binding to libcurl | |||
| lib | haskell-hinotify 0.2-1 | 31 | Haskell binding to INotify | |||
| lib | haskell-transformers 0.1.4.0-1 | 31 | Concrete monad transformers | |||
| lib | haskell-unix-compat 0.1.2.1-1 | 31 | Portable POSIX-compatibility layer. | |||
| devel | cabal2arch 0.5.3-1 | 30 | Create Arch Linux packages from Cabal packages | |||
| lib | haskell-fingertree 0.0.1.0-1 | 30 | Generic finger-tree structure, with example instances | |||
| lib | haskell-haskell-src-exts 1.0.1-1 | 30 | Manipulating Haskell source: abstract syntax, lexer, parser, and pretty-printer | |||
| YES | lib | haskell-glut 2.1.1.2-1 | 29 | A binding for the OpenGL Utility Toolkit | ||
| lib | haskell-pcre-light 0.3.1-2 | 29 | A small, efficient and portable regex library for Perl 5 compatible regular expressions | |||
| lib | haskell-rosezipper 0.1-1 | 29 | Generic zipper implementation for Data.Tree | |||
| devel | hscolour 1.13-1 | 28 | Colourise Haskell code. | |||
| lib | haskell-data-accessor 0.2.0.2-1 | 26 | Utilities for accessing and manipulating fields of records | |||
| lib | haskell-data-accessor-template 0.2.1.1-1 | 26 | Utilities for accessing and manipulating fields of records | |||
| lib | haskell-regex-tdfa 1.1.2-2 | 26 | Replaces/Enhances Text.Regex | |||
| lib | haskell-xml 1.3.4-1 | 26 | A simple XML library. | |||
| lib | haskell-hsh 2.0.2-1 | 25 | Library to mix shell scripting with Haskell programs | |||
| lib | haskell-split 0.1.1-1 | 25 | Combinator library for splitting lists. | |||
| lib | haskell-utility-ht 0.0.5.1-1 | 25 | Various small helper functions for Lists, Maybes, Tuples, Functions | |||
| lib | haskell-vty 3.1.8.4-1 | 25 | A simple terminal access library | |||
| lib | haskell-syb-with-class 0.5.1-1 | 24 | Scrap Your Boilerplate With Class | |||
| YES | lib | haskell-cgi 3001.1.7.1-1 | 23 | A library for writing CGI programs | ||
| YES | lib | haskell-fgl 5.4.2.2-1 | 23 | Martin Erwig’s Functional Graph Library | ||
| devel | derive 0.1.4-1 | 22 | A program and library to derive instances for data types | |||
| lib | haskell-monads-fd 0.0.0.1-1 | 21 | Monad classes, using functional dependencies | |||
| devel | haskell-pandoc 1.2.1-1 | 21 | Conversion between markup formats | |||
| lib | haskell-safe 0.2-1 | 21 | Library for safe (pattern match free) functions | |||
| lib | haskell-zip-archive 0.1.1.3-1 | 21 | Library for creating and modifying zip archives. | |||
| YES | lib | haskell-bytestring 0.9.1.4-1 | 20 | Fast, packed, strict and lazy byte arrays with a list interface | ||
| lib | haskell-configfile 1.0.4-2 | 20 | Configuration file reading & writing | |||
| lib | haskell-data-accessor-monads-fd 0.2-1 | 20 | Use Accessor to access state in monads-fd State monad class | |||
| lib | haskell-hstringtemplate 0.6-1 | 20 | StringTemplate implementation in Haskell. | |||
| lib | haskell-pointedlist 0.3.5-1 | 20 | A zipper-like comonad which works as a list, tracking a position. | |||
| YES | lib | haskell-quickcheck 2.1.0.1-2 | 20 | Automatic testing of Haskell programs | ||
| lib | haskell-convertible 1.0.5-1 | 19 | Typeclasses and instances for converting between types | |||
| lib | haskell-digest 0.0.0.6-1 | 19 | Various cryptographic hashes for bytestrings; CRC32 and Adler32 for now. | |||
| lib | haskell-hdbc 2.1.1-1 | 19 | Haskell Database Connectivity | |||
| network | twidge 0.99.3-1 | 19 | Unix Command-Line Twitter and Identica Client | |||
| lib | haskell-hspread 0.3.3-1 | 18 | A client library for the spread toolkit | |||
| lib | haskell-readline 1.0.1.0-1 | 17 | An interface to the GNU readline library | |||
| lib | haskell-strict 0.3.2-2 | 17 | Strict data types and String IO. | |||
| lib | haskell-happs-util 0.9.3-1 | 16 | Web framework | |||
| devel | hoogle 4.0.7-1 | 16 | Haskell API Search | |||
| editors | yi 0.6.1-1 | 16 | The Haskell-Scriptable Editor | |||
| lib | haskell-findbin 0.0.2-1 | 15 | Locate directory of original program | |||
| lib | haskell-glfw 0.3-1 | 15 | A binding for GLFW, An OpenGL Framework | |||
| lib | haskell-json 0.4.3-1 | 15 | Support for serialising Haskell to and from JSON | |||
| YES | lib | haskell-network 2.2.1.4-1 | 15 | Networking-related facilities | ||
| lib | haskell-stream 0.3.2-1 | 15 | A library for manipulating infinite lists. | |||
| lib | haskell-tagsoup 0.6-2 | 15 | Parsing and extracting information from (possibly malformed) HTML documents | |||
| YES | lib | haskell-editline 0.2.1.0-2 | 14 | Bindings to the editline library (libedit). | ||
| lib | haskell-sdl 0.5.5-1 | 14 | Binding to libSDL | |||
| editors | leksah 0.6.1-1 | 14 | Haskell IDE written in Haskell | |||
| devel | c2hs 0.16.0-1 | 13 | C->Haskell FFI tool that gives some cross-language type safety | |||
| lib | haskell-hsx 0.5.6-1 | 13 | HSX (Haskell Source with XML) allows literal XML syntax to be used in Haskell source code. | |||
| devel | hlint 1.6.4-1 | 13 | Source code suggestions | |||
| lib | haskell-crypto 4.2.0-1 | 12 | Collects together existing Haskell cryptographic functions into a package | |||
| lib | haskell-hdbc-sqlite3 2.1.0.2-1 | 12 | Sqlite v3 driver for HDBC | |||
| lib | haskell-highlighting-kate 0.2.4-1 | 12 | Syntax highlighting | |||
| lib | haskell-hjavascript 0.4.4-1 | 12 | HJavaScript is an abstract syntax for a typed subset of JavaScript. | |||
| lib | haskell-hjscript 0.4.4-1 | 12 | HJScript is a Haskell EDSL for writing JavaScript programs. | |||
| devel | mkcabal 0.4.2-2 | 12 | Generate cabal files for a Haskell project | |||
| lib | haskell-arrows 0.4.1.1-1 | 11 | Arrow classes and transformers | |||
| lib | haskell-filemanip 0.3.2-1 | 11 | Expressive file and directory manipulation for Haskell. | |||
| lib | haskell-happs-data 0.9.3-1 | 11 | HAppS data manipulation libraries | |||
| lib | haskell-happs-ixset 0.9.3-1 | 11 | ||||
| lib | haskell-happs-state 0.9.3-1 | 11 | Event-based distributed state. | |||
| lib | haskell-harp 0.4-1 | 11 | HaRP allows pattern-matching with regular expressions | |||
| lib | haskell-lazysmallcheck 0.3-2 | 11 | A library for demand-driven testing of Haskell programs | |||
| lib | haskell-typecompose 0.6.4-1 | 11 | Type composition classes & instances | |||
| lib | haskell-dataenc 0.13.0.0-1 | 10 | Data encoding library | |||
| lib | haskell-happstack-util 0.3.2-1 | 10 | Web framework | |||
| lib | haskell-hxt 8.3.1-1 | 10 | A collection of tools for processing XML with Haskell. | |||
| lib | haskell-maybet 0.1.2-1 | 10 | MaybeT monad transformer | |||
| lib | haskell-platform 2009.2.0.2-1 | 10 | The Haskell Platform | |||
| office | pdf2line 0.0.1-1 | 10 | Simple command-line utility to convert PDF into text | |||
| lib | haskell-category-extras 0.53.5-1 | 9 | Various modules and constructs inspired by category theory | |||
| lib | haskell-colour 2.2.1-1 | 9 | A model for human colour/color perception | |||
| lib | haskell-datetime 0.1-1 | 9 | Utilities to make Data.Time.* easier to use. | |||
| lib | haskell-happs-server 0.9.3-1 | 9 | Web related tools and services. | |||
Now, one of the other constraints on the Haskell Platform is sustainable growth. We can’t add 1000 packages tomorrow and hope to maintain quality. Instead, something like 10-20% growth per release cycle seems plausible. This would mean adding 4 to 9 new packages.
If we were to judge only on download popularity, the 10 new packages would be:
Now, one of the other constraints on the Haskell Platform is sustainable growth. We can’t add 1000 packages tomorrow and hope to maintain quality. Instead, something like 10-20% growth per release cycle seems plausible. This would mean adding 4 to 9 new packages.
If we were to judge only on download popularity, our first 5 new packages would be:
- haskell-extensible-exceptions
- haskell-hashed-storage
- haskell-haskeline
- haskell-mmap
- haskell-terminfo
Merely because one killer app, darcs, depends on them, and so they are widely built (they may also fail to satisfy many of the other critieria noted above).
If we ignore those packages popular for being dependencies, we get a different top 5:
Now we’re getting there. pandoc is both a library and a popular app, so we might treat it specially. gtk2hs is very popular, but not cabalised, so we might also set that aside, leaving (and I’ll ignore ghc-paths as it is used by ghc):
Which is starting to look like a plausible list. In turn however, you can find fault with all these packages in various dimensions (utf8-string may be obsoleted by Data.Text, haxml is LGPL licensed).
Coming up with an obvious list is non-trivial!
Finally, this is clearly only one very small data set, which should only have a small influence. If we step over an look at the Hackage download statistics, sorted by popularity, our top 5 new packages would be:
Popularity by Category
If instead we thought that having a comprehensive library set was the key goal, we may choose to include libraries via category, no matter how popular in the global list. This would yield, according to Hackage,
- Database: haskell-hdbc
- XML: haskell-haxml
- Binary: haskell-binary
- Text: haskell-utf8-string
- 2D Graphics: haskell-sdl
- Numerics: haskell-hmatrix
For example.
What Is The Decision Model?
So how do we decide what goes in? One model would be:
- Have people propose packages
- Sort them by category need
- Identify the top rank package in each category using a points system or page rank
- Add or remove packages based on this?
What do you think? What is a good way to decide when a package is sufficiently notable to add to the Haskell Platform?
What critieria would you use to determine when a package is blessed?