Playing with the new Haskell epoll event library

Bryan and Johan have been working hard on replacing the GHC runtime’s concurrency mechanism based on select, with a new one based on epoll. This should improve scalability of Haskell server code in a number of ways — more connections sec, more concurrent connections, and so on.

You can read about their work in a series of posts:

And you can find the current code for haskell-event on github:

After reading this post, on how epoll-based user land callbacks can outperform general purpose preemptive threads for non-compute bound IO work, I wanted to see what improvements we can hope for from an epoll-based version in Haskell.

The full code, which shows how to set up simple socket event notification using the event lib, is on the Haskell wiki, and will be useful for getting a flavor of the event lib use.

The results:

So while a simple forkIO/accept version, doing Handle and String IO would peak out around 7k req/sec, the epoll based version reached over 15k req/sec.

So that’s useful: good epoll-based event code for Haskell. The next step will be to see the redesign of Control.Concurrent based on epoll being merged into GHC.

Daily Haskell: Download and analyse logs, then generate sparklines

This is the first post in a series of “Daily Haskell” posts, about getting those everyday tasks done in Haskell by gluing together libraries on Hackage. It is a series about Haskell as glue, with a tour of the libraries thrown in. Today, a quick reflection on what brought Haskell into this “glue” phase of its existence, and the first “daily Haskell” program: hackage-sparks, a script to produce sparklines: . First, an overview of how we got here, for new Haskellers.

The practice of Hackage programming has shifted dramatically in the past 12 months. It used to be that you had to roll your own HTTP client, or log file parser, or graph generator, using libraries like Parsec or Arrows to make things clean and elegant. But by 2008 the daily practice of Haskell programming is dominated primarily by glue: combining existing libraries in new ways, using cabal-install to gather the components, and letting your simply and quickly get your work done.

Three key developments brought about this shift, and I want to quickly go over them (all aspiring Haskellers should know these tools!).

A single, common build system: Cabal

The first key development started in 2004, when Isaac Potoczny-Jones , (now my boss here at Galois) saw a chronic lack of a single Haskell build system. The few and the brave were rolling their own library archives with Makefiles and autoconf, but there was no way to check if other Haskell dependencies were around, and no agreed upon way to construct the build system. Everybody did their own thing, and all were broken in some way.

Isaac started hacking, quickly getting code out to the community. Things built from there, and now, 4 years later, we’ve a grand solution: Cabal, the common platform for building Haskell applications and libraries. By writing a simple, declarative specification for what your Haskell code provides, Cabal is able to abstract out all the nitty gritty of actually preprocessing, compiling, optimising, linking and installing your apps and tools. Now even Haskell newbies can construct perfect libraries and apps, and crucially, they can simply and correctly reuse the libraries of others, with all dependencies checked and satisifed. The type system enforces that the glue is of the appropriate strength for what we’re combining, and purity ensures libraries we import don’t monkey around in code we’ve already written. How simple, modular, scalable programming should be.

Centralised library repository: Hackage

The second big step really took off in March last year, after the Oxford Haskell Hackathon, when Hackage went live. Like CPAN before it, this provides a single, centralised repository of all the Haskell code fit to package. Dependencies are described, a standard interface is presented, and key, for developers, online, cross-referenced documentation of all the library APIs is provided. In this way, developers can contribute their libraries to the central API pool, and have it integrated and documented with others work.

The online documentation is crucial for spreading knowledge of library APIs, because, as this is a purely functional, strongly typed language, once you see the type signature for a library function, you’ve all the information you need to integrate it safely in your code, and start using it. Just check the function types — they tell you everything you need to know about how to use it.

Since it went live we’ve had over 550 libraries and tools uploaded to Hackage, with everything from XML parsers, GUIs, databases, to 3D shooter games, bioinformatics software, midi synthesises and perl 6 implementations now easily available. Almost everything you might need for your daily work is now there in one form or another, and if its not, it is a small matter of rolling an FFI binding and uploading your cabalised package to add to the pile. Please do so!

One stop Haskell installation: cabal install

The final piece of the build, host and distribution puzzle fell into place in June, with the release of cabal install, an apt or pacman like tool that automated all dependency resolution, downloading and building of libraries and applications. From a single command, for example:

    $ cabal install hackage-sparks

The tool will chase down, build and install everything needed for your application.

The centre of the Haskell universe is now focused on Hackage, with projects adding the code base, combinging libraries into new forms, packaging up, and downloading from, this wealth of code. If you’re not yet using it, run, don’t walk, to Hackage, get cabal-install, and get coding! If you’re a user of a distro with good Haskell support, like, say, Gentoo, or Arch Linux, you’ll already have cabal-install in native packaged form.

Daily Haskell: Sparlkines, Log files and Tagsoup

Down to work. With all these uploads happening on Hackage, late Friday I wanted to summarise somehow how active month-by-month, and day-by-day Hackage is, in a concise format, suitable for presentation on the front page of

One nice way to do this is via sparklines, simple, concise, dense graphics that fit inside a sentence. I’d like to condense the hackage log files into such graphs, and have them available on Hackage, thanks to Hitesh Jasani. Some examples or and you get the idea.

To do all this we’ll need to do three things:

  1. Download the log file
  2. Analyse and group the logs by date into months and days
  3. Spit out .png files containg rendered graphs

The libraries we’ll use for this are:

  1. tagsoup
  2. parsedate
  3. hsparklines

The upload logs are on, and have the form:

    Mon Jun 23 09:03:05 PDT 2008 AudreyTang Pugs
    Mon Jun 23 09:34:07 PDT 2008 UweSchmidt hxt 8.1.0
    Mon Jun 23 11:50:45 PDT 2008 JeremyShaw AGI 1.1.1

The code is straight forward, and shouldn’t take more than 10 minutes to write. Just a quick script. First, import the libs we want.

    -- Some basics
    import Data.List
    import Data.Maybe

    -- Time and locale handling
    import System.Time
    import System.Locale

    -- Diretory and filepaths
    import System.Directory
    import System.FilePath

    -- Parsers for time strings
    import System.Time.Parse

    -- Easy HTTP downloads
    import Text.HTML.Download

    -- Sparkline graphcss
    import Graphics.Rendering.HSparklines

This is how Haskell as glue works. Pull in everything, and roll some list glue between components. Now, some constants:

    -- Where our log files live
    url = ""

    -- Filenames of our generated graphs
    png1 = "hackage-monthly.png"
    png2 = "hackage-daily.png"

Yeah, no type declarations. Type inference for just getting the job done.

Visiting hackage, and look at the API for hsparkslines, we see it is possible to define a custom graph style for our sparklines, so let’s define a bar graph with a grey background:.

    graph = barSpark { bgColor = rgb 0xEE 0xEE 0xEE }

Now, our script proper. Grab the pwd, and download the log file:

    main = do
        pwd <- getCurrentDirectory
        src <- openURL url

Yeah, that’s how you download a page of the internets. Easy.

Now, we start the glue logic. Break the log file into lines, parse them into proper dates, using this API, and sort them by date, ignoring any parse failures:

        let dates = catMaybes . sort . map parse . lines $ src

Cool, we got a lot done there. Now, find today’s date, and use the list groupBy to cluster our individual uploads into groups of days and months:

        let today     = last dates
            permonth  = groupBy month dates
            thismonth = groupBy day . filter (month today) $ dates

We defined some helper functions here to let us compare by year and month, and year, month, day, at the bottom of ‘main’:

        parse = parseCalendarTime defaultTimeLocale "%c"

        month a b = ctYear a == ctYear b && ctMonth a == ctMonth b
        day   a b = month a b && ctDay a == ctDay b

‘month’ is nice, as it can be passed to both groupBy, and filter, letting us group on months, or filter things that match today.

Almost done, now count the number of uploads in each month or day group, converting the lengths into Floats ready for graphing,

        monthlies = map genericLength permonth
        dailies   = map genericLength thismonth

Now, we just use the ‘make’ function from hssparklines, which takes a graph style and a list of points in the columns. The result is an Image value, which can be immediately written to disk:

        graph1 <- make graph monthlies
        graph2 <- make (graph { limits = (0,20) }) dailies

        savePngFile png1 graph1
        savePngFile png2 graph2

Done! Now tell the user what we wrote:

        putStrLn $ "Wrote: " ++ pwd </> png1
        putStrLn $ "Wrote: " ++ pwd </> png2

And that’s it. Haskell logic gluing together network code pulling in online logs, analysing them, and graphing the results. Simple, and all strongly typed, pure glue.

Running this in ghci, or from the command line:

    $ hackagesparks
    Wrote: /home/dons/dons/src/hackage-sparks/hackage-monthly.png
    Wrote: /home/dons/dons/src/hackage-sparks/hackage-daily.png

The last step is to construct a .cabal file for the program, and upload it to hackage, specifying the program and its deps:

    name:                hackage-sparks
    version:             0.1
    license:             BSD3
    license-file:        LICENSE
    author:              Don Stewart
    category:            Graphics
    synopsis:            Generate sparkline graphs of hackage statistics
    description:         Generate sparkline graphs of hackage statistics
    cabal-version:       >= 1.2
    build-type:          Simple

    executable hackagesparks
        main-is:         Main.hs
        build-depends:   base >= 3, old-locale, old-time, directory,
                         hsparklines, tagsoup, parsedate, filepath

Bundling up the .cabal file and the source, I uploaded this script to Hackage, so you can get it via cabal-install. The cabal file you can generate these using mkcabal.

And that’s it. Job done. Sparklines for the uploads are visible on the frontpage.

I hope you get a sense for how, with the build and distrubtion infrastructure of cabal, and the wealth of libs on hackage, it’s cheap to roll Haskell solutions to your everyday scripting problems, yielding rock solid, native-code compiled, strongly typed scripts that just work. No fuss, no mess, just getting the job done. In the coming weeks I hope to post more of these daily Haskell scripts, covering more and more of Hackage, and giving an insight into what Haskell for the working programmer is like.