While at ZuriHac, Johan Tibell, David Anderson, Duncan Coutts and I discussed what the highest priority projects for the Haskell community are, in the context of the Google Summer of Code, for which Haskell.org is a mentoring organization for the 5th year.
Here’s our top 8 most important projects, that we would really like to see good applications for. Some of these have tickets already, but some don’t. If you apply to work on projects like those below, you can expect strong support from the mentors, which ultimately determines if you’ll be funded.
For details on what we think you need to consider when applying to execute a project, see this earlier post.
A Package Versioning Policy Checker
Cabal relies on package version ranges to determine what Haskell software to install on your system. Version numbers are essentially “hashes” of the API of the package, and should be computed according to the package versioning policy. However, package authors don’t have a tool to automatically determine what the version number change to their package should be, when they release a new version, leading to mistakes, and needless dependency breakages.
This project would construct a tool that would be able to compute the correct package version number, given a package and an API change. As an extension, it would warn about errors in version ranges in .cabal files.
Proper test support is essential for good software quality. By improving Cabal’s test support we can test all Cabal packages on continuous build machines which should help us detect breakages earlier. Making it easier to run the tests means that more people will run them and those who already do will run the more often.
Fast text/bytestring HTML combinators
We have Data.Binary for fast serialization of data structures to byte strings to be sent over the wire. High performance web servers need fast HTML generation too, and an approach based on Text.PrettyPrint combinators for filling unicode-friendly Data.Text buffers would be a killer app for web content generation in Haskell. This might mean working on BlazeHTML.
Threadscope with custom probes
ThreadScope is an amazing new tool in the Haskell universe for monitoring executing Haskell processes. It reveals detailed information about thread and GC performance. We’d like to extend the tool with support for new kinds of event hooks. Examples would be watching for MVar locks, STM contention, IO events, and more.
Combine Threadscope with Heap Profiling Tools
ThreadScope lets us monitor thread execution. The Haskell Heap Profiler lets us monitor the Haskell heap live. HPC lets us monitor which code is execution and when. These should all be in an integrated tool for monitoring executing Haskell processes.
LLVM Performance Study
GHC has an LLVM backend. The next step is to look closely at the kind of code we’re generating to LLVM, and the optimizations LLVM performs on GHC’s code, in order to further improve performance of Haskell code.
LLVM Cross Compiler
LLVM has support for many new backends, such as ARM. The challenge is to use this ability to generate native code for other architectures to turn GHC into a cross-compiler (so we could produce, e.g. ARM executables on an x86/Linux box). This will involve linker and build system hacking.
Hackage 2.0 Web Services
Hackage is the central repository for Haskell code. It hosts around 2000 libraries, and is growing rapidly. It can be hard to determine which packages to use. We believe social mechanisms (comments, voting, …) can be very succesful in helping to both improve the quality of Hackage, and make it easier for developers to know which library to use. This project would bring Hackage 2.0 to a deployable state, and then consider better interfaces to search and sort packages.
These are the 8 projects we felt were the most important to the community. What do you think? Are there other key projects that need to be done , that will benefit large parts of the community, or enable the use of Haskell in new areas of importance?
14 thoughts on “The 8 Most Important Haskell.org GSoC Projects”
Err, we *need* a robust Haskell debugger. We have done for many years, but this is *the* biggest gateway to productivity we have here.
At KU, we are working on a debugger that will focus on helping with Kansas Lava programs. But there is lots to do.
The debugger project might be too unconstrained at this point for a summer of code task. We did get the GHCi debugger via a GSoC project, but unless there’s a very clear design, this is probably too ambitious for GSoC, unless you have a design already.
I think a LLVM cross compiler is one step (admittedly, a small one) in the direction of Haskell on mobile devices. The market for PL’s on mobile devices is very fragmented: every platform has its own language. Haskell could be the way to compile apps for every mobile device.
We could also use a GUI toolkit binding that’s easy to set up on all major platforms. I think the current situation could be considerably improved in three person months.
Thanks Don. When debugging imperative programs written in Haskell, we’ve got a good handle on that. But for real functional programs, debugging is just hard.
This month, I’ve hit <>, and ‘head ’ many times. I needed to pull out hpc-strobe to help debug it, which is just wrong!
Are the proposals still open? I could propose a small tracing debugger, that just tells you where you’ve just been? That would help.
Yes, proposals are open. Log in to the trac and open a new proposal ticket here: http://hackage.haskell.org/trac/summer-of-code/newticket
At least one student who has more than passing familiarity with LLVM, (Csaba Hruska, IIRC), has proposed tackling writing LLVM optimization passes in Haskell itself.
And another, Alp Mestanogullari, has been talking to Davie Terei about ways to speed up the execution of the LLVM code gen.
Overall, I think there are a lot of good candidate projects for this year’s summer of code.
Andy: For now I just use ndm’s Safe religiously and import prelude without the unsafe functions entirely. 90% of the time I’ll get pushed into handling the failing case correctly, and when I’m convinced that it should “never” happen, then the custom error from headMsg points me where I want to go.
On the main topic, Cabal test would be awesome. Especially since it would let us solve the notorious quickcheck dependency problem and also maybe the hiding vs. exposure dilemma in a uniform way.
Hackage 2.0 would similarly be super important. There’s a host of issues with the growing ecosystem, and we need to sort them out. The ticket seems fairly modest, but I guess until we get that far, then all the more blue-sky features like tagging, community feedback, etc. aren’t even on the table.
I’ve actually been considering porting pretty (or another pretty-printing library) to using Text for use with graphviz, so that I can control which encoding is being used, etc.
As for the test project: I doubt this will magically solve the QuickCheck dependency issue: if developers can’t use a compile-time flag to bring in optional usage of QC for testing, then why will they do so just for a specified Cabal option?
I miss the Buddha debugger. With it, you just called the buggy function on inputs known to cause the bad behavior. It then refined the input to determine what functions your original function would call, and with what arguments. For each of these, it would ask you if the input/output pair was valid. If not, it would explore that child function further.
It was sort of a tree search on your function call graph. It no longer works on GHC, and hasn’t for ages.
It really showed how referential transparency made coding easier. It was, in fact, the best debugger I’ve used in any development environment for any language.
If people aren’t finding the GHCi debugger is up to the job, maybe a SoC project to address the most important issues there would be a good idea?
On the topic of ThreadScope, there are some more ideas in the ThreadScope TODO list:
A student has proposed marrying Haddock with Pandoc to create something similar to Python’s documentation system Sphinx. The idea is to be able to write general documentation, and not just API documentation, and to be able to use e.g. Markdown.
This could help us write friendlier, more example- and tutorial-driven documentation tightly integrated with API documentation.
If someone would like to mentor this, the student’s email address is firstname.lastname@example.org.
As the maintainer of Haddock I’d be able to help out, but not be the primary mentor since I may be too busy this summer.
Hackage 2.0 is a brilliant idea. I want to be able to search hackage with hoogle’s interface, including searching for package’s that export a function with the type signature I search for.