Feb
2013

Scalatra 2.2 Released

Posted by Ivan Porto Carrero

Last week we released Scalatra 2.2. This is our biggest release so far and introduces a bunch of exciting features like commands for handling input, atmosphere for websockets and comet. It has a much deeper swagger integration now and that API has been completely upgraded. In short this scalatra version fixes most of the big problems we were aware off. Probably one of the nastiest of those problems was the fact that we were using thread-locals to store the request and response, when you then use a future or something the request is no longer available. Let’s walk through some of these changes.

Working around thread-locals.

In a previous version we had migrated all our internal state to either the servlet context attributes or the request attributes, depending on their scope. In this release we made everything that access the request or response take them as implicit parameters. For people overriding our methods this is a breaking change, but easily fixed by adding the parameters to your method override. We also added an AsyncResult construct whose only purpose is to help you not close over thread-locals.

So what is exactly the problem?

// All of these will fail
get("/things/:id") {
  Future {
    params("id")  // throws NPE because request is not available when the future executes
  }
}

post("/things") {
  myThingsActor ? Post(parsedBody.extract[Things]) map { things =>
    if (things.isEmpty) status = 404 // throws NPE because response is not available when the future executes
    ()
  }
}

// assuming scentry is mixed in and user is something stored on the request or in cookies or something
get("/stuff/:id") {
  val stuff: Future[Stuff] = getStuff(params("id"))
  // everything is still fine
  stuff map { allTheThings =>
    getTrinketsForUser(allTheThings, user)  // throws NPE because request is not available when the future executes
  }
}

And since this is something we absolutely had to fix, we had to introduce some breaking changes but they really were for the better. Currently there are 2 ways to get around it: bring request/response into your action in implicit vals or use the AsyncResult trait to do this for you.

Let’s rewrite the broken examples in terms of the first work around:

// using plain futures
get("/things/:id") {
  implicit val request = this.request
  Future {
    params("id")  // no more NPE
  }
}

post("/things") {
  implicit val response = this.response
  myThingsActor ? Post(parsedBody.extract[Things]) map { things =>
    if (things.isEmpty) status = 404 // no more NPE
    ()
  }
}

// assuming scentry is mixed in and user is something stored on the request or in cookies or something
get("/stuff/:id") {
  implicit val request = this.request
  implicit val response = this.response
  val stuff: Future[Stuff] = getStuff(params("id"))

  stuff map { allTheThings =>
    getTrinketsForUser(allTheThings, user)  // no more NPE
  }
}

With the AsyncResult you get another chance to add some default context to your async operations but other than that it works very similar.

// Using async result
get("/things/:id") {
  new AsyncResult { val is =
    Future {
      params("id")  // no more NPE
    }
  }
}

post("/things") {
  new AsyncResult { val is =
    myThingsActor ? Post(parsedBody.extract[Things]) map { things =>
      if (things.isEmpty) status = 404 // no more NPE
      ()
    }
  }
}

// assuming scentry is mixed in and user is something stored on the request or in cookies or something
get("/stuff/:id") {
  new AsyncResult { val is = {
    val stuff: Future[Stuff] = getStuff(params("id"))

    stuff map { allTheThings =>
      getTrinketsForUser(allTheThings, user)  // no more NPE
    }
  } }
}

The AsyncResult has an implicit parameter of ScalatraContext and every ScalatraBase has an implicit conversion to a ScalatraContext so the request and response are now stable values and no longer stuck in thread-locals. With that bug out of the way, and you’re a swagger user then the next examples are for you.

New swagger API

In the previous version of scalatra we introduced swagger support. While the API we introduced then worked it ended up being very messy and was error prone since most of it used strings. At wordnik we started using scalatra and one of my co-workers, who just started learning scalatra, remarked: Swagger makes Scalatra ugly. Clearly something had to be done about this! This release tries to fix some of that by using as much information from the context as it possibly can and defining a fluent api for describing swagger operations.

There are no more strings except for things that are notes, descriptions, names etc. It integrates with scalatra’s commands so you only define the parameters for a request once. It automatically registers models when you provide them and it converts the scalatra route matcher to a swagger path string. Let’s take a look at a before and after:

This is how it used to be:

// declare the models
models = Map("Pet" -> classOf[Pet])

// declare the route
get("/findByStatus",
  summary("Finds Pets by status"),
  nickname("findPetsByStatus"),
  responseClass("List[Pet]"),
  endpoint("findByStatus"),
  notes("Multiple status values can be provided with comma separated strings"),
  parameters(
    Parameter("status",
      "Status values that need to be considered for filter",
      DataType.String,
      paramType = ParamType.Query,
      defaultValue = Some("available"),
      allowableValues = AllowableValues("available", "pending", "sold")))) {
  data.findPetsByStatus(params("status"))
}

This is what it is now:

// declare the swagger operation description
val findByStatus =
  (apiOperation[List[Pet]]("findPetsByStatus")
    summary "Finds Pets by status"
    notes "Multiple status values can be provided with comma separated strings"
    parameter (queryParam[String]("status").required  // required is the default value so not strictly necessary
                description "Status values that need to be considered for filter"
                defaultValue "available"
                allowableValues ("available", "pending", "sold")))

// declare the route with the swagger annotation
get("/findByStatus", operation(findByStatus)) {
  data.findPetsByStatus(params("status"))
}

So there is no more endpoint declaration necessary, you work with actual types and you don’t have to remember to register models and all their referenced models anymore.

Let me know what you think.

Oct
2012

Json4s: One AST to Rule Them All.

Posted by Ivan Porto Carrero

It seems to me that every webframework in Scala, of which we have plenty, also insists on writing their own JSON library. But various tools rely on the AST from lift-json. This is both good and bad, there were a number of gaps in the lift-json version. The most notable being that it does not support any other type than Double for representing decimal numbers. A second problem is that because of the large number of dependencies in the lift project it typically is a bottleneck for upgrading to scala-2.10. And thirdly it just seems an odd place for a nice library like that to live. I’m hopeful that more libraries like Play, Spray etc will contribute to this project so that instead of a fragmented json landscape we get a homogeneous one. All of them have a json ast with the same types defined in it.

So I’ve set out to set lift-json free from the lift project and tried to add some improvements along the way. The first improvement I’ve made is that you now have the choice between using BigDecimal or Double for representing decimal numbers, so you can use the library also to represent invoices etc. The second change I made is to add several backends for parsing. The original lift-json parser is still available in the native package but you can now also use jackson as a parsing backend to the json library. It’s really easy to add more backends so smarterjson, spray-json etc are all in the cards.

I took a look at what play2 has in their json support and it looks like their main thing is a type class based reader/writer story instead of the formats based one from lift-json, so for good measure I also added that system to this library. In general I like typeclasses for this type of stuff but in this case I actually think that lift-json has a nicer approach by assembling all the context into a single formats object instead of requiring many type classes to be in scope at any given time when you want to parse or write json.

There are a few more convenience methods added on the JValue type that allow you to camelize or underscore keys, remove the nulls etc. I spoke with Joni Freeman about the general idea of this library and he showed me what he did on a branch for lift-json 3.0 so I incorporated his work into json4s too. it basically means that a key/value pair used to represent a JObject is no longer a valid json AST node and there are some extra methods to work with those fields. All of this is explained in the README of the json4s project.

I’ve also added support for json4s to the dispatch reboot project so you can use it there just like you can with lift-json. Furthermore Rose Toomey let me know that salat is now using json4s as json library instead of lift-json.

Some of the improvements I still want to make is for scala 2.10 I want to use the Mirror api to be able to reflect over more stuff than just case classes. For some of the use cases we have at where I work it makes sense to be able to use a few annotations (yes annotations unfortunately) to have certain keys be ignored and so on. I’ll probably steal some of that from the Salat project so that there is still some degree of consistency between our libraries.

I also want to figure out how we can possibly make the AST based approach useful with huge data structures, as it’s not inconceivable to want to send 100MB or 10GB json docs over the wire. At that moment a lazy approach actually makes a lot of sense, I’m open to suggestions on how this could be achieved efficiently without breaking the AST model.

So if you’re using json in scala consider using or contributing to the json4s project.

Sep
2012

Typeclass Based Databinding for Scalatra

Posted by Ivan Porto Carrero

Typeclass based databinding for scalatra

One of the big new features for scalatra in the next release will be databinding. We chose to use a command approach which includes validation. It is built on top of scalaz 6.0.4 at the moment, as soon as scalaz7 is released we’ll probably migrate to that.

One of the tenets of scalatra is that it’s a mutable interface, the request and response of servlets are both mutable classes and we depend on those. I also wanted for the commands to be defined with vals that you can address later. The core functionality of the bindings is actually immutable but to the consumer of the library it pretends to be a mutable interface. Perhaps those properties will make people frown, perhaps not.

What does it do?

It provides a way to represent input forms with validation attached to the fields, maybe a code sample will be a better explanation.

import org.json4s._
import org.json4s.jackson._
import org.scalatra.databinding._

class LoginForm extends JacksonCommand {
  val login: Field[String] = asString("login").notBlank
  val password: Field[String] = asString("password").notBlank
}

This interface relies on a bunch of implicit conversions to be in scope. To be able to present this mutable interface it is needed to use the Field[T] label after the val declaration otherwise things won’t work as you’d expect them to work.

Then in the scalatra application you can do

import scalaz._
import Scalaz._

class LoginApp extends ScalatraServlet with JacksonJsonSupport with JacksonParsing {
  post("/login") {
    val cmd = command[LoginCommand]
    if (cmd.isValid) {
      users.login(~cmd.login.value, ~cmd.password.value)
    } else {
      halt(401, "invalid username/password")
    }
  }
}

This will work with data provided as form params in a http post, as json or as xml. Of course the example above is a simple one and there are many other propeties you can interrogate. At this moment the databinding is a bit light on documentation as in there is none. If you’d like to take it for a spin feel free to ping us in the #scalatra irc room.

What’s next?

At some point in the future I’d like turn a command into a monoid so that we have a map method on it, but for now I need to move onto another part of scalatra.

Apr
2012

Searchable Sbt Wiki

Posted by Ivan Porto Carrero

Searchable sbt wiki published

Last week I went to Scala Days and had a blast. I got to meet many of the people whose software I use every day.
Most Scala developers I know of use sbt to build scala projects. SBT has a great wiki with loads of information but is hosted on github and they, unfortunately, have no search box on their wikis that will search just the wiki. This leads to long searches on the sbt wiki or grepping the checked out git repo or worse. However the gollum software which runs the github wikis does have a search box, so I decided to host the sbt wiki as a read-only version somewhere else.

The same server that hosts our jenkins frontend is now also hosting the sbt wiki and the main feature is a search box. It’s running a unicorn server for rack fronted by nginx for ssl termination.

Enjoy the new sbt wiki.

Aug
2011

Work for Backchat

Posted by Ivan Porto Carrero

Backchat.IO is hiring!

BackChat.io is easiest to describe as grep for real-time data. We’re busy developing a platform that makes sense of the data you send it (in real-time).

For the social web that means that you can start following certain users on a network or define a search term and you get those in a unified activity streams format.

But for something more arcane, like say, log data, that means you can send it lines of text and have it extract out fields on which you can define filters. All of this happens in real-time and changes are reflected virtually immediately in open connections.

Read on →

Sep
2010

Radical Language and Platform Shift

Posted by Ivan Porto Carrero

I realize it’s been a really long time since I put something on this blog. And for those of you expecting more (iron)ruby posts I’m going to have to dissapoint you. This is mostly a braindump of what I’ve been working out the last months.

A few months ago Adam Burmister, who I met at Xero, and I got incubated by O2 to do a project that allowed me to push the boundaries and it required me to look outside what I already knew. I had to go look for a new way of approaching problems, it isn’t said that the problems couldn’t be solved with a language like C# or Ruby. The solution would have been pretty much sub-par. In this quest for the best way to approach that problem it turned out that Scala was the language that hit a sweet spot for me.

We needed something that resembled Erlang; and while I did my best to really get into Erlang I never could (this could possibly be because of the eye bleeds because the language is just so friggin’ ugly). So it turns out there is a design pattern called ActiveObject which is at the core the same as an erlang node (it’s not I know a node has many other properties). We also needed to be able to process humongous volumes of data (Terrabytes worth) at this point Ruby is out of the picture. I’m sure I will upset many fanboys but face it ruby is slooooooooooooooow and advances slow, by the time you can properly program distributed systems in ruby the way I had in mind I’ll be a great grand dad and have a long and pointy grey beard. .Net is lacking libraries and the ruby libraries often are good enough, but good enough never is. And since I got tired of patching every single bloody library I touched I vowed to steer clear of ruby. We still use ruby but we use it for what it’s great at: system automation scripts and sometimes quick prototyping.

We basically needed hadoop, hadoop is a java project (I’ll return to why not java and C# a little bit later). So once I entered the Java domain a new world opened up for me (old for most other people I realise that). Java has what can only be described as a SHITLOAD of great quality libraries. It’s just a pithy that Java like C# suffers from what I call programmer masturbation. I’ve certainly been guilty of that and during my time at Xero they suffered the grunt of it (sorry guys). So lets return to those problems.

You read a book, nicknamed Gang of Four which then becomes “the bible”, it has this thing called design patterns and they need to be applied where ever you can. I’ll let you in on a little secret: They do next to nothing to make your code more maintainable (quite the opposite in fact) and definitely don’t make it more understandable unless the next guy also knows “the bible”. If he doesn’t he’s a fucking retard, everybody knows those design patterns. The thing that doesn’t jive is: how is writing more code making your code more maintainable as you have to maintain more code (did I mention more code in this sentence?)The one thing ubiquitous use of design patterns from “the bible” does do is give you some job security. Pythonista’s shun design patterns and if not you should apparently. Having programmed in many languages I tend to agree with the conclusion that having to use crutches like design patterns (I should really make this into a factory or manager of some sort) actually means your language is flawed.

I still need to meet the person that can actually prove to me that your code is more maintainable than code that follows the following simple rules: * Don’t Repeat Yourself * Don’t write what you don’t need right now * write a couple of tests * Generalize as if there was no tomorrow. * Write as little and as simple code as humanly possible (this kills double dispatch and the visitor pattern) * Remember that you (as do I) have a bird brain, you will have forgotten what you did 2 weeks ago, let alone 6 months from now, so it needs to be understandable by the biggest idiot on the planet, namely the author of the code (in my case me).

I don’t want to write a post on why I left .NET but it’s inevitable to mention it. I used to think .NET was the greatest thing since sliced bread and I still think it’s a really cool piece of technology. However there is only a small minority of .NET programmers I actually get along with so some of the remarks I’m going to make are not directed at those people. I have felt unhappy about the way .NET was evolving around the same time microsoft introduced the class designer tool. Don’t even get me started on people advocating UML because that belongs in the same classification, a vertical one. Once Oslo got introduced or is it M I wanted to get out as quickly as possible. I happen to like writing code, if I wanted to drag and drop boxes and connect them with fancy lines I would have gone for a designer career.

.NET also suffers from another problem, whatever the all-knowing company produces is innovation created by them (never mind if some of those things have been around for more than 20 years). And most developers on .NET suffer from that phenomenon that can only be called Stockholm Syndrome. It is mind boggling to me that you want to use tools you know suck, they don’t make you do a better job faster in fact once you move past the hello world example they fall apart really quickly, not to mention having to debug a problem and submitting a bug (which will then be bounced back as by design).

Enough slamming on .NET let’s return to Java. Stephen Colebourne goes the next big JVM language is Java, but this time done right??? One of his arguments is 10.000.000 programmers world wide can’t be wrong. I happen to think that 9,9 million of those programmers mostly likes to get paid, it has little or nothing to do with the fact that it’s a great or not great language. It’s certainly easier than C and definitely C++, ask Bjarne. Most of the java code I read makes me sick to my stomach the boilerplate needed (the next example is in C# the java one would most def be longer: fizzbuzz enterprise edition) is atrocious. Java date arithmetic (I know about joda-time) is an absolute nightmare. The fact that you need to write at least 6 (not counting import statements and the main definition) lines of code to be able to read input from the keyboard and print it out just amplify my point.

So no ruby, no .NET no java what’s next. There is this cool thing people keep talking about: node.js it’s crazy fast (if you compare it with languages like ruby and run the correct hello world benchmarks). however the libraries are subpar at best and generally feel like they’ve been written by very young programmers (with the odd exception of course) who have little or no clue about what’s going on outside of their blog or what their “gods” are saying. I’m sure it has a place and I’ve given it more than an honest chance but at the end of the day it would have required a big investment to write all kinds of things that just aren’t there (yet).

But you know it’s event driven and asynchronous and that’s why it’s fast and only non-blocking IO is the right way to go because using blocking IO is slow. Ok now you got me, you’re right but also wrong. It depends on your use case and how you work with blocking IO. We’ve come to go by this simple rule: if you need many short-lived connections (like in say HTTP) then non blocking io is indeed better, however long lasting connections may benefit from blocking IO, because the throughput is a lot higher (although it’s not quite as black and white as that).

So back to we want erlang but without the bleeding eyes: enter scala + akka. Boy was I happy camper when I started reading their docs. An open-source project, written in this language called scala that solves the same problems as Erlang only this language is beautiful, yes I’ll repeat that beauuuutifuuuulll. Scala gives me what ruby was never able to give me, a fast, pretty language that supports multiple paradigms with a strong nudge towards functional programming. it can be run on .NET as well as on the JVM meaning we didn’t have to forego the much needed libraries. And the libraries that are available are in a totally different league than those dinky toys node.js and ruby have to offer. It’s like comparing the majors to the minors I guess.

The downside is that we do need core i7 machines to get any decent compile times out of the thing and IDE support (while it gets better steadily) is still behind on other languages. If you’re wondering about LOC count vs ruby I think they’re about even once you know what you’re doing. Scala is not an easy language but it’s heaps of fun to work with and I’m glad I get to use it the next couple of years. if you’re looking for an acceptable alternative on .NET that is supported by the all-knowing hugely innovative company you should look at F#.

As an aside the next time somebody mentions enterprise ready as baked in to me; they will get a rope, chair and nail it’s quicker and less painful.

There the rant is over, I feel a lot better now. I already know I’m an idiot so tell me something new.

Mar
2010

Video of FOSDEM IronRuby Presentation

Posted by Ivan Porto Carrero

At FOSDEM 2010 I got the chance to talk in the mono dev room about ironruby.

In this talk I extended the banshee application to work with IronRuby based plugins.

So without further ado here’s the link: IronRuby: The .NET and Ruby love child-(The%20Ruby%20and%20the%20.NET%20lovechild).mp4)

Dec
2009

Adding a Console to Your IronRuby Application

Posted by Ivan Porto Carrero

When building an application it might be very handy to have a REPL console that knows about the libraries of your application, but you don’t necessarily want to start your application to interact with it. In Rails they have a script/console command. Here’s how you create one that knows about ironruby. The example I’m going to use is taken from an IronRubyMVC application.

I started out by creating a folder script.

Then I created a file called console (on a unix system I would chmod +x this file). I also like to have completion in my console so I’ve added the irb/completion library. Then I’ll require the routes.rb file so that the libraries of my application get loaded.

console script

#!/usr/bin/env ir
# File: script/console
irb = ENV['OS'] =~ /^Windows/iu ? 'CALL iirb.bat' : 'iirb'

libs =  " -r irb/completion"
# Perhaps use a console_lib to store any extra methods I may want available in the console
libs <<  " -r #{File.dirname(__FILE__) + '/../routes'}"
puts "Loading Poll chat"
exec "#{irb} #{libs} --simple-prompt"

The 3rd line has CALL iirb.bat as a command on a windows system. The CALL command is needed for the next step because we’re going to execute a batch file from another batch file on windows. Otherwise it wouldn’t work for me. CALL is very similar to exec in ruby and gives control to another executable until its task is done.

For windows to be able to use script/console (script\console) instead of ir script/console you also need to create a batch file called console.bat in the script folder.

console script

@ECHO OFF
IF NOT "%~f0" == "~f0" GOTO :WinNT
@"ir.exe" "script/console" %1 %2 %3 %4 %5 %6 %7 %8 %9
GOTO :EOF
:WinNT
@"ir.exe" "%~dpn0" %*

This is all there is to it to get rails like scripting abilities.

IronRuby has another really cool feature built into IronRuby is the ability to provide REPL’s for your application at run-time. All you need to do is use Repl.new and give it an output and input stream.

Nov
2009

A Good Url Regular Expression? (Repost)

Posted by Ivan Porto Carrero

I’m moving this post from http://geekswithblogs.net/casualjim/archive/2005/12/01/61722.aspx

I started out blogging on geeks with blogs but I can’t allow comments there anymore or I get too much spam, so I’m moving the post from there to this place. Various people have contributed through the comments in the other blog post. So here I have better control over the spam and can open the comments again.

I have been looking for a good first layer of validating an url to see if it is valid.

For checking the format of the url it seems to me to be the most logical approach to use regular expressions. Up until now I always discarded them as being to “geeky”, meaning i don’t consider it my life’s biggest goal to be typing (/?[]\w) all day long (so why did i become a programmer, aaaah yes to make life easier for other people)

Anyway.. to find a good regular expression to that validates urls not url domains. One that doesn’t allow spaces in the domainname and where the domain can be suffixed with the port number. Also I need support for the ~/ paths

This is what I came up with.. if somebody as a better idea… or finds a mistake please let me know.. Always happy to learn something new.

^(((ht|f)tps?\:\/\/)|~/|/)?([a-zA-Z]{1}([\w\-]+\.)+([\w]{2,5})(:[\d]{1,5})?)/?(\w+\.[\w]{3,4})?((\?\w+=\w+)?(&\w+=\w+)*)?

I was a bit quickly in using this regex. Simeon pilgrim indicated that the ftp urls won’t validate when you add a username and a password.

I don’t really need to validate ftp so I should have removed the ftp protocol from the list of choices. I need this just to validate urls for weblinks and the link element in an rss feed. When I need them for ftp I will post the ftp version.. but for now I don’t have time to spend on elaborating the regex.

Anyway here is the right one :

^(http(s?)\:\/\/|~/|/)?([a-zA-Z]{1}([\w\-]+\.)+([\w]{2,5}))(:[\d]{1,5})?/?(\w+\.[\w]{3,4})?((\?\w+=\w+)?(&\w+=\w+)*)?

A full url validation would include resolving names through dns or making a webrequest to the provided url to see if we get a 200 response. The only way to be sure is to test if it is there in my opinion.

Thanks Simeon.

And for those who really want the ftp validation :

^((ht|f)tp(s?)\:\/\/|~/|/)?([\w]+:\w+@)?([a-zA-Z]{1}([\w\-]+\.)+([\w]{2,5}))(:[\d]{1,5})?/?(\w+\.[\w]{3,4})?((\?\w+=\w+)?(&\w+=\w+)*)?

I am not sure about numbers in the username but I believe you can have a username of numbers alone.

Comments don’t seem to work on this blog engine.. so just send me a mail through the contact form. thanks

Two days later …

I discovered there is still a problem with my regular expressions… folders don’t get parsed.

I’ve solved the path issue, so now it should be finding all url’s

Expression:

^((ht|f)tp(s?)\:\/\/|~/|/)?([\w]+:\w+@)?([a-zA-Z]{1}([\w\-]+\.)+([\w]{2,5}))(:[\d]{1,5})?((/?\w+/)+|/?)(\w+\.[\w]{3,4})?((\?\w+=\w+)?(&\w+=\w+)*)?

Should parse the url below

http://hh-1hallo.msn.blabla.com:80800/test/test/test.aspx?dd=dd&id=dki

But not :

http://hh-1hallo. msn.blablabla.com:80800/test/test.aspx?dd=dd&id=dki

Update 29/11/2008:

Joe posted what seems to be a great regular expression in the comments

he tested it with the following urls:

http://www.google.com/search?q=good+url+regex&rls=com.microsoft:*&ie=UTF-8&oe=UTF-8&startIndex=&startPage=1

ftp://joe:password@ftp.filetransferprotocal.com

google.ru

https://some-url.com?query=&name=joe?filter=.#some_anchor

Expression:

^(?#Protocol)(?:(?:ht|f)tp(?:s?)\:\/\/|~/|/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(?:(?:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(?:(?:(?:/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|/)+|\?|#)?(?#Query)(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?$

Update 8/11/2009:

Expression:

^(?#Protocol)(?:(?:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(?:(?:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(?:(?:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|#)?(?#Query)(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?$

There is a wave for this regex:

https://wave.google.com/wave/?pli=1#restored:wave:googlewave.com!w%252BsFbGJUukA

Update 29/09/2010

So people if you don’t like it don’t use it. Now this regex is troubled it has a bunch of issues but it works most of the time. If you want a more liberal regular expression to just capture urls from text, there is a really good one on the blog of John Gruber. Improved regex for matching urls @ daring fireball

Oct
2009

Creating Launcher Scripts for IronRuby

Posted by Ivan Porto Carrero

It’s been a while since I blogged, I’ve been terribly busy going through some changes and prepping the book.

Anyway lately a lot of blog posts have been written on how to ironruby with cucumber to test your .NET code. While I think it’s great people are using ironruby and cucumber, the guide you can find on aslak’s github wiki isn’t the most ideal solution as it will only work for windows and it requires MRI to be installed on your system. So I thought I’d write up how I’ve been creating launchers that work both on windows .NET and mono systems.

Another problem the approach of setting the GEM_PATH to the MRI gem location is that if your gem requires a C-extension (which could easily be a C# extension in IronRuby) ruby will get confused about which one it’s going to need.

I’m going to use cucumber as an example but this counts for most ruby libraries. I’ve been using this for a few months already so it really doesn’t matter which version of IronRuby you’ve got installed. I’ve compiled a fresh version from github and deployed that to C:\ironruby on windows and added C:\ironruby\bin to my PATH environment variable. I installed my ironruby version on my *nix boxes in /usr/local/ironruby and added /usr/local/ironruby/bin, /usr/local/ironruby/silverlight/bin and /usr/local/ironruby/silverlight/scripts to my PATH environment variable.

1. install the gem: igem install rspec cucumber –no-rdoc –no-ri

this will install the rspec and cucumber gems with their dependencies. And the gems process will actually install the launcher scripts in C:\ironruby\lib\ironruby\gems\1.8\bin and we’re going to use those scripts to create our launcher script

2. Get the launcher scripts into the bin dir

On windows you can now go:

copy C:\ironruby\lib\ironruby\gems\1.8\bin\cucumber C:\ironruby\bin\icucumber
copy C:\ironruby\lib\ironruby\gems\1.8\bin\cucumber.bat C:\ironruby\bin\icucumber.bat

NTFS supports symlinks so you could also use the junction tool from the sysinternals toolkit to create those instead of copying the files. http://technet.microsoft.com/en-us/sysinternals/bb896768.aspx

On *nix based systems there is one more step to go through.

cp /usr/local/ironruby/lib/IronRuby/gems/1.8/bin/cucumber /usr/local/ironruby/bin/icucumber
chmod +x /usr/local/ironruby/bin/icucumber

And this is the easier way to properly use installed gems from the ironruby distribution, it will also make it a lot easier to upgrade your gems in 2 different ruby installations at different times etc.

← Older Blog Archives