Want more egghead?

This lesson is for members. Join us? Get access to all 3,000+ tutorials + a community with expert developers around the world.

Unlock This Lesson

Already subscribed? Sign In


    Use Single-Constructor Union Types in Elm to Prevent Invalid Data

    Enrico BuonannoEnrico Buonanno

    In this lesson you will learn to use Elm's type and module system to make invalid data unrepresentable, a robust approach to ensure that your data is always consistent!

    You should know the basics of union types before tackling this content, they're explained here in Define simple union types in Elm.

    You can follow along using any editor and Elm reactor; if you need help with the setup, watch Installing and setting up Elm.



    Become a Member to view code

    You must be a Member to view code

    Access all courses and lessons, track your progress, gain confidence and expertise.

    Become a Member
    and unlock code for this lesson


    00:01 One of the nice things of Elm's type system is that it allows you to be very precise about how you model your data.

    00:09 For example, let's say that you're writing an application that sends some messages via email. You could model your email messages like this. You could have a type alias message, and you could say that a message needs a recipient, and say that it's a string, and that it needs a body, which is also a string.

    00:29 This approach works, but it has two drawbacks. Firstly, you're not really taking advantage of the types. Here, you have 2 properties that are strings, but if you have 10 properties, and they're all strings, then the types are not very descriptive.

    00:43 The second drawback, which is much more important, is that you can potentially now create a message that is bogus. For example, you could create a message where the recipient is "Hello" and say the body is also "Hello." If you say that this message is of type message, then this still compiles and works fine.

    01:09 The first problem, we can address to a certain extent with type aliases. What we would like to say is that recipient cannot just be any string, but it needs to be an email address. To do this, we would also need a type alias email address, and say that email address is a synonym for string.

    01:34 This also compiles. You can see that now, the declaration of the message type is more explicit. We are somehow stating our intention that recipient should be an email address, but on the other hand, we can still construct a message with the recipient that is Hello, so not an email address.

    01:55 This really is the nature of type aliases, that from the point of view of a human, it's more readable, so you can say it's an email address, but from the point of view of the compiler, that email address is really the same as a string.

    02:09 If type aliases don't really give us more safety, we can use union types instead. Let email address be a union type instead, that has a single case. We call this email address also. It has, as a payload, a string.

    02:29 As I said, this no longer compiles because when a constructor message here, I need to give this an email address, whereas now I'm giving it a string.

    02:41 Firstly, let me format this, and now let me fix the error by creating an email address. This compiles again. In a sense, this is even worse, because now I have a specific type that is supposed to represent email addresses, but I can build an email address with the string, Hello.

    03:02 Let's fix that. We have a function, validateAddress. It should take a string and somehow, it should return an email address. Of course, not all strings are valid email addresses. That's the whole point.

    03:19 Instead of returning an email address, we'll return a result. In case of an error, we'll return a string with an error message. If it's successful, then we'll return an email address. Let me try to implement this.

    03:39 This will take a string. For now, I'm going to have a very naïve validation. If the string contains an @ then I'm going to assume that it's a valid email address, so I'm going to return a result of OK with an email address with S. Otherwise, I'm going to return an error with the message, "Not a valid email address."

    04:18 Of course, we could improve this function and make it more robust, but this is not the focus of this lesson. Instead, let's see how we could use this approach.

    04:26 Let's say we have some data of type string. Maybe we get this from a text box. Let's say the data is Hello. In the main function, we want to display this data, but only if it's a valid email address.

    04:41 We could take our data and feed it to the validateAddress function. Remember, this will return a result with a string in the error case or an email address in the successful case.

    04:54 In the successful case, we can use result.map and map the two string functions. This will take the email address inside the OK result and apply the two-string function to it.

    05:09 To deal with the error case, we can give it a default value. We can say result.withdefault providing a default value of invalid. Then all of this, we feed to the text function.

    05:29 Let me make this a bit bigger. Let me also import the h1 function. What I want to make is an h1 with an empty list of attributes, and one child, which will be text, content. I'm going to call this whole thing content.

    05:56 My main function returns an h1 element with a single child that is a text node displaying whatever the content variable evaluates to. To show you that this works in the valid case, let me say someone@web.com and save.

    06:14 You see that, in this case, we get the default rendering that we get when we use the two-string function with an email address.

    06:21 I'm not too happy with that rendering. I would just like to see the email address. Let me quickly define a function to do that. I'll call this emailToString. It will take an email address and return a string.

    06:39 If we look at the definition of email address, you can see that email address wraps strings. We need to somehow unwrap that. There's a very easy way to do this. I can say emailToString. I can say that this takes an email address with a string inside. What I want to return is that string inside.

    07:02 Instead of mapping the two-string function, I will map emailToString. When I save, you can see that the rendering of a valid string has changed.

    07:12 What have we achieved? We now have an email address type, and we have this function, validateAddress, that will return a populated email address only if the given string satisfies some validation.

    07:24 However, there is still a shortcoming in the sense that it's still possible to create a bogus email address, Hello. The problem has to do with visibility because we can see this function, validateAddress.

    07:37 If a developer is disciplined, he would not use this so that he has an error result in case the given string is not a valid email. At the same time, we can also see this constructor that allows us to feed it an arbitrary string to create an email address.

    07:54 Somehow, what we would like to achieve is that this outer type email address is visible, but the inner constructor email address is not visible, and that we're always forced to use this validateAddress function to create an email address.

    08:10 This is something that we can achieve if we use a dedicated module. Let me switch to the terminal and create an email.elm file. Let me open these two files side by side. The email module will have everything that has to do with creating a valid email.

    08:26 Firstly, I'm going to declare the module with module email. For now, I'm going to expose all the elements. Now, I'm going to refactor and take everything in the main file that had to do with email and put it into the email module.

    08:41 This includes, of course, the definition for the email union type. That's the most important thing, and also the validation function for creating a valid email, and the function for rendering an email as a string.

    08:56 From the mail file, I need to import this module. For simplicity, I'm going to expose all the elements.

    09:10 Notice that everything still works, so nothing has really changed. What I can do now is, instead of exposing everything, only expose the members I want. Instead of the two dots that exports all the declarations, I would just export email address.

    09:30 Notice that this refers to the outer type, but not the constructors. Then I would still need to export validateAddress and emailToString.

    09:40 If I save and recompile this, notice that the main module no longer compiles because I'm referring to email address, which is now not the type but the constructor.

    10:03 I've effectively achieved my goal of exposing the union type email address, but not the constructor that allows you to create an email address just passing it a string.

    10:14 I would now need to remove this, and if I want to create a message, then I would need to create an email address first. The only way to do that is by consuming the validateAddress function in the email module.

    10:29 Let me clean up quickly. Now my code compiles again. Let me also clean up the naming a bit for good practice.

    10:38 Since the module is called email already, then I will call this address. EmailToString could also simply be called toString. In the client code, exposing all the members of the module is never a good idea because it leads to name collision, so I would remove that.

    11:00 In this case, we have email.address as the type of recipient. Here, we would have email.validateAddress, and then email.toString. I also need to rename these exports.

    11:22 If I save, you can see the code compiles again, and if I break it again by supplying an invalid email address, you see that we indeed get an invalid result.

    11:34 You've now seen a use case where it makes sense to have a union type that has a single case constructor, and also where it makes sense to export the type, but not export the constructor.

    11:46 By the way, another thing to keep in mind is that if you did want to export the constructors, then you would do this like this. The double dot in this case would export all the constructors. Otherwise, you could list your constructors explicitly here. In this case, this is precisely what we wanted to avoid.