Posted on :: min read :: Tags: :: listen via tts

Building on the approach outlined in Write Your Own json.Unmarshal, I want to introduce a versatile data model for Go to handle serialized data and a straightforward way to unmarshal it into Go types.

Recap of the Problem

The core challenge was mapping different data sources — such as http.Request.PathValue or url.Values — into a Go struct. Ideally, we wanted something as simple and intuitive as this:

var student struct {
  Name string
  Age  int
}

err := magic.Unmarshal(url.Query(), &student)

The goal was clear: abstract the source data and walk through the Go struct using reflection. For each struct field encountered, we extract the corresponding value from the source and populate the target struct.

After writing the initial article, I refined the serialized data model further, wrote comprehensive tests, and packaged everything into a library. This library is designed to make mapping data sources seamless and efficient.

Let me introduce unravel. Get it now at github.com/go-gum/unravel.

Data model

The foundation of unravel lies in its unravel.Source interface. This interface serves as an abstraction for any serialized data source or value, enabling flexible and consistent handling of diverse input formats.

type Source interface {
  Bool() (bool, error)
  Int() (int64, error)
  Uint() (uint64, error)
  Float() (float64, error)
  String() (string, error)

  // ...
}

The unravel.Source interface offers a collection of accessor methods for primitive types, such as Int, Float, and String. These methods attempt to retrieve the value as the specified primitive type. If a conversion to the requested type is not possible, the method must return ErrNotSupported to signal that the source value cannot be represented in the desired form.

In addition to primitive type accessors, the unravel.Source interface includes three additional methods designed for working with container types:

type Source interface {
  // ...
   Get(key string) (Source, error)
  Iter() (iter.Seq[Source], error)
  KeyValues() (iter.Seq2[Source, Source], error)
}

The Get method retrieves an unravel.Source for a field within the current value. If the source value is a primitive type, unravel.ErrNotSupported is returned. For container types that support indexing by string, unravel.ErrNoValue must be returned if the specified field name does not exist.

The Iter method handles slice-like values. When unravel encounters a slice or array during decoding, it calls Iter to obtain an iter.Seq representing the sequence of elements if the source value is iterable.

To manage unstructured types such as map, the interface provides a KeyValues method. Unlike structured types, unstructured data lacks predefined field mappings, so the Go type cannot guide the conversion. KeyValues addresses this by returning an iter.Seq2, which iterates over key-value pairs represented as unravel.Source. This allows unravel to populate maps dynamically.

unravel.Source values are not required to be idempotent, meaning they are not obligated to return the same values consistently. This flexibility enables the creation of sources like FakeSource, which can generate dummy data for testing purposes.

Two implementations are provided:

  1. StringSource wraps a string and implements the primitive type accessor methods. It uses strconv.ParseInt, strconv.ParseFloat, and strconv.ParseBool to convert the string into corresponding primitive types. Container methods like Get and Iter return unravel.ErrNotSupported.

  2. EmptySource does not wrap any value. All its methods return unravel.ErrNotSupported, making it a useful base for custom unravel.Source implementations where only a subset of methods needs to be implemented.

This design ensures flexibility, allowing unravel to adapt to various data representations while offering robust support for both structured and unstructured types.

Bring your own Source

The built-in unravel.Source implementations, while foundational, aren't particularly useful for day-to-day operations. The real power of unravel lies in its flexibility to create custom unravel.Source implementations tailored to specific data. Let's revisit the original problem: mapping path values from an http.Request into a Student struct.

Mapping Path Values to a Struct

Suppose we have an http.Request with path values, such as those provided by path parameters when registering a handler with http.ServeMux. Our goal is to map these path values to fields in the Student struct.

To achieve this, we need a custom unravel.Source that acts as a container. This source should allow access to the path values within the http.Request, functioning similarly to a map or struct.

We can begin by defining a PathValueSource that wraps the http.Request and provides the required access methods. Here's how we might start:

type PathValueSource struct {
  Request *http.Request
}

To implement the Source.Get method for PathValueSource, we extract the path value from the http.Request using the specified field name. Since the method must return an unravel.Source, we wrap the resulting string in an unravel.StringSource. If the path value is empty, we treat it as missing and return unravel.ErrNoValue.

func (p PathValueSource) Get(key string) (unravel.Source, error) {
  v := p.Request.PathValue(key)
  if v == "" { return nil, unravel.ErrNoValue }
  return unravel.StringSource(v), nil
}

Since PathValueSource is neither iterable nor representable as a primitive Go value, we fulfill the requirements of the unravel.Source interface by embedding unravel.EmptySource.

type PathValueSource struct {
  unravel.EmptySource
  Request *http.Request
}

Now, we can write a simple test to verify that our PathValueSource behaves as expected:

func TestPathValueSource(t *testing.T) {
  var req http.Request
  req.SetPathValue("Name", "Albert")
  req.SetPathValue("Age", "18")

  var student Student
  err := unravel.Unmarshal(PathValueSource{Request: &req}, &student)

  require.NoError(t, err)
  require.Equal(t, Student{Name: "Albert", Age: 18}, student)
}

Perfect, this works:

=== RUN   TestPathValueSource
--- PASS: TestPathValueSource (0.00s)
PASS

Now, we can integrate our PathValueSource implementation into an http.Handler:

mux := http.NewServeMux()

mux.HandleFunc("GET /register/student/{Name}/age/{Age}", func(w http.ResponseWriter, req *http.Request) {
  var student Student
  if err := unravel.Unmarshal(PathValueSource{Request: req}, &student); err != nil {
    http.Error(w, err.Error(), http.StatusBadRequest)
    return
  }

  // students Name and Age are not set correctly
  business.RegisterStudent(w, student)
})

Query Values

Let's explore a different example: mapping query values from a url.Values to a struct.

Unlike a simple map[string]string, url.Values is defined as map[string][]string because URL parameters can appear multiple times. This added complexity means we need to handle multiple values for the same key.

Our goal is to support this scenario and unmarshal query parameters into a struct like this:

type Basket struct {
  PaymentMethod string `json:"pay"`
  ItemIds       []int  `json:"itemId"`
}

To rename struct fields, unravel.Unmarshal supports struct tags in a manner similar to json.Unmarshal. By default, it uses json tags, but you can easily customize this behavior by creating a unravel.Decoder and calling the WithTag(string) method.

Let's proceed by writing a custom unravel.Source for a []string, which represents a single key in the url.Values type. We begin by defining a new struct:

type stringSliceSource struct {
    Values []string
}

In the case of a single value, such as ?x=foo&y=bar, we would retrieve []string{"foo"} for x. This value should now be decodable as a primitive type. To handle this, we can either delegate the conversion to unravel.StringSource or write our own custom conversion logic.

As an example, here are the Int and String methods that would be part of our implementation. The other methods of unravel.Source would follow a similar pattern:

func (s stringSliceSource) Int() (int64, error) {
  if len(s.Values) != 1 {
    return 0, unravel.ErrNotSupported
  }
  return unravel.StringSource(s.Values[0]).Int()
}

func (s stringSliceSource) String() (string, error) {
  if len(s.Values) != 1 {
    return "", unravel.ErrNotSupported
  }
  return s.Values[0], nil
}

In cases where a query parameter has multiple values, such as ?q=foo&q=bar, we would receive a slice like []string{"foo", "bar"}. To handle this, we implement the Iter() method, which returns an iterator over the values.

To simplify the implementation, we wrap each value in a unravel.StringSource as we yield it. This ensures that each value is returned in the correct format, while maintaining compatibility with the unravel.Source interface. Here's how the Iter() method might look:

func (s stringSliceSource) Iter() (iter.Seq[unravel.Source], error) {
  it := func(yield func(unravel.Source) bool) {
    for _, value := range s.Values {
      if !yield(unravel.StringSource(value)) { break }
    }
  }

  return it, nil
}

We can now seamlessly unmarshal zero, one, or more values into a slice, and even convert a slice with a single value into a primitive value. The only remaining piece is the ability to access individual URL values by their name.

The implementation for this is very similar to the PathValueSource example. The key difference is that, in this case, we need to return instances of stringSliceSource from the Get() method:

type UrlValuesSource struct {
  unravel.EmptySource
  Values url.Values
}

func (p UrlValuesSource) Get(key string) (unravel.Source, error) {
  return stringSliceSource{Values: p.Values[key]}, nil
}

We can now test the UrlValuesSource by mapping it to the basket struct we defined earlier:

func TestBasket(t *testing.T) {
  query, _ := url.ParseQuery("itemId=42&itemId=34&itemId=69&pay=Credit")
  t.Logf("Query: %#v", query)

  var basket Basket
  _ = unravel.Unmarshal(UrlValuesSource{Values: query}, &basket)

  require.Equal(t, Basket{ItemIds: []int{42, 34, 69}, PaymentMethod: "Credit"}, basket)
  t.Logf("Mapped Query: %#v", basket)
}

Running the test gives us the expected output:

=== RUN   TestBasket
    examples_test.go:71: Query: url.Values{"itemId":[]string{"42", "34", "69"}, "pay":[]string{"Credit"}}
    examples_test.go:77: Mapped Query: examples.Basket{PaymentMethod:"Credit", ItemIds:[]int{42, 34, 69}}
--- PASS: TestBasket (0.00s)
PASS

Parsing binary data

To accommodate a broader range of use cases, unravel.BinarySource extends the unravel.Source interface with additional functionality. While the Int() method in unravel.Source is defined as int64 to encompass the full range of Go integer types, the unravel.BinarySource interface introduces methods to retrieve integers of specific sizes, such as int8 or int16, directly.

These additional methods, named Int8, Int16, Uint16 and so on, provide precise control over integer sizes.

When a source implements the unravel.BinarySource interface, it enables unravel to select appropriately sized integers during unmarshaling. This feature is particularly useful for decoding binary formats. For instance, consider a scenario where we need to parse a bitmap header like this:

type BitmapFileHeader struct {
  Signature  [2]byte
  FileSize, Reserved, PixelOffset uint32
}

type BitmapInfoHeader struct {
  HeaderSize      uint32
  Width, Height   int32
  Planes, BPP     uint16
  Compression     uint32
  ImageSize       uint32
  XRes, YRes      int32
  ColorsUsed      uint32
  ImportantColors uint32
}

type Header struct {
  File BitmapFileHeader
  Info BitmapInfoHeader
}

Our goal is to unmarshal data from an io.Reader into this bitmap header. Fortunately, with unravel, this process is straightforward. We can define a new Source just as we did previously:

type binarySource struct {
  unravel.EmptySource
  r io.Reader
}

Next, we can implement the methods of the unravel.BinarySource interface to read data from the io.Reader and decode it into integers of the correct size using binary.LittleEndian. For example, to read a uint32, we implement the following method:

func (b binarySource) Uint32() (uint32, error) {
  var buf [4]byte
  err := b.r.Read(buf[:])
  return binary.LittleEndian.Uint32(buf[:]), err
}

We then proceed to implement the remaining methods of the unravel.BinarySource interface, such as Int32(), and others.

Even if the input data is not self describing, we still need to implement the Source.Get method to enable unravel to unmarshal struct fields. unravel processes struct fields by reading them from the Source in the order they appear.

In our case, as the Source reads values directly from the reader without maintaining additional state, it simply returns itself when Get is called. This makes fetching a child value from the source effectively a no-op. For example, source.Get("Width").Int32() is functionally identical to source.Int32(), or reading a field is just reading the next integer from the io.Reader.

func (b binarySource) Get(key string) (unravel.Source, error) {
  return b, nil
}

Finally, we need to implement the Iter() method to enable iteration over a sequence of values. This is necessary in our example to deserialize the Signature array in the file header. To deserialize a [2]byte value, unravel will call Iter() and then extract exactly two values from the returned iterator.

Since our source does not inherently track the number of elements to be read or their individual sizes, we follow the same approach as with the Get() method. Each iteration will simply return the binarySource itself, allowing unravel to handle value extraction directly.

func (b binarySource) Iter() (iter.Seq[unravel.Source], error) {
  it := func(yield func(unravel.Source) bool) {
    for {
      if !yield(b) { break }
    }
  }
  return it, nil
}

With this approach, pulling out two values from Iter() and calling Int8() on each is functionally identical to calling Int8() directly on the binarySource.

At this point, we have all the pieces in place to deserialize a binary bitmap header into our Go struct. Given an io.Reader named bmp, we can deserialize the bitmap header as follows:

// a 3x5 px bitmap header
var header Header
parsed, err := unravel.Unmarshal(binarySource{r: bmp}, &header)
require.Equal(t, err, nil)
require.Equal(t, parsed, expected)

Closing words

As demonstrated, unmarshaling values with unravel is straightforward and highly flexible. By implementing your own unravel.Source, you can create various custom mappings tailored to specific needs. To build on the examples discussed in this article, here are some additional ideas:

  • FakeSource: A source that returns random values for each primitive type whenever called. This can be useful for generating fake data in tests quickly and effortlessly.

  • CopySource: A source that uses reflection to read data from another Go struct, enabling seamless copying of values between slightly different types.

  • ConvertingSource: A source that holds a value of type any and attempts to convert it to the requested target type. For example, it could convert an int to a string, a string to an int, or even split a string into a slice.

The possibilities are vast, and you can likely think of many other use cases.

You can start using unravel in your project today! As the API has not yet reached version 1.0.0, feedback and suggestions are highly welcome.

go get github.com/go-gum/unravel

Thank you for making it this far — now go build something!

Discuss this article on reddit.